Introduction to RASON
About RASON Models and the RASON Server
Rason Subscriptions
Rason Web IDE
Creating and Running a Decision Flow
Defining Your Optimization Model
Defining Your Simulation Model
Performing Sensitivity Analysis
Defining Your Stochastic Optimization Model
Defining Your Data Science Model
Defining Custom Types
Defining Custom Functions
Defining Your Decision Table
Defining Contexts
Using the REST API
REST API Quick Call Endpoints
REST API Endpoints
Decision Flow REST API Endpoints
OData Endpoints
OData Service for Decision Flows
Creating Your Own Application
Using Arrays, For, Loops and Tables
Organization Accounts

Sampling Example

Let's step through a few Rason data science examples now. Let's start with an example of how to sample from a dataset.


  {
    "modelName": "Sampling",
    "modelDescription": 'transformation: sampling',
    "modelType": 'datamining',
    "datasources": {
   	  "mySrc": {
        "type": "csv", 
        "connection": "hald-small-binary.txt",
        "direction": "import"
   	  }
    },
   "datasets": {
     "myData": {
       "binding": "mySrc"
      }
    },
   "transformer": {
     "mySampler": {
       "type": "transformation", 
       "algorithm": "sampling",
       "parameters": {
         "sampleSize": 4, 
         "replaceOption": "false", 
         "sortIndexes": "false", 
         "seed": 123
      	} 
   	  }
    },
    "actions": {
      "sampleData": {
        "data": "myData", 
        "action": "transform", 
        "evaluations": [
          "transformation"
        ]
      }
    }
  }

Within "datasources" a sample is taken from the hald-small-binary dataset (contained within hald-small-binary.txt) and given the name "mySrc". This file contains data in CSV format.

Note: Input files in a Data Science Rason model must not contain a path to a file location.

hald-small-binary.txt

Within "datasets", the data sampled in mySrc is bound to myData. The sampling transformer, mySampler, is specified within the "transformer" attribute. Since we are sampling, the "type" is specified as "transformation" and the algorithm is specified as "sampling". Various sampling options under "parameters": sample size (sampleSize), sample with replacement (replaceOption), index sorting (sortIndexes) and the random seed value (seed). For a complete list of options associated with the sampling transformer, please see the RASON Reference Guide.

Finally, under "actions", the transformation (the sampling) is performed on the myData dataset. Under "evaluations" we see the quantities to be computed and reported. In this example, the sample is the result.

  

  Getting model results: GET https://rason.net /api/model/ 2590+Sampling+2020-01-20-01-07-51-620648/result
  
  
  "sampleData": {
    "transformation": {
      "objectType": "dataFrame",
      "name": "Sample:mydata",
      "order": "col",
      "rowNames": ["Record 5", "Record 3", "Record 9", "Record 7"],
      "colNames": ["Y", "X1", "X2", "X3", "X4", "Weights"],
      "colTypes": ["double", "double", "double", "double", "double", "double"],
      "indexCols": null,
      "data":[
        [1,1,0,1],
        [7,11,2,3],
        [52,56, 54,71],
        [6,8,18,17],
        [33,20,22,6],
        [1,2,1,1]
      ]
    }
  }
  

  Getting model results: GET https://rason.net /api/model/2590+Sampling+2020-01-20-01-07-51-620648/result
  {
  "status": {
    "id":"2590+Sampling+2020-01-20-01-07-51-620648",
    "code":0,
    "codeText": "Success"
  },
  "results": [sampleData.transformation"],
    "sampleData": {
      "transformation": {
        "objectType": "dataFrame",
        "name": "Sample:mydata",
        "order": "col",
        "rowNames": ["Record 5", "Record 3", "Record 9", "Record 7"],
        "colNames": ["Y", "X1", "X2", "X3", "X4", "Weights"],
        "colTypes": ["double", "double", "double", "double", "double", "double"],
        "indexCols": null,
          [1,1,0,1],
          [7,11,2,3],
          [52,56,54,71],
          [6,8,18,17],
          [33,20,22,6],
          [1,2,1,1]
        ]
      }
    }
  }  

According to the results, the records sampled were:

Sampling Results
Index Y X1 X2 X3 X4 Weights
5 1 7 52 6 33 1
3 1 11 56 8 20 2
9 0 2 54 18 22 1
7 1 3 71 17 6 1

A Note about Data Frames

A DataFrame, in Rason DM, is a collection of data organized into named columns of equal length and homogeneous type. Rason DM uses DataFrames to deliver input data to an algorithm and to deliver the results of the algorithm back to the user. DataFrames hold heterogeneous data across columns (variables): numeric, categorical, or textual.

Examples of basic DataFrame tasks are:

  • Creating and filling DataFrames
  • Selecting a subset of columns/rows
  • Appending columns or rows
  • Selecting subsets for training and verification models
  • RASON V2020 introduces two new REST API endpoints POST rason.net/api/solve and POST rason.net/api/model/{nameorid}/solve which automatically create an OData endpoint which returns the result in a dataframe object. For more information, see the Using the REST API topic.