RASON Analytics API Help

Essential Rason Model Sections

As mentioned in Defining a Data Science Model, there are four essential sections that must exist in a single Rason Data Science model. This Help topic describes each section, besides modelName and modelType, in more detail.

"datasources"

As mentioned above, this section is used to specify how the data will be acquired. Typically, data will be contained in an external data source such as a delimited file, Excel workbook, or database.

This section, "datasources", is an object with user defined attributes where each attribute defines an object with "type", "connection" and "direction" properties. The following example defines 3 data sources: myTrainingData, myValidationData and myTestData.

  
    "datasources": {
      "myTrainingData":{
        "type":"csv",
        "connection":"PathToDataFilesOrTrainingData.txt",
        "direction": "import"
      },
      "myValidationData":{
        "type":"csv",
        "connection":"PathToDataFilesOrValidationData.txt",
        "direction": "import"
      } ,
      "myTestData":{
        "type":"csv",
        "connection":"PathToDataFilesOrTestData.txt",
        "direction": "import"
      }
    }

In this example code snippet, three data sources are initialized: myTrainingData, myValidationData, and myTestData. The "type" property describes the file type of the data file being imported into the Rason model. In this case, the data for all three data sources is contained within a "CSV" file. The "connection" property describes the location of each data file and the "direction" property specifies whether the file is being imported or exported. The default for "direction" is "import".

Aside from "type" and "connection" properties, additional properties exist for specific types of data sources such as "headerExists" for delimited files or "selection" for SQL database selection. For the full list of properties for the "datasource" section, see the RASON Reference Guide. For examples on how to import from various data sources, see both the RASON Reference Guide or the Editor page on RASON.com.

Note: RASON V2020 makes it exceptionally easy to work with data sources in the Microsoft ecosystem, by creating a Data Connection on the user's My Account page on www.RASON.com. The RASON service supports the following data connections.

OneDrive and OneDrive for Business
Common Data Service for Dynamics 365, Power Apps and Power Automate
OData and CDS support for Power BI
CData Cloud Hub support for access to 100+enterprise data sources.

For more information on how to create and maintain Data Connections, see the previous Data Connections topic within the RASON Subscriptions topic.

"datasets"

The component, "datasets", is an object with user defined attributes where each attribute defines an object with a "binding" property. The following example defines 2 data sets: myTrainData and myValidData.

  
    "datasets": {
      "myTrainData": {
        "binding": "myTrainSrc",
        "targetCol": "Y"
      },
      "myValidData": {
        "binding": "myValidSrc",
        "targetCol": "Y"
      }
    },

In this example code snippet, two datasets are initialized, "myTrainData" and "myValidData". Within "myTrainData", the dataSource "myTrainSrc" is bound to the "myTrainData" dataset. Likewise, the dataSource "myValidSrc" is bound to the "myValidData" dataset.

The "binding" property specifies the data source to be bound. This attribute can be bound to the output of, or data sources in, other stages. "Binding" is not applicable if the user provides the data inline, i.e. enters data manually into the RASON model. For a list of all properties that may appear in a given data set definition, see the RASON Reference Guide.

"estimator"/"transformer"

The "estimator" object estimates a model from the training data and stores the fitted model, which may be used later. The "estimator" object implements the "fit" interface. The "transformer" object is used to differentiate the algorithms that do not have a model, i.e. they do not implement the "fit" interface. Rather, these algorithms implement the "transform" interface (only).

"estimator"

The "estimator" section defines the estimator used to fit the model. Estimators extract a model from the input data. This model can be used in other RASON models using a dataset binding to the output. This element is mutually exclusive with the "transformer" element. Both may not appear in the same stage definition. (Only one transformer or estimator may appear in a given RASON Data Science model.) An example of the estimator "Find Best Model" is shown below.

  
    "estimator": {
      "cfbmEstimator": {
        "type": "classification",
        "algorithm": "findBestModel"
      }
    },

In this example, a new estimator, cfbmEstimator, is initialized. This estimator will perform run all available classification learners and will determine the learner with the "best fit" to the dataset based on user specifications. See the classification and regression examples below for more information.

Properties for "estimator" are:

"type" – Must be one of the following: "classification", "regression", "clustering", "textMining", "transformation", "timeSeries".
"algorithm" – The selection for this property varies with the selected "type". See the chart below to see which algorithms correspond to the selected "type".

Options for Type and Algorithm Properties
Type	Algorithm Choice
"classification"	"boosting", "bagging", "neuralNetwork", "decisionTree", "randomTrees", "nearestNeighbors", "naiveBayes", "discriminantAnalysis" or "logisticRegression"
"regression"	"boosting", "bagging", "neuralNetwork", "decisionTree", "randomTrees", "nearestNeighbors", "linearRegression"
"clustering"	"kMeans" or "hierarchical"
"textMining"	"tfIdf" or "latentSemanticAnalysis"
"transformation"	"oneHotEncoding", "imputation", "rescaling", "principalComponentAnalysis", "binning", "factorization", "canonicalVariateAnalysis", "syntheticDataGenerator", "summarization"
"featureSelection"	"univariate", "linearWrapping" or "logisticWrapping"
"timeSeries"	"addHoltWinters", "mulHoltWinters", "noTrendHoltWinters", "doubleExponential", "exponential", "movingAverage", "arima" or "lagAnalysis"

"parameters" – The property options for "parameters" will vary depending on the algorithm selected. For a complete list of properties for each algorithm, see the RASON Reference Guide.

"simulation" - In order to run the synthetic data generator, described later in this chapter, “simulation”:{} must be called within the estimator. All parameters applying to the synthetic data generator are passed within “simulation”. For example:

    
      "estimator": {
        "mlrEstimator": {
          "type": "regression",
          "algorithm": "linearRegression",
          "parameters": {
            "fitIntercept": true
          },
          "simulation": {
            "metalogAuto": true,
              "numMetalogTerms": [
                ["CRIM", 5],
                ["ZN", 5],
                ["INDUS", 5],
                …
              }
            }
          }
        }
      }

"transformer"

A "transformer" applies to estimators that do not fit a model but rather transform data, such as Feature Selection or Sampling. Since no data is stored (i.e. transformers take data in and return data out), transformation algorithms are represented by a single object. For example, when applying a sampling algorithm to a dataset, there is nothing to estimate from the training data which results in nothing to store in a model for future actions.

This element is mutually exclusive with the "estimator" element. Both may not appear in the same RASON model. (Only one transformer or estimator may appear in a given RASON Data Science model.) An example of the transformer "mySampler" (appearing in the Transformation - Sampling.json RASON example on RASON.com) is shown below.

  
    "transformer": {
      "mySampler": {
        "type": "transformation",
        "algorithm": "sampling",
        "parameters": {
          "sampleSize": 4,
          "replaceOption": "false",
          "sortIndexes": "false",
          "seed": 123
        }
      }
    },

In the example code snippet above, the transformer "mySampler" is initialized. This transformer will perform a "transformation" (type: transformation) using the sampling algorithm (algorithm: sampling). Four options, sampleSize, replaceOption, sortIndexes and seed, are specified.

Properties for "transformer" are:

"type" – Must be one of the following: "affinityAnalysis", "bigData", "featureSelection" or "transformation".
"algorithm" – The selection for this property varies with the selected "type". See the chart below to see which algorithms correspond to the selected "type".

Options for Type and Algorithm Properties
Type	Algorithm Choice
"affinityAnalysis"	"associationRules"
"bigData"	"sampling" or "summarization"
"transformation"	"sampling", "stratifiedSampling", "partitioning", "oversamplePartitioning", "categoryReduction", "syntheticDataGenerator" and "summarization"

"parameters" – The property options for "parameters" will vary depending on the algorithm selected. For a complete list of properties for each algorithm, see the RASON Reference Guide.

"actions"

The estimator or transformer is applied to the data within the "actions" section. An example of the action "nnpModel" ((RASON Example Models – Data Science – Regression – Fitted Models POSTed to Server -- NeuralNetworkPostFM.json) is shown below.

  
    actions: {
      "nnpModel": {
        "trainData": 'myTrainData',
        "estimator": 'nnpEstimator',
        "export": 'json',
        "action": "fit",
        "evaluations": [
          "trainingLog",
          "neuronWeights",
          "numEpochsUsed",
          "trainingTime",
          "stoppingReason",
          "partitionCausedStopping"
        ]
      }
    },

In the example code snippet above, the "nnpModel" action is initialized. The model created from the "nnpEstimator" (estimator: nnpEstimator) will be applied or "fit" (action: fit) to the "myTrainData" dataset. Several results or "evaluations" are requested in the final results: the training log (trainingLog), the neuron weights (neuronWeights), number of epochs (numEpochsUsed), the time spent training the model (trainingTime), the reason the algorithm stopped (stoppingReason) and the partition used to evaluate the performance of the algorithm (partitionCausedStopping).

Note the "export" property. This property posts the fitted model, in either JSON or PMML format, to the RASON Server under the "modelName" property setting. Replace "export" ("export": "json/pmml") property with "binding" property to export the fitted model contained within a JSON or XML file. If neither "export" or "binding" properties are included within "actions", then the fitted model will only be produced in-memory. An in-memory fitted model may be used in a decision flow. For more information on POSTing a fitted model to the RASON server or exporting a fitted model, see the section POSTing/Exporting Fitted Models, below.

A second example of an action for a transformer is below. Notice there is no keyword to replace "estimator" within "actions", as in the example above. When using a transformer, there's no estimator/model, so the actions can unambiguously refer to the transformer object only.

  
    transformer: {
      mySampler: {
        type: 'transformation',
        algorithm: 'sampling',
        parameters: {
          sampleSize: 4,
          replaceOption: false,
          sortIndexes: false,
          seed: 123
        }
      }
    },
    actions: {
      sampleData: {
        data: 'myData',
        action: 'transform',
        export: 'json',
        evaluations: [
          'transformation'
        ]
      }
    }

The following properties are available for the "actions" section:

"trainData" – This property may be used interchangeably with the property, "data". In some algorithms, it is possible to provide both "trainData" and "validData" i.e. for classification and regression algorithms.

"estimator" – This property is used to reference the estimator defined in the "estimator" object.

"export" – Use the "export" property to POST the fitted model to the RASON Server. Use "binding" to export the fitted model to a XML or JSON file. If neither property exists, then the fitted model will only be produced in-memory. For more information on POSTing a fitted model to the RASON server or exporting a fitted model, see the section POSTing/Exporting Fitted Models, below. Note: The following transformation methods do not generate a fitted model: sampling, partitioning and SQL transformation.

Notes

The following transformation methods do not generate a fitted model: sampling, partitioning and SQL transformation.
The following transformation methods only produce a fitted model in JSON format: categoryreduction, factorization, imputation, and principalcomponentsanalysis.
The following estimators do not generated a fitted model: Feature Selection (logisticAnalysis, linearWrapping, univariate)
The following estimators only product a fitted model in JSON format: hierarchical and kmeans clustering.
The following estimator only produces a fitted model in PMML: affinityAnalysis.

"fittedModel" – Used when scoring a model, this property is used to reference the model generated inside of the "model" object. For more information on scoring, see the example below.

"action" - Valid values for this property are "fit", "predict", "transform", or "forecast". As the name suggests, "fit" fits the model given "estimator" and "trainData". The remaining options, "predict", "transform" and "forecast", apply the fitted model for further options on partitions or new data.

"parameters" – The selection for this property depends on the "model" or "estimator" selected. If these options are directly applicable to the prediction/transformation/forecast of the data within this action specifically (i.e. the "successProbability" when classifying different datasets), you may use different values for scoring each dataset using the same model. If using "numPrincipalComponents" when running Principal Components Analysis, you may request a different number of components when transforming each dataset using the same PCA model. For all valid parameters and evaluations for each algorithm, see the Rason Reference Guide.

"evaluations" – This property specifies the results to be reported back to the user. Only those evaluations specified for this property will be computed or reported. Evaluation results may either be 1. A part of the RASON response or 2. Bound to a writeable datasource. In the example below, "fittedModelJson" and "regressionSummary" are part of the RASON response while "influenceDiagnostics" is bound to the writeable datasource "myExportSrc". To view this complete example, see LinearRegression.json on the Editor page on RASON.com. Note: Some code has been removed from the example below for simplicity.

  
  {
    "modelName": "LinearRegression",
    "modelDescription": "regression: linear model; scoring examples JSONLinearRegression.json and PMMLRegressor.json 
    use exported fitted model, mlrModel, to score new data",
    "modelType": "datamining",
    "datasources": {
      "myTrainSrc": {
        "type": "csv",
        "connection": "hald-small-train.txt",
        "direction": "import"
      },
    …
      "myExportSrc": {
        "type": "csv",
        "content": "export",
        "connection": "influence-diagnostics.csv",
        "direction": "export"
      }
    },
    "datasets": {
      "myTrainData": {
        "binding": "myTrainSrc",
        "targetCol": "Y"
      },
    …
    },
    "estimator": {
      "mlrEstimator": {
        "type: "regression",
        "algorithm": "linearRegression",
        "parameters": {
          "fitIntercept": true
        }
      }
    },
    "actions": {
      "mlrModel": {
        "trainData": "myTrainData",
        "estimator": "mlrEstimator",
        "action": "fit",
        "evaluations": [
          "fittedModelJson",
          {
            "name": "influenceDiagnostics",
            "binding": "myExportSrc"
          },
          "regressionSummary"
          …
        ]
        …
      }
    }

Notes on exporting to a writable data source.

"Type": "CSV" and "Type": "JSON" simply create or overwrite the files with the dataframe/table evaluation.
The "selection" property specifies the Excel worksheet and is optional when "Type":"Excel". If not provided, the worksheet name will be automatically assigned based on the rason script’s name, action and evaluation, i.e mlr-mlrmodel-influenceDiagnostics.
The "selection" property specifies the Database table name and is optional for all database types. If not provided, the table name will be automatically assigned as in 2 above.
Users can write to the same Excel workbook or same database – adding new worksheets/tables with subsequent evaluations.
It is also possible to create a new MS Access database file and write evaluations there.
Creating a new database for MS SQL/Oracle types or when using an ODBC connection string is not supported. As a result, "connection" must point to an existing database.

For more examples on exporting results to a writeable datasource, see the "datasources" topic in the Rason Reference Guide.

Back to Defining a Data Science Model

Continue to Optional Rason Data Science Sections

RASON Analytics API Help

Download RASON User Guide

Download RASON Reference

Introduction to RASON

About RASON Models and the RASON Server

Rason Subscriptions

Rason Web IDE

Creating and Running a Decision Flow

Defining Your Optimization Model

Defining Your Simulation Model

Performing Sensitivity Analysis

Defining Your Stochastic Optimization Model

Defining Your Data Science Model

Defining Custom Types

Defining Custom Functions

Defining Your Decision Table

Defining Contexts

Using the REST API

REST API Quick Call Endpoints

REST API Endpoints

Decision Flow REST API Endpoints

OData Endpoints

OData Service for Decision Flows

Creating Your Own Application

Using Arrays, For, Loops and Tables