Models
Last updated
Last updated
After you create a project, you can build models. Using models, you can leverage the historical data learned within the project to make predictions about the future.
To create a new model:
Right-click the Exploration icon in the project tree, and then click the appropriate model for you needs.
Assign a new name to the model. Certain models allow for additional options relating to feature selection and model reduction . Furthermore, because these models rely heavily on matrix inversion, the user is able to adjust the sensitivity of such operations via Reciprocal Condition Number Threshold. Note, this feature is for advanced users only.
Check the attributes you want to use as inputs and targets. Click Build Model to continue.
After completing the model-building process, double click on the model icon to see information about your new model.
Most of the SML models allow for model specific options to be enabled before building. One example of such option is the pseudo matrix inverse. With pseudo inverse enabled, a linear model, which would typically not be build due to matrix non invertibility, can be created by the user. However, such model is likely to be unstable and therefore this option should only be used by advanced users.
The predefined build parameters will typically have an effect on how the model is fitted. However, the user is free to add additional tags of their own choosing. These will not affect the build process and will simply be treated as additional metadata for the model.
When working with only a handful of attributes, creating models by manually specifying the input attributes is often times sufficient to get an acceptable result. However, as the number of attributes gets large, the choice of which attributes to use can be aided by a model selection heuristic.
The model selection heuristic within SymetryML works as follows. First, the user has to decide on the general type of modelling that needs to be done, whether the task is Classification or Regression. Then, an out of sample data file is provided. SymetryML uses this out of sample file to test the various combinations of input attributes and select the one that provides the highest AUC, when dealing with classification, or the lowest RMSE when dealing with Regression problems.
If the user has a particular algorithm in mind, then the only choice left to be made is one regarding attributes and model hyperparameters. This can be achieved by right clicking on the Exploration node picking Select Model , choosing Regression or Classification as the task type and finally picking the model of your choosing.
However, if there are no strong preferences regarding the algorithm used, the user can simply use Auto Select option. This will require specifying the task type Binary Classification or Regression. Multiclass Classification will also be available for partitioned projects. SymetryML would then cycle through the appropriate algorithms for the task and pick their hyperparameters accordingly.
The available task options are based on the type of project selected. Binary Classification and Regression will always be available. For Multiclass Classification the project must be of type Partition.
Models within a particular class (i.e., Classification or Regression) can be cloned, rebuilt, with a different algorithm by simply selecting the preferred algorithm within the model info view.
Sequence models, unlike their classification and regression counterparts, require the project to be constructed as a sequence project from the start.
Once the model is built, you can double click on the corresponding model icon in the projects view and see the associated metadata.
The type of information displayed is dependent on the class of model.
LDA models have a special property that differentiates them from other algorithms. They can be refined, or updated, in real time. This means you can add or remove attributes as you see fit and the model will be rebuilt instantly. To refine a model, select the Input Attributes you want to add and remove the ones you want to remove.
SymetryML can reduce LDA models automatically to find one with the best subset of attributes. This is useful when working with datasets that have many attributes. SymetryML uses a heuristic to build various models, compares the models, and keeps the best one. This comparison is performed efficiently by using the intrinsic values of the models.
To remove a model:
Right-click the model node, and then click Delete Model.
You can assess the performance of most models (excluding MC) using a labeled file.
Right-click the model node, and then click Assessment.
Specify the input data source you will use for the assessment, and then click Next to continue.
Verify the data source you added, and then click Next to continue.
Ensure that your model’s target variables are mapped to the target variables of your file.
Click Finish to start the assessment process.
After the assessment process finishes, a dialog box asks whether you want to see the assessment results. If you click Yes, the assessment information panel appropriate for your model type appears (see the following figures).
Simple binary classification models show a confusion matrix with lift curve information.
Regression models show a distribution of errors curve in addition to RMSE and MAE.
Multi-target classification models show a descriptive confusion matrix.
Once built, a model can be exported out of a project and later imported into a different project. This allows for workflows in which a production environment can be completely separate from a development environment while this levering models build in the latter.
Importing a previously exported model is as simple as selecting Import Model option and specifying the model file.
In addition to performing an assessment, you can use an existing model for predictions against a file.
Right-click the model node, and then click Predict.
Specify the input data source, and then click Next to continue.
Specify the output data source, which is the location where you want your predictions saved. Click Next to continue. A preview of the input data source appears.
Verify that the data source has been read correctly, and then click Next to continue.
Ensure the types in the data source match those of the model. Click Next to continue.
After the prediction process completes, a sample of the prediction result appears. To download the entire prediction file, click the Download button.
For more information detailing the Search Heuristic, Search Space, Optimization Metric see section on