Classification

As of MATLAB 2015, the Statistics and Machine Learning Toolbox comes with with a GUI classifier. Here is a skeleton of training and testing a classifier - the MATLAB documentation is incorrect.

  1. Create a training data Excel file with variable names in top row 
    • Avoid spaces, "-", etc
  2. Create a second test data Excel file using the columns names 
    • The outcome variable is not necessary, but ti's fine to include it
  3. Import both Excel files as tables:
    • T_train = readtable('input.xlsx');
    • T_test = readtable(test.xlsx');
  4. Go to the "Classification Learner App (Apps tab)
  5. Start a new session, select data from Workspace
  6. Select the "T_train" data, and identify the response variable
  7. Start session, play around till you get something you like
  8. Export 
  9. To test, assuming you used the default name "trainedClassifier":
    • yfit = trainedClassifier.predictFcn( T_test )
    • This is incorrectly documented in MATLAB.
    • This outputs a predicted response for each row in the table T_test.
  10. To report accuracy, use the training information in the model, and you can also report the test accuracy. It is valid to only report the model accuracy if you have some protection against overfitting (default is cross-validation)

Excel format example

Select imported table and response variable

Protection against overfitting