Microsoft Machine Learning Studio Recommendation System Data Splitting Explained

In Microsoft Machine Learning Studio, when building a recommendation system, all test users are involved in the training process. This is different from Lab1 and Lab2. To split the data for training and testing, the 'Recommender Split' mode in 'Split Data' is used with specific parameters.

The first parameter is the 'Fraction of training-only users,' which is set to 0.8. This means that 80% of the users in the dataset will be used for training the model, and all of their ratings will be used for this purpose.

The remaining 20% of the users will be considered test users. For these users, the second parameter comes into play, which is the 'Fraction of test user ratings for training.' This parameter is set to 0.25, which means that only 25% of the test users' ratings on items will be used for training the model. The remaining 75% of their ratings on items will be used for testing the model's performance evaluation.

Overall, this approach ensures that all users are involved in the training process, and a subset of their ratings is used for testing the model's performance. This helps to create a more accurate and effective recommendation system.

Microsoft Machine Learning Studio Recommendation System Data Splitting Explained