vendredi 30 octobre 2015

Mahout recommender, how to use different CSV files for training and testing

I am building a recommender system in mahout. I have two files (train.csv, and test.csv) which should be used separately for training and testing. I searched a lot online on how to pass the train.csv to the training model and test.csv to the testing model, but I didn't get any luck in understanding what others suggested. In this question: Test and training with different dataset with MAHOUT they mentioned that we can change the evaluate class and add two different files. Can someone explain to me how to do that exactly? I am using eclipse, and I couldn't find the (evaluate) class or modify it as suggested. I am new to mahout, and I not sure what are they talking about!

Or, if you have any other solutions to add train.csv and test.csv separately I will really appreciate that... Thanks for help Here is some of my code:

...
        RandomUtils.useTestSeed();
        DataModel model = new FileDataModel(new File("train.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1, similarity, model);
        UserBasedRecommender recommender = 
                new GenericUserBasedRecommender(model, neighborhood, similarity);
....

    RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
    RecommenderBuilder builder = new MyRecommenderBuilder();
    double result = evaluator.evaluate(builder, null, model, 0.9, 1.0);
    System.out.println(result);
        double result = evaluator.evaluate(builder, null, model, 0.9, 1.0);//how can I use evalute for both train.csv and test.csv?
        System.out.println(result);

Aucun commentaire:

Enregistrer un commentaire