Classifying data points with Random Forest model using MLib
In this recipe, we will demonstrate how you can classify data points using Random Forest algorithm with MLib.
Getting ready
You will be using the Maven project you created in the recipe named Solving simple text mining problems with Apache Spark. If you have not done so yet, then follow steps 1-6 in the Getting ready section of that recipe.
Go to https://github.com/apache/spark/blob/master/data/mllib/sample_binary_classification_data.txt, download the data, and save as
rf-data.txt
in the data folder of your project that you created by following the instruction in step 1. Alternatively, you can create a text file namedrf-data.txt
in the data folder of your project and copy-paste the data from the aforementioned URL.In the package that you created, create a Java class file named
RandomForestMlib.java
. Double-click to start writing your code in it.
How to do it...
Create a class named
RandomForestMlib
:public class RandomForestMlib...