This book will utilize several key technologies that are used for big data mining and more generally data science. Our Analytics Toolkit consists of Hadoop and Spark, which can be installed both locally on the user's machine as well as on the cloud; and it has R and Python, both of which can be installed on the user's machine as well as on a cloud platform. Your Analytics Toolkit will consist of:
Software/platform |
Used for data mining |
Used for machine learning |
Hadoop |
X |
|
Spark |
X |
X |
Redis |
X |
|
MongoDB |
X |
|
Open Source R |
X |
X |
Python (Anaconda) |
X |
X |
Vowpal Wabbit |
X |
|
LIBSVM, LIBLINEAR |
X |
|
H2O |
X |