Packt+ | Advance your knowledge in tech

You're reading from Mastering Clojure Data Analysis If you'd like to apply your Clojure skills to performing data analysis, this is the book for you. The example based approach aids fast learning and covers basic to advanced topics. Get deeper into your data.

Product type Paperback

Published in May 2014

Publisher

ISBN-13 9781783284139

Length 340 pages

Edition Edition

Languages

Clojure

Concepts

Data Analysis

Author (1):

Eric Richard Rochester

View More author details

Table of Contents (17) Chapters

Mastering Clojure Data Analysis

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

1. Network Analysis – The Six Degrees of Kevin Bacon FREE CHAPTER

2. GIS Analysis – Mapping Climate Change

3. Topic Modeling – Changing Concerns in the State of the Union Addresses

4. Classifying UFO Sightings

5. Benford's Law – Detecting Natural Progressions of Numbers

6. Sentiment Analysis – Categorizing Hotel Reviews

7. Null Hypothesis Tests – Analyzing Crime Data

8. A/B Testing – Statistical Experiments for the Web

9. Analyzing Social Data Participation

10. Modeling Stock Data

Index

Cross-validating the results

As I've already mentioned, the dataset for this chapter is a manually coded group of 500 hotel reviews taken from the OpinRank dataset. For this experiment, we'll break these into 10 chunks of 50 reviews each.

These chunks will allow us to use K-fold cross validation to test how our system is doing. Cross validation is a way of checking your algorithm and procedures by splitting your data up into equally sized chunks. You then train your data on all of the chunks but one; that is the training set. You calculate the error after running the trained system on the validation set. Then, you use the next chunk as a validation set and start over again. Finally, we can average the error for all of the trials.

For example, the validation procedure uses four folds, A, B, C, and D. For the first run, A, B, and C would be the training set, and D would be the test set. Next, A, B, and D would be the training set, and C would be the test set. This would continue until every...

The rest of the chapter is locked

You're reading from Mastering Clojure Data Analysis If you'd like to apply your Clojure skills to performing data analysis, this is the book for you. The example based approach aids fast learning and covers basic to advanced topics. Get deeper into your data.

Table of Contents (17) Chapters

Cross-validating the results

Authors (1)

Personalised recommendations for you

You're reading from Mastering Clojure Data Analysis If you'd like to apply your Clojure skills to performing data analysis, this is the book for you. The example based approach aids fast learning and covers basic to advanced topics. Get deeper into your data.

Table of Contents (17) Chapters

Cross-validating the results

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you