You're reading from Data Science for Web3 A comprehensive guide to decoding blockchain data with data analysis basics and machine learning cases

Product type Paperback

Published in Dec 2023

Publisher Packt

ISBN-13 9781837637546

Length 344 pages

Edition 1st Edition

Languages

Python

Tools

Blockchain

Concepts

Blockchain

Author (1):

Gabriela Castillo Areco

Preface

1. Part 1 Web3 Data Analysis Basics

2. Chapter 1: Where Data and Web3 Meet FREE CHAPTER

3. Chapter 2: Working with On-Chain Data

4. Chapter 3: Working with Off-Chain Data

5. Chapter 4: Exploring the Digital Uniqueness of NFTs – Games, Art, and Identity

6. Chapter 5: Exploring Analytics on DeFi

7. Part 2 Web3 Machine Learning Cases

8. Chapter 6: Preparing and Exploring Our Data

9. Chapter 7: A Primer on Machine Learning and Deep Learning

10. Chapter 8: Sentiment Analysis – NLP and Crypto News

11. Chapter 9: Generative Art for NFTs

12. Chapter 10: A Primer on Security and Fraud Detection

13. Chapter 11: Price Prediction with Time Series

14. Chapter 12: Marketing Discovery with Graphs

15. Part 3 Appendix

16. Chapter 13: Building Experience with Crypto Data – BUIDL

17. Chapter 14: Interviews with Web3 Data Leaders

18. Index

19. Other Books You May Enjoy

Appendix 1

1. Appendix 2

2. Appendix 3

Building a machine learning pipeline

After cleaning the data and selecting the most important features, the machine learning flow can be summarized into steps, as shown in Figure 7.4:

Figure 7.4 – Machine learning pipeline

To carry out this process, we must do the following:

Select a model and its initial parameters based on the problem and available data.
Train: First, we must split the data into a training set and a test set. The process of training consists of making the model learn from the data. Each model’s training process can vary in time and computational consumption. To improve the model’s performance, we must employ hyperparameter tuning through techniques such as grid search or random grid search.
Predict and evaluate: The trained model is then used to predict over the test set, which contains rows of data that have not been seen by the algorithm. If we evaluate the model with the data that we used to train...