Preface
Note
About
This section briefly introduces the authors, the coverage of this book, the technical skills you'll need to get started, and the hardware and software requirements required to complete all of the included activities and exercises.
About the Book
Data Science for Marketing Analytics covers every stage of data analytics, from working with a raw dataset to segmenting a population and modeling different parts of it based on the segments.
The book starts by teaching you how to use Python libraries, such as pandas and Matplotlib, to read data from Python, manipulate it, and create plots using both categorical and continuous variables. Then, you'll learn how to segment a population into groups and use different clustering techniques to evaluate customer segmentation. As you make your way through the chapters, you'll explore ways to evaluate and select the best segmentation approach, and go on to create a linear regression model on customer value data to predict lifetime value. In the concluding chapters, you'll gain an understanding of regression techniques and tools for evaluating regression models, and explore ways to predict customer choice using classification algorithms. Finally, you'll apply these techniques to create a churn model for modeling customer product choices.
By the end of this book, you will be able to build your own marketing reporting and interactive dashboard solutions.
About the Authors
Tommy Blanchard earned his PhD from the University of Rochester and did his postdoctoral training at Harvard. Now, he leads the data science team at Fresenius Medical Care North America. His team performs advanced analytics and creates predictive models to solve a wide variety of problems across the company.
Debasish Behera works as a data scientist for a large Japanese corporate bank, where he applies machine learning/AI to solve complex problems. He has worked on multiple use cases involving AML, predictive analytics, customer segmentation, chat bots, and natural language processing. He currently lives in Singapore and holds a Master's in Business Analytics (MITB) from the Singapore Management University.
Pranshu Bhatnagar works as a data scientist in the telematics, insurance, and mobile software space. He has previously worked as a quantitative analyst in the FinTech industry and often writes about algorithms, time series analysis in Python, and similar topics. He graduated with honors from the Chennai Mathematical Institute with a degree in Mathematics and Computer Science and has completed certification books in Machine Learning and Artificial Intelligence from the International Institute of Information Technology, Hyderabad. He is based in Bangalore, India.
Objectives
Analyze and visualize data in Python using pandas and Matplotlib
Study clustering techniques, such as hierarchical and k-means clustering
Create customer segments based on manipulated data
Predict customer lifetime value using linear regression
Use classification algorithms to understand customer choice
Optimize classification algorithms to extract maximal information
Audience
Data Science for Marketing Analytics is designed for developers and marketing analysts looking to use new, more sophisticated tools in their marketing analytics efforts. It'll help if you have prior experience of coding in Python and knowledge of high school level mathematics. Some experience with databases, Excel, statistics, or Tableau is useful but not necessary.
Approach
Data Science for Marketing Analytics takes a hands-on approach to the practical aspects of using Python data analytics libraries to ease marketing analytics efforts. It contains multiple activities that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.
Minimum Hardware Requirements
For an optimal student experience, we recommend the following hardware configuration:
Processor: Dual Core or better
Memory: 4 GB RAM
Storage: 10 GB available space
Software Requirements
You'll also need the following software installed in advance:
Any of the following operating systems: Windows 7 SP1 32/64-bit, Windows 8.1 32/64-bit, or Windows 10 32/64-bit, Ubuntu 14.04 or later, or macOS Sierra or later.
Browser: Google Chrome or Mozilla Firefox
Conda
Python 3.x
Conventions
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Import the cluster module from the sklearn package."
A block of code is set as follows:
plt.xlabel('Income') plt.ylabel('Age') plt.show()
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "The Year column appears to have matched to the right values, but the line column does not seem to make much sense."
Installation and Setup
We recommend installing Python using the Anaconda distribution, available here: https://www.anaconda.com/distribution/.
It contains most of the modules that will be used. Additional Python modules can be installed using the methods here: https://docs.python.org/3/installing/index.html. There is only one module that is used that is not part of the standard Anaconda distribution; use one of the methods in the linked page to install it:
kmodes
If you do not use the Anaconda distribution, make sure you have the following modules installed:
jupyter
pandas
sklearn
numpy
scipy
seaborn
statsmodels
Installing the Code Bundle
Copy the code bundle for the class to the C:/Code folder.
Additional Resources
The code bundle for this book is also hosted on GitHub at: https://github.com/TrainingByPackt/Data-Science-for-Marketing-Analytics.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!