Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python Data Analysis

You're reading from   Python Data Analysis Perform data collection, data processing, wrangling, visualization, and model building using Python

Arrow left icon
Product type Paperback
Published in Feb 2021
Publisher Packt
ISBN-13 9781789955248
Length 478 pages
Edition 3rd Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Ivan Idris Ivan Idris
Author Profile Icon Ivan Idris
Ivan Idris
Avinash Navlani Avinash Navlani
Author Profile Icon Avinash Navlani
Avinash Navlani
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Section 1: Foundation for Data Analysis
2. Getting Started with Python Libraries FREE CHAPTER 3. NumPy and pandas 4. Statistics 5. Linear Algebra 6. Section 2: Exploratory Data Analysis and Data Cleaning
7. Data Visualization 8. Retrieving, Processing, and Storing Data 9. Cleaning Messy Data 10. Signal Processing and Time Series 11. Section 3: Deep Dive into Machine Learning
12. Supervised Learning - Regression Analysis 13. Supervised Learning - Classification Techniques 14. Unsupervised Learning - PCA and Clustering 15. Section 4: NLP, Image Analytics, and Parallel Computing
16. Analyzing Textual Data 17. Analyzing Image Data 18. Parallel Computing Using Dask 19. Other Books You May Enjoy

The skillsets of data analysts and data scientists

A data analyst is someone who discovers insights from data and creates value out of it. This helps decision-makers understand how the business is performing. Data analysts must acquire the following skills:

  • Exploratory Data Analysis (EDA): EDA is an essential skill for data analysts. It helps with inspecting data to discover patterns, test hypotheses, and assure assumptions.
  • Relational Database: Knowledge of at least one of the relational database tools, such as MySQL or Postgre, is mandatory. SQL is a must for working on relational databases.
  • Visualization and BI Tools: A picture speaks more than words. Visuals have more of an impact on humans and visuals are a clear and easy option for representing the insights. Visualization and BI tools such as Tableau, QlikView, MS Power BI, and IBM Cognos can help analysts visualize and prepare reports.
  • Spreadsheet: Knowledge of MS Excel, WPS, Libra, or Google Sheets is mandatory for storing and managing data in tabular form.
  • Storytelling and Presentation Skills: The art of storytelling is another necessary skill. A data analyst should be an expert in connecting data facts to an idea or an incident and turning it into a story.

On the other hand, the primary job of a data scientist is to solve problems using data. In order to do this, they need to understand the client's requirements, their domain, their problem space, and ensure that they get exactly what they really want. The tasks that data scientists undertake vary from company to company. Some companies use data analysts and offer the title of data scientist just to glorify the job designation. Some combine data analyst tasks with data engineers and offer data scientists designation; others assign them to machine learning-intensive tasks with data visualizations.

The task of the data scientist varies, depending on the company. Some employ data scientists as well-known data analysts and combine their responsibilities with data engineers. Others give them the task of performing intensive data visualization on machines.

A data scientist has to be a jack of all trades and wear multiple hats, including those of a data analyst, statistician, mathematician, programmer, ML, or NLP engineer. Most people are not skilled enough or experts in all these trades. Also, getting skilled enough requires lots of effort and patience. This is why data science cannot be learned in 3 or 6 months. Learning data science is a journey. A data scientist should have a wide variety of skills, such as the following:

  • Mathematics and Statistics: Most machine learning algorithms are based on mathematics and statistics. Knowledge of mathematics helps data scientists develop custom solutions.
  • Databases: Knowledge of SQL allows data scientists to interact with the database and collect the data for prediction and recommendation.
  • Machine Learning: Knowledge of supervised machine learning techniques such as regression analysis, classification techniques, and unsupervised machine learning techniques such as cluster analysis, outlier detection, and dimensionality reduction.
  • Programming Skills: Knowledge of programming helps data scientists automate their suggested solutions. Knowledge of Python and R is recommended.
  • Storytelling and Presentation skills: Communicating the results in the form of storytelling via PowerPoint presentations.
  • Big Data Technology: Knowledge of big data platforms such as Hadoop and Spark helps data scientists develop big data solutions for large-scale enterprises.
  • Deep Learning Tools: Deep learning tools such as Tensorflow and Keras are utilized in NLP and image analytics.

Apart from these skillsets, knowledge of web scraping packages/tools for extracting data from diverse sources and web application frameworks such as Flask or Django for designing prototype solutions is also obtained. It is all about the skillset for data science professionals.

Now that we have covered the basics of data analysis and data science, let's dive into the basic setup needed to get started with data analysis. In the next section, we'll learn how to install Python.

You have been reading a chapter from
Python Data Analysis - Third Edition
Published in: Feb 2021
Publisher: Packt
ISBN-13: 9781789955248
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at ₹800/month. Cancel anytime