Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Analysis with Python

You're reading from   Data Analysis with Python A Modern Approach

Arrow left icon
Product type Paperback
Published in Dec 2018
Publisher Packt
ISBN-13 9781789950069
Length 490 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
David Taieb David Taieb
Author Profile Icon David Taieb
David Taieb
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Programming and Data Science – A New Toolset FREE CHAPTER 2. Python and Jupyter Notebooks to Power your Data Analysis 3. Accelerate your Data Analysis with Python Libraries 4. Publish your Data Analysis to the Web - the PixieApp Tool 5. Python and PixieDust Best Practices and Advanced Concepts 6. Analytics Study: AI and Image Recognition with TensorFlow 7. Analytics Study: NLP and Big Data with Twitter Sentiment Analysis 8. Analytics Study: Prediction - Financial Time Series Analysis and Forecasting 9. Analytics Study: Graph Algorithms - US Domestic Flight Data Analysis 10. The Future of Data Analysis and Where to Develop your Skills A. PixieApp Quick-Reference Other Books You May Enjoy Index

What kind of skills are required to become a data scientist?

In the industry, the reality is that data science is so new that companies do not yet have a well-defined career path for it. How do you get hired for a data scientist position? How many years of experience is required? What skills do you need to bring to the table? Math, statistics, machine learning, information technology, computer science, and what else?

Well, the answer is probably a little bit of everything plus one more critical skill: domain-specific expertise.

There is a debate going on around whether applying generic data science techniques to any dataset without an intimate understanding of its meaning, leads to the desired business outcome. Many companies are leaning toward making sure data scientists have substantial amount of domain expertise, the rationale being that without it you may unknowingly introduce bias at any steps, such as when filling the gaps in the data cleansing phase or during the feature selection process, and ultimately build models that may well fit a given dataset but still end up being worthless. Imagine a data scientist working with no chemistry background, studying unwanted molecule interactions for a pharmaceutical company developing new drugs. This is also probably why we're seeing a multiplication of statistics courses specialized in a particular domain, such as biostatistics for biology, or supply chain analytics for analyzing operation management related to supply chains, and so on.

To summarize, a data scientist should be in theory somewhat proficient in the following areas:

  • Data engineering / information retrieval
  • Computer science
  • Math and statistics
  • Machine learning
  • Data visualization
  • Business intelligence
  • Domain-specific expertise

Note

If you are thinking about acquiring these skills but don't have the time to attend traditional classes, I strongly recommend using online courses.

I particularly recommend this course: https://www.coursera.org/: https://www.coursera.org/learn/data-science-course.

The classic Drew's Conway Venn Diagram provides an excellent visualization of what is data science and why data scientists are a bit of a unicorn:

What kind of skills are required to become a data scientist?

Drew's Conway Data Science Venn Diagram

By now, I hope it becomes pretty clear that the perfect data scientist that fits the preceding description is more an exception than the norm and that, most often, the role involves multiple personas. Yes, that's right, the point I'm trying to make is that data science is a team sport and this idea will be a recurring theme throughout this book.

You have been reading a chapter from
Data Analysis with Python
Published in: Dec 2018
Publisher: Packt
ISBN-13: 9781789950069
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image