Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On Data Science with the Command Line Automate everyday data science tasks using command-line tools

Product type Paperback

Published in Jan 2019

Publisher Packt

ISBN-13 9781789132984

Length 124 pages

Edition 1st Edition

Languages

Python

Tools

UNIX

Concepts

Data Science

Authors (3):

Jason Morris

Raymond Page

Chris McCubbin

View More author details

Table of Contents (8) Chapters

Preface

1. Data Science at the Command Line and Setting It Up FREE CHAPTER

2. Essential Commands

3. Shell Workflows, and Data Acquisition and Massaging

4. Bash Functions and Data Visualization

5. Loops, Functions, and String Processing

6. SQL, Math, and Wrapping it up

7. Other Books You May Enjoy

Leave a review - let other readers know what you think

Python (pandas, numpy, scikit-learn)

Counting things often gets you to where you need to be, but sometimes more complex tools are required to do the job. Fortunately, we can write our own tools in the UNIX paradigm and use them in our workstream pipes along with our other command-line tools if we so desire.

One such tool is python, along with popular data science libraries such as pandas, numpy, and scikit-learn. This isn't a text on all the great things those libraries can do for you (if you'd like to learn, a good place to start is the official python tutorial (https://docs.python.org/3/tutorial/) and the basics of Pandas data structures in the Pandas documentation (https://pandas.pydata.org/pandas-docs/stable/basics.html). Make sure you have Python, pip, and pandas installed before you continue (see Chapter 1, Data Science at the Command Line and Setting It Up)...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (3)

Morris

Jason Morris is a systems and research engineer with over 19 years of experience in system architecture, research engineering, and large data analysis. His primary focus is machine learning with TensorFlow, CUDA, and Apache Spark. Jason is also a speaker and a consultant for designing large-scale architectures, implementing best security practices on the cloud, creating near real-time image detection analytics with deep learning, and developing serverless architectures to aid in ETL. His most recent roles include solution architect, big data engineer, big data specialist, and instructor at Amazon Web Services. He is currently the Chief Technology Officer of Next Rev Technologies and his favorite command line program is netcat

See other products by Morris

McCubbin

Chris McCubbin is a data scientist and software developer with 20 years experience in developing complex systems and analytics. He co-founded the successful big data security startup Sqrrl, since acquired by Amazon. He has also developed smart swarming systems for drones, social network analysis systems in MapReduce and big data security analytic platforms using the Apache projects Accumulo and Spark. He has been using the Unix command line starting on IRIX platforms in college and his favorite command line program is find.

See other products by McCubbin

Page

Raymond Page is a computer engineer specializing in site reliability. His experience with embedded development engendered a passion for removing the pervasive bloat from web technologies and cloud computing. His favorite command is cat.

See other products by Page