Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hands-On Data Science with Anaconda

You're reading from   Hands-On Data Science with Anaconda Utilize the right mix of tools to create high-performance data science applications

Arrow left icon
Product type Paperback
Published in May 2018
Publisher Packt
ISBN-13 9781788831192
Length 364 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
James Yan James Yan
Author Profile Icon James Yan
James Yan
Yuxing Yan Yuxing Yan
Author Profile Icon Yuxing Yan
Yuxing Yan
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Ecosystem of Anaconda 2. Anaconda Installation FREE CHAPTER 3. Data Basics 4. Data Visualization 5. Statistical Modeling in Anaconda 6. Managing Packages 7. Optimization in Anaconda 8. Unsupervised Learning in Anaconda 9. Supervised Learning in Anaconda 10. Predictive Data Analytics – Modeling and Validation 11. Anaconda Cloud 12. Distributed Computing, Parallel Computing, and HPCC 13. References 14. Other Books You May Enjoy

Preface

Anaconda is an open source data science platform that brings the best tools for data science together. It is a data science stack that includes more than 100 popular packages based on Python, Scala, and R. With the help of its package manager, conda, users can work with hundreds of packages in different languages and perform data preprocessing, modeling, clustering, classification, and validation with ease.

This book will get you started with Anaconda and how you can use it to perform data science operations in the real world. You will start of setting up the environment for the Anaconda platform, Jupyter, and installing the relevant packages. You will then cover the basics of data science and linear algebra for performing data science tasks. Once you are ready to go, you will start with data science operations such as cleaning, sorting, and data classification. You will then learn how to perform tasks such as clustering, regression, prediction, building machine learning models, and optimizing them. You will also learn how to visualize data and share the projects.

During this course, you will learn how to use different packages, using Anaconda to get the best results. You will learn how to efficiently use conda — the package, dependency, and environment manager for Anaconda. You will also be introduced to several powerful features of Anaconda, such as additional projects, project add-ons, shared project drives, and powerful compute nodes that are available in the paid version for accomplishing advanced data handling processes. You will learn how to build scalable and functionally efficient packages, and how to perform heterogeneous data exploration, distributed computing, and more. You will learn to discover and share packages, notebooks, and environments to increase productivity. You will also learn about Anaconda Accelerate, a feature that can help you to achieve SLAs easily and optimize computational power.

In this book, we introduce four programming languages: R, Python, Octave, and Julia. There are several reasons for doing so. Firstly, all four are open source, which is one of the future trends. Secondly, one of the most obvious advantages to using the Anaconda platform is that it allows you to where we could implement many programs written in different languages. However, for many new readers, learning four languages at the same time would be quite challenging. The best strategy is to focus on R and Python first. After a while, or after finishing the whole book, learn Octave or Julia on the second reading.

  • R: This is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, such as Windows and macOS. We think that R might be the easiest of many good computer languages, especially those that offer free software. The author has published a book entitled Financial Modeling using R; you can refer to its Amazon link at http://canisius.edu/~yany/webs/amazon2018R.shtml.
  • Python: This is an interpreted high-level programming language for general-purpose programming. For business analytics/data science, Python is probably the number 1 choice out of many promising computer languages. In 2017, the author published a book entitled Python for Finance (second edition); you can refer to its Amazon link at http://canisius.edu/~yany/webs/amazonP4F2.shtml.
  • Octave: This is a piece of software featuring a high-level programming language, primarily intended for numerical computations. Octave helps with solving linear and nonlinear problems numerically, as well as performing other numerical experiments. Octave is also free. Its syntax is largely compatible with MATLAB, which is quite popular on Wall Street and in other industries.
  • Julia: This is a high-level, high-performance dynamic programming language for numerical computing. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s base library, largely written in Julia itself, also integrates mature, best-of-breed, open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing.

Happy reading!

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime