Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Julia for Data Science
Julia for Data Science

Julia for Data Science: high-performance computing simplified

eBook
€8.99 €32.99
Paperback
€41.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Julia for Data Science

Chapter 1. The Groundwork – Julia's Environment

Julia is a fairly young programming language. In 2009, three developers (Stefan Karpinski, Jeff Bezanson, and Viral Shah) at MIT in the Applied Computing group under the supervision of Prof. Alan Edelman started working on a project that lead to Julia. In February 2012, Julia was presented publicly and became open source. The source code is available on GitHub (https://github.com/JuliaLang/julia). The source of the registered packages can also be found on GitHub. Currently, all four of the initial creators, along with developers from around the world, actively contribute to Julia.

Note

The current release is 0.4 and is still away from its 1.0 release candidate.

Based on solid principles, its popularity is steadily increasing in the field of scientific computing, data science, and high-performance computing.

This chapter will guide you through the download and installation of all the necessary components of Julia. This chapter covers the following topics:

  • How is Julia different?
  • Setting up Julia's environment.
  • Using Julia's shell and REPL.
  • Using Jupyter notebooks
  • Package management
  • Parallel computation
  • Multiple dispatch
  • Language interoperability

Traditionally, the scientific community has used slower dynamic languages to build their applications, although they have required the highest computing performance. Domain experts who had experience with programming, but were not generally seasoned developers, always preferred dynamic languages over statically typed languages.

Julia is different

Over the years, with the advancement in compiler techniques and language design, it is possible to eliminate the trade-off between performance and dynamic prototyping. So, the scientific computing required was a good dynamic language like Python together with performance like C. And then came Julia, a general purpose programming language designed according to the requirements of scientific and technical computing, providing performance comparable to C/C++, and with an environment productive enough for prototyping like the high-level dynamic language of Python. The key to Julia's performance is its design and Low Level Virtual Machine (LLVM) based Just-in-Time compiler which enables it to approach the performance of C and Fortran.

The key features offered by Julia are:

  • A general purpose high-level dynamic programming language designed to be effective for numerical and scientific computing
  • A Low-Level Virtual Machine (LLVM) based Just-in-Time (JIT) compiler that enables Julia to approach the performance of statically-compiled languages like C/C++

The following quote is from the development team of Julia—Jeff Bezanson, Stefan Karpinski, Viral Shah, and Alan Edelman:

Note

We are greedy: we want more.

We want a language that's open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that's homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

It is quite often compared with Python, R, MATLAB, and Octave. These have been around for quite some time and Julia is highly influenced by them, especially when it comes to numerical and scientific computing. Although Julia is really good at it, it is not restricted to just scientific computing as it can also be used for web and general purpose programming.

The development team of Julia aims to create a remarkable and never done before combination of power and efficiency without compromising the ease of use in one single language. Most of Julia's core is implemented in C/C++. Julia's parser is written in Scheme. Julia's efficient and cross-platform I/O is provided by the Node.js's libuv.

Features and advantages of Julia can be summarized as follows:

  • It's designed for distributed and parallel computation.
  • Julia provides an extensive library of mathematical functions with great numerical accuracy.
  • Julia gives the functionality of multiple dispatch. Multiple dispatch refers to using many combinations of argument types to define function behaviors.
  • The Pycall package enables Julia to call Python functions in its code and Matlab packages using Matlab.jl. Functions and libraries written in C can also be called directly without any need for APIs or wrappers.
  • Julia provides powerful shell-like capabilities for managing other processes in the system.
  • Unlike other languages, user-defined types in Julia are compact and quite fast as built-ins.
  • Data analysis makes great use of vectorized code to gain performance benefits. Julia eliminates the need to vectorize code to gain performance. De-vectorized code written in Julia can be as fast as vectorized code.
  • It uses lightweight "green" threading also known as tasks or coroutines, cooperative multitasking, or one-shot continuations.
  • Julia has a powerful type system. The conversions provided are elegant and extensible.
  • It has efficient support for Unicode.
  • It has facilities for metaprogramming and Lisp-like macros.
  • It has a built-in package manager. (Pkg)
  • Julia provides efficient, specialized and automatic generation of code for different argument types.
  • It's free and open source with an MIT license.

Setting up the environment

Julia is available free. It can be downloaded from its website at the following address: http://julialang.org/downloads/. The website also has exhaustive documentation, examples, and links to tutorials and community. The documentation can be downloaded in popular formats.

Installing Julia (Linux)

Ubuntu/Linux Mint is one of the most famous Linux distros, and their deb packages of Julia are also provided. These are available for both 32-bit and 64-bit distributions.

To install Julia, add the PPA (personal package archive). Ubuntu users are privileged enough to have PPA. It is treated as an apt repository to build and publish Ubuntu source packages. In the terminal, type the following:

sudo apt-get add-repository ppa:staticfloat/juliareleases 
sudo apt-get update 

This adds the PPA and updates the package index in the repository.

Now install Julia:

sudo apt-get install Julia  

The installation is complete. To check if the installation is successful in the Terminal type in the following:

julia --version 

This gives the installed Julia's version.

Installing Julia (Linux)

To open the Julia's interactive shell, type julia into the Terminal. To uninstall Julia, simply use apt to remove it:

sudo apt-get remove julia 

For Fedora/RHEL/CentOS or distributions based on them, enable the EPEL repository for your distribution version. Then, click on the link provided. Enable Julia's repository using the following:

dnf copr enable nalimilan/julia

Or copy the relevant .repo file available as follows:

/etc/yum.repos.d/

Finally, in the Terminal type the following:

yum install julia

Installing Julia (Mac)

Users with Mac OS X need to click on the downloaded .dmg file to run the disk image. After that, drag the app icon into the Applications folder. It may prompt you to ask if you want to continue as the source has been downloaded from the Internet and so is not considered secure. Click on continue if it is downloaded for the Julia language official website.

Julia can also be installed using homebrew on the Mac as follows:

brew update 
brew tap staticfloat/julia 
brew install julia 

The installation is complete. To check if the installation is successful in the Terminal, type the following:

julia --version 

This gives you the installed Julia version.

Installing Julia (Windows)

Download the .exe file provided on the download page according to your system's configuration (32-bit/64-bit). Julia is installed on Windows by running the downloaded .exe file, which will extract Julia into a folder. Inside this folder is a batch file called julia.bat, which can be used to start the Julia console.

To uninstall, delete the Julia folder.

Exploring the source code

For enthusiasts, Julia's source code is available and users are encouraged to contribute by adding features or by bug fixing. This is the directory structure of the tree:

base/

Source code for Julia's standard library

contrib/

Editor support for Julia source, miscellaneous scripts

deps/

External dependencies

doc/manual

Source for the user manual

doc/stdlib

Source for standard library function help text

examples/

Example Julia programs

src/

Source for Julia language core

test/

Test suites

test/perf

Benchmark suites

ui/

Source for various frontends

usr/

Binaries and shared libraries loaded by Julia's standard libraries

Using REPL

Read-Eval-Print-Loop is an interactive shell or the language shell that provides the functionality to test out pieces of code. Julia provides an interactive shell with a Just-in-Time compiler at the backend. We can give inputs in a line, it is compiled and evaluated, and the result is given in the next line.

Using REPL

The benefit of using the REPL is that we can test out our code for possible errors. Also, it is a good environment for beginners. We can type in the expressions and press Enter to evaluate.

A Julia library, or custom-written Julia program, can be included in the REPL using include. For example, I have a file called hello.jl, which I will include in the REPL by doing the following:

julia> include ("hello.jl") 

Julia also stores all the commands written in the REPL in the .julia_history. This file is located at /home/$USER on Ubuntu, C:\Users\username on Windows, or ~/.julia_history on OS X.

As with a Linux Terminal, we can reverse-search using Ctrl + R in Julia's shell. This is a really nice feature as we can go back in the history of typed commands.

Typing ? in the language shell will change the prompt to:

help?>  

Using REPL

To clear the screen, press Ctrl + L. To come out of the REPL press Ctrl + D or type the following:

julia> exit().

Using Jupyter Notebook

Data science and scientific computing are privileged to have an amazing interactive tool called Jupyter Notebook. With Jupyter Notebook you can to write and run code in an interactive web environment, which also has the capability to have visualizations, images, and videos. It makes testing of equations and prototyping a lot easier. It has the support of over 40 programming languages and is completely open source.

GitHub supports Jupyter notebooks. The notebook with the record of computation can be shared via the Jupyter notebook viewer or other cloud storage. Jupyter notebooks are extensively used for coding machine-learning algorithms, statistical modeling and numerical simulation, and data munging.

Jupyter Notebook is implemented in Python but you can run the code in any of the 40 languages provided you have their kernel. You can check if Python is installed on your system or not by typing the following into the Terminal:

python -version 

This will give the version of Python if it is there on the system. It is best to have Python 2.7.x or 3.5.x or a later version.

If Python is not installed then you can install it by downloading it from the official website for Windows. For Linux, typing the following should work:

sudo apt-get install python 

It is highly recommended to install Anaconda if you are new to Python and data science. Commonly used packages for data science, numerical, and scientific computing including Jupyter notebook come bundled with Anaconda making it the preferred way to set up the environment. Instructions can be found at https://www.continuum.io/downloads.

Jupyter is present in the Anaconda package, but you can check if the Jupyter package is up to date by typing in the following:

conda install jupyter 

Another way to install Jupyter is by using pip:

pip install jupyter 

To check if Jupyter is installed properly, type the following in the Terminal:

jupyter -version 

It should give the version of the Jupyter if it is installed.

Now, to use Julia with Jupyter we need the IJulia package. This can be installed using Julia's package manager.

After installing IJulia, we can create a new notebook by selecting Julia under the Notebooks section in Jupyter.

Using Jupyter Notebook

To get the latest version of all your packages, in Julia's shell type the following:

julia> Pkg.update() 

After that add the IJulia package by typing the following:

julia> Pkg.add("IJulia") 

In Linux, you may face some warnings, so it's better to build the package:

julia> Pkg.build("IJulia") 

After IJulia is installed, come back to the Terminal and start the Jupyter notebook:

jupyter notebook 

A browser window will open. Under New, you will find options to create new notebooks with the kernels already installed. As we want to start a Julia notebook we will select Julia 0.4.2. This will start a new Julia notebook. You can try out a simple example.

In this example, we are creating a histogram of random numbers. This is just an example we will be studying the components used in detail in coming chapters.

Using Jupyter Notebook

Popular editors such as Atom and Sublime have a plugin for Julia. Atom has language—julia and Sublime has Sublime—IJulia, both of which can be downloaded from their package managers.

Package management

Julia provides a built-in package manager. Using Pkg we can install libraries written in Julia. For external libraries, we can also compile them from their source or use the standard package manager of the operating system. A list of registered packages is maintained at http://pkg.julialang.org.

Pkg is provided in the base installation. The Pkg module contains all the package manager commands.

Pkg.status() – package status

The Pkg.status() is a function that prints out a list of currently installed packages with a summary. This is handy when you need to know if the package you want to use is installed or not.

When the Pkg command is run for the first time, the package directory is automatically created. It is required by the command that the Pkg.status() returns a valid list of the packages installed. The list of packages given by the Pkg.status() are of registered versions which are managed by Pkg.

Pkg.installed() can also be used to return a list of all the installed packages with their versions.

Pkg.status() – package status

Pkg.add() – adding packages

Julia's package manager is declarative and intelligent. You only have to tell it what you want and it will figure out what version to install and will resolve dependencies if there are any. Therefore, we only need to add the list of requirements that we want and it resolves which packages and their versions to install.

The ~/.julia/v0.4/REQUIRE file contains the package requirements. We can open it using a text editor such as vi or atom, or use Pkg.edit() in Julia's shell to edit this file. After editing the file, run Pkg.resolve() to install or remove the packages.

We can also use Pkg.add(package_name) to add packages and Pkg.rm(package_name) to remove packages. Earlier, we used Pkg.add("IJulia")  to install the IJulia package.

When we don't want to have a package installed on our system anymore, Pkg.rm() is used for removing the requirement from the REQUIRE file. Similar to Pkg.add(), Pkg.rm() first removes the requirement of the package from the REQUIRE file and then updates the list of installed packages by running Pkg.resolve() to match.

Working with unregistered packages

Frequently, we would like to be able to use packages created by our team members or someone who has published on Git but they are not in the registered packages of Pkg. Julia allows us to do that by using a clone. Julia packages are hosted on Git repositories and can be cloned using mechanisms supported by Git. The index of registered packages is maintained at METADATA.jl. For unofficial packages, we can use the following:

Pkg.clone("git://example.com/path/unofficialPackage/Package.jl.git") 

Sometimes unregistered packages have dependencies that require fulfilling before use. If that is the scenario, a REQUIRE file is needed at the top of the source tree of the unregistered package. The dependencies of the unregistered packages on the registered packages are determined by this REQUIRE file. When we run Pkg.clone(url), these dependencies are automatically installed.

Pkg.update() – package update

It's good to have updated packages.  Julia, which is under active development, has its packages frequently updated and new functionalities are added.

To update all of the packages, type the following:

Pkg.update() 

Under the hood, new changes are pulled into the METADATA file in the directory located at ~/.julia/v0.4/ and it checks for any new registered package versions which may have been published since the last update. If there are new registered package versions, Pkg.update() attempts to update the packages which are not dirty and are checked out on a branch. This update process satisfies the top-level requirements by computing the optimal set of package versions to be installed. The packages with specific versions that must be installed are defined in the REQUIRE file in Julia's directory (~/.julia/v0.4/).

METADATA repository

Registered packages are downloaded and installed using the official METADATA.jl repository. A different METADATA repository location can also be provided if required:

julia> Pkg.init("https://julia.customrepo.com/METADATA.jl.git", "branch") 

Developing packages

Julia allows us to view the source code and as it is tracked by Git, the full development history of all the installed packages is available. We can also make our desired changes and commit to our own repository, or do bug fixes and contribute enhancements upstream.

You may also want to create your own packages and publish them at some point in time. Julia's package manager allows you to do that too.

It is a requirement that Git is installed on the system and the developer needs an account at their hosting provider of choice (GitHub, Bitbucket, and so on). Having the ability to communicate over SSH is preferred—to enable that, upload your public ssh-key to your hosting provider.

Creating a new package

It is preferable to have the REQUIRE file in the package repository. This should have the bare minimum of a description of the Julia version.

For example, if we would like to create a new Julia package called HelloWorld we would have the following:

Pkg.generate("HelloWorld", "MIT") 

Here, HelloWorld is the package that we want to create and MIT is the license that our package will have. The license should be known to the package generator.

This will create a directory as follows: ~/.julia/v0.4/HelloWorld. The directory that is created is initialized as a Git repository. Also, all the files required by the package are kept in this directory. This directory is then committed to the repository.

This can now be pushed to the remote repository for the world to use.

Parallel computation using Julia

Advancement in modern computing has led to multi-core CPUs in systems and sometimes these systems are combined together in a cluster capable of performing a task which a single system might not be able to perform alone, or if it did it would take an undesirable amount of time. Julia's environment of parallel processing is based on message passing. Multiple processes are allowed for programs in separate memory domains.

Message passing is implemented differently in Julia from other popular environments such as MPI. Julia provides one-sided communication, therefore the programmer explicitly manages only one process in the two-process operation.

Julia's parallel programming paradigm is built on the following:

  • Remote references
  • Remote calls

A request to run a function on another process is called a remote call. The reference to an object by another object on a particular process is called a remote reference. A remote reference is a construct used in most distributed object systems. Therefore, a call which is made with some specific arguments to the objects generally on a different process by the objects of the different process is called the remote call and this will return a reference to the remote object which is called the remote reference.

The remote call returns a remote reference to its result. Remote calls return immediately. The process that made the call proceeds to its next operation. Meanwhile, the remote call happens somewhere else. A call to wait() on its remote reference waits for the remote call to finish. The full value of the result can be obtained using fetch(), and put!() is used to store the result to a remote reference.

Julia uses a single process default. To start Julia with multiple processors use the following:

julia -p n

where n is the number of worker processes. Alternatively, it is possible to create extra processors from a running system by using addproc(n). It is advisable to put n equal to the number of the CPU cores in the system.

pmap and @parallel are the two most frequently used and useful functions.

Julia provides a parallel for loop, used to run a number of processes in parallel. This is used as follows.

Parallel computation using Julia

Parallel for loop works by having multiple processes assigned iterations and then reducing the result (in this case (+)). It is somewhat similar to the map-reduce concept. Iterations will run independently over different processes and the results obtained by these processes will be combined at the end (like map-reduce). The resultant of one loop can also become the feeder for the other loop. The answer is the resultant of this whole parallel loop.

It is very different than a normal iterative loop because the iterations do not take place in a specified sequence. As the iterations run on different processes, any writes that happens on variables or arrays are not globally visible. The variables used are copied and broadcasted to each process of the parallel for loop.

For example:

arr = zeros(500000) 
@parallel for i=1:500000 
  arr[i] = i 
end 

This will not give the desired result as each process gets their own separate copy of arr. The vector will not be filled in with i as expected. We must avoid such parallel for loops.

pmap refers to parallel map. For example:

Parallel computation using Julia

This code solves the problem if we have a number of large random matrices and we are required to obtain the singular values, in parallel.

Julia's pmap() is designed differently. It is well suited for cases where a large amount of work is done by each function call, whereas @parallel is suited for handling situations which involve numerous small iterations. Both pmap() and @parallel for utilize worker nodes for parallel computation. However, the node from which the calling process originated does the final reduction in @parallel for.

Julia's key feature – multiple dispatch

A function is an object, mapping a tuple of arguments using some expression to a return value. When this function object is unable to return a value, it throws an exception. For different types of arguments the same conceptual function can have different implementations. For example, we can have a function to add two floating point numbers and another function to add two integers. But conceptually, we are only adding two numbers. Julia provides a functionality by which different implementations of the same concept can be implemented easily. The functions don't need to be defined all at once. They are defined in small abstracts. These small abstracts are different argument type combinations and have different behaviors associated with them. The definition of one of these behaviors is called a method.

The types and the number of arguments that a method definition accepts is indicated by the annotation of its signatures. Therefore, the most suitable method is applied whenever a function is called with a certain set of arguments. To apply a method when a function is invoked is known as dispatch. Traditionally, object-oriented languages consider only the first argument in dispatch. Julia is different as all of the function's arguments are considered (not just only the first) and then it choses which method should be invoked. This is known as multiple dispatch.

Multiple dispatch is particularly useful for mathematical and scientific code. We shouldn't consider that the operations belong to one argument more than any of the others. All of the argument types are considered when implementing a mathematical operator. Multiple dispatch is not limited to mathematical expressions as it can be used in numerous real-world scenarios and is a powerful paradigm for structuring the programs.

Methods in multiple dispatch

+ is a function in Julia using multiple dispatch. Multiple dispatch is used by all of Julia's standard functions and operators. For various possible combinations of argument types and count, all of them have many methods defining their behavior. A method is restricted to take certain types of arguments using the :: type-assertion operator:

julia> f(x::Float64, y::Float64) = x + y 

The function definition will only be applied for calls where x and y are both values of type Float64:

julia> f(10.0, 14.0) 
24.0 

If we try to apply this definition to other types of arguments, it will give a method error.

Methods in multiple dispatch

The arguments must be of precisely the same type as defined in the function definition.

The function object is created in the first method definition. New method definitions add new behaviors to the existing function object. When a function is invoked, the number and types of the arguments are matched, and the most specific method definition matching will be executed.

The following example creates a function with two methods. One method definition takes two arguments of the type Float64 and adds them. The second method definition takes two arguments of the type Number, multiplies them by two and adds them. When we invoke the function with Float64 arguments, then the first method definition is applied, and when we invoke the function with Integer arguments, the second method definition is applied as the number can take any numeric values. In the following example, we are playing with floating point numbers and integers using multiple dispatch.

Methods in multiple dispatch

In Julia, all values are instances of the abstract type "Any". When the type declaration is not given with ::, that means it is not specifically defined as the type of the argument, therefore Any is the default type of method parameter and it doesn't have the restriction of taking any type of value. Generally, one method definition is written in such a way that it will be applied to the certain arguments to which no other method definition applies. It is one of the Julia language's most powerful features.

It is efficient with a great ease of expressiveness to generate specialized code and implement complex algorithms without caring much about the low-level implementation using Julia's multiple dispatch and flexible parametric type system.

Ambiguities – method definitions

Sometimes function behaviors are defined in such a way that there isn't a unique method to apply for a certain set of arguments. Julia throws a warning in such cases about this ambiguity, but proceeds by arbitrarily picking a method. To avoid this ambiguity we should define a method to handle such cases.

In the following example, we define a method definition with one argument of the type Any and another argument of the type Float64. In the second method definition, we just changed the order, but this doesn't differentiate it from the first definition. In this case, Julia will give a warning of ambiguous method definition but will allow us to proceed.

Ambiguities – method definitions

Facilitating language interoperability

Although Julia can be used to write most kinds of code, there are mature libraries for numerical and scientific computing which we would like to exploit. These libraries can be in C, Fortran or Python. Julia allows the ease of using the existing code written in Python, C, or Fortran. This is done by making Julia perform simple and efficient-to-call C, Fortran, or Python functions.

The C/Fortran libraries should be available to Julia. An ordinary but valid call with ccall is made to this code. This is possible when the code is available as a shared library. Julia's JIT generates the same machine instructions as the native C call. Therefore, it is generally no different from calling through a C code with a minimal overhead.

Importing Python code can be beneficial and sometimes needed, especially for data science, because it already has an exhaustive library of implementations of machine learning and statistical functions. For example, it contains scikit-learn and pandas. To use Python in Julia, we require PyCall.jl. To add PyCall.jl do the following:

Pkg.add("PyCall") 

Facilitating language interoperability

PyCall contains a macro @pyimport that facilitates importing Python packages and provides Julia wrappers for all of the functions and constants therein, including automatic conversion of types between Julia and Python.

PyCall also provides functionalities for lower-level manipulation of Python objects, including a PyObject type for opaque Python objects. It also has a pycall function (similar to Julia's ccall function), which can be used in Julia to call Python functions with type conversions. PyCall does not use the Python program but links directly to the libpython library. During the Pkg.build, it finds the location of the libpython by Punning python.

Calling Python code in Julia

The @pyimport macro automatically makes the appropriate type conversions to Julia types in most of the scenarios based on a runtime inspection of the Python objects. It achieves better control over these type conversions by using lower-level functions. Using PyCall in scenarios where the return type is known can help in improving the performance, both by eliminating the overhead of runtime type inference, and also by providing more type information to the Julia compiler:

  • pycall(function::PyObject, returntype::Type, args...): This calls the given Python function (typically looked up from a module) with the given args... (of standard Julia types which are converted automatically to the corresponding Python types if possible), converting the return value to returntype (use a returntype of PyObject to return the unconverted Python object reference, or PyAny to request an automated conversion).
  • pyimport(s): This imports the Python modules (a string or symbol) and returns a pointer to it (a PyObject). Functions or other symbols in the module may then be looked up by s[name] where the name is a string (for the raw PyObject) or a symbol (for automatic type conversion). Unlike the @pyimport macro, this does not define a Julia module and members cannot be accessed with an s.name.

Summary

In this chapter, we learned how Julia is different and how an LLVM-based JIT compiler enables Julia to approach the performance of C/C++. We introduced you to how to download Julia, install it, and build it from source. The notable features that we found were that the language is elegant, concise, and powerful and it has amazing capabilities for numeric and scientific computing.

We worked on some examples of working with Julia via the command line (REPL) and saw how full of features the language shell is. The features found were tab-completion, reverse-search, and help functions. We also discussed why should we use Jupyter Notebook and went on to set up Jupyter with the IJulia package. We worked on a simple example to use the Jupyter Notebook and Julia's visualization package, Gadfly.

In addition, we learned about Julia's powerful built-in package management and how to add, update, and remove modules. Also, we went through the process of creating our own package and publishing it to the community. We also introduced you to one of the most powerful features of Julia—multiple dispatch—and worked on some basic examples of how to create method definitions to implement multiple dispatch.

In addition, we introduced you to the parallel computation, explaining how it is different from conventional message passing and how to make use of all the compute resources available. We also learned Julia's feature of language interoperability and how we can call a Python module or a library from the Julia program.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • An in-depth exploration of Julia's growing ecosystem of packages
  • Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
  • Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large data sets

Description

Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. (https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century). This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game. This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations. You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning. This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.

Who is this book for?

This book is aimed at data analysts and aspiring data scientists who have a basic knowledge of Julia or are completely new to it. The book also appeals to those competent in R and Python and wish to adopt Julia to improve their skills set in Data Science. It would be beneficial if the readers have a good background in statistics and computational mathematics.

What you will learn

  • Apply statistical models in Julia for data-driven decisions
  • Understanding the process of data munging and data preparation using Julia
  • Explore techniques to visualize data using Julia and D3 based packages
  • Using Julia to create self-learning systems using cutting edge machine learning algorithms
  • Create supervised and unsupervised machine learning systems using Julia. Also, explore ensemble models
  • Build a recommendation engine in Julia
  • Dive into Julia's deep learning framework and build a system using Mocha.jl
Estimated delivery fee Deliver to Switzerland

Standard delivery 10 - 13 business days

€11.95

Premium delivery 3 - 6 business days

€16.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 30, 2016
Length: 346 pages
Edition : 1st
Language : English
ISBN-13 : 9781785289699
Category :
Languages :
Concepts :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to Switzerland

Standard delivery 10 - 13 business days

€11.95

Premium delivery 3 - 6 business days

€16.95
(Includes tracking information)

Product Details

Publication date : Sep 30, 2016
Length: 346 pages
Edition : 1st
Language : English
ISBN-13 : 9781785289699
Category :
Languages :
Concepts :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 149.97
Julia: High Performance Programming
€74.99
Julia Cookbook
€32.99
Julia for Data Science
€41.99
Total 149.97 Stars icon
Banner background image

Table of Contents

11 Chapters
1. The Groundwork – Julia's Environment Chevron down icon Chevron up icon
2. Data Munging Chevron down icon Chevron up icon
3. Data Exploration Chevron down icon Chevron up icon
4. Deep Dive into Inferential Statistics Chevron down icon Chevron up icon
5. Making Sense of Data Using Visualization Chevron down icon Chevron up icon
6. Supervised Machine Learning Chevron down icon Chevron up icon
7. Unsupervised Machine Learning Chevron down icon Chevron up icon
8. Creating Ensemble Models Chevron down icon Chevron up icon
9. Time Series Chevron down icon Chevron up icon
10. Collaborative Filtering and Recommendation System Chevron down icon Chevron up icon
11. Introduction to Deep Learning Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.5
(6 Ratings)
5 star 83.3%
4 star 0%
3 star 0%
2 star 16.7%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




RareComplexCollectionOfMatter Aug 17, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Good
Amazon Verified review Amazon
Deepankar A. Jan 25, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The way the book is written is really amazing with various practical examples. This actually gives a good insights of how to use Julia with Data Science. The language is easy by comparing the complexity the book is dealing with so it is easy for starters to start with. It is a must read if one is adopting Julia as a language for Data Science.
Amazon Verified review Amazon
Amazon Customer Dec 27, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I would definitely recommend this book. I have worked on many projects in the past and have used Python,R and Scala, this book has added an entire new area for me to work on. It is well structured and was easy to go through, a good job by the author.
Amazon Verified review Amazon
Amazon Customer Dec 22, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Learning a new language is always a difficult task, but this book has covered all the crucial topics and thus makes learning very smooth and easy.
Amazon Verified review Amazon
Rahul Dec 17, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As someone who knew nothing about data science and machine learning models, this book proves to be a great asset for people looking out for serious and immersive content over the topic. The author explains each and every topic in detail and uses Julia code, which btw is something that every modern data scientist should be looking out for. Overall, this is a great book and I would highly recommend it.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela