To get the most out of this book
You should have basic familiarity with Python programming, as the entire code that we use for the practical sections is in Python
. Familiarity with major libraries in Python, such as pandas
and scikit-learn
, are not essential (because the book covers some basics) but will help you get through the book much faster. Familiarity with PyTorch
, the framework the book uses for deep learning, is also not essential but would accelerate your learning by many folds. Any of the software requirements shouldn’t stop you because, in today’s internet-enabled world, the only thing that is standing between you and a world of knowledge is the search bar in your favorite search engine.
Another key aspect to get the most out of this book is to run the associated notebooks as you go along the lessons. Also, feel free to experiment with different variations that the book doesn’t go into. That is a surefire way to internalize what’s being talked about in the book. And for that, we need to set up an environment, as you’ll see in the following section.
Setting up an environment
The easiest way to set up an environment is by using Anaconda, a distribution of Python for scientific computing. You can use Miniconda, a minimal installer for Conda, as well if you do not want the pre-installed packages that come with Anaconda:
- Install Anaconda/Miniconda: Anaconda can be installed from https://www.anaconda.com/products/distribution. Depending on your operating system, choose the corresponding file and follow the instructions. Alternatively, you can install Miniconda from here: https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links.
- Open conda prompt: To open Anaconda Prompt (or Terminal on Linux or macOS), do the following:
- Windows: Open the Anaconda Prompt (Start | Anaconda Prompt)
- macOS: Open Launchpad and then open Terminal. Type
conda activate
. - Linux: Open Terminal. Type
conda activate
.
- Navigate to the downloaded code: Use operating system-specific commands to navigate to the folder where you have downloaded the code. For instance, in Windows, use
cd
. - Install the environment: Using the
anaconda_env.yml
file that is included, install the environment:conda env create -f anaconda_env.yml
This creates a new environment under the name modern_ts
and will install all the required libraries in the environment. This can take a while.
- Checking the installation: We can check whether all the libraries required for the book are installed properly by executing a script in the downloaded code folder:
python test_installation.py
- Activating the environment and running notebooks: Every time you want to run the notebooks, first activate the environment using the
conda activate modern_ts
command and then use the Jupyter Notebook (jupyter notebook
) or JupyterLab (jupyter lab
), according to your preference.
Download the data
You are going to be using a single dataset throughout the book. The book uses the London Smart Meters dataset from Kaggle for this purpose. Therefore, if you don’t have an account with Kaggle, please go ahead and create one: https://www.kaggle.com/account/login?phase=startRegisterTab.
There are two ways you can download the data-automated and manual.
For the automated way, we need to download a key from Kaggle. Let’s do that first (if you are going to choose the manual way, you can skip this):
- Click on your profile picture in the top-right corner of Kaggle.
- Select Account, and find the section for API.
- Click the Create New API Token button. A file with the name
kaggle.json
will be downloaded. - Copy the file and place it in the
api_keys
folder in the downloaded code folder.
Now that we have kaggle.json
downloaded and placed in the right folder, let’s look at the two methods to download data:
Method one – automated download
- Activate the environment using
conda activate modern_ts
. - Run the provided script from the
root
directory of the downloaded code:python scripts/download_data.py
That’s it. Now, just wait for the script to finish downloading, unzip it, and organize the files in the expected format.
Method two – manual download
- Go to https://www.kaggle.com/jeanmidev/smart-meters-in-london and download the dataset.
- Unzip the contents to
data/london_smart_meters
. - Unzip
hhblock_dataset
to get the raw files we want to work with. - Make sure the unzipped files are in the expected folder structure (see the next section).
Now that you have downloaded the data, we need to make sure it is arranged in the following folder structure. The automated download does it automatically, but with the manual download, this structure needs to be created. To avoid ambiguity, the expected folder structure can be found as follows:
data ├── london_smart_meters │ ├── hhblock_dataset │ │ ├── hhblock_dataset │ │ ├── block_0.csv │ │ ├── block_1.csv │ │ ├── ... │ │ ├── block_109.csv │── acorn_details.csv ├── informations_households.csv ├── uk_bank_holidays.csv ├── weather_daily_darksky.csv ├── weather_hourly_darksky.csv
There can be additional files as part of the extraction process. You can remove them without impacting anything. There is a helpful script that checks this structure.
python test_data_download.py
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository. Doing so will help you avoid any potential errors related to the copying and pasting of code.
The code that is provided along with the book is in no way a library but more of a guide for you to start experimenting on. The amount of learning you can derive from the book and code is directly proportional to how much you experiment with the code and stray outside your comfort zone. So, go ahead and start experimenting and putting the skills you pick up in the book to good use.