Installing Beautiful Soup
Installing Beautiful Soup in Linux
Installing Beautiful Soup is pretty simple and straightforward in Linux machines. For recent versions of Debian or Ubuntu, Beautiful Soup is available as a package and we can install this using the system package manager. For other versions of Debian or Ubuntu, where Beautiful Soup is not available as a package, we can use alternative methods for installation.
Normally, these are the following three ways to install Beautiful Soup in Linux machines:
- Using package manager
- Using
pip
- Using
easy_install
The choices are ranked depending on the complexity levels and to avoid the trial-and-error method. The easiest method is always using the package manager since it requires less effort from the user, so we will cover this first. If the installation is successful in one step, we don't need to do the next because the three steps mentioned previously do the same thing.
Installing Beautiful Soup using package manager
Linux machines normally come with a package manager to install various packages. In the recent version of Debian or Ubuntu, since Beautiful Soup is available as a package, we will be using the system package manager for installation. In Linux machines such as Ubuntu and Debian, the default package manager is based on apt-get
and hence we will use apt-get
to do the task.
Just open up a terminal and type in the following command:
The preceding command will install Beautiful Soup Version 4 in our Linux operating system. Installing new packages in the system normally requires root user privileges, which is why we append sudo
in front of the apt-get
command. If we didn't append sudo
, we will basically end up with a permission denied error. If the packages are already updated, we will see the following success message in the command line itself:
Since we are using a recent version of Ubuntu or Debian, python-bs4
will be listed in the apt
repository. But if the preceding command fails with Package Not Found Error
, it means that the package list is not up-to-date. This normally happens if we have just installed our operating system and the package list is not downloaded from the package repository. In this case, we need to first update the package list using the following command:
The preceding command will update the necessary package list from the online package repositories. After this, we need to try the preceding command to install Beautiful Soup.
In the older versions of the Linux operating system, even after running the apt-get update
command, we might not be able to install Beautiful Soup because it might not be available in the repositories. In these scenarios, we can rely on the other methods of installation using either pip
or easy_install
.
Installing Beautiful Soup using pip or easy_install
The pip
and easy_install
are the tools used for managing and installing Python packages. Either of them can be used to install Beautiful Soup.
Installing Beautiful Soup using pip
From the terminal, type the following command:
The preceding command will install Beautiful Soup Version 4 in the system after downloading the necessary packages from http://pypi.python.org/.
Installing Beautiful Soup using easy_install
The easy_install
tool installs the package from Python Package Index (PyPI). So, in the terminal, type the following command:
All the previous methods to install Beautiful Soup in Linux will not work if you do not have an active network connection. So, in case everything fails, we can still install Beautiful Soup. The last option would be to use the setup.py
script that comes with every Python package downloaded from pypi.python.org. This method is also the recommended method to install Beautiful Soup in Windows and in Mac OS X machines. So, we will discuss this method in the Installing Beautiful Soup in Windows section.
Installing Beautiful Soup in Windows
In Windows, we will make use of the recent Python package for Beautiful Soup available from https://pypi.python.org/packages/source/b/beautifulsoup4/ and use the setup.py
script to install Beautiful Soup. But before doing this, it will be easier for us if we add the path of Python in the system path. The next section discusses setting up the path to Python on a Windows machine.
Verifying Python path in Windows
Often, the path to python.exe
will not be added to an environment variable by default in Windows. So, in order to check this from the Windows command-line prompt, you need to type the following command:
The preceding command will work without any errors if the path to Python is already added in the environment path variable or we are already within the Python installed directory. But, it would be good to check the path variable for the Python directory entry.
If it doesn't exist in the path variable, we have to find out the actual path, which is entirely dependent on where you installed Python. For Python 2.x, it will be by C:\Python2x
by default, and for Python 3.x, the path will be C:\Python3x
by default.
We have to add this to the Path
environment variable in the Windows machine. For this, right-click on My Computer | Properties | Environment Variables | System Variable.
Pick the Path
variable and add the following section to the Path
variable:
;C:\PythonXY for example C:\Python27
This is shown in the following screenshot:
After the Python path is ready, we can follow the steps for installing Beautiful Soup on a Windows machine.
Note
The method, which will be explained in the next section, of installing Beautiful Soup using setup.py
is the same for Linux, Windows, and Mac OS X operating systems.