There are many ways of installing and configuring PySpark on Python IDEs such as PyCharm, Spider, and so on. Alternatively, you can use PySpark if you have already installed Spark and configured the SPARK_HOME. Thirdly, you can also use PySpark from the Python shell. Below we will see how to configure PySpark for running standalone jobs.
Installation and configuration
By setting SPARK_HOME
At first, download and place the Spark distribution at your preferred place, say /home/asif/Spark. Now let's set the SPARK_HOME as follows:
echo "export SPARK_HOME=/home/asif/Spark" >> ~/.bashrc
Now let's set PYTHONPATH as follows:
echo "export PYTHONPATH=$SPARK_HOME/python/" >> ~/.bashrc
echo "...