Tuesday, November 21, 2017

Running jupyter notebook with pyspark shell on Windows 10

# Running jupyter notebook with pyspark shell.

This notebook will not run in an ordinary jupyter notebook server.
Here is how we can load pyspark to use Jupyter notebooks.
On Mac, Linux, something like this will work:
On Windows 10, it works at 2017-Nov-21.

# Install Anaconda

https://www.anaconda.com/download/

## Run "Anaconda Prompt" to open command line:

%windir%\System32\cmd.exe "/K" E:\Users\datab_000\Anaconda3\Scripts\activate.bat E:\Users\datab_000\Anaconda3

Note: above commands are in one single line.

## Then set system environment variables:

rem set PYSPARK_PYTHON=python3
set PYSPARK_DRIVER_PYTHON=jupyter
set PYSPARK_DRIVER_PYTHON_OPTS=notebook

## start pyspark

E:\movie\spark\spark-2.2.0-bin-hadoop2.7\bin\pyspark.cmd


Reference:

No comments: