This notebook will not run in an ordinary jupyter notebook server.
Here is how we can load pyspark to use Jupyter notebooks.
On Mac, Linux, something like this will work:
On Windows 10, it works at 2017-Nov-21.
# Install Anaconda
https://www.anaconda.com/download/
## Run "Anaconda Prompt" to open command line:
%windir%\System32\cmd.exe "/K" E:\Users\datab_000\Anaconda3\Scripts\activate.bat E:\Users\datab_000\Anaconda3
Note: above commands are in one single line.
## Then set system environment variables:
rem set PYSPARK_PYTHON=python3
set PYSPARK_DRIVER_PYTHON=jupyter
set PYSPARK_DRIVER_PYTHON_OPTS=notebook
## start pyspark
E:\movie\spark\spark-2.2.0-bin-hadoop2.7\bin\pyspark.cmd
No comments:
Post a Comment