Import custom package or module in PySpark
First zip all of the dependencies into zip file like this. Then you can use one of the following methods to import it.
|-- kk.zip
| |-- kk.py
Using –py-files in spark-submit
When submit spark job, add --py-files=kk.zip
parameter. kk.zip
will be distributed with the main scrip file, and kk.zip
will be inserted at the beginning of PATH
environment variable.
Then you can use import kk
in your main script file.