Skip to main content

Python

2024


Modern pip build process (–use-pep517)

·3 mins

Nowadays, pyproject.toml becomes the standard configuration file for packaging. Compare with the old setup.py, it adds two feature pep517 and pep518.

pep517 defines two hooks: build_wheel and build_sdist, which is required to build the package from source. Each build backend must implement these two hooks. It makes it possible to create other build backend such as flit or poetry.

sys.path in Python

·3 mins

Here is the process how sys.path is set in Python, with some parts omitted.

Python Command Line Arguments #

By default, as initialized upon program startup, a potentially unsafe path is prepended to sys.path:

__import__ in Python

·2 mins

It’s known that Python’s import statement is implemented by __import__ function. In general, if we want to import a module dynamically, we can use import_module function, which is a wrapper around __import__.

The most important difference between these two functions is that import_module() returns the specified package or module (e.g. pkg.mod), while import() returns the top-level package or module (e.g. pkg). – https://docs.python.org/3/library/importlib.html#importlib.import_module

2023


2022


Memory Leak in Python multiprocessing.Pool

·4 mins

There is a historical memory leak problem in our Django app and I fixed it recently. As time goes by, the memory usage of app keeps growing and so does the CPU usage.

After some research, I figure out the cause. Some views does not close multiprocessing.Pool after using it. The problem disappears when I use Pool with with statement.

2021


How to disable auto strip in Charfield in Django

·2 mins

In Django, when edit field in admin page or post data to forms, the leading and tailing whitespace in CharField and TextField are removed.

The reason is strip=True parameter in forms.CharField, which is added in Djagno 1.9. You can see the discussion in django tiket #4960 and here is source code. models.CharField and models.TextField use formfield() to create form to interact with user, then both of them eventually create a forms.CharField

Using JSONField before Django 3.1

·2 mins

In Django 3.1, Django support save python data into database as JSON encoded data and it is also possible to make query based on field value in JSONField. The detailed usage can be found here. If you are using older version and want to try this feature. Though there are many packages ported this function, I recommend django-jsonfield-backport.

2020


Program Crash Caused by CPU Instruction

·3 mins

It’s inevitable to dealing with bugs in coding career. The main part of coding are implementing new features, fixing bugs and improving performance. For me, there are two kinds of bugs that is difficult to tackle: those are hard to reproduce, and those occur in code not wrote by you.

Import custom package or module in PySpark

·1 min

First zip all of the dependencies into zip file like this. Then you can use one of the following methods to import it.

|-- kk.zip
|   |-- kk.py

Using –py-files in spark-submit #

When submit spark job, add --py-files=kk.zip parameter. kk.zip will be distributed with the main scrip file, and kk.zip will be inserted at the beginning of PATH environment variable.