Skip to main content

Python

2024


sys.path in Python

·3 mins

Here is the process how sys.path is set in Python, with some parts omitted.

Python Command Line Arguments #

By default, as initialized upon program startup, a potentially unsafe path is prepended to sys.path:

__import__ in Python

·2 mins

It’s known that Python’s import statement is implemented by __import__ function. In general, if we want to import a module dynamically, we can use import_module function, which is a wrapper around __import__.

The most important difference between these two functions is that import_module() returns the specified package or module (e.g. pkg.mod), while import() returns the top-level package or module (e.g. pkg). – https://docs.python.org/3/library/importlib.html#importlib.import_module

2023


2022


Memory Leak in Python multiprocessing.Pool

·4 mins

There is a historical memory leak problem in our Django app and I fixed it recently. As time goes by, the memory usage of app keeps growing and so does the CPU usage.

After some research, I figure out the cause. Some views does not close multiprocessing.Pool after using it. The problem disappears when I use Pool with with statement.

2021


How to disable auto strip in Charfield in Django

·2 mins

In Django, when edit field in admin page or post data to forms, the leading and tailing whitespace in CharField and TextField are removed.

The reason is strip=True parameter in forms.CharField, which is added in Djagno 1.9. You can see the discussion in django tiket #4960 and here is source code. models.CharField and models.TextField use formfield() to create form to interact with user, then both of them eventually create a forms.CharField

Using JSONField before Django 3.1

·2 mins

In Django 3.1, Django support save python data into database as JSON encoded data and it is also possible to make query based on field value in JSONField. The detailed usage can be found here. If you are using older version and want to try this feature. Though there are many packages ported this function, I recommend django-jsonfield-backport.

2020


Program Crash Caused by CPU Instruction

·3 mins

It’s inevitable to dealing with bugs in coding career. The main part of coding are implementing new features, fixing bugs and improving performance. For me, there are two kinds of bugs that is difficult to tackle: those are hard to reproduce, and those occur in code not wrote by you.

Import custom package or module in PySpark

·1 min

First zip all of the dependencies into zip file like this. Then you can use one of the following methods to import it.

|-- kk.zip
|   |-- kk.py

Using –py-files in spark-submit #

When submit spark job, add --py-files=kk.zip parameter. kk.zip will be distributed with the main scrip file, and kk.zip will be inserted at the beginning of PATH environment variable.

C3 Linearization and Python MRO(Method Resolution Order)

·3 mins

Python supports multiple inheritance, its class can be derived from more than one base classes. If the specified attribute or methods was not found in current class, how to decide the search sequence from superclasses? In simple scenario, we know left-to right, bottom to up. But when the inheritance hierarchy become complicated, it’s not easy to answer by intuition.