/images/avatar.png

Program Crash Caused by CPU Instruction

It’s inevitable to dealing with bugs in coding career. The main part of coding are implementing new features, fixing bugs and improving performance. For me, there are two kinds of bugs that is difficult to tackle: those are hard to reproduce, and those occur in code not wrote by you.

Recently, I met a bug which has both features mentioned before. I write a Spark program to analyse the log and cluster them. Last week I update the code, use Facebook’s faiss library to accelerate the process of find similar vector. After I push the new code to spark, the program crashed. I found this log on Spark driver:

C-m, RET and Return Key in Emacs

I use Emacs to write blog. In the recent update, I found M-RET no longer behave as leader key in org mode, but behave as org-meta-return. And even more strange is that in other mode, it behave as leader key. And M-RET also works in terminal in org mode. In GUI, pressing C-M-m can trigger leader key.

SO I opened this issue, with the help of these friends, the issue has been fixed. Here is the cause of the bug.

Import custom package or module in PySpark

First zip all of the dependencies into zip file like this. Then you can use one of the following methods to import it.

|-- kk.zip
|   |-- kk.py

Using –py-files in spark-submit

When submit spark job, add --py-files=kk.zip parameter. kk.zip will be distributed with the main scrip file, and kk.zip will be inserted at the beginning of PATH environment variable.

Time boundary in InfluxDB Group by Time Statement

These days I use InfluxDB to save some time series data. I love these features it provides:

High Performance

According to to it’s hardware guide, a single node will support more than 750k point write per second, 100 moderate queries per second and 10M series cardinality.

Continuous Queries

Simple aggregation can be done by InfluxDB’s continuous queries.

Overwrite Duplicated Points

If you submit a new point with same measurements, tag set and timestamp, the new data will overwrite the old one.

C3 Linearization and Python MRO(Method Resolution Order)

Python supports multiple inheritance, its class can be derived from more than one base classes. If the specified attribute or methods was not found in current class, how to decide the search sequence from superclasses? In simple scenario, we know left-to right, bottom to up. But when the inheritance hierarchy become complicated, it’s not easy to answer by intuition.

For instance, what’s search sequence of class M?

class X:pass
class Y: pass
class Z:pass
class A(X,Y):pass
class B(Y,Z):pass
class M(B,A,Z):pass

The answer is: M, B, A, X, Y, Z, object

Difference between Value and Pointer variable in Defer in Go

defer is a useful function to do cleanup, as it will execute in LIFO order before the surrounding function returns. If you don’t know how it works, sometimes the execution result may confuse you.

How it Works and Why Value or Pointer Receiver Matters

I found an interesting code on Stack Overflow:

type X struct {
    S string
}

func (x X) Close() {
    fmt.Println("Value-Closing", x.S)
}

func (x *X) CloseP() {
    fmt.Println("Pointer-Closing", x.S)
}

func main() {
    x := X{"Value-X First"}
    defer x.Close()
    x = X{"Value-X Second"}
    defer x.Close()

    x2 := X{"Value-X2 First"}
    defer x2.CloseP()
    x2 = X{"Value-X2 Second"}
    defer x2.CloseP()

    xp := &X{"Pointer-X First"}
    defer xp.Close()
    xp = &X{"Pointer-X Second"}
    defer xp.Close()

    xp2 := &X{"Pointer-X2 First"}
    defer xp2.CloseP()
    xp2 = &X{"Pointer-X2 Second"}
    defer xp2.CloseP()
}

The output is: