Skip to main content

Posts

2022


QNAP TS-453Dmini Review

·3 mins

My first NAS is Synology DS120j, which is ARM based entry level product. It’s okay to use it for downloading and backup, but not power enough for running docker and virtual machine.

So I bought this NAS last month, and I’m satisfied with it. Here are the advantages and disadvantages.

Internet Account Keeps Coming Back after deletion on MacOS

·1 min

Today I tried to delete an inactive Internet account on system preference. It was deleted successfully but come back again after 20 seconds. This drives me nuts.

I tried these methods, but none of them works.

  • Boot in safe mode, delete account.
  • Delete record in ZACCOUNT table in ~/Library/Accounts/Accounts4.sqlite.
  • Delete related items in Keychain Access app.

Later, RedHatDude’s answer in this post gives me a clue, it looks like a iCloud sync problem. I tried to delete the account on my 3 MacBooks together. Thank goodness! It does not show up again.

2021


How to disable auto strip in Charfield in Django

·2 mins

In Django, when edit field in admin page or post data to forms, the leading and tailing whitespace in CharField and TextField are removed.

The reason is strip=True parameter in forms.CharField, which is added in Djagno 1.9. You can see the discussion in django tiket #4960 and here is source code. models.CharField and models.TextField use formfield() to create form to interact with user, then both of them eventually create a forms.CharField

Using JSONField before Django 3.1

·2 mins

In Django 3.1, Django support save python data into database as JSON encoded data and it is also possible to make query based on field value in JSONField. The detailed usage can be found here. If you are using older version and want to try this feature. Though there are many packages ported this function, I recommend django-jsonfield-backport.

Dynamic Allocate Executors when Executing Jobs in Spark

·4 mins

I wrote a Spark program to process logs. The number of logs always changes as time goes by. To ensure logs can be processed instantly, the number of executors is calculated by the maximum of logs per minutes. As a consequence, the CPU usage is low in executors. In order to decrease resource waste, I tried to find a way to schedule executors during the execution of program.

Improve Kafka throughput

·5 mins

Kafka is a high-performance and scalable messaging system. Sometimes when handling big data. The default configuration may limit the maximum performance. In this article, I’ll explain how messages are generate and saved in Kafka, and how to improve performance by changing configuration.

Fix Error: Cask 'java' is unavailable in Homebrew

·1 min

After update brew to latest version, when calling cask related command, it always outputs Error: Cask 'java' is unavailable: No Cask with this name exists., such as brew list --cask. However, the brew command works.

After doing some research, I found Java has been moved to homebrew/core. This makes sense now. I installed java by cask, but it’s not available now and cask throw this error. If I uninstall java from cask, the error should disappear.

2020


Timezone in JVM

·2 mins

I wrote a Scala code to get the current time. However, the output is different on the development server and docker.

import java.util.Calendar

println(Calendar.getInstance().getTime)

On my development server, it outputs Sun Oct 18 18:01:01 CST 2020, but in docker, it print a UTC time.

Retrieve Large Dataset in Elasticsearch

·5 mins

It’s easy to get small dataset from Elasticsearch by using size and from. However, it’s impossible to retrieve large dataset in the same way.

Deep Paging Problem #

As we know it, Elasticsearch data is organised into indexes, which is a logical namespace, and the real data is stored into physical shards. Each shard is an instance of Lucene. There are two kind of shards, primary shards and replica shards. Replica shards is the copy of primary shards in case nodes or shards fail. By distributing documents in an index across multiple shards, and distributing those shards across multiple nodes, Elasticsearch can ensure redundancy and scalability. By default, Elasticsearch create 5 primary shards and one replica shard for each primary shards.