ZSTD – A stronger compression algorithm

ZSTD , short for Z-Standard, is a new lossless compression algorithm, aiming at providing both great compression ratio and speed for your standard compression needs. “Standard” translates into everyday situations which neither look for highest possible ratio (which LZMA and ZPAQ cover) nor extreme speeds (which LZ4 covers).

It is provided as a BSD-license package , hosted on Github .

For a taste of its performance, here are a few benchmark numbers, completed on a Core i5-4300U @ 1.9 GHz, using fsbench 0.14.3 , an open-source benchmark program by m^2.

Name Ratio C.speed D.speed

zlib 1.2.8 -6 3.099 18 275

ZSTD 2.872 201 498

zlib 1.2.8 -1 2.730 58 250

LZ4 HC r127 2.720 26 1720

QuickLZ 1.5.1b6 2.237 323 373

LZO 2.06 2.106 351 510

Snappy 1.1.0 2.091 238 964

LZ4 r127 2.084 370 1590

LZF 3.6 2.077 220 502

An interesting feature of ZSTD is that it can qualify as both a reasonably strong compressor and a fast one.

ZSTD delivers high decompression speed, at around ~500 MB/s per core.

Obviously, your exact mileage will vary depending on your target system.

ZSTD compression speed, on the other hand, can be configured to fit different situations.

The first, fast, derivative offers ~200 MB/s per core, which is suitable for a few real-time scenarios.

But similar to LZ4 , ZSTD can offer derivatives trading compression time for compression ratio, while keeping decompression properties intact. “Offline compression”, where compression time is of little importance because the content is only compressed once and decompressed many times, is therefore within the scope.

Note that high compression derivatives still have to be developed.

It’s a complex area which will certainly benefit the contributions from a few experts.

Another property ZSTD is developed for is configurable memory requirement, with the objective to fit into low-memory configurations, or servers handling many connections in parallel.

On the decoding side, ZSTD memory requirement is divided into 2 main parts :

  1. The entropy tables : ZSTD entropy stage is handled by FSE (Finite State Entropy).
    FSE needs several transformation tables, which currently cost 10 KB.
    The intention is to make this size configurable, from a minimum of 2.5 KB to a maximum of 20 KB. This is relatively mild requirement, mostly interesting for systems with very limited memory resource.
  2. The match window size, which is basically the size of “look back buffer” decompression side must maintain in order to complete “match copy” operations.
    Basic idea is : the larger the window size, the better the compression ratio.
    However, it also increases memory requirement on the decoding side, so a trade off must be found.
    Current default window size is 512 KB, but this value will be configurable, from very small (KB) to very large (GB), in the expectation to fit multiple scenarios needs.

The compression stage needs to handle a few more memory segments, the number and size of which is highly dependent on the selected search algorithm. At a minimum, there is a need for a “look-up table”, such as the one used by the “fast mode”. The current default size of this table is currently selected at 128 KB, but this too will be configurable, from as low as a few KB to a few MB.

Stronger search algorithms will need more tables, hence more memory.

While such speed qualify ZSTD as a “fast compressor / decompressor”, it still doesn’t reach LZ4 territory. Therefore, selecting which algorithm best suits your need highly depends on your speed objective.

In order to help such selection, I’ve reused the benchmark methodology proposed here , which adds compression, communication, and decompression time in a simplistic transmission scenario. It results in the following chart :

(click to enlarge)

As expected, using “direct raw numbers representation” results in a clogged graphic, where each compressor is difficult to distinguish. Therefore, the representation is reworked, using the same scenario and same numbers, but dynamically zooming each speed sample so that the ranking is preserved, with the best solution receiving always the relative note “1”, and other following depending on their speed difference. It creates the following graph :

(click to enlarge)

which is easier to interpret.

From this table we see that LZ4 is a better choice for speeds above ~50 MB/s, while ZSTD takes the lead for speeds between 0.5 MB/s and 50 MB/s. Below that point, stronger alternatives take the lead.

ZSTD development is starting. So consider current results merely as early ones. The implementation will gradually evolve and improve overtime, especially during this first year. This is a phase which will depend a lot on user feedback, since these feedbacks will be key in deciding next priorities or features to add.

稿源:RealTime Data Compression (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » ZSTD – A stronger compression algorithm

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录