tweak palette heuristics #3420

jonsneyers · 2024-03-13T17:41:36Z

Some tweaks to the modular palette heuristics:

Refactored some code (there was some near-duplicate code between global and group encoding)
Change the max_colors heuristic to make it less likely to use palette when the image/group has low entropy already anyway (e.g. the group is only a smooth gradient, which has few enough colors for doing a palette but also palette doesn't help)
Actually use the implicit palette if these colors happen to occur in the image (but still make them explicit if they are frequent)
Tweak the palette ordering: still luma sorted, but first the 'common' colors and then the 'rare' colors
Don't do palette if the whole group is just a single color (it's slightly cheaper not to)

Results

QOI test set

Before:
benchmark_xl v0.10.0 453f112 [NEON]
12 total threads, 17088 tasks, 6 threads, 2 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4      1321975 813352177    4.9220412   3.663  18.466          nan 100.00000000  99.99   0.00000000  0.000000000000   4.922      0
jxl:d0:5      1321975 786705494    4.7607875   2.556  18.754          nan 100.00000000  99.22   0.00000000  0.000000000000   4.761      0
jxl:d0:6      1321975 774050078    4.6842026   1.801  17.320          nan 100.00000000  99.99   0.00000000  0.000000000000   4.684      0
jxl:d0:7      1321975 747782641    4.5252439   1.236  16.101          nan 100.00000000  99.24   0.00000000  0.000000000000   4.525      0
jxl:d0:8      1321975 732645434    4.4336403   0.403  15.206          nan 100.00000000  99.19   0.00000000  0.000000000000   4.434      0
jxl:d0:9      1321975 724420391    4.3838661   0.212  14.877          nan 100.00000000  99.09   0.00000000  0.000000000000   4.384      0
Aggregate:    1321975 762523609    4.6144496   1.101  16.720   0.00000000 100.00000000  99.45   0.00000000  0.000000000000   4.614      0

After:
benchmark_xl v0.10.0 8c7f3833 [NEON]
12 total threads, 17088 tasks, 6 threads, 2 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4      1321975 813579664    4.9234178   3.674  18.429          nan 100.00000000  99.99   0.00000000  0.000000000000   4.923      0
jxl:d0:5      1321975 785292105    4.7522343   2.437  17.808          nan 100.00000000  99.14   0.00000000  0.000000000000   4.752      0
jxl:d0:6      1321975 772569948    4.6752455   1.743  16.399          nan 100.00000000  99.21   0.00000000  0.000000000000   4.675      0
jxl:d0:7      1321975 745875196    4.5137009   1.169  15.259          nan 100.00000000  99.21   0.00000000  0.000000000000   4.514      0
jxl:d0:8      1321975 731995894    4.4297096   0.405  14.482          nan 100.00000000  99.47   0.00000000  0.000000000000   4.430      0
jxl:d0:9      1321975 723616280    4.3789999   0.207  14.115          nan 100.00000000  99.15   0.00000000  0.000000000000   4.379      0
Aggregate:    1321975 761509591    4.6083132   1.073  16.001   0.00000000 100.00000000  99.36   0.00000000  0.000000000000   4.608      0

Overall a 0.25% compression improvement at default effort.

Most of the bytes in the full QOI testset go to photographic images where nothing or not much changes; for non-photographic images (especially small ones) the improvement can be larger.

For example, for the specific subset icon64 (a set of 64x64 rendered vector icons), the improvement is quite significant:

icon64

Before:
benchmark_xl v0.10.0 453f112 [NEON]
12 total threads, 1278 tasks, 12 threads, 0 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4          872   708994    6.5011920   1.366   7.894          nan 100.00000000  99.99   0.00000000  0.000000000000   6.501      0
jxl:d0:5          872   681801    6.2518431   0.984   7.347          nan 100.00000000  99.99   0.00000000  0.000000000000   6.252      0
jxl:d0:6          872   660630    6.0577135   0.779   6.804          nan 100.00000000  99.99   0.00000000  0.000000000000   6.058      0
jxl:d0:7          872   653946    5.9964239   0.590   7.291          nan 100.00000000  99.99   0.00000000  0.000000000000   5.996      0
jxl:d0:8          872   610334    5.5965192   0.248   6.321          nan 100.00000000  99.99   0.00000000  0.000000000000   5.597      0
jxl:d0:9          872   649023    5.9512819   0.126   6.126          nan 100.00000000  99.99   0.00000000  0.000000000000   5.951      0
Aggregate:        872   660092    6.0527830   0.518   6.937   0.00000000 100.00000000  99.99   0.00000000  0.000000000000   6.053      0

After:
benchmark_xl v0.10.0 8c7f3833 [NEON]
12 total threads, 1278 tasks, 12 threads, 0 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4          872   652203    5.9804412   1.181   5.094          nan 100.00000000  99.99   0.00000000  0.000000000000   5.980      0
jxl:d0:5          872   607202    5.5678000   0.799   4.991          nan 100.00000000  99.99   0.00000000  0.000000000000   5.568      0
jxl:d0:6          872   598654    5.4894183   0.701   5.025          nan 100.00000000  99.99   0.00000000  0.000000000000   5.489      0
jxl:d0:7          872   591846    5.4269916   0.524   4.684          nan 100.00000000  99.99   0.00000000  0.000000000000   5.427      0
jxl:d0:8          872   566868    5.1979533   0.229   4.745          nan 100.00000000  99.99   0.00000000  0.000000000000   5.198      0
jxl:d0:9          872   555405    5.0928422   0.119   4.747          nan 100.00000000  99.99   0.00000000  0.000000000000   5.093      0
Aggregate:        872   594564    5.4519160   0.460   4.878   0.00000000 100.00000000  99.99   0.00000000  0.000000000000   5.452      0

(9.5% compression improvement at default effort for these icons)

Set of large manga images

Before:
benchmark_xl v0.10.0 453f112 [NEON]
12 total threads, 246 tasks, 0 threads, 8 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4       299731 166674831    4.4486396  16.941  90.134          nan 100.00000000  99.99   0.00000000  0.000000000000   4.449      0
jxl:d0:5       299731 159307724    4.2520076  12.261  90.202          nan 100.00000000  99.99   0.00000000  0.000000000000   4.252      0
jxl:d0:6       299731 156628791    4.1805055   8.369  81.711          nan 100.00000000  99.99   0.00000000  0.000000000000   4.181      0
jxl:d0:7       299731 151337716    4.0392839   5.807  76.123          nan 100.00000000  99.99   0.00000000  0.000000000000   4.039      0
jxl:d0:8       299731 148798488    3.9715105   2.095  72.507          nan 100.00000000  99.99   0.00000000  0.000000000000   3.972      0
jxl:d0:9       299731 147334255    3.9324294   1.075  70.774          nan 100.00000000  99.99   0.00000000  0.000000000000   3.932      0
Aggregate:     299731 154871432    4.1336006   5.323  79.865   0.00000000 100.00000000  99.99   0.00000000  0.000000000000   4.134      0

After:
benchmark_xl v0.10.0 8c7f3833 [NEON]
12 total threads, 246 tasks, 0 threads, 8 inner threads

Encoding      kPixels    Bytes          BPP  E MP/s  D MP/s     Max norm  SSIMULACRA2   PSNR        pnorm       BPP*pnorm   QABPP   Bugs
----------------------------------------------------------------------------------------------------------------------------------------
jxl:d0:4       299731 166674831    4.4486396  17.012  88.900          nan 100.00000000  99.99   0.00000000  0.000000000000   4.449      0
jxl:d0:5       299731 157296296    4.1983215  11.776  85.809          nan 100.00000000  99.99   0.00000000  0.000000000000   4.198      0
jxl:d0:6       299731 154674529    4.1283452   8.202  78.467          nan 100.00000000  99.99   0.00000000  0.000000000000   4.128      0
jxl:d0:7       299731 149960989    4.0025383   5.604  73.752          nan 100.00000000  99.99   0.00000000  0.000000000000   4.003      0
jxl:d0:8       299731 147630150    3.9403270   2.065  70.100          nan 100.00000000  99.99   0.00000000  0.000000000000   3.940      0
jxl:d0:9       299731 145998497    3.8967773   1.045  67.579          nan 100.00000000  99.99   0.00000000  0.000000000000   3.897      0
Aggregate:     299731 153550597    4.0983468   5.204  77.044   0.00000000 100.00000000  99.99   0.00000000  0.000000000000   4.098      0

About 1% smaller at default effort.

lib/jxl/enc_modular.cc

lib/jxl/modular/transform/enc_palette.cc

…tdepth/nb_chans

Flyby fix to reinstate libjxl#627 after being reverted in libjxl#3420

* Allow Squeeze predictors and fix faster decoding * Allow LZ77 for faster decoding * Fix syntax error Wouldn't build. * Use smallest group size, remove LZ77 only * Allow changing Delta Palette predictor * Allow changing Delta Palette predictor Flyby fix to reinstate #627 after being reverted in #3420 * Fixing setting smaller group size by default * Tweaked some settings + comments * More Tweaks * Revert Delta fix Code order was changed making it infeasible * Revert Delta fix Code order was changed making it infeasible * Increase density of Progressive Lossless * bug fix * Tweaks 2.0 + better lossless progressive * Update ANS with Tweaks 2.0 * Local MA trees for progressive lossless * Attempt to fix progressive YCoCg * Allow overriding RCT and Predictor * Underscores are important * Fix Lossy/Delta Palette * Round progressive lossless faster decoding * Try all predictors at effort 10 for progressive lossless * Syntax and add comments * Shifting this function to enc_modular.cc * There we go * Making the compiler happy * ugh * I forgot an additional check * Fix VarDCT density regression * Code cleanup and more fixes * Faster VarDCT Encoding Speeds * oops silly me :P * Fix ordering of progressive FD1 check * Typos * Typos * Typos * yippie! * Forgot something :( * Fix merge conflict * Try to fix progressive lossless * Revert progressive lossless buffering * Brackets are important. * Fix small image density regression * I can't believe this little thing was causing me soo much pain * Revert failed fix Also trigger checks to run * Finally, it should pass all tests now * Last one hopefully * Comments --------- Co-authored-by: Galaxy4594 <164440799+Galaxy4594@users.noreply.github.com> Co-authored-by: Jon Sneyers <jon@cloudinary.com>

jonsneyers added encoder Related to the libjxl encoder density Related to compression density labels Mar 13, 2024

tweak palette heuristics

33740ad

jonsneyers force-pushed the palette_tweaks branch from 0eea66d to 33740ad Compare March 13, 2024 18:41

veluca93 reviewed Mar 13, 2024

View reviewed changes

lib/jxl/enc_modular.cc Outdated Show resolved Hide resolved

veluca93 reviewed Mar 13, 2024

View reviewed changes

lib/jxl/modular/transform/enc_palette.cc Show resolved Hide resolved

make low-effort bpp estimate for the max_color heuristic depend on bi…

3a998c1

…tdepth/nb_chans

veluca93 approved these changes Mar 14, 2024

View reviewed changes

jonsneyers added this pull request to the merge queue Mar 15, 2024

Merged via the queue into libjxl:main with commit b49cdd5 Mar 15, 2024

jonsneyers mentioned this pull request Mar 28, 2024

Lossless density regression #3441

Closed

jonnyawsom3 added a commit to jonnyawsom3/libjxl that referenced this pull request Apr 3, 2025

Allow changing Delta Palette predictor

f724c72

Flyby fix to reinstate libjxl#627 after being reverted in libjxl#3420

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tweak palette heuristics #3420

tweak palette heuristics #3420

Uh oh!

jonsneyers commented Mar 13, 2024

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tweak palette heuristics #3420

tweak palette heuristics #3420

Uh oh!

Conversation

jonsneyers commented Mar 13, 2024

Results

QOI test set

icon64

Set of large manga images

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants