这是indexloc提供的服务,不要输入任何密码
Skip to content

noise synthesis generates unused Xorshiro batch when noise group size is a multiple of 16 #1888

@Traneptora

Description

@Traneptora

Describe the bug
During noise synthesis, libjxl generates one batch of 16 floating point numbers for every partial 16-pixel wide row. After filling the row, it generates another batch of 16 floats for the stragglers, in case the the group width is not a multiple of 16 (which it isn't guaranteed to be, for example if the image dimensions are odd).

However, the group width usually is a multiple of 16, and if it is, then it generates this extra batch for the stragglers. And because there are no stragglers, it does nothing with it, so for a group width, of say, 512, it generates 33 batches where only 32 are required, and the last batch is unused.

Look at the loop on line 67 of dec_noise.cc:

  for (size_t y = 0; y < ysize; ++y) {
    float* JXL_RESTRICT row = rect.Row(noise, y);

    size_t x = 0;
    // Only entire batches (avoids exceeding the image padding).
    for (; x + kFloatsPerBatch <= xsize; x += kFloatsPerBatch) { // <-- HERE, Line 72
      rng->Fill(batch);
      for (size_t i = 0; i < kFloatsPerBatch; i += Lanes(df)) {
        BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + i, row + x + i);
      }
    }

    // Any remaining pixels, rounded up to vectors (safe due to padding).
    rng->Fill(batch);
    size_t batch_pos = 0;  // < kFloatsPerBatch
    for (; x < xsize; x += N) {
      BitsToFloat(reinterpret_cast<const uint32_t*>(batch) + batch_pos,
                  row + x);
      batch_pos += N;
    }
  }

On line 72 (comment added), there's a non-strict inequality. I believe this should be a strict inequality.
Notice that if xsize is a multiple of kFloatsPerBatch (i.e. 16), then we have x + 16 <= xsize as the limiting condition, which is the same thing as x < xsize, so this fills the entire row. Then rng->Fill(batch); is called again, unconditionally.

Additional Information
This behavior isn't mentioned in the spec. Section K.5.2 in the spec has the following quote:

The next 16 (or the number of remaining samples in the row, whichever is less)

It doesn't explicitly state that this is also generated if the number of remaining samples is zero, and it would be very strange to interpret it that way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions