DC - Add MIME sink pattern reuse cache. #1392

chri50 · 2025-10-17T13:00:13Z

Based on #1382

So, I have worked on the cupsFilter instead of the entire PPD cache.

Share an identical incoming filter graph to the same sink (normalising printer/* → printer/sink) and reuses the computed supported MIME type list instead of recomputing it per printer. Uses a canonicalised, sorted edge signature (super/type, cost, maxsize, program hash) hashed. Reduces scheduler startup and printer addition overhead when many printers have identical filter chains; safe fallback when disabled or on cache miss.

Core testing results:
Base: opencups 2.4.7 with mime sink pattern applied , set up 6947 printing queues with four different PPD.

Before: ~32 minutes
After:   ~10 seconds

chri50 · 2025-10-17T13:05:05Z

With CUPS_MIME_SINK_REUSE=1, store for reuse by future printers (same sink incoming edges)

Please let me know, what you think about.
Thx.

michaelrsweet

OK, so I'm going to spend a bit more time reviewing these changes. Some initial impressions:

I like the more general implementation and reduction in algorithmic complexity.
I'm wondering whether we could take advantage of the cupsArray API's hashed array support rather than re-implementing a hashed storage container.
In a similar vein, optimizing the implementation of mimeFilter* which has a rough complexity of O(n^4) would dramatically improve overall performance for the non-cached cases.
I wonder if it would be worth saving the cache and keep track of the mtime or hash of .types and .convs files?

Finally, whatever we end up with we'll just leave it turned on.

michaelrsweet · 2025-10-17T17:43:10Z

I've taken the liberty of updating the MIME unit test program (testmime.c) so we can do some isolated profiling/tuning:

[master 2fb742f] Update MIME unit test:

The current CUPS 2.5 code seems to take about 0.3ms to generate the array of source types for a given PPD on my MacBook Pro (M4 Max), 0.75ms on a Raspberry Pi 5.

chri50 · 2025-10-20T13:14:56Z

I really appreciate your feedback, thx!

I'm wondering whether we could take advantage of the cupsArray API's hashed array support rather than re-implementing a hashed storage container.

I'll take the cupsArray advice into account.

In a similar vein, optimizing the implementation of mimeFilter* which has a rough complexity of O(n^4) would dramatically improve overall performance for the non-cached cases.

I thought caching would be more defensive approach. You're right, the growth of the graph database and the increase in runtime per printer is a tough topic. I'm curious if there is still such a topic in CUPS 3?

I wonder if it would be worth saving the cache and keep track of the mtime or hash of .types and .convs files?

I assume you're talking about precompiling the mime graph when cups starts. It depends a lot on the specific setup and i think it will provide more drawbacks than benefits.

Thank you for your support.

michaelrsweet · 2025-10-20T17:10:58Z

@chri50

In a similar vein, optimizing the implementation of mimeFilter* which has a rough complexity of O(n^4) would dramatically improve overall performance for the non-cached cases.

I thought caching would be more defensive approach. You're right, the growth of the graph database and the increase in runtime per printer is a tough topic. I'm curious if there is still such a topic in CUPS 3?

CUPS 3 is largely based on PAPPL 2.0, which offers a much simpler filter architecture without chaining. The complexity of building the supported document formats is only O(n) there - one loop through the filters, and we list anything that converts to the printer's native format or to PWG Raster (image/pwg-raster) which is the "internal" raster format we use for the "drivers" there whether the printer takes PWG or Apple (image/urf) raster.

I wonder if it would be worth saving the cache and keep track of the mtime or hash of .types and .convs files?

I assume you're talking about precompiling the mime graph when cups starts. It depends a lot on the specific setup and i think it will provide more drawbacks than benefits.

My gut agrees but wanted to put it out there.

chri50 · 2025-10-27T13:24:20Z

2. I'm wondering whether we could take advantage of the cupsArray API's hashed array support rather than re-implementing a hashed storage container.

I rewrote the mime-sink-pattern to adopt the cupsArray features and simplify the cache hit method, as well as eliminate unnecessary memory waste.

Current E2E testing shows significant performance improvements similar to those seen in the first attempt.

Please let me know, what you think about.
Thx.

michaelrsweet · 2025-10-27T20:59:12Z

@chri50 Thanks for this, I'm head's down on some other stuff ATM but will circle back to this in the next day or so.

michaelrsweet · 2025-11-12T19:47:38Z

OK, so I'm finally taking a fresh look and I want to further simplify this. The vast bulk of your current PR is code to calculate a hash of the different filters' values to avoid re-running mimeFilter over N file types, and honestly the result is pretty hard to follow/maintain.

I'm thinking about having a parallel array that tracks destination media types with a list of source types, so we can build out a list of supported document formats more quickly. For this we don't need to worry about cost or max size (it is just answering the question "will it blend?" and not how) and it would provide caching behavior...

…ed document formats for a printer. The new algorithm is O(n log n) vs. the old O(n^4) (Issue #1392)

michaelrsweet · 2025-11-12T22:00:55Z

Try current Github master:

[master 1150b02] Add a new mimeGetFilterTypes function for getting the list of supported document formats for a printer. The new algorithm is O(n log n) vs. the old O(n^4) (Issue #1392)

The new algorithm executes an order of magnitude faster on my M4 Max MacBook Pro and two orders of magnitude faster on my 10th Gen Intel Dell XPS 13 and a Raspberry Pi Zero 2W. Let me know how it works for you...

DC - Add MIME sink pattern reuse cache.

4b54241

michaelrsweet self-assigned this Oct 17, 2025

michaelrsweet added enhancement New feature or request priority-high labels Oct 17, 2025

michaelrsweet added this to the v2.5 milestone Oct 17, 2025

michaelrsweet requested changes Oct 17, 2025

View reviewed changes

chri50 added 2 commits October 20, 2025 09:24

Merge branch 'master' into testing

79e12e3

DC - correct includes

c0db82b

DC - update mime-sink-patterns to adopt cupsArray, simplification

81f11e5

Merge branch 'OpenPrinting:master' into feature/msink

6d04602

michaelrsweet added a commit that referenced this pull request Nov 12, 2025

Add a new mimeGetFilterTypes function for getting the list of support…

1150b02

…ed document formats for a printer. The new algorithm is O(n log n) vs. the old O(n^4) (Issue #1392)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DC - Add MIME sink pattern reuse cache. #1392

DC - Add MIME sink pattern reuse cache. #1392

Uh oh!

chri50 commented Oct 17, 2025

Uh oh!

chri50 commented Oct 17, 2025

Uh oh!

michaelrsweet left a comment

Uh oh!

michaelrsweet commented Oct 17, 2025

Uh oh!

chri50 commented Oct 20, 2025

Uh oh!

michaelrsweet commented Oct 20, 2025

Uh oh!

chri50 commented Oct 27, 2025

Uh oh!

michaelrsweet commented Oct 27, 2025

Uh oh!

michaelrsweet commented Nov 12, 2025

Uh oh!

michaelrsweet commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DC - Add MIME sink pattern reuse cache. #1392

Are you sure you want to change the base?

DC - Add MIME sink pattern reuse cache. #1392

Uh oh!

Conversation

chri50 commented Oct 17, 2025

Uh oh!

chri50 commented Oct 17, 2025

Uh oh!

michaelrsweet left a comment

Choose a reason for hiding this comment

Uh oh!

michaelrsweet commented Oct 17, 2025

Uh oh!

chri50 commented Oct 20, 2025

Uh oh!

michaelrsweet commented Oct 20, 2025

Uh oh!

chri50 commented Oct 27, 2025

Uh oh!

michaelrsweet commented Oct 27, 2025

Uh oh!

michaelrsweet commented Nov 12, 2025

Uh oh!

michaelrsweet commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants