这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@TedTed
Copy link
Contributor

@TedTed TedTed commented Oct 15, 2025

This seemed like a simple first step to implementing #23, and start moving logic from the measurement visitor to a series of rewrite rules. This PR does this for noise selection, annotating the QueryExprs corresponding to numerical aggregations with a core_mechanism field, and using this field during compilation.

It also creates the basic plumbing for rule rewriting, similarly to what we do for KeySets. As part of that plumbing, it also adds a new field to all QueryExpr containing compilation info, and a rule to fill this field, so the rewrite rules can access the information they need.

That last change requires us to make an additional change to QueryExpr classes, and use the kw_only=True argument, which requires all callers to create QueryExpr using keywords for all arguments. Otherwise, dataclasses cannot have a field with a default value in a base class and fields with no default values in a subclass — the alternative is to have default values everywhere, which seems error-prone, or switch to attrs, which seems more complicated. This change makes a bunch of tests a little more verbose/explicit (and is responsible for most of the line changes in this PR).

As a smaller drive-by fix, it changes ReplaceInfinity to use a custom __post_init__ instead of a custom __init__, so every QueryExpr works the same way. The core logic for noise selection has also been made slightly simpler (without any behavior change).

@TedTed TedTed marked this pull request as draft October 15, 2025 07:21
@Maegereg
Copy link
Contributor

Some initial thoughts:

  • I'm curious why you're storing the compilation info on each QueryExpr, rather than passing them as arguments to the rewrite functions? The latter approach seems simpler to me.
  • I had initially assumed that noise addition could become a node in the QueryExpr tree, rather than an attribute on QueryExprs. I see why this approach makes more sense to start with, but having a separate node seems to match the underlying Core components more closely, so I'd be curious what you think about that alternative.

@TedTed
Copy link
Contributor Author

TedTed commented Oct 19, 2025

  • I'm curious why you're storing the compilation info on each QueryExpr, rather than passing them as arguments to the rewrite functions? The latter approach seems simpler to me.

You're right and I did a lot of work for no good reason :D I'll close this PR and use the simpler approach instead: #94 (not ready yet, I need to fix tests).

  • I had initially assumed that noise addition could become a node in the QueryExpr tree, rather than an attribute on QueryExprs. I see why this approach makes more sense to start with, but having a separate node seems to match the underlying Core components more closely, so I'd be curious what you think about that alternative.

This is probably worth more discussion — I think it might be good to do that, but this would move this logic from Core to Analytics, which I think is less consensual than changing the logic in Analytics. So I will start with the latter and then we'll see whether it makes sense to put even more in rewrite rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants