这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@peytondmurray
Copy link
Contributor

@peytondmurray peytondmurray commented Apr 12, 2025

Description

This PR adds a pre/post-transaction hooks plugin system. Pre- (post-) transaction hooks accept a subclass of Action, and are guaranteed to run before (after) every other action is run. Plugin authors must define the following methods:

  1. verify
  2. execute
  3. reverse (only if execute raises an exception)
  4. cleanup

and have access to the following attributes:

  • transaction_context
  • target_prefix
  • unlink_precs
  • link_precs
  • remove_specs
  • update_specs
  • neutered_specs

In this way, you can define plugins that have access to e.g. what links are being generated/removed and where. For tests, I wrote a few that followed pretty closely the ones I found in that same plugin test directory.

Closes #13795.

Related: #13842

Checklist - did you ...

  • Add a file to the news directory (using the template) for the next release's release notes?
  • Add / update necessary tests?
  • Add / update outdated documentation?

@github-project-automation github-project-automation bot moved this to 🆕 New in 🔎 Review Apr 12, 2025
@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Apr 12, 2025
@peytondmurray peytondmurray force-pushed the 13795-transaction-action-plugins branch from 73dff89 to 0204a0d Compare April 12, 2025 00:55
@peytondmurray peytondmurray changed the title [ENH] Add post transaction plugin hooks [WIP] [ENH] Add post transaction plugin hooks Apr 12, 2025
@peytondmurray peytondmurray force-pushed the 13795-transaction-action-plugins branch from 0204a0d to ddab30b Compare April 12, 2025 07:47
@codspeed-hq
Copy link

codspeed-hq bot commented Apr 12, 2025

CodSpeed Performance Report

Merging #14754 will not alter performance

Comparing peytondmurray:13795-transaction-action-plugins (ba6494b) with main (a6288c2)

Summary

✅ 21 untouched benchmarks

@jaimergp
Copy link
Contributor

jaimergp commented Apr 14, 2025

Clever! This is cleaner and might actually work. Pinging @jezdez too.

My two cents:

  • Instead of adding a new verb, run(), with similar semantics, why not rename the existing execute() as def _execute(), and then turn execute(): _execute(); post_transaction_plugins()? Alternatively, leave execute() as is and introduce a new method execute_with_plugins() which we'll use in the application layer.
  • Is it time to make _Action() public?
  • This might incur in a LOT of extra calls for IO-heavy transactions. I think some of those _Action() subclasses are created per file copy even. I worry about the potential overheads here. I'd like to see numbers / tests where something big is installed and we add a random 1-100ms sleep in the plugin.

Copy link
Member

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Jaime, let’s not rename the execute functions like this and let’s publish Action for sure. A few more comments in-line.

@peytondmurray peytondmurray changed the title [WIP] [ENH] Add post transaction plugin hooks [ENH] Add post transaction plugin hooks Apr 22, 2025
@peytondmurray peytondmurray force-pushed the 13795-transaction-action-plugins branch from 20e3d70 to 1130cec Compare April 22, 2025 23:30
@peytondmurray peytondmurray force-pushed the 13795-transaction-action-plugins branch from 1130cec to e71511b Compare April 22, 2025 23:33
@peytondmurray
Copy link
Contributor Author

This might incur in a LOT of extra calls for IO-heavy transactions. I think some of those _Action() subclasses are created per file copy even. I worry about the potential overheads here. I'd like to see numbers / tests where something big is installed and we add a random 1-100ms sleep in the plugin.

Okay, some benchmarks on my local machine, using the tests in tests/plugins/test_post_transactions.py. These two tests are very similar, but test_post_transaction_invoked uses a dummy plugin that calls a function that does nothing, while test_post_transaction_overhead uses a plugin that calls a function which adds a random overhead between 1-100ms. The test time is utterly dominated by the length of time it takes to solve the environment:

74.41s call     tests/plugins/test_post_transactions.py::test_post_transaction_invoked
37.40s call     tests/plugins/test_post_transactions.py::test_post_transaction_overhead
31.04s call     tests/plugins/test_post_transactions.py::test_post_transaction_raises_exception

Let me see if there's a straightforward way to mock the solve, but keep the UnlinkLinkTransaction where these hooks are being called from.

@peytondmurray
Copy link
Contributor Author

Okay, measuring execution times it looks like if the hook is called on every action, i.e. action_type=Action:

[no overhead] elapsed time: 3.9611058235168457
[random overhead] elapsed time: 402.5158312320709

But if for example the hooks use action_type=CreatePrefixRecordAction, the dummy plugin is only ~1s faster than the random overhead plugin on my local machine:

[no overhead] elapsed time: 3.776529550552368
[random overhead] elapsed time: 4.754676103591919

So it really depends on what action type is being targeted.

@peytondmurray peytondmurray marked this pull request as ready for review April 24, 2025 01:29
@peytondmurray peytondmurray requested a review from a team as a code owner April 24, 2025 01:29
@jaimergp
Copy link
Contributor

Let me see if there's a straightforward way to mock the solve

You can use an @EXPLICIT lockfile to do that via conda create --file .... There are some at https://github.com/conda/conda-libmamba-solver/tree/main/tests/data, but you can essentially place a few URLs in there and that's all.

The 400s one for action_type=Action is a bit concerning, I wonder if we should add different types of hooks:

  • post_transaction_hook: would be limited to the Action subclasses that only run a few times. This would be recommended.
  • post_transaction_slow_hook: would allow all the others.

Do you think that's a good idea? (Names can be different) Alternatively, really loud warnings in the documentation would help. Do we actually have use cases for the Action subclasses that are called VERY often?

@peytondmurray
Copy link
Contributor Author

Alternatively, we could prevent users from passing Action as a trigger... they'd then be forced to pass only the subclasses they want instead?

@jaimergp
Copy link
Contributor

I like that! I think we can easily achieve that with issubclass(X, Action) and X != Action?

@peytondmurray
Copy link
Contributor Author

Yep, thinking the same thing 🤝

@peytondmurray
Copy link
Contributor Author

Cool, so the way to do this is to use __post_init__ on the CondaPostTransaction class. That way, we only need to check once when the plugin is instantiated 👍.

@peytondmurray peytondmurray force-pushed the 13795-transaction-action-plugins branch from bd402cc to a471442 Compare May 14, 2025 17:44
@peytondmurray peytondmurray moved this from 🏗️ In Progress to 👀 In Review in 🔎 Review May 15, 2025
@travishathaway
Copy link
Contributor

travishathaway commented May 15, 2025

My one hesitation here is about having PrefixActionGroupV2 be public. Really it seems like an implementation detail, and we're already going through one deprecation cycle for its predecessor. Is there actually a reason to make this public?

If not, is there any way I can make it private? Normally I'd prefix the name of it with an underscore, but we're also deprecating _Action here too...

If it makes you feel any better, we also deprecate things with underscores in front of them:

  • conda/conda/core/index.py

    Lines 684 to 735 in a6288c2

    @deprecated("25.3", "25.9", addendum="Use `conda.core.Index.reload` instead.")
    def _supplement_index_with_prefix(
    index: Index | dict[Any, Any],
    prefix: str | os.PathLike[str] | Path | PrefixData,
    ) -> None:
    """
    Supplement the given index with information from the specified environment prefix.
    :param index: The package index to supplement.
    :param prefix: The path to the environment prefix.
    """
    # supplement index with information from prefix/conda-meta
    prefix_data = prefix if isinstance(prefix, PrefixData) else PrefixData(prefix)
    if isinstance(index, Index):
    if index.prefix_data != prefix_data:
    raise OperationNotAllowed(
    "An index can only be supplemented with its own prefix."
    )
    index.reload(prefix=True)
    return
    for prefix_record in prefix_data.iter_records():
    if prefix_record in index:
    current_record = index[prefix_record]
    if current_record.channel == prefix_record.channel:
    # The downloaded repodata takes priority, so we do not overwrite.
    # We do, however, copy the link information so that the solver (i.e. resolve)
    # knows this package is installed.
    link = prefix_record.get("link") or EMPTY_LINK
    index[prefix_record] = PrefixRecord.from_objects(
    current_record, prefix_record, link=link
    )
    else:
    # If the local packages channel information does not agree with
    # the channel information in the index then they are most
    # likely referring to different packages. This can occur if a
    # multi-channel changes configuration, e.g. defaults with and
    # without the free channel. In this case we need to fake the
    # channel data for the existing package.
    prefix_channel = prefix_record.channel
    prefix_channel._Channel__canonical_name = prefix_channel.url()
    del prefix_record._PackageRecord__pkey
    index[prefix_record] = prefix_record
    else:
    # If the package is not in the repodata, use the local data.
    # If the channel is known but the package is not in the index, it
    # is because 1) the channel is unavailable offline, or 2) it no
    # longer contains this package. Either way, we should prefer any
    # other version of the package to this one. On the other hand, if
    # it is in a channel we don't know about, assign it a value just
    # above the priority of all known channels.
    index[prefix_record] = prefix_record

So, making it "private" won't really change the way we treat it.

travishathaway
travishathaway previously approved these changes May 15, 2025
Copy link
Contributor

@travishathaway travishathaway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good! Great job on even taking the time to update the "deep dives" section.

Co-authored-by: Travis Hathaway <travis.j.hathaway@gmail.com>
Comment on lines 188 to 220
@dataclass
class PrefixActionGroupV2:
"""A container for groups of actions carried out during an UnlinkLinkTransaction.
:param remove_menu_action_groups: Actions which remove menu items
:param unlink_action_groups: Actions which unlink files
:param unregister_action_groups: Actions which unregister environment locations
:param link_action_groups: Actions which link files
:param register_action_groups: Actions which register environment locations
:param compile_action_groups: Actions which compile pyc files
:param make_menu_action_groups: Actions which create menu items
:param entry_point_action_groups: Actions which create python entry points
:param prefix_record_groups: Actions which create package json files in ``conda-meta/``
:param initial_action_groups: User-defined actions which run before all other actions
:param final_action_groups: User-defined actions which run after all other actions
"""

remove_menu_action_groups: Iterable[ActionGroup]
unlink_action_groups: Iterable[ActionGroup]
unregister_action_groups: Iterable[ActionGroup]
link_action_groups: Iterable[ActionGroup]
register_action_groups: Iterable[ActionGroup]
compile_action_groups: Iterable[ActionGroup]
make_menu_action_groups: Iterable[ActionGroup]
entry_point_action_groups: Iterable[ActionGroup]
prefix_record_groups: Iterable[ActionGroup]
initial_action_groups: Iterable[ActionGroup] = ()
final_action_groups: Iterable[ActionGroup] = ()

def __iter__(self) -> Generator[Iterable[ActionGroup], None, None]:
for field in fields(self):
yield getattr(self, field.name)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead just make the new dataclass quack like a NamedTuple and just deprecate the tuple parts? Something like:

from dataclasses import dataclass, fields, astuple, field
from typing import Iterable, Generator, Any, Iterator, Tuple

from conda.deprecations import deprecated

def deprecate_PrefixActionGroup_tuple() -> None:
    deprecated.topic(
        "25.9",
        "26.3",
        topic="`conda.core.link.PrefixActionGroup` as a NamedTuple",
        addendum="Utilize `conda.core.link.PrefixActionGroup` as a dataclass instead.",
    )

@dataclass
class PrefixActionGroup:
    remove_menu_action_groups: Iterable[ActionGroup]
    unlink_action_groups: Iterable[ActionGroup]
    unregister_action_groups: Iterable[ActionGroup]
    link_action_groups: Iterable[ActionGroup]
    register_action_groups: Iterable[ActionGroup]
    compile_action_groups: Iterable[ActionGroup]
    make_menu_action_groups: Iterable[ActionGroup]
    entry_point_action_groups: Iterable[ActionGroup]
    prefix_record_groups: Iterable[ActionGroup]
    initial_action_groups: Iterable[ActionGroup] = ()
    final_action_groups: Iterable[ActionGroup] = ()
    
    # quack like a tuple
    def __getitem__(self, index):
        deprecate_PrefixActionGroup_tuple()
        return astuple(self)[index]
    
    def __iter__(self) -> Iterator[Iterable[ActionGroup]]:
        deprecate_PrefixActionGroup_tuple()
        return iter(astuple(self))
    
    def __len__(self) -> int:
        deprecate_PrefixActionGroup_tuple()
        return len(fields(self))
    
    def __iter_fields__(self) -> Generator[Iterable[ActionGroup], None, None]:
        yield from astuple(self)
    
    def __eq__(self, other):
        if isinstance(other, tuple):
            deprecate_PrefixActionGroup_tuple()
            return astuple(self) == other
        return super().__eq__(other)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely can, but is this actually substantially better than what we have? Yes, it does avoid us having a PrefixActionGroupV2, but making it quack like a NamedTuple doesn't actually make it a NamedTuple. So if the primary concern here is about backwards compatibility, issubclass(PrefixActionGroup, tuple) == False whereas issubclass(PrefixActionGroupV2, tuple) == True. And to my knowledge we are not using __len__, __iter_fields__, __eq__, or __getitem__, so I just don't see the need here.

We could probably cook up some multiple inheritance scheme to get this to actually behave like a namedtuple. If other folks are really interested in having this or the suggestion above be the path forward I'm happy to do it, but again for what is effectively an internal only implementation detail I just don't see the point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can define __instancecheck__ and __subclasscheck__ magic methods to address isinstance and issubclass

🤷🏼‍♂️ just trying to think of alternatives to needing to introduce a V2 suffix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead do something like this?

  • Introduce PrefixActionGroupDataclass, already deprecated and scheduled to be renamed as simply PrefixActionGroup
  • Deprecate the old PrefixActionGroup, which will be replaced by PrefixActionGroupDataclass when time comes
  • We start using PrefixActionGroupDataclass internally

Copy link
Contributor

@jaimergp jaimergp May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is a bit like mutually canceling deprecations; downstream users won't have the chance to change their code in time 😬 I'd rather stick to well-known APIs instead of trying to each objects how to be have like others. Even if we make it quack like a tuple, the amount and order of elements is different, so it's still an API breakage.

I say we keep the solution as is. If we don't like the V2 suffix, I think Dataclass is descriptive enough. Other alternatives:

  • PrefixActions
  • GroupedPrefixActions
  • PrefixActionGrouper
  • ActionGroupsForPrefix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ I vote for what @jaimergp suggested. Let's choose one of the names above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for PrefixActions.

kenodegard
kenodegard previously approved these changes May 15, 2025
Copy link
Member

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix those names and then merge this thing. This is awesome work @peytondmurray!

@github-project-automation github-project-automation bot moved this from 👀 In Review to 🏗️ In Progress in 🔎 Review May 16, 2025
@peytondmurray
Copy link
Contributor Author

Okay, here's where we stand:

  • PrefixActionGroup is deprecated, with suggestion to use PrefixActionGroupDataclass instead
  • PrefixActionGroupV2 renamed to PrefixActionGroupDataclass
  • PrefixActionGroupDataclass deprecated, to be renamed in 26.3 to PrefixActionGroup, following @jaimergp's suggestion

Thanks folks for the support!

@jezdez
Copy link
Member

jezdez commented May 16, 2025

Sorry, not PrefixActionGroupDataclass, that's not needed, PrefixActions is a better name

@jezdez
Copy link
Member

jezdez commented May 16, 2025

I'm noticing an increasing number of Java-style class names, and I don't like it ;)

@peytondmurray
Copy link
Contributor Author

100% agreed 🙅 ☕ 🙅

@peytondmurray peytondmurray requested a review from jezdez May 16, 2025 20:00
@jezdez jezdez dismissed jaimergp’s stale review May 17, 2025 11:40

The concerns have been taken care of

@jezdez jezdez merged commit 02956f9 into conda:main May 17, 2025
75 checks passed
@github-project-automation github-project-automation bot moved this from 🏗️ In Progress to 🏁 Done in 🔎 Review May 17, 2025
@beckermr
Copy link
Contributor

@peytondmurray We need to pay much closer attention to detail here. The PrefixAction class is marked as deprecated and points to PrefixActionGroup, but below the PrefixActionGroup is marked as deprecated with a message that says Use PrefixActions instead.

I think the intent is to rename one to the other. However, this doesn't really work since the deprecation warnings are circular and so downstream users cannot switch from one class to the other to remove the warnings.

If the two offer identical functionality and are drop-in replacements, a workable approach is to instead simply do the replacement now and leave PrefixActions as a pointer to PrefixActionGroup with a deprecation warning on PrefixActions only.

If that won't work, then you'll need to go through two deprecation cycles or pick a new unique name or something.

@peytondmurray
Copy link
Contributor Author

@peytondmurray We need to pay much closer attention to detail here. The PrefixAction class is marked as deprecated and points to PrefixActionGroup, but below the PrefixActionGroup is marked as deprecated with a message that says Use PrefixActions instead.

I wouldn't call this an inattention to detail - you'll notice that the original implementation here included a PrefixActionGroupV2 which didn't have any deprecation. After the discussion with 4 code owners above, group consensus was to move forward with the cyclic deprecation, and rename this object now. I understand your objection: this results in a deprecation that in some sense users are unable to solve, but in practice these are PendingDeprecationWarning which many users will never see, as the default warning filter ignores it. If we don't want to be doing this I'm happy to revert that change, and we should escalate warnings into errors across the test suite (as in #12771).

I think the intent is to rename one to the other. However, this doesn't really work since the deprecation warnings are circular and so downstream users cannot switch from one class to the other to remove the warnings.

If the two offer identical functionality and are drop-in replacements, a workable approach is to instead simply do the replacement now and leave PrefixActions as a pointer to PrefixActionGroup with a deprecation warning on PrefixActions only.

If that won't work, then you'll need to go through two deprecation cycles or pick a new unique name or something.

As discussed above PrefixActions is not a drop in replacement for PrefixActionGroup, so we'll either need to live with a new name for this or go through two deprecation cycles. Looks like this is already being handled in #12771?

@beckermr
Copy link
Contributor

this results in a deprecation that in some sense users are unable to solve

They literally cannot solve it. We can't ship warnings that people cannot solve by following the instructions in the warnings.

As discussed above PrefixActions is not a drop in replacement for PrefixActionGroup

OK. Then I will adjust the deprecation of one to be after the other or we can ignore one of the warnings for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed [bot] added once the contributor has signed the CLA

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Plugin system: transaction action plugins

8 participants