这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@ifitchet
Copy link
Contributor

@ifitchet ifitchet commented Mar 4, 2024

Description

Rationale

MSYS2 supports multiple environments including several native compilation environments as well as the POSIX-oriented MSYS environment and this PR aims provide a cleaner mechanism to support them.

The native compilation environments differ in regards to which compiler suite is used (GCC or LLVM) and which C/C++ runtimes are used.

History

conda will only support a single native compilation environment plus the MSYS environment. One of the MSYS2 mingw32 or mingw64 native compilation environments is reworked into {prefix}/Library/mingw-w64 and the MSYS environment is placed in {prefix}/Library becoming, in effect, {prefix}/Library/usr/*.

(The mingw-w64 name is the base of the MSYS2 native compilation environment recipes, eg. mingw-w64-zlib.)

The following (pertinent) directories are added to the user's PATH in this order:

  1. {prefix}/Library/mingw-w64/bin - the MSYS2 native compilation environment binaries
  2. {prefix}/Library/usr/bin - the MSYS2 POSIX environment binaries
  3. {prefix}/Library/bin - the conda package binary install location

NOTE: The final item, {prefix}/Library/bin, is problematic. Internally, MSYS2 has nothing in /bin and mounts /usr/bin onto /bin (presumably to enable the usage of #! /bin/bash etc.) but /usr/bin is {prefix}/Library/usr/bin and /bin is {prefix}/Library/bin. That is, MSYS2 utilities may have the conda package binary install location obscured with an overlying mount of {prefix}/Library/usr/bin.

Proposal

The proposal is for conda to support any of the 64-bit MSYS2 native compilation environments and always include the existing mingw-w64 system for backwards compatibility. No change will occur to the MSYS environment.

The new environments will use the same distinguishing directory as the MSYS2 distribution, so the UCRT64 environment will use a {prefix}/Library/ucrt64 directory, CLANG64 the {prefix}/Library/clang64 directory etc..

There is an immediate benefit here as the "repack" of the MSYS2 tarball into a conda archive becomes a far simpler mv.

The mechanism will be for a dynamic lookup of the native compilation environment distinguishing directories plus the current mingw-w64 directory.

Observations

Order and PATH Insertion
  1. there is an order to the dynamic lookup of the distinguishing directories: [ucrt64, clang64, mingw64, clangarm64]

    Each {dir} in the above list is looked for as {prefix}/Library/{dir}.

    1. the first detected directory is the only one put on the user's PATH as {prefix}/Library/{dir}/bin

      Extra matching directories will elicit a warning but will not be put on the user's PATH.

    2. if no distinguishing directory is found then nothing will be put on the user's PATH

  2. Then {prefix}/mingw-w64/bin, {prefix}/usr/bin and {prefix}/bin will be put on the user's PATH as usual

Existing conda Behaviour

If nothing else is done the existing conda behaviour will largely continue unchanged: {prefix}/Library/mingw-w64/bin will be put on the user's PATH.

Packages Bundling MSYS2

The main/git package (and others?) is a repack of a Git For Windows release which bundles an entire MSYS2 tree internally. This currently includes the underlying MSYS2 mingw64 hierarchy.

Currently, conda does not add the mingw64/bin directory to the user's PATH. (If you run the supplied Git Bash Start Menu item then /mingw64/bin is added to that shell's PATH -- Git Bash appears to be a re-spin of the MSYS2 MSYS2 MINGW64 Start Menu item.)

The conda packaging puts this tree in {prefix}/Library meaning that {prefix}/Library/mingw64 may be picked up as a distinguishing directory and {prefix}/Library/mingw64/bin will be put on the user's PATH:

  • if no prior distinguishing directories exist then {prefix}/Library/mingw64/bin will be put on PATH ahead of the (potentially non-existent) {prefix}/Library/mingw-w64/bin

    Here the user will have the git-related executables appear on their PATH which they never had had previously.

    NOTE If the {prefix}/Library/mingw-w64 existed and had similarly named executables in it then they would not be visible any more. See the section on Anaconda below.

  • if a prior distinguishing directory does exist, say, clang64, then {prefix}/Library/clang64 will be used instead of {prefix}/Library/mingw64

    Here the user would not see the git-related executables just as previously.

(The Git For Windows people don't seem inclined to switch to UCRT64 at the moment.)

Anaconda Distribution

Anaconda distribution does come bundled with a {prefix}/Library/mingw-w64 directory which contains jq.exe and libwinpthread-1.dll (the m2w64-jq MSYS2 package and dependency). If someone installs main/git (or a similar package with a bundled MSYS2) then those executables may be obscured.

Package mutexes

Packages in the native compilation environments should associate themselves with a discriminating mutex to minimise conflicts. We would propose msys2-mingw-w64-mutex with a build string of ucrt64, clang64 etc..

The rationale for the package mutex is that whilst the code changes to activate.py will only put one native compilation environment on the PATH it is better all round if you can be prevented from installing potentially confusing libraries in the same conda environment in the first place.

There is a question here about backporting mutexes for, say, the existing m2w64-* packages which is raised but left unanswered. A mutex build string of m2w64 will be defined for m2w64-* packages.

Similarly, a mutex for the MSYS (POSIX) environment, msys2-msys-mutex has been defined with a build string of msys2. A variant with a mutex build string of m2 will be defined for m2-* packages.

Dynamic Lookups vs CLI flags, Environment Variables

It was not the intention to have dynamic lookups but it turns out to be necessary as, for example, tests in multi-output recipes are unable to be directed as to which native compilation environment should be used.

conda activate {env} could use a CLI flag or an environment variable reasonably easily although there is the risk that the user's choice could conflict with the conda environment's actual state.

Unix to Native Prefixes

In practice, conda activate|deactivate supply the code with a Windows native pathname, eg. C:\path\to\prefix however during a build, the supplied {prefix} is a Unix-like pathname, eg. /c/path/to/prefix. The code catches both MSYS2-style Unix pathnames, /c/..., and Cygwin-style Unix pathnames, /cygdrive/c/....

Ostensibly this shouldn't matter except that the Python os.isdir() function only works for Windows native pathnames.

Consequently the previously existing native_path_to_unix() has been mirrored into a unix_path_to_native() function.

Checklist - did you ...

  • Add a file to the news directory (using the template) for the next release's release notes?
  • Add / update necessary tests?
  • Add / update outdated documentation?

@ifitchet ifitchet requested a review from a team as a code owner March 4, 2024 17:35
@conda-bot
Copy link
Contributor

We require contributors to sign our Contributor License Agreement and we don't have one on file for @ifitchet.

In order for us to review and merge your code, please e-sign the Contributor License Agreement PDF. We then need to manually verify your signature, merge the PR (conda/infrastructure#888), and ping the bot to refresh the PR.

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 4, 2024

CodSpeed Performance Report

Merging #13649 will not alter performance

Comparing ifitchet:cleaner-msys2-support (5652de9) with main (c1af45f)

Summary

✅ 21 untouched benchmarks

@ifitchet
Copy link
Contributor Author

ifitchet commented Mar 4, 2024

Should I renumber the news article to reflect the PR#?

@beeankha
Copy link
Member

beeankha commented Mar 4, 2024

@conda-bot check

@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Mar 4, 2024
@beeankha
Copy link
Member

beeankha commented Mar 4, 2024

Should I renumber the news article to reflect the PR#?

Yes, that's fine; alternatively, the number in the name of the news file can be the number associated with the related GitHub issue.

@ifitchet ifitchet force-pushed the cleaner-msys2-support branch from 3dfc75c to 5f9e9b2 Compare April 16, 2024 13:03
@travishathaway travishathaway requested a review from a team April 16, 2024 13:35

win_path = (
re.sub(
r"([a-zA-Z]:[\/\\\\]+(?:[^:*?\"<>|;]+[\/\\\\]*)*)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a handful of regular expressions here. Should we pre-compile these and store them as constants?

REGEX_CONST: Final[re.Pattern[str]] = re.compile(r"([a-zA-Z]:[\/\\\\]+(?:[^:*?\"<>|;]+[\/\\\\]*)*)")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, not on this pass.

@beeankha
Copy link
Member

pre-commit.ci autofix


# MSYS2 /c/
# cygwin /cygdrive/c/
if re.match("^(/[A-Za-z]/|/cygdrive/[A-Za-z]/).*", prefix):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it only ever be a windows drive path? or could we also say:

Suggested change
if re.match("^(/[A-Za-z]/|/cygdrive/[A-Za-z]/).*", prefix):
if "/" in prefix:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kenodegard to me that seems risky. Can I create an expecting-to-be-Windows conda environment with / in the name? Probably: md here/there\conda-meta (or whatever the minimum conda env is).
That said, anyone who mixes / and \ in the same tree and is using MSYS2 is probably in for a world of pain for other reasons.
I might think that prefix.startswith("/") would be OK but can you create an environment that doesn't start at the filesystem root? Can I conda activate here\there and what will the value of prefix be, will it always be resolved to the filesystem root?

Copy link
Member

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits about nesting in conversion function and suggestions, nothing major though. Nice to see so many tests!

I do wonder if a) there is prior art elsewhere we could verify this with b) there is a best practice from MS we could get inspired from, I wonder if they've had to account for Cygwin when they rolled out WSL.

#
# mingw-w64 is a legacy variant used by m2w64-* packages
#
# We could include clang32 and mingw32 variants
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but are't because...?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erm, good question.
I limited it to 64-bit variants as "we" don't built 32-bit variants of any packages any more.
The choice of "we" and 64-bit only is a little parochial and may not cover the wider conda community's needs.

The cost will be two extra (filename munge + stat(2)) for everyone on Windows each time this function is called (conda activate|deactivate|build).


# MSYS2 /c/
# cygwin /cygdrive/c/
if re.match("^(/[A-Za-z]/|/cygdrive/[A-Za-z]/).*", prefix):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need the match, it might be slightly faster to assign the return value of re.compile to a variable and reuse it in the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jezdez We don't need the match, we're just looking to identify MSYS2 or Cygwin pathnames hence the grouping for alternates.
I don't know enough the the detail of Python's re to know if (?:...) is a saving.



@pytest.mark.skipif(
not on_win,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be run outside of Windows then to make sure that this is the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows the native_path_to_unix and unix_path_to_native functions will short-circuit to the path_identity function instead which won't error, they will simply return gibberish for the input values.

(library / path / "bin").mkdir(parents=True)

activator = activator_cls()
paths = activator._replace_prefix_in_path(str(prefix), str(prefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a roundabout way to get the directories in PATH because there is no getter function like that in the activator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this isn't great but fixing it is out of scope for this PR. Alternatively, we'd do something like:

Suggested change
paths = activator._replace_prefix_in_path(str(prefix), str(prefix))
paths = (
*activator._get_path_dirs(prefix),
*activator.path_conversion(activator._get_starting_path_list())
)

So IMO one "private" method is better than two of them.

@ifitchet
Copy link
Contributor Author

ifitchet commented May 7, 2024

@jezdez

I wonder if they've had to account for Cygwin when they rolled out WSL

Not really, they have another spin on cygpath:

ifitchet@CPC-ifitc-2HB9V:~$ wslpath
wslpath: Invalid argument
Usage:
    -a    force result to absolute path format
    -u    translate from a Windows path to a WSL path (default)
    -w    translate from a WSL path to a Windows path
    -m    translate from a WSL path to a Windows path, with '/' instead of '\'

EX: wslpath 'c:\users'

and, FWIW, they use /mnt for Windows drives (and other things):

ifitchet@CPC-ifitc-2HB9V:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
...
C:\             128G  115G   13G  90% /mnt/c
...

(on an Ubuntu/WSL2 instance at any rate)

@kenodegard kenodegard merged commit d8178c9 into conda:main May 8, 2024
kenodegard added a commit that referenced this pull request May 8, 2024
Refactor MSYS2 environment handling, support multiple native compilation
environments, and introduce a new `unix_path_to_native` function (the
inverse of `native_path_to_unix`).

---------

Co-authored-by: Bianca Henderson <bhenderson@anaconda.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ken Odegard <kodegard@anaconda.com>
@kenodegard kenodegard mentioned this pull request Jul 17, 2024
61 tasks
@github-actions github-actions bot added the locked [bot] locked due to inactivity label May 9, 2025
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed [bot] added once the contributor has signed the CLA locked [bot] locked due to inactivity

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

8 participants