+
Skip to content

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Aug 29, 2025

Summary:

Short term fix for #2901 to unblock the 0.13.0 release.

Long version:

  1. torchao's c++ kernels are not using libtorch and therefore are not guaranteed to work across different PyTorch versions
  2. looks like we got lucky with (1) as torchao kernels just happened to work across PyTorch versions <= 2.8, but PyTorch nightlies in 2.9 introduce a breaking ABI change (I don't know what specifically). Therefore, if we build torchao with torch 2.8, and then import it in an environment with torch 2.9+, the Python process will crash with Aborted (core dumped).

For now, I just gate out the "known broken" case where we detect that the torch version used to build torchao is < 2.9, and the torch version in the environment when torchao is imported is >= 2.9. If this is detected, this PR skips importing the .so files and logs a warning, to at least have most of the torchao Python API still work and give the user some information about how to get the custom kernels working.

For future releases, we'll need to make this more robust - leaving that for future PRs.

Test Plan:

// install the 0.13.0 RC, built with PyTorch 2.8
with-proxy pip install torchao==0.13.0 --extra-index-url https://download.pytorch.org/whl/test/cu128

// copy over these changes to the local __init__.py file in the installation:
// ~/.conda/envs/pytorch_nightly/lib/python3.11/site-packages/torchao/__init__.py

// install PyTorch 2.9.x nightly
with-proxy pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128

// import torchao, verify no more crash and the warning message is emitted
(pytorch_nightly) [vasiliy@devgpu007.eag6 ~/local]$ python -X faulthandler -c "import torch; print(torch.__version__); import torchao"
2.9.0.dev20250829+cu128
Skipping import of cpp extensions due to incompatible torch version 2.9.0.dev20250829+cu128 for torchao version 0.13.0+cu128

Reviewers:

Subscribers:

Tasks:

Tags:

Copy link

pytorch-bot bot commented Aug 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2908

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

⏳ No Failures, 1 Pending

As of commit d22bbf5 with merge base 2f78cfe (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2025
@vkuzo vkuzo added the topic: bug fix Use this tag for PRs that fix bugs label Aug 29, 2025
@vkuzo vkuzo force-pushed the 20250829_torchao_init_workaround branch from 156a99f to aa7f066 Compare August 29, 2025 14:06
Summary:

Short term fix for #2901 to unblock
the 0.13.0 release.

Long version:
1. torchao's c++ kernels are not using libtorch and therefore are not
   guaranteed to work across different PyTorch versions
2. looks like we got lucky with (1) as torchao kernels just happened to
   work across PyTorch versions <= 2.8, but PyTorch nightlies in 2.9
   introduce a breaking ABI change (I don't know what specifically).
   Therefore, if we build torchao with torch 2.8, and then import it in
   an environment with torch 2.9+, the Python process will crash with
   `Aborted (core dumped)`.

For now, I just gate out the "known broken" case where we detect
that the torch version used to build torchao is < 2.9, and the torch
version in the environment when torchao is imported is >= 2.9. If this
is detected, this PR skips importing the `.so` files and logs a warning,
to at least have most of the torchao Python API still work and give the
user some information about how to get the custom kernels working.

For future releases, we'll need to make this more robust - leaving that
for future PRs.

Test Plan:

```bash
// install the 0.13.0 RC, built with PyTorch 2.8
with-proxy pip install torchao==0.13.0 --extra-index-url https://download.pytorch.org/whl/test/cu128

// copy over these changes to the local __init__.py file in the installation:
// ~/.conda/envs/pytorch_nightly/lib/python3.11/site-packages/torchao/__init__.py

// install PyTorch 2.9.x nightly
with-proxy pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128

// import torchao, verify no more crash and the warning message is emitted
(pytorch_nightly) [vasiliy@devgpu007.eag6 ~/local]$ python -X faulthandler -c "import torch; print(torch.__version__); import torchao"
2.9.0.dev20250829+cu128
Skipping import of cpp extensions due to incompatible torch version 2.9.0.dev20250829+cu128 for torchao version 0.13.0+cu128
```

Reviewers:

Subscribers:

Tasks:

Tags:
@vkuzo vkuzo force-pushed the 20250829_torchao_init_workaround branch from aa7f066 to d22bbf5 Compare August 29, 2025 14:36
@vkuzo vkuzo merged commit 7ea5410 into main Aug 29, 2025
18 checks passed
vkuzo added a commit that referenced this pull request Aug 29, 2025
…ion (#2908)

Summary:

Short term fix for #2901 to unblock
the 0.13.0 release.

Long version:
1. torchao's c++ kernels are not using libtorch and therefore are not
   guaranteed to work across different PyTorch versions
2. looks like we got lucky with (1) as torchao kernels just happened to
   work across PyTorch versions <= 2.8, but PyTorch nightlies in 2.9
   introduce a breaking ABI change (I don't know what specifically).
   Therefore, if we build torchao with torch 2.8, and then import it in
   an environment with torch 2.9+, the Python process will crash with
   `Aborted (core dumped)`.

For now, I just gate out the "known broken" case where we detect
that the torch version used to build torchao is < 2.9, and the torch
version in the environment when torchao is imported is >= 2.9. If this
is detected, this PR skips importing the `.so` files and logs a warning,
to at least have most of the torchao Python API still work and give the
user some information about how to get the custom kernels working.

For future releases, we'll need to make this more robust - leaving that
for future PRs.

Test Plan:

```bash
// install the 0.13.0 RC, built with PyTorch 2.8
with-proxy pip install torchao==0.13.0 --extra-index-url https://download.pytorch.org/whl/test/cu128

// copy over these changes to the local __init__.py file in the installation:
// ~/.conda/envs/pytorch_nightly/lib/python3.11/site-packages/torchao/__init__.py

// install PyTorch 2.9.x nightly
with-proxy pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu128

// import torchao, verify no more crash and the warning message is emitted
(pytorch_nightly) [vasiliy@devgpu007.eag6 ~/local]$ python -X faulthandler -c "import torch; print(torch.__version__); import torchao"
2.9.0.dev20250829+cu128
Skipping import of cpp extensions due to incompatible torch version 2.9.0.dev20250829+cu128 for torchao version 0.13.0+cu128
```

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Fix error in #2908. The version string for
PyTorch 2.8 reads "2.8.0...", so we need to compare `>= 2.9` to properly
gate out PyTorch 2.9.

Test Plan:

1. make this change in a locally installed __init__ file of torchao
   downloaded via pip
2. install PyTorch 2.8.0
3. import torchao, verify warning was not hit

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Fix error in #2908. The version string for
PyTorch 2.8 reads "2.8.0...", so we need to compare `>= 2.9` to properly
gate out PyTorch 2.9.

Test Plan:

1. make this change in a locally installed __init__ file of torchao
   downloaded via pip
2. install PyTorch 2.8.0
3. import torchao, verify warning was not hit

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Fix error in #2908. The version string for
PyTorch 2.8 reads "2.8.0...", so we need to compare `>= 2.9` to properly
gate out PyTorch 2.9.

Test Plan:

1. make this change in a locally installed __init__ file of torchao
   downloaded via pip
2. install PyTorch 2.8.0
3. import torchao, verify warning was not hit

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Undoes part of #2908 to make
the message about missing `.so` files be a debug print instead of a
warning.  Reason: this always happens for builds without executorch ops.

Keeps the version mismatch log as a warning.

Test Plan:

Make this change locally in an install of torchao on an H100, verify
warning no longer prints.

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Undoes part of #2908 to make
the message about missing `.so` files be a debug print instead of a
warning.  Reason: this always happens for builds without executorch ops.

Keeps the version mismatch log as a warning.

Test Plan:

Make this change locally in an install of torchao on an H100, verify
warning no longer prints.

Reviewers:

Subscribers:

Tasks:

Tags:
vkuzo added a commit that referenced this pull request Sep 2, 2025
Summary:

Undoes part of #2908 to make
the message about missing `.so` files be a debug print instead of a
warning.  Reason: this always happens for builds without executorch ops.

Keeps the version mismatch log as a warning.

Test Plan:

Make this change locally in an install of torchao on an H100, verify
warning no longer prints.

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: bug fix Use this tag for PRs that fix bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载