-
Notifications
You must be signed in to change notification settings - Fork 2.1k
extend text-generation-benchmark to xpu, pass #2730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the text generation benchmark, this was definitely on our list to do soon. Unfortunately, I ran into an issue when trying it out. Could you please check?
Also, once you're finished with your changes, please run ruff on method_comparison/text_generation_benchmark/ (this directory is not automatically checked).
| accelerator_reserved_mb = 0.0 | ||
|
|
||
| return ram_usage_mb, gpu_allocated_mb, gpu_reserved_mb | ||
| return ram_usage_mb, accelerator_allocated_mb, accelerator_reserved_mb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, I'm getting incorrect results. When I run python run_base.py -v --force on this branch on a 4090, I get 0 MB memory usage in the report. To debug this, you can set a breakpoint at this line and run the aforementioned command. The first time we get here, accelerator_allocated_mb and accelerator_reserved_mb are 0, which is expected. After this, the model is loaded to the accelerator. Thus, when we get here the second time, these values should be > 0, but I still get 0 here. When running the same on the main branch, I get correct reports.
I could not determine where this difference comes from, nvidia-smi shows the same output for main branch and this branch, torch.cuda.is_available() always returns True, and the model_kwargs are also identical. Can you replicate this and do you have an idea what the issue could be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminBossan oh, my fault, the chain should be if-elif-else, but i break to 2(if and if-else), so CUDA GPU goes into first if and second else(which gives it 0), sorry and thx for checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, makes sense.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the issue with CUDA. I think the script still has a small error for XPU, which I flagged, please check.
| accelerator_reserved_mb = 0.0 | ||
|
|
||
| return ram_usage_mb, gpu_allocated_mb, gpu_reserved_mb | ||
| return ram_usage_mb, accelerator_allocated_mb, accelerator_reserved_mb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, makes sense.
| return gpu_allocated, gpu_reserved | ||
| _, accelerator_allocated, accelerator_reserved = get_memory_usage() | ||
| return accelerator_allocated, accelerator_reserved | ||
| elif torch.xpu.is_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, elif is not necessary because the previous branch returns, but it also doesn't hurt.
Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the text generation benchmark XPU compatible.
@BenjaminBossan, pls help review, thx very much.