Improve training performance with torch.compile and torch.amp #412

Gouryella · 2025-05-21T04:17:10Z

Update AMP usage from torch.cuda.amp to torch.amp for broader device compatibility.
Add torch.compile to models in various training scripts (SFT, LoRA, DPO, Distillation, Pretrain, Distill Reason) to leverage potential performance gains.
Implement a custom load_state_dict in MiniMindForCausalLM to handle state dict keys potentially modified by torch.compile.

- Update AMP usage from `torch.cuda.amp` to `torch.amp` for broader device compatibility. - Add `torch.compile` to models in various training scripts (SFT, LoRA, DPO, Distillation, Pretrain, Distill Reason) to leverage potential performance gains. - Implement a custom `load_state_dict` in `MiniMindForCausalLM` to handle state dict keys potentially modified by `torch.compile`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve training performance with torch.compile and torch.amp #412

Improve training performance with torch.compile and torch.amp #412

Uh oh!

Gouryella commented May 21, 2025

Uh oh!

Uh oh!

Improve training performance with torch.compile and torch.amp #412

Are you sure you want to change the base?

Improve training performance with torch.compile and torch.amp #412

Uh oh!

Conversation

Gouryella commented May 21, 2025

Uh oh!

Uh oh!