-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Insights: jingyaogong/minimind
Overview
-
0 Active pull requests
-
- 0 Merged pull requests
- 0 Open pull requests
- 0 Closed issues
- 8 New issues
There hasn’t been any commit activity on jingyaogong/minimind in the last week.
Want to help out?
8 Issues opened by 8 people
-
请问,全量微调多卡训练要出问题,单卡训练却没有问题,是怎么回事?(预训练多卡没有问题)
#462 opened
Jul 25, 2025 -
jsonl文件image
#461 opened
Jul 24, 2025 -
对于moe代码中的一个疑问
#460 opened
Jul 24, 2025 -
求问pretraining时,是否需要进行input token的packing对应的attention mask的处理
#459 opened
Jul 23, 2025 -
大佬好,运行了30M的模型,发现效果很好,模型知识储备也不错,看到预训练数据只有1.6G,请问这是如何做到的呢,如何判断数据集质量呢
#458 opened
Jul 23, 2025 -
大佬们想咨询一下,为什么我用huggingface的Trainer来重写minimind,为什么效果很差呢?
#457 opened
Jul 23, 2025 -
不支持调用工具
#456 opened
Jul 22, 2025 -
进行full SFT时loss的波动问题
#455 opened
Jul 20, 2025
6 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
为什么要在预训练数据集加入SFT对话数据
#434 commented on
Jul 20, 2025 • 0 new comments -
RuntimeError: _scaled_dot_product_attention: Explicit attn_mask should not be set when is_causal=True
#453 commented on
Jul 20, 2025 • 0 new comments -
感谢大佬的开源教程!这是我在学习 MiniMind[最新版本] 过程中记录的12篇笔记,包含了原理解释和代码shape注释。
#436 commented on
Jul 21, 2025 • 0 new comments -
请教:如果想用qwen的词表,该怎么load呀?调试了半天,总是出错,
#435 commented on
Jul 21, 2025 • 0 new comments -
不是Issue,一点个人训练minimind的记录
#26 commented on
Jul 24, 2025 • 0 new comments -
[feat] add interactive notebook
#214 commented on
Jul 22, 2025 • 0 new comments