[Bug] Not recognized model type 'sba'

### Describe the Bug


Hi, I am running the example custom model in the README (sba) and it does not recognize it, any ideas why?  error log:
Traceback (most recent call last):
  File "/network/scratch/n/nizar.islah/flame/fla-venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1218, in from_pretrained
    config_class = CONFIG_MAPPING[config_dict["model_type"]]
  File "/network/scratch/n/nizar.islah/flame/fla-venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 914, in __getitem__
    raise KeyError(key)
KeyError: 'sba'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/cvmfs/ai.mila.quebec/apps/arch/distro/python/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/cvmfs/ai.mila.quebec/apps/arch/distro/python/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/network/scratch/n/nizar.islah/flame/flame/utils/convert_dcp_to_hf.py", line 65, in <module>
    save_pretrained(args.path, args.step, args.config, args.tokenizer)
  File "/network/scratch/n/nizar.islah/flame/fla-venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/network/scratch/n/nizar.islah/flame/flame/utils/convert_dcp_to_hf.py", line 28, in save_pretrained
    config = AutoConfig.from_pretrained(config, trust_remote_code=True)
  File "/network/scratch/n/nizar.islah/flame/fla-venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1220, in from_pretrained
    raise ValueError(
ValueError: The checkpoint you are trying to load has model type `sba` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

### Steps to Reproduce the Bug

NNODE=1 NGPU=1 LOG_RANK=0 bash train.sh \
  --job.config_file flame/models/fla.toml \
  --job.dump_folder exp/sba-340M-10B/batch32.seqlen2048.warmup1024.update1.steps20480.lr3e-4 \
  --model.config configs/sba_340m.json \
  --model.tokenizer_path fla-hub/transformer-1.3B-100B \
  --optimizer.name AdamW \
  --optimizer.eps 1e-15 \
  --optimizer.lr 3e-4 \
  --lr_scheduler.warmup_steps 1024 \
  --lr_scheduler.lr_min 0.1 \
  --lr_scheduler.decay_type cosine \
  --training.batch_size 32 \
  --training.seq_len 2048 \
  --training.gradient_accumulation_steps 1 \
  --training.steps 20480 \
  --training.max_norm 1.0 \
  --training.skip_nan_inf \
  --training.dataset HuggingFaceFW/fineweb-edu \
  --training.dataset_name default \
  --training.dataset_split train \
  --training.streaming \
  --training.num_workers 32 \
  --training.prefetch_factor 2 \
  --training.seed 42 \
  --training.compile \
  --training.tensor_parallel_degree 1 \
  --training.disable_loss_parallel \
  --checkpoint.interval 2048 \
  --checkpoint.load_step -1 \
  --metrics.log_freq 1

### Expected Behavior

should begin training normally

### Environment Information

pip install flame .


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Not recognized model type 'sba' #35

Describe the Bug

Steps to Reproduce the Bug

Expected Behavior

Environment Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Not recognized model type 'sba' #35

Description

Describe the Bug

Steps to Reproduce the Bug

Expected Behavior

Environment Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions