YongganFu commited on
Commit
0750f7e
·
verified ·
1 Parent(s): 426f2fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -3,14 +3,14 @@ library_name: transformers
3
  tags: []
4
  ---
5
 
6
- # Nemotron-Hymba2-3B-Instruct
7
 
8
- Nemotron-Hymba2 is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).
9
 
10
  Docker path: `/lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_fast_slm.sqsh` on NRT.
11
 
12
 
13
- ## Chat with Nemotron-Hymba2-3B-Instruct
14
 
15
  We wrap the model into CUDA Graph for fast generation:
16
 
@@ -18,7 +18,7 @@ We wrap the model into CUDA Graph for fast generation:
18
  from transformers import AutoModelForCausalLM, AutoTokenizer
19
  import torch
20
 
21
- repo_name = "nvidia/Nemotron-Hymba2-3B-Instruct"
22
 
23
  tokenizer = AutoTokenizer.from_pretrained(repo_name, trust_remote_code=True)
24
  model = AutoModelForCausalLM.from_pretrained(repo_name, trust_remote_code=True)
 
3
  tags: []
4
  ---
5
 
6
+ # Nemotron-Flash-3B-Instruct
7
 
8
+ Nemotron-Flash is a new hybrid SLM model family that outperforms Qwen models in accuracy (math, coding, and commonsense), batch-size-1 latency, and throughput. More details are in our NeurIPS 2025 [paper](https://drive.google.com/drive/folders/17vOGktwUfUpRAJPGJUV6oX8XwLSczZtv?usp=sharing).
9
 
10
  Docker path: `/lustre/fsw/portfolios/nvr/users/yongganf/docker/megatron_py25_fast_slm.sqsh` on NRT.
11
 
12
 
13
+ ## Chat with Nemotron-Flash-3B-Instruct
14
 
15
  We wrap the model into CUDA Graph for fast generation:
16
 
 
18
  from transformers import AutoModelForCausalLM, AutoTokenizer
19
  import torch
20
 
21
+ repo_name = "nvidia/Nemotron-Flash-3B-Instruct"
22
 
23
  tokenizer = AutoTokenizer.from_pretrained(repo_name, trust_remote_code=True)
24
  model = AutoModelForCausalLM.from_pretrained(repo_name, trust_remote_code=True)