Make sure to update your transformers installation via pip install --upgrade transformers.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_ID = "uzzalHossen/llama-3.1-8b-bengali-empathetic"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.float16,
device_map="auto"
)
model.eval()
Run Inference
def chat(prompt):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=128,
temperature=0.7,
top_p=0.9
)
return tokenizer.decode(output[0], skip_special_tokens=True)
print(chat("আমি খুব একা অনুভব করছি"))
Uploaded model
- Developed by: uzzalHossen
- License: apache-2.0
- Finetuned from model : unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
This llama model was trained 2x faster with Unsloth
- Downloads last month
- 3
