Make sure to update your transformers installation via pip install --upgrade transformers.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "uzzalHossen/llama-3.1-8b-bengali-empathetic"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto"
)

model.eval()

Run Inference

def chat(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        output = model.generate(
            **inputs,
            max_new_tokens=128,
            temperature=0.7,
            top_p=0.9
        )
    return tokenizer.decode(output[0], skip_special_tokens=True)

print(chat("আমি খুব একা অনুভব করছি"))

Uploaded model

  • Developed by: uzzalHossen
  • License: apache-2.0
  • Finetuned from model : unsloth/meta-llama-3.1-8b-instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth

Downloads last month
3
Safetensors
Model size
8B params
Tensor type
F16
·
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support