Qwen2.5-3B-Instruct-GRPO (Gmail Tool-Calling)

GRPO ile Gmail tool-calling görevine ince ayar yapılmış Qwen2.5-3B-Instruct modeli.
Amaç: Gmail benzeri bir ortamda e-posta okuma, arama, etiketleme, ek indirme ve e-posta gönderme işlemlerini araç çağrıları (tool calls) üzerinden planlayabilen bir yardımcı sunmak.

Base Model

Bu model şu base model üzerine inşa edildi:

Fine-tune sürecinde:

  • Önce SFT ile Gmail tool-calling senaryoları örnekleriyle eğitildi.
  • Ardından GRPO (Group Relative Policy Optimization) ile, doğru tool seçimi ve argüman kalitesine göre ödül verilerek ince ayar yapıldı.
  • Son aşamada LoRA ağırlıkları base modele merge edildi ve bu repo altında tam birleşik (merged) model olarak yayınlandı.

Languages

  • English
  • Turkish (partially, instructions and prompts can be in Turkish)

Intended Use & Tasks

  • Conversational Assistant
  • Tool-Calling / Function-Calling
  • Gmail-style email operations:
    • Search emails
    • Read / delete / label emails
    • Download attachments
    • Draft and send emails via tools

Kullanım (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO"  # kendi repo adını buraya yaz
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {
        "role": "system",
        "content": "You are a Gmail assistant. Use tools to satisfy the user request.",
    },
    {
        "role": "user",
        "content": "Find emails from [email protected] with PDF attachments from last week and send a hello email.",
    },
]

from transformers import AutoTokenizer

def format_chat(messages):
    # Qwen chat template kullanıyorsan tokenizer.apply_chat_template de kullanabilirsin;
    # burada basit bir concat örneği verilmiştir.
    text = ""
    for m in messages:
        role = m["role"]
        content = m["content"]
        text += f"<|im_start|>{role}\n{content}<|im_end|>\n"
    return text

prompt = format_chat(messages)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using VLLM (Inference)

vllm serve /path/to/Qwen2.5-3B-Instruct-GRPO \
  --host 0.0.0.0 \
  --port 8000 \
  --max-model-len 8096 \
  --dtype bfloat16 \
  --gpu-memory-utilization 0.6 \
  --enforce-eager \
  --enable-auto-tool-choice \
  --tool-call-parser hermes
import requests
import json

VLLM_URL = "http://localhost:8000/v1/chat/completions"
MODEL_ID = "TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO"  # kendi repo adınızı yazın

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_emails",
            "description": "Searches for emails using Gmail search syntax",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "maxResults": {"type": "number"},
                },
                "required": ["query"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "send_email",
            "description": "Sends a new email",
            "parameters": {
                "type": "object",
                "properties": {
                    "to": {
                        "type": "array",
                        "items": {"type": "string"},
                    },
                    "subject": {"type": "string"},
                    "body": {"type": "string"},
                },
                "required": ["to", "subject", "body"],
            },
        },
    },
    # Diğer Gmail tool'larınızı da burada tanımlayabilirsiniz...
]

payload = {
    "model": MODEL_ID,
    "messages": [
        {
            "role": "system",
            "content": "You are a Gmail assistant. Use the provided tools to satisfy the user request.",
        },
        {
            "role": "user",
            "content": "Find emails from [email protected] with PDF attachments from last week and send email saying Hello.",
        },
    ],
    "tools": tools,
    "tool_choice": "auto",
    "temperature": 0.7,
    "top_p": 0.9,
    "max_tokens": 256,
}

resp = requests.post(VLLM_URL, json=payload, timeout=60)
resp.raise_for_status()
data = resp.json()

print(json.dumps(data, indent=2, ensure_ascii=False))

tool_calls = data["choices"][0]["message"].get("tool_calls", [])
for tc in tool_calls:
    func = tc["function"]
    name = func["name"]
    args = json.loads(func["arguments"])
    print("Tool:", name)
    print("Args:", json.dumps(args, indent=2, ensure_ascii=False))
Downloads last month
6
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO

Base model

Qwen/Qwen2.5-3B
Finetuned
(884)
this model