Qwen2.5-3B-Instruct-GRPO (Gmail Tool-Calling)

GRPO ile Gmail tool-calling görevine ince ayar yapılmış Qwen2.5-3B-Instruct modeli.
Amaç: Gmail benzeri bir ortamda e-posta okuma, arama, etiketleme, ek indirme ve e-posta gönderme işlemlerini araç çağrıları (tool calls) üzerinden planlayabilen bir yardımcı sunmak.

Base Model

Bu model şu base model üzerine inşa edildi:

Base: Qwen/Qwen2.5-3B-Instruct

Fine-tune sürecinde:

Önce SFT ile Gmail tool-calling senaryoları örnekleriyle eğitildi.
Ardından GRPO (Group Relative Policy Optimization) ile, doğru tool seçimi ve argüman kalitesine göre ödül verilerek ince ayar yapıldı.
Son aşamada LoRA ağırlıkları base modele merge edildi ve bu repo altında tam birleşik (merged) model olarak yayınlandı.

Languages

English
Turkish (partially, instructions and prompts can be in Turkish)

Intended Use & Tasks

Conversational Assistant
Tool-Calling / Function-Calling
Gmail-style email operations:
- Search emails
- Read / delete / label emails
- Download attachments
- Draft and send emails via tools

Kullanım (Transformers)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO"  # kendi repo adını buraya yaz
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {
        "role": "system",
        "content": "You are a Gmail assistant. Use tools to satisfy the user request.",
    },
    {
        "role": "user",
        "content": "Find emails from [email protected] with PDF attachments from last week and send a hello email.",
    },
]

from transformers import AutoTokenizer

def format_chat(messages):
    # Qwen chat template kullanıyorsan tokenizer.apply_chat_template de kullanabilirsin;
    # burada basit bir concat örneği verilmiştir.
    text = ""
    for m in messages:
        role = m["role"]
        content = m["content"]
        text += f"<|im_start|>{role}\n{content}<|im_end|>\n"
    return text

prompt = format_chat(messages)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using VLLM (Inference)

vllm serve /path/to/Qwen2.5-3B-Instruct-GRPO \
  --host 0.0.0.0 \
  --port 8000 \
  --max-model-len 8096 \
  --dtype bfloat16 \
  --gpu-memory-utilization 0.6 \
  --enforce-eager \
  --enable-auto-tool-choice \
  --tool-call-parser hermes

import requests
import json

VLLM_URL = "http://localhost:8000/v1/chat/completions"
MODEL_ID = "TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO"  # kendi repo adınızı yazın

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_emails",
            "description": "Searches for emails using Gmail search syntax",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"},
                    "maxResults": {"type": "number"},
                },
                "required": ["query"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "send_email",
            "description": "Sends a new email",
            "parameters": {
                "type": "object",
                "properties": {
                    "to": {
                        "type": "array",
                        "items": {"type": "string"},
                    },
                    "subject": {"type": "string"},
                    "body": {"type": "string"},
                },
                "required": ["to", "subject", "body"],
            },
        },
    },
    # Diğer Gmail tool'larınızı da burada tanımlayabilirsiniz...
]

payload = {
    "model": MODEL_ID,
    "messages": [
        {
            "role": "system",
            "content": "You are a Gmail assistant. Use the provided tools to satisfy the user request.",
        },
        {
            "role": "user",
            "content": "Find emails from [email protected] with PDF attachments from last week and send email saying Hello.",
        },
    ],
    "tools": tools,
    "tool_choice": "auto",
    "temperature": 0.7,
    "top_p": 0.9,
    "max_tokens": 256,
}

resp = requests.post(VLLM_URL, json=payload, timeout=60)
resp.raise_for_status()
data = resp.json()

print(json.dumps(data, indent=2, ensure_ascii=False))

tool_calls = data["choices"][0]["message"].get("tool_calls", [])
for tc in tool_calls:
    func = tc["function"]
    name = func["name"]
    args = json.loads(func["arguments"])
    print("Tool:", name)
    print("Args:", json.dumps(args, indent=2, ensure_ascii=False))

Downloads last month: 6

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for TurkishCodeMan/Qwen2.5-3B-Instruct-GRPO

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(884)

this model