Qwen2.5-1.5B Browser Action LoRA

This is a LoRA adapter fine-tuned from Qwen/Qwen2.5-1.5B-Instruct for browser-use action prediction.

Training objective

The model was trained on step-level action-only browser-agent supervision. Each example contains:

  • a system prompt derived from the original BrowserGym teacher prompt
  • the current task goal, URL, short recent history, and observation text
  • the next BrowserGym action as the assistant target

The goal is not broad open-web generality. This is a scoped research model for testing whether a small model can improve on synthetic browser-use tasks after SFT.

Dataset

Training dataset:

Dataset summary:

  • 6508 train rows
  • 240 validation rows
  • strict filtered export from a larger 10k+ step collection corpus

Fine-tuning setup

  • Base model: Qwen/Qwen2.5-1.5B-Instruct
  • Method: PEFT LoRA
  • Epochs: 1
  • Sequence length: 2048
  • Learning rate: 2e-4
  • Batch size: 4
  • Gradient accumulation: 4

Evaluation

Validation set size: 240

Before fine-tuning:

  • Parseable action rate: 100%
  • Exact-match action accuracy: 17.08%

After fine-tuning:

  • Parseable action rate: 100%
  • Exact-match action accuracy: 79.58%

This indicates a large improvement over the untuned base model on the target task distribution.

Usage

Load as a PEFT adapter on top of the base model.

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
adapter_id = "saital/qwen25-1.5b-browser-action-lora"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)

tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)

Limitations

  • evaluated only on the project's synthetic browser-task distribution
  • exact-match evaluation is strict and may undercount formatting-equivalent actions
  • this adapter is intended for research iteration, not production deployment
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for saital/qwen25-1.5b-browser-action-lora

Adapter
(726)
this model