qwen3-4b-structured-output-20260102_V5_3T4_lora

This repository provides a LoRA adapter fine-tuned from unsloth/Qwen3-4B-Instruct-2507 using Unsloth (QLoRA, 4-bit base).

This repo contains LoRA adapter weights only (PEFT). You must load the base model separately.

Model Overview

Base model: unsloth/Qwen3-4B-Instruct-2507
Adapter type: LoRA (PEFT)
Fine-tuning method: QLoRA (Unsloth, 4-bit base)
Max sequence length: 1536
Primary objective: Improve assistant-side structured output generation (format conversion + information extraction with high output consistency)

Trainable Parameters

Trainable (LoRA only): ~33M
Base model params: ~4B
Trainable ratio: ~0.8%

Training Data

Dataset: daichira/AppliedCourse_SFT_datasets
Format: OpenAI-style messages
Filtering: samples without a non-empty assistant turn are removed

Supervision strategy (assistant-only loss)

Render chat with model chat template
Mask system/user tokens (label=-100)
Compute loss only on assistant response tokens

Training Configuration (Key)

Seed: 42
Max seq len: 1536
LR: 2e-4
Per-device batch: 2
Grad acc: 8 (effective batch 16)
Optim: AdamW 8-bit (Unsloth)
LoRA: r=16, alpha=32, dropout=0.0
Target modules: q/k/v/o_proj, gate/up/down_proj

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_id = "unsloth/Qwen3-4B-Instruct-2507"
adapter_id = "daichira/qwen3-4b-structured-output-20260102_V5_3T4_lora"

tokenizer = AutoTokenizer.from_pretrained(base_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

Intended Use

Structured text generation (JSON / CSV / XML)
Format conversion & normalization
Information extraction pipelines
Educational baselines for student competitions

Limitations

Output validity is not guaranteed for underspecified prompts.
Optimized for structured output fidelity, not general-purpose reasoning.

Sources & Terms (IMPORTANT)

Training data (daichira/AppliedCourse_SFT_datasets) is derived from multiple upstream open-data sources. You must comply with upstream licenses/terms when using this adapter and any downstream derivatives.

OpenFoodFacts product-database: database ODbL, contents DbCL, images CC BY-SA (images excluded in training set)
Shopify product-catalogue: product attribute text (safe columns only)
ontologicalapple/vrts-gtfs-archive: use governed by the linked BC Transit Open Data Terms of Use (safe columns only)

This adapter repository contains LoRA weights only and does not redistribute upstream datasets. However, training on licensed data does not remove obligations: attribution / terms compliance remains your responsibility.

License

License: other
Base model license and restrictions apply: unsloth/Qwen3-4B-Instruct-2507

Acknowledgements

Qwen / Alibaba for the base model
Unsloth for efficient QLoRA training
Hugging Face Transformers & PEFT ecosystem

Downloads last month: 32

Model tree for daichira/qwen3-4b-structured-output-20260102_V5_3T4_lora

Base model

Qwen/Qwen3-4B-Instruct-2507

Finetuned

unsloth/Qwen3-4B-Instruct-2507

Adapter

(31)

this model

daichira
/

qwen3-4b-structured-output-20260102_V5_3T4_lora