qwen3-4b-structured-output-20260102_V5_3T4_lora

This repository provides a LoRA adapter fine-tuned from unsloth/Qwen3-4B-Instruct-2507 using Unsloth (QLoRA, 4-bit base).

This repo contains LoRA adapter weights only (PEFT). You must load the base model separately.


Model Overview

  • Base model: unsloth/Qwen3-4B-Instruct-2507
  • Adapter type: LoRA (PEFT)
  • Fine-tuning method: QLoRA (Unsloth, 4-bit base)
  • Max sequence length: 1536
  • Primary objective: Improve assistant-side structured output generation (format conversion + information extraction with high output consistency)

Trainable Parameters

  • Trainable (LoRA only): ~33M
  • Base model params: ~4B
  • Trainable ratio: ~0.8%

Training Data

  • Dataset: daichira/AppliedCourse_SFT_datasets
  • Format: OpenAI-style messages
  • Filtering: samples without a non-empty assistant turn are removed

Supervision strategy (assistant-only loss)

  • Render chat with model chat template
  • Mask system/user tokens (label=-100)
  • Compute loss only on assistant response tokens

Training Configuration (Key)

  • Seed: 42
  • Max seq len: 1536
  • LR: 2e-4
  • Per-device batch: 2
  • Grad acc: 8 (effective batch 16)
  • Optim: AdamW 8-bit (Unsloth)
  • LoRA: r=16, alpha=32, dropout=0.0
  • Target modules: q/k/v/o_proj, gate/up/down_proj

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_id = "unsloth/Qwen3-4B-Instruct-2507"
adapter_id = "daichira/qwen3-4b-structured-output-20260102_V5_3T4_lora"

tokenizer = AutoTokenizer.from_pretrained(base_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

Intended Use

  • Structured text generation (JSON / CSV / XML)
  • Format conversion & normalization
  • Information extraction pipelines
  • Educational baselines for student competitions

Limitations

  • Output validity is not guaranteed for underspecified prompts.
  • Optimized for structured output fidelity, not general-purpose reasoning.

Sources & Terms (IMPORTANT)

Training data (daichira/AppliedCourse_SFT_datasets) is derived from multiple upstream open-data sources. You must comply with upstream licenses/terms when using this adapter and any downstream derivatives.

  • OpenFoodFacts product-database: database ODbL, contents DbCL, images CC BY-SA (images excluded in training set)
  • Shopify product-catalogue: product attribute text (safe columns only)
  • ontologicalapple/vrts-gtfs-archive: use governed by the linked BC Transit Open Data Terms of Use (safe columns only)

This adapter repository contains LoRA weights only and does not redistribute upstream datasets. However, training on licensed data does not remove obligations: attribution / terms compliance remains your responsibility.


License

  • License: other
  • Base model license and restrictions apply: unsloth/Qwen3-4B-Instruct-2507

Acknowledgements

  • Qwen / Alibaba for the base model
  • Unsloth for efficient QLoRA training
  • Hugging Face Transformers & PEFT ecosystem
Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for daichira/qwen3-4b-structured-output-20260102_V5_3T4_lora

Adapter
(31)
this model

Dataset used to train daichira/qwen3-4b-structured-output-20260102_V5_3T4_lora