---
datasets:
- iimran/Medical-Intelligence-Questions
base_model:
- Qwen/Qwen2.5-3B
language:
- en
tags:
- medical
- text-generation-inference
- transformers
- unsloth
---
# Qwen2.5-3B-R1-MedicalReasoner

**Qwen2.5-3B-R1-MedicalReasoner** is a clinical reasoning language model fine-tuned for advanced diagnostic and case-based problem solving. It has been developed for applications in medical education, clinical decision support, and research, with the capability to generate detailed chain-of-thought responses that include both the reasoning process and the final answer.

## Overview

- **Model Name:** Qwen2.5-3B-R1-MedicalReasoner  
- **Base Architecture:** Qwen2.5 (3B)  
- **Primary Application:** Clinical reasoning and medical problem solving  
- **Key Features:**
  - **Chain-of-Thought Outputs:** Responds with structured reasoning (`<reasoning> ... </reasoning>`) followed by a concise answer (`<answer> ... </answer>`).
  - **Multi-Specialty Coverage:** Well-suited for scenarios in internal medicine, surgery, pediatrics, OB/GYN, emergency medicine, and more.
  - **Explainable AI:** Generates detailed, educational explanations that support clinical decision-making.

## Model Capabilities

- **Expert-Level Clinical Reasoning:** Equipped to analyze complex clinical scenarios and provide in-depth diagnostic reasoning.
- **Structured Outputs:** Enforces a response format that separates the thought process from the final answer, aiding transparency and interpretability.
- **Optimized for Speed:** Uses Unsloth and vLLM for fast, efficient inference on GPU systems.

## Inference and Usage

Below is an example of how to use the model for inference or refer to inference.py in files section:

```python
from unsloth import FastLanguageModel, is_bfloat16_supported
from vllm import SamplingParams
from huggingface_hub import snapshot_download
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="iimran/Qwen2.5-3B-R1-MedicalReasoner",
    load_in_4bit=True,
    fast_inference=True,
    gpu_memory_utilization=0.5
)
lora_rank = 64
model = FastLanguageModel.get_peft_model(
    model,
    r=lora_rank,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=lora_rank,
    use_gradient_checkpointing="unsloth",
    random_state=3407,
)
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)
model.load_lora(lora_path)
SYSTEM_PROMPT = (
    "Respond in the following format:\n"
    "<reasoning>\n"
    "...\n"
    "</reasoning>\n"
    "<answer>\n"
    "...\n"
    "</answer>"
)
USER_PROMPT = (
    "In the context of disseminated intravascular coagulation (DIC), "
    "which blood component is expected to show an increase due to the excessive breakdown of fibrin?"
)
text = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_PROMPT},
    ],
    tokenize=False,
    add_generation_prompt=True
)
sampling_params = SamplingParams(
    temperature=0.1,
    top_p=0.95,
    max_tokens=4096,
)
outputs = model.fast_generate(
    text,
    sampling_params=sampling_params,
    lora_request=None
)
print(outputs[0].outputs[0].text)
```

### Adapter Integration

For further fine-tuning or experiments with LoRA adapters, the LoRA adapter for this model is available in a separate repository.

- **LoRA Adapter Repo:** [iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter](https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter)

To download and integrate the LoRA adapter:
```python
from huggingface_hub import snapshot_download

# Download the LoRA adapter repository:
lora_path = snapshot_download("iimran/Qwen2.5-3B-R1-MedicalReasoner-lora-adapter")
print("LoRA adapter downloaded to:", lora_path)

# Load the adapter into the model:
model.load_lora(lora_path)
```

## Installation

To use this model, install the required packages:
```bash
pip install unsloth vllm trl datasets huggingface-hub
```

A compatible GPU is recommended for optimal performance.

## Citation

If you use **Qwen2.5-3B-R1-MedicalReasoner** in your research, please cite:
```bibtex
@misc{sarwar2025reinforcement,
  author = {Imran Sarwar and Muhammad Rouf Mustafa},
  title = {Reinforcement Learning Elevates Qwen2.5-3B Medical Reasoning Performance},
  year = {2025},
  month = {Apr},
  day = {10},
  publisher = {Imran Sarwar's Blog},
  howpublished = {\url{https://www.imransarwar.com/blog-posts/Reinforcement-Learning-Elevates-Qwen2.5-Medical-Reasoning-Performance.html}},
  note = {Accessed: 2025-04-09}
}
```

```bibtex
@misc{Qwen2.5-3B-R1-MedicalReasoner,
  authors = {Imran Sarwar, Muhammad Rouf Mustafa},
  title = {Qwen 2.5-3B Meets Deepseek R1: A Fine-Tuned Medical Reasoning Model for Enhanced Diagnostics},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/iimran/Qwen2.5-3B-R1-MedicalReasoner}
}
```

## Disclaimer

This model is intended for research and educational purposes only. It should not be used as the sole basis for clinical decision-making. All outputs should be validated by qualified healthcare professionals.