bge-reranker-v2-m3
Multi-format version of BAAI/bge-reranker-v2-m3 - optimized for deployment.
Model Information
| Property | Value |
|---|---|
| Base Model | BAAI/bge-reranker-v2-m3 |
| Task | text-classification |
| Type | Text Model |
| Trust Remote Code | False |
Available Versions
| Folder | Format | Description | Size |
|---|---|---|---|
safetensors-fp32/ |
PyTorch FP32 | Baseline, highest accuracy | 2187 MB |
safetensors-fp16/ |
PyTorch FP16 | GPU inference, ~50% smaller | 1104 MB |
Usage
PyTorch (GPU)
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
# GPU inference with FP16
model = AutoModelForSequenceClassification.from_pretrained(
"n24q02m/bge-reranker-v2-m3",
subfolder="safetensors-fp16",
torch_dtype=torch.float16
).cuda()
tokenizer = AutoTokenizer.from_pretrained(
"n24q02m/bge-reranker-v2-m3",
subfolder="safetensors-fp16"
)
# Rerank inference
pairs = [["what is panda?", "The giant panda is a bear species."]]
with torch.no_grad():
inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors="pt").to("cuda")
scores = model(**inputs).logits.view(-1,).float()
print(scores)
Notes
- SafeTensors FP16 is the primary format for GPU inference
License
Apache 2.0 (following the base model's license)
Credits
- Base Model: BAAI/bge-reranker-v2-m3
- Conversion: PyTorch + SafeTensors
Model tree for n24q02m/bge-reranker-v2-m3
Base model
BAAI/bge-reranker-v2-m3