--- library_name: peft base_model: michiyasunaga/BioLinkBERT-large tags: - medical - cardiology - embeddings - domain-adaptation - lora - sentence-transformers - sentence-similarity language: - en license: apache-2.0 --- # CardioEmbed-BioLinkBERT **Domain-specialized cardiology text embeddings using LoRA-adapted BioLinkBERT-large** This is the **best performing model** from our comparative study of 10 embedding architectures for clinical cardiology. ## Performance | Metric | Score | |--------|-------| | Separation Score | **0.510** | | Similar Pair Avg | 0.811 | | Different Pair Avg | 0.301 | | Throughput | 143.5 emb/sec | | Memory | 1.51 GB | ## Usage ```python from transformers import AutoModel, AutoTokenizer from peft import PeftModel # Load base model base_model = AutoModel.from_pretrained("michiyasunaga/BioLinkBERT-large") tokenizer = AutoTokenizer.from_pretrained("michiyasunaga/BioLinkBERT-large") # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "richardyoung/CardioEmbed-BioLinkBERT") # Generate embeddings text = "Atrial fibrillation with rapid ventricular response" inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) outputs = model(**inputs) embeddings = outputs.last_hidden_state.mean(dim=1) ``` ## Training - **Training Data**: 106,535 cardiology text pairs from medical textbooks - **Method**: LoRA fine-tuning (r=16, alpha=32) - **Loss**: Multiple Negatives Ranking Loss (InfoNCE) ## Citation ```bibtex @article{young2024comparative, title={Comparative Analysis of LoRA-Adapted Embedding Models for Clinical Cardiology Text Representation}, author={Young, Richard J and Matthews, Alice M}, journal={arXiv preprint}, year={2024} } ``` ## Related Models This is part of the CardioEmbed model family. See [richardyoung/CardioEmbed](https://huggingface.co/richardyoung/CardioEmbed) for more models.