SentenceTransformer based on sentence-transformers/embeddinggemma-300m-medical

This is a sentence-transformers model finetuned from sentence-transformers/embeddinggemma-300m-medical. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
    "Velocity slope 2 by US",
]
documents = [
    'Sonogram',
    '顺序型 存在情况',
    'Cardiology',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4595, 0.0490, 0.4026]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 5,483,754 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 16.23 tokens
    • max: 54 tokens
    • min: 3 tokens
    • mean: 7.32 tokens
    • max: 99 tokens
  • Samples:
    anchor positive
    Calcium [Mass/volume] in Serum or Plasma --4 hours post XXX challenge 4Hr之后于XXX刺激
    Centers for Environmental Health abrine and ricinine panel [Mass/volume] - Urine Ricidine
    HIV 1 Ag [Presence] in Serum 艾滋病
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 5,000 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 16.46 tokens
    • max: 56 tokens
    • min: 3 tokens
    • mean: 7.48 tokens
    • max: 74 tokens
  • Samples:
    anchor positive
    Propylparaben IgE Ab [Units/volume] in Serum Propylparaben Ab.IgE in Ser
    SLCO1B1 gene targeted mutation analysis in Blood or Tissue by Molecular genetics method 临床医疗文书 全血或组织
    Borrelia burgdorferi 39kD IgG Ab [Presence] in Cerebral spinal fluid by Immunoblot West blt
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 2e-05
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_num_workers: 8
  • load_best_model_at_end: True
  • ddp_find_unused_parameters: False

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 1
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: False
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0187 50 4.957 -
0.0374 100 4.1103 -
0.0560 150 3.6413 -
0.0747 200 3.4118 -
0.0934 250 3.2548 -
0.1121 300 3.1812 -
0.1307 350 3.0903 -
0.1494 400 3.0572 -
0.1681 450 2.9969 -
0.1868 500 2.9774 -
0.2055 550 2.9353 -
0.2241 600 2.9169 -
0.2428 650 2.906 -
0.2615 700 2.9007 -
0.2802 750 2.8852 -
0.2988 800 2.8636 -
0.3175 850 2.8541 -
0.3362 900 2.8332 -
0.3549 950 2.8334 -
0.3736 1000 2.8159 -
0.3922 1050 2.8018 -
0.4109 1100 2.7881 -
0.4296 1150 2.7733 -
0.4483 1200 2.7622 -
0.4669 1250 2.7567 -
0.4856 1300 2.7561 -
0.4998 1338 - 2.7056
0.5043 1350 2.7557 -
0.5230 1400 2.7536 -
0.5417 1450 2.734 -
0.5603 1500 2.7352 -
0.5790 1550 2.7109 -
0.5977 1600 2.7291 -
0.6164 1650 2.7186 -
0.6350 1700 2.7192 -
0.6537 1750 2.7166 -
0.6724 1800 2.7015 -
0.6911 1850 2.6988 -
0.7097 1900 2.6962 -
0.7284 1950 2.6809 -
0.7471 2000 2.6928 -
0.7658 2050 2.6989 -
0.7845 2100 2.6916 -
0.8031 2150 2.6834 -
0.8218 2200 2.6836 -
0.8405 2250 2.6692 -
0.8592 2300 2.676 -
0.8778 2350 2.6723 -
0.8965 2400 2.6733 -
0.9152 2450 2.6605 -
0.9339 2500 2.6687 -
0.9526 2550 2.6549 -
0.9712 2600 2.652 -
0.9899 2650 2.6467 -
0.9996 2676 - 2.6273
1.0086 2700 2.6369 -
1.0273 2750 2.6196 -
1.0459 2800 2.6226 -
1.0646 2850 2.6201 -
1.0833 2900 2.6257 -
1.1020 2950 2.6275 -
1.1207 3000 2.6225 -
1.1393 3050 2.6181 -
1.1580 3100 2.6162 -
1.1767 3150 2.6213 -
1.1954 3200 2.6278 -
1.2140 3250 2.6092 -
1.2327 3300 2.6217 -
1.2514 3350 2.6179 -
1.2701 3400 2.6167 -
1.2888 3450 2.5976 -
1.3074 3500 2.6204 -
1.3261 3550 2.6267 -
1.3448 3600 2.6226 -
1.3635 3650 2.6226 -
1.3821 3700 2.6127 -
1.4008 3750 2.6072 -
1.4195 3800 2.6006 -
1.4382 3850 2.6111 -
1.4569 3900 2.6043 -
1.4755 3950 2.6061 -
1.4942 4000 2.6149 -
1.4994 4014 - 2.5884
1.5129 4050 2.6128 -
1.5316 4100 2.6023 -
1.5502 4150 2.6046 -
1.5689 4200 2.6043 -
1.5876 4250 2.5917 -
1.6063 4300 2.6104 -
1.6250 4350 2.6028 -
1.6436 4400 2.6005 -
1.6623 4450 2.6005 -
1.6810 4500 2.604 -
1.6997 4550 2.5974 -
1.7183 4600 2.5987 -
1.7370 4650 2.6011 -
1.7557 4700 2.59 -
1.7744 4750 2.6034 -
1.7931 4800 2.581 -
1.8117 4850 2.589 -
1.8304 4900 2.5926 -
1.8491 4950 2.5929 -
1.8678 5000 2.5889 -
1.8864 5050 2.5999 -
1.9051 5100 2.5768 -
1.9238 5150 2.5732 -
1.9425 5200 2.5784 -
1.9612 5250 2.5808 -
1.9798 5300 2.5846 -
1.9985 5350 2.5894 -
1.9993 5352 - 2.5568
2.0172 5400 2.5608 -
2.0359 5450 2.5517 -
2.0545 5500 2.5537 -
2.0732 5550 2.5534 -
2.0919 5600 2.5629 -
2.1106 5650 2.5509 -
2.1292 5700 2.5585 -
2.1479 5750 2.5531 -
2.1666 5800 2.5539 -
2.1853 5850 2.5651 -
2.2040 5900 2.5584 -
2.2226 5950 2.5475 -
2.2413 6000 2.5572 -
2.2600 6050 2.5531 -
2.2787 6100 2.554 -
2.2973 6150 2.5589 -
2.3160 6200 2.556 -
2.3347 6250 2.5622 -
2.3534 6300 2.5417 -
2.3721 6350 2.5595 -
2.3907 6400 2.5552 -
2.4094 6450 2.5509 -
2.4281 6500 2.5439 -
2.4468 6550 2.5573 -
2.4654 6600 2.554 -
2.4841 6650 2.5569 -
2.4991 6690 - 2.5476
2.5028 6700 2.5393 -
2.5215 6750 2.5419 -
2.5402 6800 2.5516 -
2.5588 6850 2.5529 -
2.5775 6900 2.5548 -
2.5962 6950 2.5443 -
2.6149 7000 2.5365 -
2.6335 7050 2.5376 -
2.6522 7100 2.5539 -
2.6709 7150 2.5559 -
2.6896 7200 2.5506 -
2.7083 7250 2.55 -
2.7269 7300 2.5602 -
2.7456 7350 2.5537 -
2.7643 7400 2.5404 -
2.7830 7450 2.5464 -
2.8016 7500 2.5446 -
2.8203 7550 2.5376 -
2.8390 7600 2.5504 -
2.8577 7650 2.5507 -
2.8764 7700 2.5358 -
2.8950 7750 2.5476 -
2.9137 7800 2.5295 -
2.9324 7850 2.5337 -
2.9511 7900 2.5449 -
2.9697 7950 2.5457 -
2.9884 8000 2.5403 -
2.9989 8028 - 2.5336
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.3
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
3
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for greyplan/loinc-multilingual-embeddings