SentenceTransformer based on sentence-transformers/embeddinggemma-300m-medical

This is a sentence-transformers model finetuned from sentence-transformers/embeddinggemma-300m-medical. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/embeddinggemma-300m-medical
Maximum Sequence Length: 256 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
    "Velocity slope 2 by US",
]
documents = [
    'Sonogram',
    '顺序型 存在情况',
    'Cardiology',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.4595, 0.0490, 0.4026]])

Training Details

Training Dataset

Unnamed Dataset

Size: 5,483,754 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 4 tokens
mean: 16.23 tokens
max: 54 tokens

min: 3 tokens
mean: 7.32 tokens
max: 99 tokens

	anchor	positive
type	string	string
details	min: 4 tokens mean: 16.23 tokens max: 54 tokens	min: 3 tokens mean: 7.32 tokens max: 99 tokens

Samples:

anchor	positive
`Calcium [Mass/volume] in Serum or Plasma --4 hours post XXX challenge`	`4Hr之后于XXX刺激`
`Centers for Environmental Health abrine and ricinine panel [Mass/volume] - Urine`	`Ricidine`
`HIV 1 Ag [Presence] in Serum`	`艾滋病`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Evaluation Dataset

Unnamed Dataset

Size: 5,000 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 4 tokens
mean: 16.46 tokens
max: 56 tokens

min: 3 tokens
mean: 7.48 tokens
max: 74 tokens

	anchor	positive
type	string	string
details	min: 4 tokens mean: 16.46 tokens max: 56 tokens	min: 3 tokens mean: 7.48 tokens max: 74 tokens

Samples:

anchor	positive
`Propylparaben IgE Ab [Units/volume] in Serum`	`Propylparaben Ab.IgE in Ser`
`SLCO1B1 gene targeted mutation analysis in Blood or Tissue by Molecular genetics method`	`临床医疗文书全血或组织`
`Borrelia burgdorferi 39kD IgG Ab [Presence] in Cerebral spinal fluid by Immunoblot`	`West blt`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 512
per_device_eval_batch_size: 512
learning_rate: 2e-05
warmup_ratio: 0.1
bf16: True
dataloader_num_workers: 8
load_best_model_at_end: True
ddp_find_unused_parameters: False

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 512
per_device_eval_batch_size: 512
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 1
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: True
dataloader_num_workers: 8
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: False
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss
0.0187	50	4.957	-
0.0374	100	4.1103	-
0.0560	150	3.6413	-
0.0747	200	3.4118	-
0.0934	250	3.2548	-
0.1121	300	3.1812	-
0.1307	350	3.0903	-
0.1494	400	3.0572	-
0.1681	450	2.9969	-
0.1868	500	2.9774	-
0.2055	550	2.9353	-
0.2241	600	2.9169	-
0.2428	650	2.906	-
0.2615	700	2.9007	-
0.2802	750	2.8852	-
0.2988	800	2.8636	-
0.3175	850	2.8541	-
0.3362	900	2.8332	-
0.3549	950	2.8334	-
0.3736	1000	2.8159	-
0.3922	1050	2.8018	-
0.4109	1100	2.7881	-
0.4296	1150	2.7733	-
0.4483	1200	2.7622	-
0.4669	1250	2.7567	-
0.4856	1300	2.7561	-
0.4998	1338	-	2.7056
0.5043	1350	2.7557	-
0.5230	1400	2.7536	-
0.5417	1450	2.734	-
0.5603	1500	2.7352	-
0.5790	1550	2.7109	-
0.5977	1600	2.7291	-
0.6164	1650	2.7186	-
0.6350	1700	2.7192	-
0.6537	1750	2.7166	-
0.6724	1800	2.7015	-
0.6911	1850	2.6988	-
0.7097	1900	2.6962	-
0.7284	1950	2.6809	-
0.7471	2000	2.6928	-
0.7658	2050	2.6989	-
0.7845	2100	2.6916	-
0.8031	2150	2.6834	-
0.8218	2200	2.6836	-
0.8405	2250	2.6692	-
0.8592	2300	2.676	-
0.8778	2350	2.6723	-
0.8965	2400	2.6733	-
0.9152	2450	2.6605	-
0.9339	2500	2.6687	-
0.9526	2550	2.6549	-
0.9712	2600	2.652	-
0.9899	2650	2.6467	-
0.9996	2676	-	2.6273
1.0086	2700	2.6369	-
1.0273	2750	2.6196	-
1.0459	2800	2.6226	-
1.0646	2850	2.6201	-
1.0833	2900	2.6257	-
1.1020	2950	2.6275	-
1.1207	3000	2.6225	-
1.1393	3050	2.6181	-
1.1580	3100	2.6162	-
1.1767	3150	2.6213	-
1.1954	3200	2.6278	-
1.2140	3250	2.6092	-
1.2327	3300	2.6217	-
1.2514	3350	2.6179	-
1.2701	3400	2.6167	-
1.2888	3450	2.5976	-
1.3074	3500	2.6204	-
1.3261	3550	2.6267	-
1.3448	3600	2.6226	-
1.3635	3650	2.6226	-
1.3821	3700	2.6127	-
1.4008	3750	2.6072	-
1.4195	3800	2.6006	-
1.4382	3850	2.6111	-
1.4569	3900	2.6043	-
1.4755	3950	2.6061	-
1.4942	4000	2.6149	-
1.4994	4014	-	2.5884
1.5129	4050	2.6128	-
1.5316	4100	2.6023	-
1.5502	4150	2.6046	-
1.5689	4200	2.6043	-
1.5876	4250	2.5917	-
1.6063	4300	2.6104	-
1.6250	4350	2.6028	-
1.6436	4400	2.6005	-
1.6623	4450	2.6005	-
1.6810	4500	2.604	-
1.6997	4550	2.5974	-
1.7183	4600	2.5987	-
1.7370	4650	2.6011	-
1.7557	4700	2.59	-
1.7744	4750	2.6034	-
1.7931	4800	2.581	-
1.8117	4850	2.589	-
1.8304	4900	2.5926	-
1.8491	4950	2.5929	-
1.8678	5000	2.5889	-
1.8864	5050	2.5999	-
1.9051	5100	2.5768	-
1.9238	5150	2.5732	-
1.9425	5200	2.5784	-
1.9612	5250	2.5808	-
1.9798	5300	2.5846	-
1.9985	5350	2.5894	-
1.9993	5352	-	2.5568
2.0172	5400	2.5608	-
2.0359	5450	2.5517	-
2.0545	5500	2.5537	-
2.0732	5550	2.5534	-
2.0919	5600	2.5629	-
2.1106	5650	2.5509	-
2.1292	5700	2.5585	-
2.1479	5750	2.5531	-
2.1666	5800	2.5539	-
2.1853	5850	2.5651	-
2.2040	5900	2.5584	-
2.2226	5950	2.5475	-
2.2413	6000	2.5572	-
2.2600	6050	2.5531	-
2.2787	6100	2.554	-
2.2973	6150	2.5589	-
2.3160	6200	2.556	-
2.3347	6250	2.5622	-
2.3534	6300	2.5417	-
2.3721	6350	2.5595	-
2.3907	6400	2.5552	-
2.4094	6450	2.5509	-
2.4281	6500	2.5439	-
2.4468	6550	2.5573	-
2.4654	6600	2.554	-
2.4841	6650	2.5569	-
2.4991	6690	-	2.5476
2.5028	6700	2.5393	-
2.5215	6750	2.5419	-
2.5402	6800	2.5516	-
2.5588	6850	2.5529	-
2.5775	6900	2.5548	-
2.5962	6950	2.5443	-
2.6149	7000	2.5365	-
2.6335	7050	2.5376	-
2.6522	7100	2.5539	-
2.6709	7150	2.5559	-
2.6896	7200	2.5506	-
2.7083	7250	2.55	-
2.7269	7300	2.5602	-
2.7456	7350	2.5537	-
2.7643	7400	2.5404	-
2.7830	7450	2.5464	-
2.8016	7500	2.5446	-
2.8203	7550	2.5376	-
2.8390	7600	2.5504	-
2.8577	7650	2.5507	-
2.8764	7700	2.5358	-
2.8950	7750	2.5476	-
2.9137	7800	2.5295	-
2.9324	7850	2.5337	-
2.9511	7900	2.5449	-
2.9697	7950	2.5457	-
2.9884	8000	2.5403	-
2.9989	8028	-	2.5336

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.3
Sentence Transformers: 5.1.2
Transformers: 4.57.3
PyTorch: 2.9.1+cu128
Accelerate: 1.12.0
Datasets: 4.4.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 3

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for greyplan/loinc-multilingual-embeddings

Base model

google/embeddinggemma-300m

Finetuned

sentence-transformers/embeddinggemma-300m-medical

Finetuned

(1)

this model