language: []
library_name:sentence-transformerstags:-sentence-transformers-sentence-similarity-feature-extraction-generated_from_trainer-dataset_size:557850-loss:MatryoshkaLoss-loss:MultipleNegativesRankingLossbase_model:intfloat/multilingual-e5-smalldatasets: []
metrics:-pearson_cosine-spearman_cosine-pearson_manhattan-spearman_manhattan-pearson_euclidean-spearman_euclidean-pearson_dot-spearman_dot-pearson_max-spearman_maxwidget:-source_sentence:ذكرمتوازنبعنايةيقفعلىقدمواحدةبالقربمنمنطقةشاطئالمحيطالنظيفةsentences:-رجليقدمعرضاً-هناكرجلبالخارجقربالشاطئ-رجليجلسعلىأريكه-source_sentence:رجليقفزإلىسريرهالقذرsentences:-السريرقذر.-رجليضحكأثناءغسيلالملابس-الرجلعلىالقمر-source_sentence:الفتياتبالخارجsentences:-امرأةتلفالخيطإلىكراتبجانبكومةمنالكرات-فتيانيركبانفيجولةمتعة->- ثلاث فتيات يقفون سوية في غرفة واحدة تستمع وواحدة تكتب على الحائط والثالثة تتحدث إليهن-source_sentence:الرجليرتديقميصاًأزرق.sentences:->- رجل يرتدي قميصاً أزرق يميل إلى الجدار بجانب الطريق مع شاحنة زرقاء وسيارة حمراء مع الماء في الخلفية.-كتابالقصصمفتوح-رجليرتديقميصأسوديعزفعلىالجيتار.-source_sentence:يجلسشابذوشعرأشقرعلىالحائطيقرأجريدةبينماتمرامرأةوفتاةشابة.sentences:-ذكرشابينظرإلىجريدةبينماتمرإمرأتانبجانبه-رجليستلقيعلىوجههعلىمقعدفيالحديقة.-الشابنائمبينماالأمتقودابنتهاإلىالحديقةpipeline_tag:sentence-similaritymodel-index:-name:SentenceTransformerbasedonintfloat/multilingual-e5-smallresults:-dataset:config:arname:MTEBMIRACLRetrievalHardNegatives(ar)revision:95c8db7d4a6e9c1d8a60601afd63d553ae20a2ebsplit:devtype:mteb/miracl-hard-negativesmetrics:-type:main_scorevalue:33.441task:type:Retrieval-dataset:config:ara-araname:MTEBMLQARetrieval(ara-ara)revision:397ed406c1a7902140303e7faf60fff35b58d285split:testtype:facebook/mlqametrics:-type:main_scorevalue:64.488task:type:Retrieval-dataset:config:arname:MTEBMintakaRetrieval(ar)revision:efa78cc2f74bbcd21eff2261f9e13aebe40b814esplit:testtype:jinaai/mintakaqametrics:-type:main_scorevalue:16.162task:type:Retrieval-dataset:config:defaultname:MTEBSadeemQuestionRetrieval(default)revision:3cb0752b182e5d5d740df547748b06663c8e0bd9split:testtype:sadeem-ai/sadeem-ar-eval-retrieval-questionsmetrics:-type:main_scorevalue:63.235task:type:Retrieval-task:type:semantic-similarityname:SemanticSimilaritydataset:name:ststest384type:sts-test-384metrics:-type:pearson_cosinevalue:0.7883137447514015name:PearsonCosine-type:spearman_cosinevalue:0.7971624317482785name:SpearmanCosine-type:pearson_manhattanvalue:0.7845904338398069name:PearsonManhattan-type:spearman_manhattanvalue:0.7939541836133244name:SpearmanManhattan-type:pearson_euclideanvalue:0.7882887522003604name:PearsonEuclidean-type:spearman_euclideanvalue:0.7971601260546269name:SpearmanEuclidean-type:pearson_dotvalue:0.7883137483129774name:PearsonDot-type:spearman_dotvalue:0.7971605875966696name:SpearmanDot-type:pearson_maxvalue:0.7883137483129774name:PearsonMax-type:spearman_maxvalue:0.7971624317482785name:SpearmanMax-task:type:semantic-similarityname:SemanticSimilaritydataset:name:ststest256type:sts-test-256metrics:-type:pearson_cosinevalue:0.7851969391652749name:PearsonCosine-type:spearman_cosinevalue:0.7968026743946358name:SpearmanCosine-type:pearson_manhattanvalue:0.7852783784725356name:PearsonManhattan-type:spearman_manhattanvalue:0.7935883492889713name:SpearmanManhattan-type:pearson_euclideanvalue:0.7882018230746569name:PearsonEuclidean-type:spearman_euclideanvalue:0.7963116553267949name:SpearmanEuclidean-type:pearson_dotvalue:0.7786421988393841name:PearsonDot-type:spearman_dotvalue:0.7867782644180616name:SpearmanDot-type:pearson_maxvalue:0.7882018230746569name:PearsonMax-type:spearman_maxvalue:0.7968026743946358name:SpearmanMax-task:type:semantic-similarityname:SemanticSimilaritydataset:name:ststest128type:sts-test-128metrics:-type:pearson_cosinevalue:0.7754967709350954name:PearsonCosine-type:spearman_cosinevalue:0.7933453885370457name:SpearmanCosine-type:pearson_manhattanvalue:0.7832834632297865name:PearsonManhattan-type:spearman_manhattanvalue:0.7907589269176767name:SpearmanManhattan-type:pearson_euclideanvalue:0.7867583047946054name:PearsonEuclidean-type:spearman_euclideanvalue:0.7935816990844704name:SpearmanEuclidean-type:pearson_dotvalue:0.7317253736607925name:PearsonDot-type:spearman_dotvalue:0.7335574962775742name:SpearmanDot-type:pearson_maxvalue:0.7867583047946054name:PearsonMax-type:spearman_maxvalue:0.7935816990844704name:SpearmanMax-task:type:semantic-similarityname:SemanticSimilaritydataset:name:ststest64type:sts-test-64metrics:-type:pearson_cosinevalue:0.7625204599039478name:PearsonCosine-type:spearman_cosinevalue:0.7837078735068292name:SpearmanCosine-type:pearson_manhattanvalue:0.7752889433866854name:PearsonManhattan-type:spearman_manhattanvalue:0.7790888579029828name:SpearmanManhattan-type:pearson_euclideanvalue:0.777961287133872name:PearsonEuclidean-type:spearman_euclideanvalue:0.7815940757356076name:SpearmanEuclidean-type:pearson_dotvalue:0.6685094830550401name:PearsonDot-type:spearman_dotvalue:0.6621206899696827name:SpearmanDot-type:pearson_maxvalue:0.777961287133872name:PearsonMax-type:spearman_maxvalue:0.7837078735068292name:SpearmanMax
SentenceTransformer based on intfloat/multilingual-e5-small
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-small on the Omartificial-Intelligence-Space/arabic-n_li-triplet dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Omartificial-Intelligence-Space/E5-Matro")
# Run inference
sentences = [
'يجلس شاب ذو شعر أشقر على الحائط يقرأ جريدة بينما تمر امرأة وفتاة شابة.',
'ذكر شاب ينظر إلى جريدة بينما تمر إمرأتان بجانبه',
'الشاب نائم بينما الأم تقود ابنتها إلى الحديقة',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}