Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,134 @@
|
|
| 1 |
---
|
| 2 |
license: bigscience-bloom-rail-1.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: bigscience-bloom-rail-1.0
|
| 3 |
+
datasets:
|
| 4 |
+
- xnli
|
| 5 |
+
language:
|
| 6 |
+
- fr
|
| 7 |
+
- en
|
| 8 |
+
pipeline_tag: zero-shot-classification
|
| 9 |
---
|
| 10 |
+
|
| 11 |
+
# Presentation
|
| 12 |
+
We introduce the Bloomz-7b1-mt-NLI model, fine-tuned from the [Bloomz-7b1-mt-chat-dpo](https://huggingface.co/cmarkea/bloomz-7b1-mt-dpo-chat) foundation model.
|
| 13 |
+
This model is trained on a Natural Language Inference (NLI) task in a language-agnostic manner. The NLI task involves determining the semantic relationship
|
| 14 |
+
between a hypothesis and a set of premises, often expressed as pairs of sentences.
|
| 15 |
+
|
| 16 |
+
The goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B?) and is a classification task (given two sentences, predict one of the
|
| 17 |
+
three labels).
|
| 18 |
+
If sentence A is called *premise*, and sentence B is called *hypothesis*, then the goal of the modelization is to estimate the following:
|
| 19 |
+
$$P(premise=c\in\{contradiction, entailment, neutral\}\vert hypothesis)$$
|
| 20 |
+
|
| 21 |
+
### Language-agnostic approach
|
| 22 |
+
It should be noted that hypotheses and premises are randomly chosen between English and French, with each language combination representing a probability of 25%.
|
| 23 |
+
|
| 24 |
+
### Performance
|
| 25 |
+
|
| 26 |
+
| **class** | **precision (%)** | **f1-score (%)** | **support** |
|
| 27 |
+
| :----------------: | :---------------: | :--------------: | :---------: |
|
| 28 |
+
| **global** | 83.31 | 83.02 | 5,010 |
|
| 29 |
+
| **contradiction** | 81.27 | 86.63 | 1,670 |
|
| 30 |
+
| **entailment** | 87.54 | 83.57 | 1,670 |
|
| 31 |
+
| **neutral** | 81.13 | 78.86 | 1,670 |
|
| 32 |
+
|
| 33 |
+
### Benchmark
|
| 34 |
+
|
| 35 |
+
Here are the performances for both the hypothesis and premise in French:
|
| 36 |
+
|
| 37 |
+
| **model** | **accuracy (%)** | **MCC (x100)** |
|
| 38 |
+
| :--------------: | :--------------: | :------------: |
|
| 39 |
+
| [cmarkea/distilcamembert-base-nli](https://huggingface.co/cmarkea/distilcamembert-base-nli) | 77.45 | 66.24 |
|
| 40 |
+
| [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) | 81.72 | 72.67 |
|
| 41 |
+
| [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) | 83.43 | 75.15 |
|
| 42 |
+
| [cmarkea/bloomz-560m-nli](https://huggingface.co/cmarkea/bloomz-560m-nli) | 68.70 | 53.57 |
|
| 43 |
+
| [cmarkea/bloomz-3b-nli](https://huggingface.co/cmarkea/bloomz-3b-nli) | 81.08 | 71.66 |
|
| 44 |
+
| [cmarkea/bloomz-7b1-mt-nli](https://huggingface.co/cmarkea/bloomz-7b1-mt-nli) | 83.13 | 74.89 |
|
| 45 |
+
|
| 46 |
+
And now the hypothesis in French and the premise in English (cross-language context):
|
| 47 |
+
|
| 48 |
+
| **model** | **accuracy (%)** | **MCC (x100)** |
|
| 49 |
+
| :--------------: | :--------------: | :------------: |
|
| 50 |
+
| [cmarkea/distilcamembert-base-nli](https://huggingface.co/cmarkea/distilcamembert-base-nli) | 16.89 | -26.82 |
|
| 51 |
+
| [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) | 74.59 | 61.97 |
|
| 52 |
+
| [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) | 85.15 | 77.74 |
|
| 53 |
+
| [cmarkea/bloomz-560m-nli](https://huggingface.co/cmarkea/bloomz-560m-nli) | 68.84 | 53.55 |
|
| 54 |
+
| [cmarkea/bloomz-3b-nli](https://huggingface.co/cmarkea/bloomz-3b-nli) | 82.12 | 73.22 |
|
| 55 |
+
| [cmarkea/bloomz-7b1-mt-nli](https://huggingface.co/cmarkea/bloomz-7b1-mt-nli) | 85.43 | 78.25 |
|
| 56 |
+
|
| 57 |
+
# Zero-shot Classification
|
| 58 |
+
The primary interest of training such models lies in their zero-shot classification performance. This means that the model is able to classify any text with any label
|
| 59 |
+
without a specific training. What sets the Bloomz-3b-NLI LLMs apart in this domain is their ability to model and extract information from significantly more complex
|
| 60 |
+
and lengthy test structures compared to models like BERT, RoBERTa, or CamemBERT.
|
| 61 |
+
|
| 62 |
+
The zero-shot classification task can be summarized by:
|
| 63 |
+
$$P(hypothesis=i\in\mathcal{C}|premise)=\frac{e^{P(premise=entailment\vert hypothesis=i)}}{\sum_{j\in\mathcal{C}}e^{P(premise=entailment\vert hypothesis=j)}}$$
|
| 64 |
+
With *i* representing a hypothesis composed of a template (for example, "This text is about {}.") and *#C* candidate labels ("cinema", "politics", etc.), the set
|
| 65 |
+
of hypotheses is composed of {"This text is about cinema.", "This text is about politics.", ...}. It is these hypotheses that we will measure against the premise, which
|
| 66 |
+
is the sentence we aim to classify.
|
| 67 |
+
|
| 68 |
+
### Performance
|
| 69 |
+
|
| 70 |
+
The model is evaluated based on sentiment analysis evaluation on the French film review site [Allociné](https://huggingface.co/datasets/allocine). The dataset is labeled
|
| 71 |
+
into 2 classes, positive comments and negative comments. We then use the hypothesis template "Ce commentaire est {}. and the candidate classes "positif" and "negatif".
|
| 72 |
+
|
| 73 |
+
| **model** | **accuracy (%)** | **MCC (x100)** |
|
| 74 |
+
| :--------------: | :--------------: | :------------: |
|
| 75 |
+
| [cmarkea/distilcamembert-base-nli](https://huggingface.co/cmarkea/distilcamembert-base-nli) | 80.59 | 63.71 |
|
| 76 |
+
| [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) | 86.37 | 73.74 |
|
| 77 |
+
| [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) | 84.97 | 70.05 |
|
| 78 |
+
| [cmarkea/bloomz-560m-nli](https://huggingface.co/cmarkea/bloomz-560m-nli) | 71.13 | 46.3 |
|
| 79 |
+
| [cmarkea/bloomz-3b-nli](https://huggingface.co/cmarkea/bloomz-3b-nli) | 89.06 | 78.10 |
|
| 80 |
+
| [cmarkea/bloomz-7b1-mt-nli](https://huggingface.co/cmarkea/bloomz-7b1-mt-nli) | 95.12 | 90.27 |
|
| 81 |
+
|
| 82 |
+
# How to use Bloomz-7b1-mt-NLI
|
| 83 |
+
|
| 84 |
+
```python
|
| 85 |
+
from transformers import pipeline
|
| 86 |
+
|
| 87 |
+
classifier = pipeline(
|
| 88 |
+
task='zero-shot-classification',
|
| 89 |
+
model="cmarkea/bloomz-7b1-mt-nli"
|
| 90 |
+
)
|
| 91 |
+
result = classifier (
|
| 92 |
+
sequences="Le style très cinéphile de Quentin Tarantino "
|
| 93 |
+
"se reconnaît entre autres par sa narration postmoderne "
|
| 94 |
+
"et non linéaire, ses dialogues travaillés souvent "
|
| 95 |
+
"émaillés de références à la culture populaire, et ses "
|
| 96 |
+
"scènes hautement esthétiques mais d'une violence "
|
| 97 |
+
"extrême, inspirées de films d'exploitation, d'arts "
|
| 98 |
+
"martiaux ou de western spaghetti.",
|
| 99 |
+
candidate_labels="cinéma, technologie, littérature, politique",
|
| 100 |
+
hypothesis_template="Ce texte parle de {}."
|
| 101 |
+
)
|
| 102 |
+
|
| 103 |
+
result
|
| 104 |
+
{"labels": ["cinéma",
|
| 105 |
+
"littérature",
|
| 106 |
+
"technologie",
|
| 107 |
+
"politique"],
|
| 108 |
+
"scores": [0.8745610117912292,
|
| 109 |
+
0.10403601825237274,
|
| 110 |
+
0.014962797053158283,
|
| 111 |
+
0.0064402492716908455]}
|
| 112 |
+
|
| 113 |
+
# Resilience in cross-language French/English context
|
| 114 |
+
result = classifier (
|
| 115 |
+
sequences="Quentin Tarantino's very cinephile style is "
|
| 116 |
+
"recognized, among other things, by his postmodern and "
|
| 117 |
+
"non-linear narration, his elaborate dialogues often "
|
| 118 |
+
"peppered with references to popular culture, and his "
|
| 119 |
+
"highly aesthetic but extremely violent scenes, inspired by "
|
| 120 |
+
"exploitation films, martial arts or spaghetti western.",
|
| 121 |
+
candidate_labels="cinéma, technologie, littérature, politique",
|
| 122 |
+
hypothesis_template="Ce texte parle de {}."
|
| 123 |
+
)
|
| 124 |
+
|
| 125 |
+
result
|
| 126 |
+
{"labels": ["cinéma",
|
| 127 |
+
"littérature",
|
| 128 |
+
"technologie",
|
| 129 |
+
"politique"],
|
| 130 |
+
"scores": [0.9314399361610413,
|
| 131 |
+
0.04960821941494942,
|
| 132 |
+
0.013468802906572819,
|
| 133 |
+
0.005483036395162344]}
|
| 134 |
+
```
|