Fix test set result
Browse files
README.md
CHANGED
|
@@ -17,13 +17,13 @@ model-index:
|
|
| 17 |
revision: 2814b78e7af4b5a1f1886fe7ad49632de4d9dd25
|
| 18 |
metrics:
|
| 19 |
- type: f1
|
| 20 |
-
value: 0.
|
| 21 |
name: F1
|
| 22 |
- type: precision
|
| 23 |
-
value: 0.
|
| 24 |
name: Precision
|
| 25 |
- type: recall
|
| 26 |
-
value: 0.
|
| 27 |
name: Recall
|
| 28 |
license: apache-2.0
|
| 29 |
datasets:
|
|
@@ -52,13 +52,23 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 52 |
# span-marker-bert-base-multilingual-cased-multinerd
|
| 53 |
|
| 54 |
This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on an [Babelscape/multinerd](https://huggingface.co/datasets/Babelscape/multinerd) dataset.
|
| 55 |
-
It achieves the following results on the
|
| 56 |
- Loss: 0.0049
|
| 57 |
- Overall Precision: 0.9242
|
| 58 |
- Overall Recall: 0.9281
|
| 59 |
- Overall F1: 0.9261
|
| 60 |
- Overall Accuracy: 0.9852
|
| 61 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
This is a replication of Tom's work. Everything remains unchanged,
|
| 64 |
except that we extended the number of training epochs to 3 for a
|
|
|
|
| 17 |
revision: 2814b78e7af4b5a1f1886fe7ad49632de4d9dd25
|
| 18 |
metrics:
|
| 19 |
- type: f1
|
| 20 |
+
value: 0.9270
|
| 21 |
name: F1
|
| 22 |
- type: precision
|
| 23 |
+
value: 0.9281
|
| 24 |
name: Precision
|
| 25 |
- type: recall
|
| 26 |
+
value: 0.9259
|
| 27 |
name: Recall
|
| 28 |
license: apache-2.0
|
| 29 |
datasets:
|
|
|
|
| 52 |
# span-marker-bert-base-multilingual-cased-multinerd
|
| 53 |
|
| 54 |
This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on an [Babelscape/multinerd](https://huggingface.co/datasets/Babelscape/multinerd) dataset.
|
| 55 |
+
It achieves the following results on the evaluation set:
|
| 56 |
- Loss: 0.0049
|
| 57 |
- Overall Precision: 0.9242
|
| 58 |
- Overall Recall: 0.9281
|
| 59 |
- Overall F1: 0.9261
|
| 60 |
- Overall Accuracy: 0.9852
|
| 61 |
|
| 62 |
+
Test set results:
|
| 63 |
+
- test_loss: 0.005226554349064827,
|
| 64 |
+
- test_overall_accuracy: 0.9851129807294873,
|
| 65 |
+
- test_overall_f1: 0.9270450073152169,
|
| 66 |
+
- test_overall_precision: 0.9281906912835416,
|
| 67 |
+
- test_overall_recall: 0.9259021481405626,
|
| 68 |
+
- test_runtime: 2690.9722,
|
| 69 |
+
- test_samples_per_second: 150.748,
|
| 70 |
+
- test_steps_per_second: 4.711
|
| 71 |
+
|
| 72 |
|
| 73 |
This is a replication of Tom's work. Everything remains unchanged,
|
| 74 |
except that we extended the number of training epochs to 3 for a
|