Update README.md
Browse files
README.md
CHANGED
|
@@ -69,7 +69,7 @@ print(output)
|
|
| 69 |
|
| 70 |
## Training Data
|
| 71 |
|
| 72 |
-
Karamaru was trained using a custom Edo-period text dataset totaling approximately
|
| 73 |
1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
|
| 74 |
2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
|
| 75 |
3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).
|
|
|
|
| 69 |
|
| 70 |
## Training Data
|
| 71 |
|
| 72 |
+
Karamaru was trained using a custom Edo-period text dataset totaling approximately 25 million characters.
|
| 73 |
1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
|
| 74 |
2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
|
| 75 |
3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).
|