Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
magibu
's Collections
Pretrain Datasets
papers
Ekip karışık verileri
Fine-tuned LLMs
Turkish Language Healthcare Datasets
Pretrain Datasets
updated
about 1 month ago
Datasets we use for pretraining large language models
Upvote
-
omarkamali/wikipedia-monthly
Updated
about 21 hours ago
•
4.11k
•
51
alibayram/hukuk_soru_cevap
Viewer
•
Updated
Nov 6, 2024
•
2.08k
•
17
•
13
umutertugrul/turkish-hospital-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
24.6k
•
56
•
8
umutertugrul/turkish-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
42.8k
•
9
•
3
alibayram/tr-books
Viewer
•
Updated
Dec 17, 2025
•
3.7k
•
3
selimfirat/bilkent-turkish-writings-dataset
Viewer
•
Updated
May 24, 2025
•
25.1k
•
64
•
8
umutertugrul/turkish-academic-theses-dataset
Viewer
•
Updated
Aug 18, 2025
•
649k
•
64
•
8
alibayram/onedio_haberler
Viewer
•
Updated
Jun 18, 2024
•
66.7k
•
6
•
5
habanoz/news-tr-1.8M
Viewer
•
Updated
Oct 6, 2024
•
1.85M
•
93
•
7
alibayram/hepsiburada_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
2.66M
•
13
•
13
alibayram/kitapyurdu_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
405k
•
30
alibayram/beyazperde_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
192k
•
13
•
5
BILGEM-AI/BILGE-Synthetic-Stories
Viewer
•
Updated
Nov 20, 2025
•
2.87M
•
373
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections