joellidin commited on
Commit
dabf08d
·
verified ·
1 Parent(s): 45dd831

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -1
README.md CHANGED
@@ -1,3 +1,70 @@
 
 
 
 
1
  ---
2
- license: mit
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Templar-I: Permissionless Distributed Training
2
+
3
+ > A 1.2B-parameter causal language model trained with **Gauntlet**, an incentive system that rewards permissionless contributors for useful pseudo-gradients on the Bittensor network. [[Paper]](https://arxiv.org/abs/2505.21684)
4
+
5
  ---
6
+ ## Overview
7
+
8
+ * **Setting:** Fully open, permissionless, internet-scale training; no control over who registers or their hardware.
9
+ * **Mechanism:** Two-stage peer filtering (uptime/reliability/sync) + scoring per-peer gradient quality.
10
+ * **Run:** 20K communication rounds; FineWebEdu data; top **15** peers aggregated per round with up to 250 registered peers.
11
+ * **Result:** On a per-iteration basis, convergence outpaced a centralized AdamW baseline; downstream metrics are competitive.
12
+
13
  ---
14
+
15
+ ## Gauntlet
16
+
17
+ * **Stage 1:** Filters peers by uptime, reliability, and synchronization.
18
+ * **Stage 2:** Estimates loss before/after applying each peer’s pseudo-gradients to evaluate its contribution.
19
+ * **Ratings:** Uses **OpenSkill** to track competitiveness across time.
20
+ * **Aggregation:** In each round, aggregate updates from the top-scoring **G=15** peers.
21
+
22
+ ---
23
+
24
+ ## Training setup
25
+
26
+ * **Data:** FineWeb-edu \[11].
27
+ * **Rounds:** 20,000 communication rounds (evaluation windows matched rounds).
28
+ * **Tokens:** 100-200B
29
+ * **Baseline for comparison:** Centralized AdamW trained for 120B tokens.
30
+
31
+ ---
32
+
33
+ ## Quickstart
34
+
35
+ ```python
36
+ from transformers import AutoTokenizer, AutoModelForCausalLM
37
+ import torch
38
+
39
+ model_id = "tplr/TEMPLAR-I"
40
+ tok = AutoTokenizer.from_pretrained(model_id)
41
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Results
47
+ ### Downstream Benchmarks (zero-shot)
48
+ | Model | Dataset | Tokens | HellaSwag (acc_norm) | PIQA (acc_norm) | ARC-E (acc) |
49
+ |-----------------|-------------|------------|----------------------:|----------------:|------------:|
50
+ | TEMPLAR-1B | FineWebEdu | 100B–200B | 51.0 | 71.4 | 59.2 |
51
+ | DeMo 1B [12] | Dolmo | 100B | 48.0 | 70.0 | 55.0 |
52
+ | AdamW DDP 1B | FineWebEdu | 120B | 51.0 | 71.9 | 58.9 |
53
+
54
+ ### Per-Iteration Loss
55
+ ![Training loss](./figures/per_iteration_loss.png)
56
+
57
+ ---
58
+
59
+ ## Citation
60
+
61
+ If you use this model or Gauntlet, please cite it as follows:
62
+
63
+ ```
64
+ @article{lidin2025incentivizing,
65
+ title={Incentivizing Permissionless Distributed Learning of LLMs},
66
+ author={Lidin, Joel and Sarfi, Amir and Pappas, Evangelos and Dare, Samuel and Belilovsky, Eugene and Steeves, Jacob},
67
+ journal={arXiv preprint arXiv:2505.21684},
68
+ year={2025}
69
+ }
70
+ ```