shallowMind

community

AI & ML interests

Repo for my AI projects. Just like DeepMind, but way more stupid

Organization Card

ShallowMind - Just like DeepMind, but way more stupid🧠

Hi there! My name is Alessandro, i'm a ai research engineer. ShallowMind is my workspace for training and experimenting with language models.
The name is playful, but the goal is straightforward: to build increasingly capable models while exploring new ideas in pretraining and reasoning.

Research Interests

Information-theoretic pretraining
Looking at ways to identify and prioritize the most informative tokens, to see whether current scaling laws can be adjusted. (Work in progress — I’ll share results once experiments are further along.)
Reasoning models
Testing approaches that improve step-by-step and compositional reasoning.
Architectural variations
Extending my training pipeline to support Mixture-of-Experts (MoE) and other non-standard components.

Current Work

Built a custom pre-training pipeline and pre-trained a first model from scratch (~1B scale) as a proof of concept.
Iterating on the pipeline to add MoE layers and information-gain–based logic.
Next steps:
- Fine-tune the first model into Promptasaurus-Zero.
- Train Blahblahthron-7B as a larger-scale follow-up experiment.

Roadmap

Share ablations and code from early experiments.
Scale training to larger models.
Document results on token selection and reasoning tasks.

models 1

ShallowMind-abeat/blahblahthron-1.1b

1B • Updated Oct 11 • 4

datasets 0

None public yet