Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.10622

High Low Media's AI/ML bookshelf

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Paper • 1502.01852 • Published Feb 6, 2015 • 1
Deep Residual Learning for Image Recognition

Paper • 1512.03385 • Published Dec 10, 2015 • 9
Focal Loss for Dense Object Detection

Paper • 1708.02002 • Published Aug 7, 2017
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers

Paper • 2409.20537 • Published Sep 30, 2024 • 14

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 37
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 174
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 145

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

model-base-structure

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 132

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

Fun journal papers Ive read

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

Transformer architecture improvements

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

High Low Media's AI/ML bookshelf

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Paper • 1502.01852 • Published Feb 6, 2015 • 1
Deep Residual Learning for Image Recognition

Paper • 1512.03385 • Published Dec 10, 2015 • 9
Focal Loss for Dense Object Detection

Paper • 1708.02002 • Published Aug 7, 2017
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers

Paper • 2409.20537 • Published Sep 30, 2024 • 14

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 37
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 174
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Forgetting Transformer: Softmax Attention with a Forget Gate

Paper • 2503.02130 • Published Mar 3 • 32

Fun journal papers Ive read

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 145

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

model-base-structure

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171
Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 132

Transformer architecture improvements

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 171

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs