EXAONE-Deep-7.8B / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Add paper link, abstract, and Github link
6612c81 verified
|
raw
history blame
2.41 kB
metadata
base_model: LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
language:
  - en
  - ko
library_name: transformers
license: other
pipeline_tag: text-generation
tags:
  - lg-ai
  - exaone
  - exaone-deep
base_model_relation: finetune


EXAONE-Deep-7.8B

Introduction

This repository contains the model described in the paper EXAONE Deep: Reasoning Enhanced Language Models.

We introduce EXAONE Deep, which exhibits superior capabilities in various reasoning tasks including math and coding benchmarks, ranging from 2.4B to 32B parameters developed and released by LG AI Research. Evaluation results show that 1) EXAONE Deep 2.4B outperforms other models of comparable size, 2) EXAONE Deep 7.8B outperforms not only open-weight models of comparable scale but also a proprietary reasoning model OpenAI o1-mini, and 3) EXAONE Deep 32B demonstrates competitive performance against leading open-weight models.

Paper Abstract

We present EXAONE Deep series, which exhibits superior capabilities in various reasoning tasks, including math and coding benchmarks. We train our models mainly on the reasoning-specialized dataset that incorporates long streams of thought processes. Evaluation results show that our smaller models, EXAONE Deep 2.4B and 7.8B, outperform other models of comparable size, while the largest model, EXAONE Deep 32B, demonstrates competitive performance against leading open-weight models. All EXAONE Deep models are openly available for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE

For more details, please refer to our paper, blog and GitHub.

This repository contains the reasoning 7.8B language model with the following features:

  • Number of Parameters (without embeddings): 6.98B
  • Number of Layers: 32
  • Number of Attention Heads: GQA with 32 Q-heads and 8 KV-heads
  • Vocab Size: 102,400
  • Context Length: 32,768 tokens

Quickstart

We recommend to use transformers v4.43.1 or later.

[Remaining content as is from the original model card]