Spaces:
Sleeping
Sleeping
File size: 4,106 Bytes
39ea998 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
---
title: GAIA Agent
emoji: π€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
---
# π€ GAIA Agent
A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities.
## Overview
This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions.
## Features
The GAIA Agent has access to the following tools:
- **Web Search** (DuckDuckGo): Search the web for latest information
- **Code Interpreter**: Execute Python code for calculations and data processing
- **Image Processing**: Analyze images from URLs
- **Weather Information**: Get weather data for any location
- **Hub Statistics**: Fetch model statistics from Hugging Face Hub
## Architecture
- **Framework**: smolagents
- **Model**: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free)
- **Planning**: Enabled with interval of 3 steps
- **Base Tools**: Additional base tools enabled
## Project Structure
```
agent_hugging/
βββ agent.py # Main agent implementation
βββ app.py # Gradio interface for interaction
βββ code_interpreter.py # Python code execution tool
βββ image_processing.py # Image analysis tool
βββ tools.py # Custom tools (search, weather, hub stats)
βββ system_prompt.txt # System prompt for the agent
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
## Setup
1. **Install dependencies:**
```bash
pip install -r requirements.txt
```
2. **Set environment variables:**
```bash
export OPENROUTER_API_KEY="your-api-key-here"
export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations
export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas
export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging
```
Get a free API key from: https://openrouter.ai/keys
Get your Hugging Face token from: https://huggingface.co/settings/tokens
3. **Run the agent:**
```bash
python agent.py
```
4. **Launch the Gradio interface:**
```bash
python app.py
```
## Usage
### Testing a Single Question
Use the "Test Single Question" tab in the Gradio interface to:
- Enter a question manually
- Fetch a random question from the benchmark
- Get the agent's answer
### Submitting All Answers
Use the "Submit All Answers" tab to:
1. Enter your Hugging Face username
2. Optionally provide your Space code link
3. Click "Process & Submit All Questions"
4. View the submission status and results
### Viewing Questions
Use the "View All Questions" tab to browse all GAIA benchmark questions.
## API Integration
The app connects to the scoring API at: `https://agents-course-unit4-scoring.hf.space`
Endpoints:
- `GET /questions`: Retrieve all questions
- `GET /random-question`: Get a random question
- `POST /submit`: Submit answers for scoring
## Metadata.jsonl Support
The project includes `metadata.jsonl` which contains GAIA benchmark questions and their correct answers. This file is used for:
1. **Testing & Validation**: Compare agent answers with correct answers from metadata
2. **Debugging**: See expected answers when testing the agent
3. **Development**: Understand question patterns and expected answer formats
**Note**: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes.
## Notes
- The agent returns answers directly without "FINAL ANSWER" prefix
- Answers are compared using exact match
- Make sure your Space is public for verification
- The code interpreter has security restrictions to prevent dangerous operations
- Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers
## License
This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment. |