Spaces:
Running
A newer version of the Gradio SDK is available:
6.1.0
title: GAIA Agent
emoji: π€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
π€ GAIA Agent
A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities.
Overview
This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions.
Features
The GAIA Agent has access to the following tools:
- Web Search (DuckDuckGo): Search the web for latest information
- Code Interpreter: Execute Python code for calculations and data processing
- Image Processing: Analyze images from URLs
- Weather Information: Get weather data for any location
- Hub Statistics: Fetch model statistics from Hugging Face Hub
Architecture
- Framework: smolagents
- Model: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free)
- Planning: Enabled with interval of 3 steps
- Base Tools: Additional base tools enabled
Project Structure
agent_hugging/
βββ agent.py # Main agent implementation
βββ app.py # Gradio interface for interaction
βββ code_interpreter.py # Python code execution tool
βββ image_processing.py # Image analysis tool
βββ tools.py # Custom tools (search, weather, hub stats)
βββ system_prompt.txt # System prompt for the agent
βββ requirements.txt # Python dependencies
βββ README.md # This file
Setup
- Install dependencies:
pip install -r requirements.txt
- Set environment variables:
export OPENROUTER_API_KEY="your-api-key-here"
export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations
export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas
export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging
Get a free API key from: https://openrouter.ai/keys Get your Hugging Face token from: https://huggingface.co/settings/tokens
- Run the agent:
python agent.py
- Launch the Gradio interface:
python app.py
Usage
Testing a Single Question
Use the "Test Single Question" tab in the Gradio interface to:
- Enter a question manually
- Fetch a random question from the benchmark
- Get the agent's answer
Submitting All Answers
Use the "Submit All Answers" tab to:
- Enter your Hugging Face username
- Optionally provide your Space code link
- Click "Process & Submit All Questions"
- View the submission status and results
Viewing Questions
Use the "View All Questions" tab to browse all GAIA benchmark questions.
API Integration
The app connects to the scoring API at: https://agents-course-unit4-scoring.hf.space
Endpoints:
GET /questions: Retrieve all questionsGET /random-question: Get a random questionPOST /submit: Submit answers for scoring
Metadata.jsonl Support
The project includes metadata.jsonl which contains GAIA benchmark questions and their correct answers. This file is used for:
- Testing & Validation: Compare agent answers with correct answers from metadata
- Debugging: See expected answers when testing the agent
- Development: Understand question patterns and expected answer formats
Note: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes.
Notes
- The agent returns answers directly without "FINAL ANSWER" prefix
- Answers are compared using exact match
- Make sure your Space is public for verification
- The code interpreter has security restrictions to prevent dangerous operations
- Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers
License
This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment.