--- title: GAIA Agent emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 6.0.2 app_file: app.py pinned: false --- # 🤖 GAIA Agent A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities. ## Overview This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions. ## Features The GAIA Agent has access to the following tools: - **Web Search** (DuckDuckGo): Search the web for latest information - **Code Interpreter**: Execute Python code for calculations and data processing - **Image Processing**: Analyze images from URLs - **Weather Information**: Get weather data for any location - **Hub Statistics**: Fetch model statistics from Hugging Face Hub ## Architecture - **Framework**: smolagents - **Model**: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free) - **Planning**: Enabled with interval of 3 steps - **Base Tools**: Additional base tools enabled ## Project Structure ``` agent_hugging/ ├── agent.py # Main agent implementation ├── app.py # Gradio interface for interaction ├── code_interpreter.py # Python code execution tool ├── image_processing.py # Image analysis tool ├── tools.py # Custom tools (search, weather, hub stats) ├── system_prompt.txt # System prompt for the agent ├── requirements.txt # Python dependencies └── README.md # This file ``` ## Setup 1. **Install dependencies:** ```bash pip install -r requirements.txt ``` 2. **Set environment variables:** ```bash export OPENROUTER_API_KEY="your-api-key-here" export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging ``` Get a free API key from: https://openrouter.ai/keys Get your Hugging Face token from: https://huggingface.co/settings/tokens 3. **Run the agent:** ```bash python agent.py ``` 4. **Launch the Gradio interface:** ```bash python app.py ``` ## Usage ### Testing a Single Question Use the "Test Single Question" tab in the Gradio interface to: - Enter a question manually - Fetch a random question from the benchmark - Get the agent's answer ### Submitting All Answers Use the "Submit All Answers" tab to: 1. Enter your Hugging Face username 2. Optionally provide your Space code link 3. Click "Process & Submit All Questions" 4. View the submission status and results ### Viewing Questions Use the "View All Questions" tab to browse all GAIA benchmark questions. ## API Integration The app connects to the scoring API at: `https://agents-course-unit4-scoring.hf.space` Endpoints: - `GET /questions`: Retrieve all questions - `GET /random-question`: Get a random question - `POST /submit`: Submit answers for scoring ## Metadata.jsonl Support The project includes `metadata.jsonl` which contains GAIA benchmark questions and their correct answers. This file is used for: 1. **Testing & Validation**: Compare agent answers with correct answers from metadata 2. **Debugging**: See expected answers when testing the agent 3. **Development**: Understand question patterns and expected answer formats **Note**: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes. ## Notes - The agent returns answers directly without "FINAL ANSWER" prefix - Answers are compared using exact match - Make sure your Space is public for verification - The code interpreter has security restrictions to prevent dangerous operations - Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers ## License This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment.