Spaces:
Running
Running
| title: GAIA Agent | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.0.2 | |
| app_file: app.py | |
| pinned: false | |
| # π€ GAIA Agent | |
| A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities. | |
| ## Overview | |
| This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions. | |
| ## Features | |
| The GAIA Agent has access to the following tools: | |
| - **Web Search** (DuckDuckGo): Search the web for latest information | |
| - **Code Interpreter**: Execute Python code for calculations and data processing | |
| - **Image Processing**: Analyze images from URLs | |
| - **Weather Information**: Get weather data for any location | |
| - **Hub Statistics**: Fetch model statistics from Hugging Face Hub | |
| ## Architecture | |
| - **Framework**: smolagents | |
| - **Model**: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free) | |
| - **Planning**: Enabled with interval of 3 steps | |
| - **Base Tools**: Additional base tools enabled | |
| ## Project Structure | |
| ``` | |
| agent_hugging/ | |
| βββ agent.py # Main agent implementation | |
| βββ app.py # Gradio interface for interaction | |
| βββ code_interpreter.py # Python code execution tool | |
| βββ image_processing.py # Image analysis tool | |
| βββ tools.py # Custom tools (search, weather, hub stats) | |
| βββ system_prompt.txt # System prompt for the agent | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| ``` | |
| ## Setup | |
| 1. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 2. **Set environment variables:** | |
| ```bash | |
| export OPENROUTER_API_KEY="your-api-key-here" | |
| export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations | |
| export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas | |
| export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging | |
| ``` | |
| Get a free API key from: https://openrouter.ai/keys | |
| Get your Hugging Face token from: https://huggingface.co/settings/tokens | |
| 3. **Run the agent:** | |
| ```bash | |
| python agent.py | |
| ``` | |
| 4. **Launch the Gradio interface:** | |
| ```bash | |
| python app.py | |
| ``` | |
| ## Usage | |
| ### Testing a Single Question | |
| Use the "Test Single Question" tab in the Gradio interface to: | |
| - Enter a question manually | |
| - Fetch a random question from the benchmark | |
| - Get the agent's answer | |
| ### Submitting All Answers | |
| Use the "Submit All Answers" tab to: | |
| 1. Enter your Hugging Face username | |
| 2. Optionally provide your Space code link | |
| 3. Click "Process & Submit All Questions" | |
| 4. View the submission status and results | |
| ### Viewing Questions | |
| Use the "View All Questions" tab to browse all GAIA benchmark questions. | |
| ## API Integration | |
| The app connects to the scoring API at: `https://agents-course-unit4-scoring.hf.space` | |
| Endpoints: | |
| - `GET /questions`: Retrieve all questions | |
| - `GET /random-question`: Get a random question | |
| - `POST /submit`: Submit answers for scoring | |
| ## Metadata.jsonl Support | |
| The project includes `metadata.jsonl` which contains GAIA benchmark questions and their correct answers. This file is used for: | |
| 1. **Testing & Validation**: Compare agent answers with correct answers from metadata | |
| 2. **Debugging**: See expected answers when testing the agent | |
| 3. **Development**: Understand question patterns and expected answer formats | |
| **Note**: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes. | |
| ## Notes | |
| - The agent returns answers directly without "FINAL ANSWER" prefix | |
| - Answers are compared using exact match | |
| - Make sure your Space is public for verification | |
| - The code interpreter has security restrictions to prevent dangerous operations | |
| - Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers | |
| ## License | |
| This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment. |