Spaces:

ArdaKaratas
/

arya

Running

App Files Files Community

arya / README.md

ArdaKaratas

Update README.md

39ea998 verified 9 days ago

preview code

raw

history blame contribute delete

4.11 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: GAIA Agent
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false

🤖 GAIA Agent

A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities.

Overview

This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions.

Features

The GAIA Agent has access to the following tools:

Web Search (DuckDuckGo): Search the web for latest information
Code Interpreter: Execute Python code for calculations and data processing
Image Processing: Analyze images from URLs
Weather Information: Get weather data for any location
Hub Statistics: Fetch model statistics from Hugging Face Hub

Architecture

Framework: smolagents
Model: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free)
Planning: Enabled with interval of 3 steps
Base Tools: Additional base tools enabled

Project Structure

agent_hugging/
├── agent.py              # Main agent implementation
├── app.py                # Gradio interface for interaction
├── code_interpreter.py   # Python code execution tool
├── image_processing.py   # Image analysis tool
├── tools.py              # Custom tools (search, weather, hub stats)
├── system_prompt.txt     # System prompt for the agent
├── requirements.txt      # Python dependencies
└── README.md             # This file

Setup

Install dependencies:

pip install -r requirements.txt

Set environment variables:

export OPENROUTER_API_KEY="your-api-key-here"
export HF_TOKEN="your-huggingface-token"  # Optional, for Hugging Face Hub operations
export HF_USERNAME="ArdaKaratas"  # Optional, defaults to ArdaKaratas
export HF_SPACE_NAME="agent_hugging"  # Optional, defaults to agent_hugging

Get a free API key from: https://openrouter.ai/keys Get your Hugging Face token from: https://huggingface.co/settings/tokens

Run the agent:

python agent.py

Launch the Gradio interface:

python app.py

Usage

Testing a Single Question

Use the "Test Single Question" tab in the Gradio interface to:

Enter a question manually
Fetch a random question from the benchmark
Get the agent's answer

Submitting All Answers

Use the "Submit All Answers" tab to:

Enter your Hugging Face username
Optionally provide your Space code link
Click "Process & Submit All Questions"
View the submission status and results

Viewing Questions

Use the "View All Questions" tab to browse all GAIA benchmark questions.

API Integration

The app connects to the scoring API at: https://agents-course-unit4-scoring.hf.space

Endpoints:

GET /questions: Retrieve all questions
GET /random-question: Get a random question
POST /submit: Submit answers for scoring

Metadata.jsonl Support

The project includes metadata.jsonl which contains GAIA benchmark questions and their correct answers. This file is used for:

Testing & Validation: Compare agent answers with correct answers from metadata
Debugging: See expected answers when testing the agent
Development: Understand question patterns and expected answer formats

Note: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes.

Notes

The agent returns answers directly without "FINAL ANSWER" prefix
Answers are compared using exact match
Make sure your Space is public for verification
The code interpreter has security restrictions to prevent dangerous operations
Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers

License

This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment.