ArdaKaratas commited on
Commit
39ea998
Β·
verified Β·
1 Parent(s): a4d2216

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +129 -130
README.md CHANGED
@@ -1,130 +1,129 @@
1
- ---
2
- title: GAIA Agent
3
- emoji: πŸ€–
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: "4.0.0"
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- # πŸ€– GAIA Agent
13
-
14
- A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities.
15
-
16
- ## Overview
17
-
18
- This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions.
19
-
20
- ## Features
21
-
22
- The GAIA Agent has access to the following tools:
23
-
24
- - **Web Search** (DuckDuckGo): Search the web for latest information
25
- - **Code Interpreter**: Execute Python code for calculations and data processing
26
- - **Image Processing**: Analyze images from URLs
27
- - **Weather Information**: Get weather data for any location
28
- - **Hub Statistics**: Fetch model statistics from Hugging Face Hub
29
-
30
- ## Architecture
31
-
32
- - **Framework**: smolagents
33
- - **Model**: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free)
34
- - **Planning**: Enabled with interval of 3 steps
35
- - **Base Tools**: Additional base tools enabled
36
-
37
- ## Project Structure
38
-
39
- ```
40
- agent_hugging/
41
- β”œβ”€β”€ agent.py # Main agent implementation
42
- β”œβ”€β”€ app.py # Gradio interface for interaction
43
- β”œβ”€β”€ code_interpreter.py # Python code execution tool
44
- β”œβ”€β”€ image_processing.py # Image analysis tool
45
- β”œβ”€β”€ tools.py # Custom tools (search, weather, hub stats)
46
- β”œβ”€β”€ system_prompt.txt # System prompt for the agent
47
- β”œβ”€β”€ requirements.txt # Python dependencies
48
- └── README.md # This file
49
- ```
50
-
51
- ## Setup
52
-
53
- 1. **Install dependencies:**
54
- ```bash
55
- pip install -r requirements.txt
56
- ```
57
-
58
- 2. **Set environment variables:**
59
- ```bash
60
- export OPENROUTER_API_KEY="your-api-key-here"
61
- export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations
62
- export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas
63
- export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging
64
- ```
65
-
66
- Get a free API key from: https://openrouter.ai/keys
67
- Get your Hugging Face token from: https://huggingface.co/settings/tokens
68
-
69
- 3. **Run the agent:**
70
- ```bash
71
- python agent.py
72
- ```
73
-
74
- 4. **Launch the Gradio interface:**
75
- ```bash
76
- python app.py
77
- ```
78
-
79
- ## Usage
80
-
81
- ### Testing a Single Question
82
-
83
- Use the "Test Single Question" tab in the Gradio interface to:
84
- - Enter a question manually
85
- - Fetch a random question from the benchmark
86
- - Get the agent's answer
87
-
88
- ### Submitting All Answers
89
-
90
- Use the "Submit All Answers" tab to:
91
- 1. Enter your Hugging Face username
92
- 2. Optionally provide your Space code link
93
- 3. Click "Process & Submit All Questions"
94
- 4. View the submission status and results
95
-
96
- ### Viewing Questions
97
-
98
- Use the "View All Questions" tab to browse all GAIA benchmark questions.
99
-
100
- ## API Integration
101
-
102
- The app connects to the scoring API at: `https://agents-course-unit4-scoring.hf.space`
103
-
104
- Endpoints:
105
- - `GET /questions`: Retrieve all questions
106
- - `GET /random-question`: Get a random question
107
- - `POST /submit`: Submit answers for scoring
108
-
109
- ## Metadata.jsonl Support
110
-
111
- The project includes `metadata.jsonl` which contains GAIA benchmark questions and their correct answers. This file is used for:
112
-
113
- 1. **Testing & Validation**: Compare agent answers with correct answers from metadata
114
- 2. **Debugging**: See expected answers when testing the agent
115
- 3. **Development**: Understand question patterns and expected answer formats
116
-
117
- **Note**: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes.
118
-
119
- ## Notes
120
-
121
- - The agent returns answers directly without "FINAL ANSWER" prefix
122
- - Answers are compared using exact match
123
- - Make sure your Space is public for verification
124
- - The code interpreter has security restrictions to prevent dangerous operations
125
- - Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers
126
-
127
- ## License
128
-
129
- This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment.
130
-
 
1
+ ---
2
+ title: GAIA Agent
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 6.0.2
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # πŸ€– GAIA Agent
13
+
14
+ A sophisticated AI agent designed to solve GAIA (General AI Assistants) benchmark questions using multiple tools and capabilities.
15
+
16
+ ## Overview
17
+
18
+ This agent is built to tackle the GAIA benchmark, which tests AI systems on real-world tasks requiring reasoning, multi-modal understanding, web browsing, and tool usage. The agent combines multiple tools to provide accurate answers to complex questions.
19
+
20
+ ## Features
21
+
22
+ The GAIA Agent has access to the following tools:
23
+
24
+ - **Web Search** (DuckDuckGo): Search the web for latest information
25
+ - **Code Interpreter**: Execute Python code for calculations and data processing
26
+ - **Image Processing**: Analyze images from URLs
27
+ - **Weather Information**: Get weather data for any location
28
+ - **Hub Statistics**: Fetch model statistics from Hugging Face Hub
29
+
30
+ ## Architecture
31
+
32
+ - **Framework**: smolagents
33
+ - **Model**: OpenRouter API (meta-llama/llama-3.3-70b-instruct:free)
34
+ - **Planning**: Enabled with interval of 3 steps
35
+ - **Base Tools**: Additional base tools enabled
36
+
37
+ ## Project Structure
38
+
39
+ ```
40
+ agent_hugging/
41
+ β”œβ”€β”€ agent.py # Main agent implementation
42
+ β”œβ”€β”€ app.py # Gradio interface for interaction
43
+ β”œβ”€β”€ code_interpreter.py # Python code execution tool
44
+ β”œβ”€β”€ image_processing.py # Image analysis tool
45
+ β”œβ”€β”€ tools.py # Custom tools (search, weather, hub stats)
46
+ β”œβ”€β”€ system_prompt.txt # System prompt for the agent
47
+ β”œβ”€β”€ requirements.txt # Python dependencies
48
+ └── README.md # This file
49
+ ```
50
+
51
+ ## Setup
52
+
53
+ 1. **Install dependencies:**
54
+ ```bash
55
+ pip install -r requirements.txt
56
+ ```
57
+
58
+ 2. **Set environment variables:**
59
+ ```bash
60
+ export OPENROUTER_API_KEY="your-api-key-here"
61
+ export HF_TOKEN="your-huggingface-token" # Optional, for Hugging Face Hub operations
62
+ export HF_USERNAME="ArdaKaratas" # Optional, defaults to ArdaKaratas
63
+ export HF_SPACE_NAME="agent_hugging" # Optional, defaults to agent_hugging
64
+ ```
65
+
66
+ Get a free API key from: https://openrouter.ai/keys
67
+ Get your Hugging Face token from: https://huggingface.co/settings/tokens
68
+
69
+ 3. **Run the agent:**
70
+ ```bash
71
+ python agent.py
72
+ ```
73
+
74
+ 4. **Launch the Gradio interface:**
75
+ ```bash
76
+ python app.py
77
+ ```
78
+
79
+ ## Usage
80
+
81
+ ### Testing a Single Question
82
+
83
+ Use the "Test Single Question" tab in the Gradio interface to:
84
+ - Enter a question manually
85
+ - Fetch a random question from the benchmark
86
+ - Get the agent's answer
87
+
88
+ ### Submitting All Answers
89
+
90
+ Use the "Submit All Answers" tab to:
91
+ 1. Enter your Hugging Face username
92
+ 2. Optionally provide your Space code link
93
+ 3. Click "Process & Submit All Questions"
94
+ 4. View the submission status and results
95
+
96
+ ### Viewing Questions
97
+
98
+ Use the "View All Questions" tab to browse all GAIA benchmark questions.
99
+
100
+ ## API Integration
101
+
102
+ The app connects to the scoring API at: `https://agents-course-unit4-scoring.hf.space`
103
+
104
+ Endpoints:
105
+ - `GET /questions`: Retrieve all questions
106
+ - `GET /random-question`: Get a random question
107
+ - `POST /submit`: Submit answers for scoring
108
+
109
+ ## Metadata.jsonl Support
110
+
111
+ The project includes `metadata.jsonl` which contains GAIA benchmark questions and their correct answers. This file is used for:
112
+
113
+ 1. **Testing & Validation**: Compare agent answers with correct answers from metadata
114
+ 2. **Debugging**: See expected answers when testing the agent
115
+ 3. **Development**: Understand question patterns and expected answer formats
116
+
117
+ **Note**: In production, the agent generates its own answers. The metadata is only used for comparison and validation purposes.
118
+
119
+ ## Notes
120
+
121
+ - The agent returns answers directly without "FINAL ANSWER" prefix
122
+ - Answers are compared using exact match
123
+ - Make sure your Space is public for verification
124
+ - The code interpreter has security restrictions to prevent dangerous operations
125
+ - Use the "Compare with metadata.jsonl" checkbox in the test interface to see how your agent's answers compare to the correct answers
126
+
127
+ ## License
128
+
129
+ This project is part of the Hugging Face AI Agents Course - Unit 4 Final Assignment.