Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
e540aa6
tutorial
recursix Aug 8, 2025
68ba773
tutorial
recursix Aug 8, 2025
19e1774
Update readme to include test note
recursix Aug 8, 2025
39850df
update toml to dynamic requirements and add uv.lock file
amanjaiswal73892 Aug 8, 2025
882a828
Add tutorial to setup python env with uv
amanjaiswal73892 Aug 8, 2025
3249687
tutorial 2
recursix Aug 8, 2025
12018d3
Merge branch 'tutorial' of https://github.com/ServiceNow/AgentLab int…
recursix Aug 8, 2025
d61e8a1
Update dependencies in pyproject.toml and uv.lock to allow for newer …
amanjaiswal73892 Aug 8, 2025
bae21b9
Implement code changes to enhance functionality and improve performance
recursix Aug 8, 2025
108ad77
Merge branch 'tutorial' of https://github.com/ServiceNow/AgentLab int…
recursix Aug 8, 2025
57c10ad
Fix tutorial instructions by moving git clone and cd commands to the …
recursix Aug 8, 2025
69088d0
Refactor tutorial content and remove commented-out dependencies in py…
recursix Aug 11, 2025
6121ea9
add instruction to activate the env
amanjaiswal73892 Aug 11, 2025
af237a3
Add support for GPT-5 models and update tutorial instructions
recursix Aug 11, 2025
f454098
Update OpenAI API Key instructions in tutorial
recursix Aug 11, 2025
ae7a02d
Refactor tutorial headings for consistency and clarity
recursix Aug 11, 2025
8f90090
add oai oss and gpt-5 models
amanjaiswal73892 Aug 7, 2025
032e893
Update deperecated param `max_tokens`-> `max_completion_tokens` in ch…
amanjaiswal73892 Aug 7, 2025
101d2c9
add OpenRouter versions of gpt 5 model series.
amanjaiswal73892 Aug 7, 2025
e79fb28
port o3 model to openrouter
amanjaiswal73892 Aug 11, 2025
7740643
update response api test
amanjaiswal73892 Aug 11, 2025
cf6826f
remove deprecated o1-mini model from main.py
amanjaiswal73892 Aug 11, 2025
acc74e8
Add Gpt-5-nano in tool-use-agent
amanjaiswal73892 Aug 11, 2025
9c8e7e3
fix GPT 5 mini and nano config
amanjaiswal73892 Aug 11, 2025
99e69aa
Add litellm pricing as a backup princing backend.
amanjaiswal73892 Aug 11, 2025
752a485
Add GPT-5 mini agent
amanjaiswal73892 Aug 11, 2025
53b74ad
Add GPT-5-Mini to agentlab-assistant.
amanjaiswal73892 Aug 11, 2025
963c13f
Add initial readme for prompt injection tutorial
recursix Aug 11, 2025
ef50be5
add ipykernal and dot_env to dependency
amanjaiswal73892 Aug 11, 2025
03ed6d3
add notebook to setup miniwob and launch experiments.
amanjaiswal73892 Aug 11, 2025
7e177bf
update formatting in launch_experiments.ipynb
amanjaiswal73892 Aug 11, 2025
13e8f98
update readme in 2_eval_on_miniwob
amanjaiswal73892 Aug 11, 2025
00db3ca
update readme for 2_eval_on_miniwob and grammar fix.
amanjaiswal73892 Aug 11, 2025
c7559c4
grammar fix readme tutorial 2.
amanjaiswal73892 Aug 11, 2025
3710519
Add prompt injection tutorials and update attack scenarios
recursix Aug 12, 2025
605a6a5
update T1 readme with a note to install additional playwright deps.
amanjaiswal73892 Aug 12, 2025
bbeef14
Update readme.md
recursix Aug 12, 2025
501c8c3
Update readme.md
recursix Aug 12, 2025
97e89b3
Update readme.md
recursix Aug 12, 2025
b6f1062
clear output
recursix Aug 12, 2025
158210c
Merge branch 'tutorial' of https://github.com/ServiceNow/AgentLab int…
recursix Aug 12, 2025
8b7ab0d
add miniwob automatic install in agentlab.
amanjaiswal73892 Aug 13, 2025
3591c8f
update experiment.py to include miniwob auto-install and envars expo…
amanjaiswal73892 Aug 13, 2025
5429aca
black refactor agent-config.py
amanjaiswal73892 Aug 13, 2025
ad09e2a
Add cmd to checkout tutorial branch
amanjaiswal73892 Aug 13, 2025
a25b291
remove launch_experiment notebook from T2
amanjaiswal73892 Aug 13, 2025
87351f4
minor fixes in T1 read me and spell check,
amanjaiswal73892 Aug 13, 2025
e7785d7
update CI/CD to use uv
amanjaiswal73892 Aug 13, 2025
1c198fb
merge with main
amanjaiswal73892 Aug 13, 2025
450105d
Implement code changes to enhance functionality and improve performance
recursix Aug 13, 2025
3e4b72e
Update README and experiment script for clarity and consistency
recursix Aug 13, 2025
5c96a6b
Fix stale tests.
amanjaiswal73892 Aug 13, 2025
e440c44
fix stale test
amanjaiswal73892 Aug 13, 2025
b566211
add darglint as dev dependency
amanjaiswal73892 Aug 13, 2025
179b1f4
update CI/CD for uv.
amanjaiswal73892 Aug 13, 2025
43789ed
update CI/CD apply formatting only src.
amanjaiswal73892 Aug 13, 2025
2ed6a1c
update darglint to be run from py3.12
amanjaiswal73892 Aug 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions .github/workflows/code_format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,19 @@ jobs:
- name: Checkout Repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
python-version: '3.11'
cache: 'pip' # caching pip dependencies
enable-cache: true

- name: Set up Python
run: uv python install 3.11

- name: Pip install
run: pip install -r requirements.txt
- name: Install dependencies
run: uv sync --frozen --extra dev

- name: Pip list
run: pip list
- name: List packages
run: uv pip list

- name: Code Formatting
run: black . --check --diff
run: uv run black src/ --check --diff
20 changes: 11 additions & 9 deletions .github/workflows/darglint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,19 @@ jobs:
- name: Checkout Repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
python-version: '3.12'
cache: 'pip' # caching pip dependencies
enable-cache: true

- name: Set up Python
run: uv python install 3.12 # this fails in 3.11

- name: Pip install
run: pip install darglint
- name: Install dependencies
run: uv sync --frozen --extra dev

- name: Pip list
run: pip list
- name: List packages
run: uv pip list

- name: Darglint checks
run: darglint -v 2 -z short src/
run: uv run darglint -v 2 -z short src/
15 changes: 9 additions & 6 deletions .github/workflows/python_version_compatibility.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,12 @@ jobs:
- name: Check Python ${{ matrix.python-version }}
continue-on-error: true
run: |
export PATH="$HOME/.cargo/bin:$PATH"
if uvx --python ${{ matrix.python-version }} --from python --with-requirements requirements.txt python -c "print('✅ Compatible')"; then
echo "✅ Python ${{ matrix.python-version }} works"
else
echo "❌ Python ${{ matrix.python-version }} incompatible"
fi
export PATH="$HOME/.cargo/bin:$PATH"
uv python install ${{ matrix.python-version }}
if uv sync --frozen --python ${{ matrix.python-version }}; then
uv run -p ${{ matrix.python-version }} python -c "import sys; print('✅ Compatible:', sys.version)"
echo "✅ Python ${{ matrix.python-version }} works"
else
echo "❌ Python ${{ matrix.python-version }} incompatible"
exit 1
fi
24 changes: 13 additions & 11 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,23 +23,25 @@ jobs:
- name: Set up Git user
run: git config --global user.email "not_a_real_email@address.com" && git config --global user.name "GitHub Actions"

- name: Install uv
uses: astral-sh/setup-uv@v4
with:
enable-cache: true

- name: Set up Python
uses: actions/setup-python@v5
with: # python at least 3.11
python-version: '3.11'
cache: 'pip' # caching pip dependencies
run: uv python install 3.11

- name: Install AgentLab
run: pip install -e .
run: uv sync --frozen --extra dev

- name: Pip list
run: pip list
- name: List packages
run: uv pip list

- name: Install Playwright
run: playwright install chromium --with-deps
run: uv run playwright install chromium --with-deps

- name: Download WebArena / VisualWebArena ressource files
run: python -c 'import nltk; nltk.download("punkt_tab")'
run: uv run python -c 'import nltk; nltk.download("punkt_tab")'

- name: Fetch MiniWob
uses: actions/checkout@v4
Expand All @@ -59,9 +61,9 @@ jobs:
run: curl -I "http://localhost:8080/miniwob/" || echo "MiniWob not reachable"

- name: Pre-download nltk ressources
run: python -c "import nltk; nltk.download('punkt_tab')"
run: uv run python -c "import nltk; nltk.download('punkt_tab')"

- name: Run AgentLab Unit Tests
env:
MINIWOB_URL: "http://localhost:8080/miniwob/"
run: pytest -n 5 --durations=10 -m 'not pricy' -v tests/
run: uv run pytest -n 5 --durations=10 -m 'not pricy' -v tests/
4 changes: 4 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,8 @@ sphinx:
# See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html
python:
install:
- method: pip
path: .
extra_requirements:
- dev
- requirements: docs/source/requirements.txt
4 changes: 2 additions & 2 deletions docs/source/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
agentlab
sphinx-rtd-theme
sphinx-rtd-theme
sphinx
2 changes: 1 addition & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
AGENT_4o,
AGENT_4o_MINI,
AGENT_o3_MINI,
AGENT_o1_MINI,
AGENT_37_SONNET,
AGENT_CLAUDE_SONNET_35,
AGENT_GPT5_MINI,
)
from agentlab.experiments.study import Study

Expand Down
53 changes: 51 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "agentlab"
dynamic = ["version", "dependencies"]
dynamic = ["version"]
description = "Main package for developing agents and experiments"
authors = [
{name = "Rim Assouel", email = "rim.assouel@gmail.com"},
Expand All @@ -27,13 +27,55 @@ classifiers = [
"Topic :: Scientific/Engineering :: Artificial Intelligence",
"License :: OSI Approved :: Apache Software License",
]
dependencies = [
"pydantic~=2.9",
"dask",
"distributed",
"browsergym>=0.7.1",
"joblib>=1.2.0",
"openai>=1.7,<2",
"langchain_community",
"tiktoken",
"huggingface_hub",
"contexttimer",
"ipython",
"pyyaml>=6",
"pandas",
"gradio>=5.5",
"gitpython",
"requests",
"matplotlib",
"ray[default]",
"python-slugify",
"pillow",
"gymnasium>=0.27",
"torch>=2.2.2",
"safetensors>=0.4.0",
"transformers>=4.38.2",
"anthropic>=0.62.0",
"litellm>=1.75.3",
"python-dotenv>=1.1.1",
]

[project.optional-dependencies]
dev = [
"black[jupyter]>=24.2.0",
"blacken-docs",
"pre-commit",
"pytest==7.3.2",
"flaky",
"pytest-xdist",
"pytest-playwright",
]
# tapeagents = [
# "tapeagents[converters]",
# ]

[project.urls]
"Homepage" = "https://github.com/ServiceNow/AgentLab"

[tool.setuptools.dynamic]
version = {attr = "agentlab.__version__"}
dependencies = {file = ["requirements.txt"]}

[tool.black]
line-length = 100
Expand All @@ -54,6 +96,13 @@ exclude = '''
)/
'''

[dependency-groups]
dev = [
"darglint>=1.8.1",
"ipykernel>=6.30.1",
"pip>=25.2",
]


[project.scripts]
agentlab-assistant = "agentlab.ui_assistant:main"
Expand Down
29 changes: 0 additions & 29 deletions requirements.txt

This file was deleted.

2 changes: 2 additions & 0 deletions src/agentlab/agents/generic_agent/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
AGENT_o3_MINI,
FLAGS_GPT_4o,
GenericAgentArgs,
AGENT_GPT5_MINI,
)

__all__ = [
Expand All @@ -46,4 +47,5 @@
"AGENT_4o_VISION",
"AGENT_4o_MINI_VISION",
"AGENT_CLAUDE_SONNET_35_VISION",
"AGENT_GPT5_MINI",
]
18 changes: 17 additions & 1 deletion src/agentlab/agents/generic_agent/agent_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -270,8 +270,12 @@
chat_model_args=CHAT_MODEL_ARGS_DICT["openrouter/anthropic/claude-3.7-sonnet"],
flags=FLAGS_GPT_4o,
)
# AGENT_o3_MINI = GenericAgentArgs(
# chat_model_args=CHAT_MODEL_ARGS_DICT["openai/o3-mini-2025-01-31"],
# flags=FLAGS_GPT_4o,
# )
AGENT_o3_MINI = GenericAgentArgs(
chat_model_args=CHAT_MODEL_ARGS_DICT["openai/o3-mini-2025-01-31"],
chat_model_args=CHAT_MODEL_ARGS_DICT["openrouter/openai/o3-mini"],
flags=FLAGS_GPT_4o,
)

Expand Down Expand Up @@ -302,6 +306,18 @@
chat_model_args=CHAT_MODEL_ARGS_DICT["openrouter/meta-llama/llama-4-maverick"],
flags=BASE_FLAGS,
)
GPT5_MINI_FLAGS = BASE_FLAGS.copy()
GPT5_MINI_FLAGS.action = dp.ActionFlags( # action should not be str to work with agentlab-assistant
action_set=HighLevelActionSetArgs(
subsets=["bid"],
multiaction=False,
)
)

AGENT_GPT5_MINI = GenericAgentArgs(
chat_model_args=CHAT_MODEL_ARGS_DICT["openai/gpt-5-mini-2025-08-07"],
flags=GPT5_MINI_FLAGS,
)

DEFAULT_RS_FLAGS = GenericPromptFlags(
flag_group="default_rs",
Expand Down
32 changes: 30 additions & 2 deletions src/agentlab/agents/tool_use_agent/tool_use_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -510,6 +510,26 @@ def get_action(self, obs: Any) -> float:
vision_support=True,
)

GPT_5_mini = OpenAIChatModelArgs(
model_name="gpt-5-mini-2025-08-07",
max_total_tokens=400_000,
max_input_tokens=400_000 - 4_000,
max_new_tokens=4_000,
temperature=1, # Only temperature 1 works for gpt-5-mini
vision_support=True,
)


GPT_5_nano = OpenAIChatModelArgs(
model_name="gpt-5-nano-2025-08-07",
max_total_tokens=400_000,
max_input_tokens=400_000 - 4_000,
max_new_tokens=4_000,
temperature=1, # Only temperature 1 works for gpt-5-nano
vision_support=True,
)


GPT_4_1_MINI = OpenAIResponseModelArgs(
model_name="gpt-4.1-mini",
max_total_tokens=200_000,
Expand Down Expand Up @@ -578,7 +598,7 @@ def get_action(self, obs: Any) -> float:
general_hints=GeneralHints(use_hints=False),
task_hint=TaskHint(use_task_hint=True),
keep_last_n_obs=None,
multiaction=True, # whether to use multi-action or not
multiaction=False, # whether to use multi-action or not
# action_subsets=("bid",),
action_subsets=("coord"),
# action_subsets=("coord", "bid"),
Expand All @@ -590,7 +610,15 @@ def get_action(self, obs: Any) -> float:
)

OAI_AGENT = ToolUseAgentArgs(
model_args=GPT_4_1,
model_args=GPT_5_mini,
config=DEFAULT_PROMPT_CONFIG,
)
GPT5_1_NANO_AGENT = ToolUseAgentArgs(
model_args=GPT_5_nano,
config=DEFAULT_PROMPT_CONFIG,
)
GPT5_1_MINI_AGENT = ToolUseAgentArgs(
model_args=GPT_5_mini,
config=DEFAULT_PROMPT_CONFIG,
)

Expand Down
Loading
Loading