Skip to content

Conversation

@ammar-agent
Copy link
Collaborator

Adds a new web_fetch tool that:

  • Fetches web pages using curl via the Runtime (respects workspace network context)
  • Extracts main content using Mozilla Readability
  • Converts to clean markdown using Turndown
  • Supports both markdown and plain text output formats

Features

  • Network isolation: requests originate from workspace, not Mux host
  • Robust HTTP handling: curl handles redirects, SSL, encoding, compression natively
  • Size limits: output truncated to 64KB, HTML input limited to 5MB
  • Graceful errors: DNS failures, timeouts, empty responses handled cleanly

Dependencies added

  • @mozilla/readability: article extraction
  • jsdom: DOM parsing for Readability
  • turndown: HTML to markdown conversion

Testing

9 integration tests using real runtime (no mocks):

  • Real network calls to example.com
  • Local file:// URLs for HTML parsing tests
  • Error scenarios (DNS failure, connection refused, empty files)

Also updates AGENTS.md testing section to clarify preferred test types:

  1. True integration tests (no mocks)
  2. Unit tests on pure/isolated logic

Generated with mux

@ammar-agent ammar-agent force-pushed the web-fetch-tool-plan branch 2 times, most recently from cdc7475 to 0279398 Compare November 25, 2025 01:18
@chatgpt-codex-connector

This comment has been minimized.

@ammar-agent ammar-agent force-pushed the web-fetch-tool-plan branch 8 times, most recently from ff158ee to 509aed6 Compare November 25, 2025 02:02
@ammario ammario linked an issue Nov 25, 2025 that may be closed by this pull request
Adds a new web_fetch tool that:
- Fetches web pages using curl via the Runtime (respects workspace network context)
- Extracts main content using Mozilla Readability
- Converts to clean markdown using Turndown
- Supports both markdown and plain text output formats

Features:
- Network isolation: requests originate from workspace, not Mux host
- Curl handles redirects, SSL, encoding, compression natively
- Output truncated to 64KB, HTML input limited to 5MB
- Graceful error handling for DNS failures, timeouts, empty responses

Dependencies added:
- @mozilla/readability: article extraction
- jsdom: DOM parsing for Readability
- turndown: HTML to markdown conversion

Also updates AGENTS.md testing section to clarify preferred test types:
1. True integration tests (no mocks)
2. Unit tests on pure/isolated logic

_Generated with `mux`_
@ammario ammario merged commit 7fe9486 into main Nov 25, 2025
15 checks passed
@ammario ammario deleted the web-fetch-tool-plan branch November 25, 2025 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add web_fetch tool

2 participants