Skip to content

Conversation

@alex-w-99
Copy link
Contributor

@alex-w-99 alex-w-99 commented Jan 18, 2026

alex-w-99 and others added 30 commits January 15, 2026 19:22
- Add StartRoutineDiscoveryJobCreationParams Pydantic model for tool schema
- Add data_models/guide_agent/ with conversation state and message types
- Add data_models/websockets/ with base WS types and guide-specific commands/responses
- Update GuideAgent with callback pattern, tool confirmation flow, state management
- Business logic stubs marked with NotImplementedError for subsequent PR

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move all WebSocket types (base, browser, guide) into one consolidated
websockets.py file. Also move test_websockets.py from servers repo.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace Pydantic model + constants with a simple function stub
that colleague will implement. Guide agent now uses
register_tool_from_function and calls the function directly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add tool_utils.py with extract_description_from_docstring and
  generate_parameters_schema for converting Python functions to
  LLM tool definitions using pydantic TypeAdapter
- Add register_tool_from_function method to LLMClient that extracts
  name, description, and parameters schema from a function
- Add unit tests for tool_utils

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Merge GuideWebSocketClientCommandType into WebSocketClientCommandType
- Merge ParsedGuideWebSocketClientCommand into ParsedWebSocketClientCommand
- Remove Guide- prefix from response types (WebSocketMessageResponse, etc.)
- Consolidate response type enums (MESSAGE, STATE, TOOL_INVOCATION_RESULT)
- Add tests for all previously untested models and commands
- Increase test coverage from ~50% to 100% of websockets module

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…LLM API

- Add Chat and ChatThread models extending ResourceBase with bidirectional linking
- Rename duplicate ChatMessage to EmittedChatMessage for callback messages
- Add LLMToolCall and LLMChatResponse models for tool calling support
- Implement GuideAgent with conversation logic, persistence callbacks, and
  self-aware system prompt for web automation routine creation
- Update all LLM client methods to accept messages array instead of single prompt
  (get_text_sync/async, get_structured_response_sync/async, chat_sync)
- Add run_guide_agent.py terminal chat script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaces stub script with full terminal interface featuring ANSI colors,
ASCII banner, tool invocation confirmation flow, and conversation commands.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update welcome message to describe CDP capture analysis workflow
- Add links to Vectorly docs and console
- Change banner color to purple
- Fix OpenAI client to use max_completion_tokens for GPT-5 models

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
GPT-5 models only support temperature=1 (default), so we omit the
parameter entirely to avoid API errors.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add chat_stream_sync method to abstract, OpenAI, and Anthropic clients
- Add stream_chunk_callable parameter to GuideAgent
- Update terminal CLI to print chunks as they arrive for typewriter effect

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add STREAM_CHUNK and STREAM_END to WebSocketStreamResponseType
- Add WebSocketStreamChunkResponse for text deltas during streaming
- Add WebSocketStreamEndResponse with full accumulated content
- Update WebSocketServerResponse union to include new types

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update tests to use thread_id instead of guide_chat_id to match
the WebSocketStateResponse model change.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
alex-w-99 and others added 23 commits January 16, 2026 17:22
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Include message IDs in emitted chat responses so WebSocket clients
can track and reference individual messages.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update callback signatures from Callable[[T], None] to Callable[[T], T]
so the persistence layer can assign IDs and return them to GuideAgent.
This allows servers to use ResourceBase-generated IDs while keeping
web_hacker's models decoupled from ResourceBase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ness-logic

Guide agent context docs business logic
Initial copy of async CDP session, event broadcaster, and monitors from
the servers repo. These files will be refactored in subsequent commits
to remove AWS-specific dependencies (Firehose, S3, etc.) and use
callback patterns instead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename cdp/async to cdp/async_cdp to avoid Python keyword conflict
- Create local data_models.py with CDP event models:
  - BaseCDPEvent, NetworkTransactionEvent, StorageEvent
  - WindowPropertyChange, WindowPropertyEvent
- Rewrite EventBroadcaster to use pure callbacks instead of AWS Firehose/S3
- Update all imports to use web_hacker paths
- Remove all AWS-specific code and references from docstrings

The async CDP monitors now use a callback pattern where callers can inject
their own event handlers for storage and streaming purposes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@alex-w-99 alex-w-99 mentioned this pull request Jan 19, 2026
alex-w-99 and others added 4 commits January 19, 2026 01:32
- Add AsyncInteractionMonitor: tracks mouse/keyboard events via JS injection
  - Monitors click, mousedown, mouseup, dblclick, contextmenu, mouseover
  - Monitors keydown, keyup, keypress, input, change, focus, blur
  - Uses Runtime.addBinding for CDP communication
  - Emits UiInteractionEvent-compatible events via callback
  - Includes consolidate_interactions() for JSONL aggregation

- Integrate into AsyncCDPSession:
  - Setup interaction monitoring in setup_cdp()
  - Handle interaction messages and command replies
  - Include interactions in monitoring summary

- Update FileEventWriter:
  - Add AsyncInteractionMonitor category handling
  - Create interaction/ directory in output structure

- Update browser_monitor.py and SDK monitor.py:
  - Add interaction directory and events path
  - Include interaction stats in session summary

- Clean up: Remove sync CDP monitors (now using async versions)
  - Delete cdp_session.py, network_monitor.py, storage_monitor.py
  - Delete window_property_monitor.py, event_broadcaster.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Flattens the directory structure by moving all files from
web_hacker/cdp/async_cdp/ to web_hacker/cdp/. Updates all
import paths accordingly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rayruizhiliao rayruizhiliao changed the base branch from beta to main January 22, 2026 01:15
@alex-w-99 alex-w-99 linked an issue Jan 22, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add async analogues of sync CDP classes

3 participants