VLAgent #250

erikchwang · 2025-05-26T03:13:03Z

Description by Korbit AI

What change is being made?

Introduce the VLAgent functionality, including model configurations, a UI agent implementation, utility functions for image-handling, and an extensible prompt system for browser-based interactions.

Why are these changes being made?

The VLAgent addition facilitates high-level web-based task automation by allowing the configuration of both primary and auxiliary models, leveraging visual inputs and historical context to choose appropriate browser actions. This modular approach, with abstract base classes and concrete implementations, supports flexibility and future extensions, though further validation on extensive real-world datasets would strengthen its general applicability.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

korbit-ai

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.

Category	Issue	Status
	Missing return type hint ▹ view
	Mixed Responsibilities in VLAgentArgs ▹ view
	Unclear Action Format Specification ▹ view
	Unspecified Observation Preprocessing Requirements ▹ view
	Unsafe image data processing without validation ▹ view
	Inflexible Image Format Handling ▹ view
	Unsafe Numpy Array Conversion ▹ view
	Silent Exception Masking ▹ view
	Missing API Key Validation ▹ view
	Unbounded Exponential Backoff ▹ view

Files scanned

File Path	Reviewed
src/agentlab/agents/vl_agent/vl_model/base.py	✅
src/agentlab/agents/vl_agent/vl_prompt/base.py	✅
src/agentlab/agents/vl_agent/vl_agent/base.py	✅
src/agentlab/agents/vl_agent/main.py	✅
src/agentlab/agents/vl_agent/utils.py	✅
src/agentlab/agents/vl_agent/config.py	✅
src/agentlab/agents/vl_agent/vl_model/openrouter_api_model.py	✅
src/agentlab/agents/vl_agent/vl_model/llama_model.py	✅
src/agentlab/agents/vl_agent/vl_agent/ui_agent.py	✅
src/agentlab/agents/vl_agent/vl_prompt/ui_prompt.py	✅

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/utils.py

+import numpy as np
+
+
+def image_to_image_url(image: Union[Image.Image, np.ndarray]):


Missing return type hint

Tell me more

What is the issue?

Return type hint is missing for the function.

Why this matters

Missing return type hints make it harder for developers to understand the expected output type without reading the implementation details.

Suggested change ∙ Feature Preview

def image_to_image_url(image: Union[Image.Image, np.ndarray]) -> str:

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/vl_agent/base.py

+@dataclass
+class VLAgentArgs(ABC):
+    @property
+    @abstractmethod
+    def agent_name(self) -> str:
+        raise NotImplementedError
+
+    @abstractmethod
+    def make_agent(self) -> VLAgent:
+        raise NotImplementedError


Mixed Responsibilities in VLAgentArgs

Tell me more

What is the issue?

VLAgentArgs is mixing the Factory pattern (make_agent) with configuration (agent_name) and lifecycle management (prepare, close) responsibilities, violating the Single Responsibility Principle.

Why this matters

This design makes the class harder to maintain and test, while reducing reusability. Lifecycle management, configuration, and object creation should be handled by separate components.

Suggested change ∙ Feature Preview

Split the class into separate components:

@dataclass class VLAgentConfig: agent_name: str # other configuration parameters class VLAgentFactory(ABC): @abstractmethod def make_agent(self, config: VLAgentConfig) -> VLAgent: pass class VLAgentLifecycle(ABC): @abstractmethod def prepare(self): pass @abstractmethod def close(self): pass

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/vl_agent/base.py

+
+class VLAgent(ABC):
+    @abstractmethod
+    def get_action(self, obs: dict) -> tuple[str, dict]:


Unclear Action Format Specification

Tell me more

What is the issue?

The return type tuple[str, dict] lacks documentation about what the string and dictionary represent in the action tuple, which could lead to incorrect implementations by derived classes.

Why this matters

Without clear documentation of the expected format and content of the action tuple, implementations may return incompatible or incorrect action formats that could cause runtime errors or unexpected behavior.

Suggested change ∙ Feature Preview

Add a docstring explaining the expected format of the action tuple:

@abstractmethod def get_action(self, obs: dict) -> tuple[str, dict]: """Get the next action based on the observation. Args: obs (dict): The observation from the environment Returns: tuple[str, dict]: A tuple containing: - action_type (str): The type of action to perform (e.g., 'click', 'type') - action_params (dict): Parameters specific to the action type """ raise NotImplementedError

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/vl_agent/base.py

+        raise NotImplementedError
+
+    @abstractmethod
+    def obs_preprocessor(self, obs: dict) -> dict:


Unspecified Observation Preprocessing Requirements

Tell me more

What is the issue?

The obs_preprocessor method lacks specification about what preprocessing should be performed and what the output format should be.

Why this matters

Implementations might process observations incorrectly or inconsistently, leading to unexpected agent behavior when processing environment observations.

Suggested change ∙ Feature Preview

Add a docstring specifying the preprocessing requirements:

@abstractmethod def obs_preprocessor(self, obs: dict) -> dict: """Preprocess the raw observation from the environment. Args: obs (dict): Raw observation from the environment Returns: dict: Processed observation with standardized format: - Required fields should be documented here - Any transformations that must be applied """ raise NotImplementedError

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/utils.py

+def image_url_to_image(image_url: str) -> Image.Image:
+    image_base64 = image_url.replace("data:image/jpeg;base64,", "")
+    image_data = base64.b64decode(image_base64.encode())
+    buffer = io.BytesIO(image_data)
+    image = Image.open(buffer)


Unsafe image data processing without validation

Tell me more

What is the issue?

The function accepts and processes base64 image data without proper input validation, which could lead to processing of malicious image data.

Why this matters

Malformed or malicious image data could lead to memory exhaustion, buffer overflows, or arbitrary code execution through image parsing vulnerabilities.

Suggested change ∙ Feature Preview

def image_url_to_image(image_url: str) -> Image.Image: if not image_url.startswith("data:image/jpeg;base64,"): raise ValueError("Invalid image URL format") try: image_base64 = image_url.replace("data:image/jpeg;base64,", "") image_data = base64.b64decode(image_base64.encode()) buffer = io.BytesIO(image_data) image = Image.open(buffer) if image.format != "JPEG": raise ValueError("Invalid image format") return image except (ValueError, IOError) as e: raise ValueError("Invalid image data") from e

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:58Z

src/agentlab/agents/vl_agent/utils.py

+def image_url_to_image(image_url: str) -> Image.Image:
+    image_base64 = image_url.replace("data:image/jpeg;base64,", "")


Inflexible Image Format Handling

Tell me more

What is the issue?

The function assumes all image URLs will be JPEG format, but the input URL could contain other image formats (PNG, GIF, etc.).

Why this matters

This could cause conversion failures when processing non-JPEG images, leading to runtime errors in the virtual learning agent.

Suggested change ∙ Feature Preview

Modify the function to handle different image formats by extracting the format from the URL:

def image_url_to_image(image_url: str) -> Image.Image: # Extract format from data URL format_match = image_url.split(';')[0] if not format_match.startswith('data:image/'): raise ValueError('Invalid image data URL') # Remove the data URL prefix image_base64 = image_url.split(',')[1] image_data = base64.b64decode(image_base64.encode()) buffer = io.BytesIO(image_data) image = Image.open(buffer) return image

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:59Z

src/agentlab/agents/vl_agent/utils.py

+def image_to_image_url(image: Union[Image.Image, np.ndarray]):
+    if isinstance(image, np.ndarray):
+        image = Image.fromarray(image)


Unsafe Numpy Array Conversion

Tell me more

What is the issue?

The function doesn't handle invalid numpy array inputs that can't be converted to PIL Images.

Why this matters

Invalid array shapes, data types, or value ranges could cause crashes when converting numpy arrays to PIL Images.

Suggested change ∙ Feature Preview

Add error handling for numpy array conversion:

def image_to_image_url(image: Union[Image.Image, np.ndarray]): if isinstance(image, np.ndarray): try: image = Image.fromarray(image) except TypeError as e: raise ValueError(f"Invalid numpy array format: {e}")

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:59Z

src/agentlab/agents/vl_agent/vl_model/openrouter_api_model.py

+            try:
+                response = completion.choices[0].message.content
+            except:
+                response = ""


Silent Exception Masking

Tell me more

What is the issue?

Bare except clause that silently catches all exceptions and returns an empty string

Why this matters

This could mask critical security-related exceptions like authentication failures or API key issues, making it harder to detect and respond to security problems

Suggested change ∙ Feature Preview

try: response = completion.choices[0].message.content except Exception as e: # Log the error appropriately raise # Re-raise the exception to handle it at a higher level

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:59Z

src/agentlab/agents/vl_agent/vl_model/openrouter_api_model.py

+        max_tokens: int,
+        reproducibility_config: dict,
+    ):
+        self.client = OpenAI(base_url=base_url, api_key=os.getenv("OPENROUTER_API_KEY"))


Missing API Key Validation

Tell me more

What is the issue?

No validation of required API key environment variable

Why this matters

If OPENROUTER_API_KEY is not set, it will be None, potentially causing unclear errors or unexpected behavior when making API calls

Suggested change ∙ Feature Preview

api_key = os.getenv("OPENROUTER_API_KEY") if not api_key: raise ValueError("OPENROUTER_API_KEY environment variable is required") self.client = OpenAI(base_url=base_url, api_key=api_key)

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

korbit-ai · 2025-05-26T03:16:59Z

src/agentlab/agents/vl_agent/vl_model/openrouter_api_model.py

+        @backoff.on_exception(backoff.expo, RateLimitError)
+        def get_response(messages, max_tokens, **kwargs):


Unbounded Exponential Backoff

Tell me more

What is the issue?

The backoff decorator's exponential retry strategy lacks maximum retry attempts and maximum delay parameters.

Why this matters

Without limits on retries and delays, the system could potentially keep retrying indefinitely or with extremely long delays, impacting system performance and resource utilization.

Suggested change ∙ Feature Preview

Add max_tries and max_time parameters to the backoff decorator:

@backoff.on_exception( backoff.expo, RateLimitError, max_tries=5, max_time=30 )

Provide feedback to improve future suggestions

_{💬 Looking for more details? Reply to this comment to chat with Korbit.}

erikchwang added 25 commits May 5, 2025 22:33

update

4f7c035

update

e619dfd

update

c7aa049

update

f6318b4

update

414cf68

update

e6e9724

update

6c27267

update

1cfe416

update

3b6561a

update

a1361ad

update

c536f4f

update

cbbad61

update

28dd340

update

7767826

update

7fb9d39

update

8a727fd

update

acfd6a8

update

e2624c1

update

41cbc3b

update

77ed33b

update

fd55958

update

03d777c

update

7d19c07

update

294e526

update

a7b9999

korbit-ai bot reviewed May 26, 2025

View reviewed changes

erikchwang added 4 commits May 28, 2025 05:15

update

30bc57b

update

ace3e41

update

12016cb

update

6ebcdd8

erikchwang closed this Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLAgent #250

VLAgent #250

Uh oh!

erikchwang commented May 26, 2025 •

edited by korbit-ai bot

Loading

Uh oh!

korbit-ai bot left a comment •

edited

Loading

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

korbit-ai bot May 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		import numpy as np


		def image_to_image_url(image: Union[Image.Image, np.ndarray]):

		def image_url_to_image(image_url: str) -> Image.Image:
		image_base64 = image_url.replace("data:image/jpeg;base64,", "")

		@backoff.on_exception(backoff.expo, RateLimitError)
		def get_response(messages, max_tokens, **kwargs):

VLAgent #250

VLAgent #250

Uh oh!

Conversation

erikchwang commented May 26, 2025 • edited by korbit-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description by Korbit AI

What change is being made?

Why are these changes being made?

Uh oh!

korbit-ai bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Missing return type hint

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Mixed Responsibilities in VLAgentArgs

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Unclear Action Format Specification

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Unspecified Observation Preprocessing Requirements

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Unsafe image data processing without validation

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Inflexible Image Format Handling

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Unsafe Numpy Array Conversion

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

korbit-ai bot May 26, 2025

Choose a reason for hiding this comment

Silent Exception Masking

What is the issue?

Why this matters

Suggested change ∙ Feature Preview

Provide feedback to improve future suggestions

Uh oh!

erikchwang commented May 26, 2025 •

edited by korbit-ai bot

Loading

korbit-ai bot left a comment •

edited

Loading