Skip to content

Conversation

@erikchwang
Copy link
Collaborator

@erikchwang erikchwang commented May 26, 2025

Description by Korbit AI

What change is being made?

Introduce the VLAgent functionality, including model configurations, a UI agent implementation, utility functions for image-handling, and an extensible prompt system for browser-based interactions.

Why are these changes being made?

The VLAgent addition facilitates high-level web-based task automation by allowing the configuration of both primary and auxiliary models, leveraging visual inputs and historical context to choose appropriate browser actions. This modular approach, with abstract base classes and concrete implementations, supports flexibility and future extensions, though further validation on extensive real-world datasets would strengthen its general applicability.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Readability Missing return type hint ▹ view
Design Mixed Responsibilities in VLAgentArgs ▹ view
Documentation Unclear Action Format Specification ▹ view
Documentation Unspecified Observation Preprocessing Requirements ▹ view
Security Unsafe image data processing without validation ▹ view
Functionality Inflexible Image Format Handling ▹ view
Functionality Unsafe Numpy Array Conversion ▹ view
Security Silent Exception Masking ▹ view
Security Missing API Key Validation ▹ view
Performance Unbounded Exponential Backoff ▹ view
Files scanned
File Path Reviewed
src/agentlab/agents/vl_agent/vl_model/base.py
src/agentlab/agents/vl_agent/vl_prompt/base.py
src/agentlab/agents/vl_agent/vl_agent/base.py
src/agentlab/agents/vl_agent/main.py
src/agentlab/agents/vl_agent/utils.py
src/agentlab/agents/vl_agent/config.py
src/agentlab/agents/vl_agent/vl_model/openrouter_api_model.py
src/agentlab/agents/vl_agent/vl_model/llama_model.py
src/agentlab/agents/vl_agent/vl_agent/ui_agent.py
src/agentlab/agents/vl_agent/vl_prompt/ui_prompt.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

import numpy as np


def image_to_image_url(image: Union[Image.Image, np.ndarray]):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type hint category Readability

Tell me more
What is the issue?

Return type hint is missing for the function.

Why this matters

Missing return type hints make it harder for developers to understand the expected output type without reading the implementation details.

Suggested change ∙ Feature Preview
def image_to_image_url(image: Union[Image.Image, np.ndarray]) -> str:
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +16 to +25
@dataclass
class VLAgentArgs(ABC):
@property
@abstractmethod
def agent_name(self) -> str:
raise NotImplementedError

@abstractmethod
def make_agent(self) -> VLAgent:
raise NotImplementedError
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed Responsibilities in VLAgentArgs category Design

Tell me more
What is the issue?

VLAgentArgs is mixing the Factory pattern (make_agent) with configuration (agent_name) and lifecycle management (prepare, close) responsibilities, violating the Single Responsibility Principle.

Why this matters

This design makes the class harder to maintain and test, while reducing reusability. Lifecycle management, configuration, and object creation should be handled by separate components.

Suggested change ∙ Feature Preview

Split the class into separate components:

@dataclass
class VLAgentConfig:
    agent_name: str
    # other configuration parameters

class VLAgentFactory(ABC):
    @abstractmethod
    def make_agent(self, config: VLAgentConfig) -> VLAgent:
        pass

class VLAgentLifecycle(ABC):
    @abstractmethod
    def prepare(self):
        pass
    @abstractmethod
    def close(self):
        pass
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.


class VLAgent(ABC):
@abstractmethod
def get_action(self, obs: dict) -> tuple[str, dict]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear Action Format Specification category Documentation

Tell me more
What is the issue?

The return type tuple[str, dict] lacks documentation about what the string and dictionary represent in the action tuple, which could lead to incorrect implementations by derived classes.

Why this matters

Without clear documentation of the expected format and content of the action tuple, implementations may return incompatible or incorrect action formats that could cause runtime errors or unexpected behavior.

Suggested change ∙ Feature Preview

Add a docstring explaining the expected format of the action tuple:

@abstractmethod
def get_action(self, obs: dict) -> tuple[str, dict]:
    """Get the next action based on the observation.
    
    Args:
        obs (dict): The observation from the environment
        
    Returns:
        tuple[str, dict]: A tuple containing:
            - action_type (str): The type of action to perform (e.g., 'click', 'type')
            - action_params (dict): Parameters specific to the action type
    """
    raise NotImplementedError
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

raise NotImplementedError

@abstractmethod
def obs_preprocessor(self, obs: dict) -> dict:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unspecified Observation Preprocessing Requirements category Documentation

Tell me more
What is the issue?

The obs_preprocessor method lacks specification about what preprocessing should be performed and what the output format should be.

Why this matters

Implementations might process observations incorrectly or inconsistently, leading to unexpected agent behavior when processing environment observations.

Suggested change ∙ Feature Preview

Add a docstring specifying the preprocessing requirements:

@abstractmethod
def obs_preprocessor(self, obs: dict) -> dict:
    """Preprocess the raw observation from the environment.
    
    Args:
        obs (dict): Raw observation from the environment
        
    Returns:
        dict: Processed observation with standardized format:
            - Required fields should be documented here
            - Any transformations that must be applied
    """
    raise NotImplementedError
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +23 to +27
def image_url_to_image(image_url: str) -> Image.Image:
image_base64 = image_url.replace("data:image/jpeg;base64,", "")
image_data = base64.b64decode(image_base64.encode())
buffer = io.BytesIO(image_data)
image = Image.open(buffer)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe image data processing without validation category Security

Tell me more
What is the issue?

The function accepts and processes base64 image data without proper input validation, which could lead to processing of malicious image data.

Why this matters

Malformed or malicious image data could lead to memory exhaustion, buffer overflows, or arbitrary code execution through image parsing vulnerabilities.

Suggested change ∙ Feature Preview
def image_url_to_image(image_url: str) -> Image.Image:
    if not image_url.startswith("data:image/jpeg;base64,"):
        raise ValueError("Invalid image URL format")
    try:
        image_base64 = image_url.replace("data:image/jpeg;base64,", "")
        image_data = base64.b64decode(image_base64.encode())
        buffer = io.BytesIO(image_data)
        image = Image.open(buffer)
        if image.format != "JPEG":
            raise ValueError("Invalid image format")
        return image
    except (ValueError, IOError) as e:
        raise ValueError("Invalid image data") from e
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +23 to +24
def image_url_to_image(image_url: str) -> Image.Image:
image_base64 = image_url.replace("data:image/jpeg;base64,", "")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inflexible Image Format Handling category Functionality

Tell me more
What is the issue?

The function assumes all image URLs will be JPEG format, but the input URL could contain other image formats (PNG, GIF, etc.).

Why this matters

This could cause conversion failures when processing non-JPEG images, leading to runtime errors in the virtual learning agent.

Suggested change ∙ Feature Preview

Modify the function to handle different image formats by extracting the format from the URL:

def image_url_to_image(image_url: str) -> Image.Image:
    # Extract format from data URL
    format_match = image_url.split(';')[0]
    if not format_match.startswith('data:image/'):
        raise ValueError('Invalid image data URL')
    # Remove the data URL prefix
    image_base64 = image_url.split(',')[1]
    image_data = base64.b64decode(image_base64.encode())
    buffer = io.BytesIO(image_data)
    image = Image.open(buffer)
    return image
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +11 to +13
def image_to_image_url(image: Union[Image.Image, np.ndarray]):
if isinstance(image, np.ndarray):
image = Image.fromarray(image)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe Numpy Array Conversion category Functionality

Tell me more
What is the issue?

The function doesn't handle invalid numpy array inputs that can't be converted to PIL Images.

Why this matters

Invalid array shapes, data types, or value ranges could cause crashes when converting numpy arrays to PIL Images.

Suggested change ∙ Feature Preview

Add error handling for numpy array conversion:

def image_to_image_url(image: Union[Image.Image, np.ndarray]):
    if isinstance(image, np.ndarray):
        try:
            image = Image.fromarray(image)
        except TypeError as e:
            raise ValueError(f"Invalid numpy array format: {e}")
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +28 to +31
try:
response = completion.choices[0].message.content
except:
response = ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent Exception Masking category Security

Tell me more
What is the issue?

Bare except clause that silently catches all exceptions and returns an empty string

Why this matters

This could mask critical security-related exceptions like authentication failures or API key issues, making it harder to detect and respond to security problems

Suggested change ∙ Feature Preview
try:
    response = completion.choices[0].message.content
except Exception as e:
    # Log the error appropriately
    raise  # Re-raise the exception to handle it at a higher level
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

max_tokens: int,
reproducibility_config: dict,
):
self.client = OpenAI(base_url=base_url, api_key=os.getenv("OPENROUTER_API_KEY"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing API Key Validation category Security

Tell me more
What is the issue?

No validation of required API key environment variable

Why this matters

If OPENROUTER_API_KEY is not set, it will be None, potentially causing unclear errors or unexpected behavior when making API calls

Suggested change ∙ Feature Preview
api_key = os.getenv("OPENROUTER_API_KEY")
if not api_key:
    raise ValueError("OPENROUTER_API_KEY environment variable is required")
self.client = OpenAI(base_url=base_url, api_key=api_key)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +23 to +24
@backoff.on_exception(backoff.expo, RateLimitError)
def get_response(messages, max_tokens, **kwargs):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded Exponential Backoff category Performance

Tell me more
What is the issue?

The backoff decorator's exponential retry strategy lacks maximum retry attempts and maximum delay parameters.

Why this matters

Without limits on retries and delays, the system could potentially keep retrying indefinitely or with extremely long delays, impacting system performance and resource utilization.

Suggested change ∙ Feature Preview

Add max_tries and max_time parameters to the backoff decorator:

@backoff.on_exception(
    backoff.expo,
    RateLimitError,
    max_tries=5,
    max_time=30
)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

@erikchwang erikchwang closed this Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants