Skip to content

Questions about unified action space and Benchmrk action categories ? #22

@zhshj0110

Description

@zhshj0110

Thanks for your excellent work.

I found that the performance issues on GUI-Odyeesy and Android-Control were not reproducible.

During training and evaluation, did you customize prompts and action spaces for each dataset ? For example, GUI-Odyeesy does not have Launch(app=app_name), but Android-Control has a similar action?

By using the same action space as the benchmark, the model reduces the number of unexpected actions predicted, and performance improves.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions