You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During training and evaluation, did you customize prompts and action spaces for each dataset ? For example, GUI-Odyeesy does not have Launch(app=app_name), but Android-Control has a similar action?
By using the same action space as the benchmark, the model reduces the number of unexpected actions predicted, and performance improves.