Skip to content

Add an evaluation for model robustness #36

@aron0093

Description

@aron0093

We need to add an evaluation that tests the robustness of programs across multiple runs (seeds) and also across multiple K-values.

  1. A weak test can assess similarity of the overall information captured by each run.
  2. A stronger test would compare programs across runs and assess consistency.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions