Skip to content

111 create yaml template for problem submission#121

Open
kvdblom wants to merge 5 commits intomainfrom
111-create-yaml-template-for-problem-submission
Open

111 create yaml template for problem submission#121
kvdblom wants to merge 5 commits intomainfrom
111-create-yaml-template-for-problem-submission

Conversation

@kvdblom
Copy link
Collaborator

@kvdblom kvdblom commented Nov 19, 2025

Draft yaml template for review

@kvdblom kvdblom linked an issue Nov 19, 2025 that may be closed by this pull request
Copy link
Contributor

@Dvermetten Dvermetten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be useful to have some sort of comments to describe what the fields are / what would be valid input, specifically for the cases where there is a difference between quoted and non-quoted input. Example:

  variables: # information about the input variables
    types: continuous # can be one of (continuous, integer, binary, mixed)
    conditional: 'no' # whether there are conditional dependencies between variables, 'yes' or 'no'
    dimensionality: scalable # number of input variables, either as a number (in quotes) or scalable

Copy link
Collaborator

@CIGbalance CIGbalance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! The comments are mostly easy fixes or not that serious.

But I am mostly wondering of how this is supposed to be used? It is called a template, but it has BBOB-specific entries? I think this increases the risk of biasing the results.

I think I would suggest doing a template with a lot more comments (like @Dvermetten suggested) and then we can use this as real data as well as link to it as an example?

- name:
short: BBOB
full: Real-Parameter Black-Box Optimization Benchmarking
suite/generator/single: suite
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it would be best to enter problems separately, but I understand the necessity of streamlining this process. So if we want to allow entering suites, maybe we have different templates? Because some values might only make sense for a suite? Otherwise, for a suite, a lot of the input would be "varies" for number of objectives, variables, constraints, dynamic, noise, etc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be nice to have problems separately, I have also prototyped how it might work to have both the suite and component problems, see here:

OPL/problems.yaml

Line 1107 in 84b34b6

problems:

I am not sure about separate templates for suites or problems, but we can think about this and discuss what makes sense. I would imagine, e.g., the BBOB sphere still has a bunch of "varies", because it is scalable in the number of variables for example. For a suite, I would want an exhaustive list (e.g., noise: yes, no / optional, or something like that) of the options, rather than a vague "varies". I will describe this in a comment in the template.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: create a new issue for this.

- name: COCO
link: https://github.com/numbbo/coco
languges: 'C, Python'
evaluation time: 'less than a second'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be wild to analyse if we don't give some structure

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you suggest? @CIGbalance

My initial idea would be to ask for something like:
Enter approximate time with units and greater/less than symbol if relevant, e.g.: <1s, or 2h5m3s

Are there good standards we can refer to/use?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did something similar for the real-world benchmarks questionnaire. There we had less than a second/minute/hour/day, or more than a day.

Perhaps the 2h5m3s format is nicer, so we can automatically group things as desired in whatever analysis code is used. (Probably with some optional symbols like <, >, ~.)
Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decision: GO with categories second, minute, hour... can enter multiple if it varies.

@bn12
Copy link

bn12 commented Dec 11, 2025

Work in progress ...

`

Please enter the relevant information.

Fields that are not relevant can be left empty.

  • name:
    short: BBOB
    full: Real-Parameter Black-Box Optimization Benchmarking
    suite/generator/single: suite
    objectives:
    number: '1'
    types: single
    1 - single
    2 - bi or multi or multiple
    3 - multi or multiple
    4 or more - many
    scalable - scalable
    variables:
    types: continuous
    continuous or discrete or mixed-integer
    conditional: 'no'
    yes or no
    dimensionality: scalable
    scalable or fixed number or interval or ranges (5-11,19,85)
    constraints:
    present: 'no'
    yes or no
    soft: '0'
    number of soft constraints
    hard: '0'
    number of hard constraints
    boundary/box: 'yes'
    yes or no, in case of yes the number should be equal to the dimensionality under variables
    permutation: 'no'
    yes or no, in case of yes the number should be euqal to the length of the permutation
    dynamic:
    present: 'no'
    yes or no
    types: ''
    ??
    noise: 'no'
    yes or no, guess we don't differentiate types of noise (by now?)
    modality:
    types: 'unimodal, multimodal'
    evaluations:
    multi-fidelity: 'no'
    partial possible: 'no'
    independent objectives: 'no'
    reference:
    links:
    - https://doi.org/10.1080/10556788.2020.1808977
    authors: ''
    contact person: ''
    implementations:
    • name: COCO
      link: https://github.com/numbbo/coco
      languges: 'C, Python'
      evaluation time: 'less than a second'
      specific requirements: 'no'
      source:
      real-world:
      degree: ''
      open/closed: ''
      artificial: 'yes'
      other: 'no'
      textual description:
      general info: ''
      motivation: 'evaluate algorithm performance for typical difficulties that occur in continuous problems'
      challenage/key characteristics: ''
      limitations: ''
      other info: ''
      `

template.yaml Outdated
number: '1'
types: single
variables:
types: continuous
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type of variables should be continuous or discrete or mixed (mixed-integer)

variables:
types: continuous
conditional: 'no'
dimensionality: scalable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scalable or fixed number or interval / ranges (e.g. 5-11,19,85)
What do we expect the practitioner to add here? I'd prefer the exact number or range if available.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: Prefer exact numbers/ranges. 'scalable' only if it can be any number.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under dimensionality there should be two fields:

  • scalable yes/no
  • range (but can be single value)

present: 'no'
soft: '0'
hard: '0'
boundary/box: 'yes'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't consider discrete, categorical variables here, right?
That's fine with me but we should be aware of the missing thing.

In case of non-categorical variables, we expect the number of boxes to be equal to the number of variables. Could it be different?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option if some are bounded, people could enter 'some'

Put general comment at the start of the template: Suggested values for fields are recommendations. If they are not a good fit, feel free to add something else (and ideally, explain why)

soft: '0'
hard: '0'
boundary/box: 'yes'
permutation: 'no'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A yes would expect the variables to be discrete, right?

In the case of a mixed-integer problem, can there be a permutation present in the (parts of) discrete part?
Not the regular case, I'd assume, but again we should be aware of the possibility if it exists.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, people can deviate. And here too, it indicates the proprety is present in the problem, but not necessarily applies to all variables.

dynamic:
present: 'no'
types: ''
noise: 'no'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect to consistency, I suggest to follow the present, type logic here as well. This also keeps things simple.
We should ask practitioners to fill the other info field with corresponding information in case this is needed, e.g. for the dynamics, the noise, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add at top of templete: Present indicates it exists somewhere in the problem, but not necessarily everywhere.
Add types to other fields where it makes sense.
Specificially for noise, copy the types from the google form + other types field.

@kvdblom
Copy link
Collaborator Author

kvdblom commented Feb 5, 2026

Maybe it would be useful to have some sort of comments to describe what the fields are / what would be valid input, specifically for the cases where there is a difference between quoted and non-quoted input. Example:

  variables: # information about the input variables
    types: continuous # can be one of (continuous, integer, binary, mixed)
    conditional: 'no' # whether there are conditional dependencies between variables, 'yes' or 'no'
    dimensionality: scalable # number of input variables, either as a number (in quotes) or scalable

Yes we should do that.

@kvdblom
Copy link
Collaborator Author

kvdblom commented Feb 5, 2026

Thanks for this! The comments are mostly easy fixes or not that serious.

But I am mostly wondering of how this is supposed to be used? It is called a template, but it has BBOB-specific entries? I think this increases the risk of biasing the results.

I think I would suggest doing a template with a lot more comments (like @Dvermetten suggested) and then we can use this as real data as well as link to it as an example?

We should split between

  • A template
  • An example

@Dvermetten
Copy link
Contributor

To check what fields from the other info (based on the form) should be added to the template (taken from a different problem):
other info:
name: null
partial evaluations: Not Present
full name: Electric Motor Design Optimization
constraint properties: Hard Constraints, Soft Constraints, Box Constraints
number of constraints: '12'
type of dynamicism: ''
form of noise model: ''
type of noise space: ''
other noise properties: ''
description of multimodality: Constraints are multimodal
key challenges / characteristics: Time-consuming solution evaluation, highly-constrained
problem
scientific motivation: Challenging to find good solutions in a limited time
limitations: 'Unavailability, even if available, it wouldn''t be helpful to use
for benchmarking due taking a long time to evaluate a single solution '
implementation languages: Python
links to implementations: Implementation not freely available
approximate evaluation time: 8 minutes
links to usage examples: ''
general: This is not an available problem, but could be interesting to show to
researchers which difficulties appear in real-world problems

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create YAML template for problem submission

4 participants