Add provenance output support to execute() response#768
Add provenance output support to execute() response#768hapix wants to merge 1 commit intoOpen-EO:masterfrom
Conversation
| import logging | ||
| from pathlib import Path | ||
| from typing import Callable, Dict, List, Optional, Union | ||
| import os |
|
|
||
| # Return the result and get the workflow provinance (yprov4wfs) | ||
| result = pg.to_callable(PROCESS_REGISTRY)() | ||
| workflow = pg.workflow |
There was a problem hiding this comment.
as far as I understand this depends on a new feature of openeo-pg-parser-networkx, so the minimum version of this dependency has to be bumped at
Lines 47 to 52 in ee18290
| workflow = pg.workflow | ||
|
|
||
| # To save the provenance file in the specific path use: | ||
| # workflow.prov_to_json(directory_path=save_path) |
There was a problem hiding this comment.
I don't think it's useful to have this as comment here. If this is for users, it should be in the docblock
| # workflow.prov_to_json(directory_path=save_path) | ||
|
|
||
| if return_provenance: | ||
| return result, workflow |
There was a problem hiding this comment.
This custom return should be documented in the docblock and return annotation
There was a problem hiding this comment.
that being said, I'm not a big fan of the pattern of returning different data structures (tuple of two things instead of a single DataArray) depending on input arguments.
Especially, because in normal usage of the openEO python client, the execute method of connection objects (LocallConnection here) is usually not used directly by users, but indirectly through DataCube.execute() or something equivalent. So changing the input and output API of Connection.execute is going to create problems
| *, | ||
| validate: Optional[bool] = None, | ||
| auto_decode: bool = True, | ||
| return_provenance: bool = False, |
There was a problem hiding this comment.
You are adding a custom argument here to a more public API of Connection.execute(), which is fine as long as you call this LocalConnection.execute() yourself. But in general this method will be called automatically without this argument (because it's not in the official API), e.g.:
cube = local_connection.load_collection(...)
res = cube.execute() the latter execute() is a method defined on DataCube and does not support return_provenance, let alone it will pass it properly to LocalConnection.execute
Summary
This PR adds support for returning workflow provenance data as part of the
execute()result in the OpenEO Python client. It complements the provenance generation added inopeneo-pg-parser-networkxvia theyProv4WFSlibrary.Key Changes
provenanceoutput toexecute()result structureDependencies
This PR depends on the provenance functionality in
openeo-pg-parser-networkx.