Skip to content

Conversation

@lupko
Copy link
Contributor

@lupko lupko commented Dec 11, 2024

FlexConnect's implementation of GetFlightInfo uses task executor to submit tasks that will invoke the FlexConnect functions. The GetFlightInfo will then wait for the task to complete (generate flight) and return result.

The GetFlightInfo implementation had hardcoded (in a constant) timeout that was used for waiting. This timeout was set to 60 seconds. This can sometimes be too little. And what's worse, it cannot be changed.

This PR introduces new setting call_deadline_ms that is expected to appear in the [flexconnect] configuration section. It is deadline in milliseconds. The default was bumped to 180 seconds.


Part of this PR are two additional changes:

  • Try to cancel the task for when the deadline was exceeded. This is basic sanity. The task may be stuck in queue -> cancel will throw it out immediately. The task may be invoking function that is cancellable -> propagate cancel indicator to the function so that it can act on it.

  • When task that executes FlexConnect function finishes invocation (and has result), it should check whether it was cancelled & switch itself to non-cancellable state before it returns the result. This is also basic hygiene. Tasks that run non-cancellable functions may be cancelled, the function still finishes the execution and returns result which is then retained by the server for configured amount of time. However, since the task was cancelled, there is no chance a client will ever come for the results -> they will hang in the memory unnecessarily.

- this was hardcoded to 60 seconds before
- added new option `call_deadline_ms` - allows to configure
  deadline for the FlexConnection function call; in millis
- if not specified, the default of 180 seconds will be used
- added extra e2e test verifying the behavior
- sanitized test fixtures / test function

JIRA: CQ-1005
- for now, this will mainly work in cases when the task is still
  in the queue
- once the task is running & making the function call, there is no
  mechanism to tell the call to cancel

JIRA: CQ-1005
- see code comments for explanation

JIRA: CQ-1005
@lupko lupko enabled auto-merge December 12, 2024 07:49
@lupko lupko merged commit 484fc20 into gooddata:master Dec 12, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants