Skip to content

Conversation

@tenthe
Copy link
Contributor

@tenthe tenthe commented Nov 11, 2025

Hi all,

in Apache StreamPipes we encountered an issue where, if a connection drops during a lease and reconnection subsequently fails, the container leaves behind a “zombie” leasedConnection as well as a non-empty queue containing futures that have already completed. To address this, we added a queue.clear() call in the ConnectionContainer.

For StreamPipes we had temporarily copied the corresponding PLC4X classes [1] to apply this fix on our side. If this change is acceptable from your perspective, we can remove the duplicated PLC4X classes from the StreamPipes codebase.

I hope this update makes sense. If there’s anything else I should adjust, please let me know.

[1] https://github.com/apache/streampipes/tree/dev/streampipes-extensions/streampipes-connectors-plc/src/main/java/org/apache/streampipes/extensions/connectors/plc/cache

@sruehl
Copy link
Contributor

sruehl commented Nov 11, 2025

LGTM, thanks for the PR

@sruehl sruehl merged commit b430d0f into apache:develop Nov 11, 2025
8 of 9 checks passed
@chrisdutz
Copy link
Contributor

Oh ... I was just about to review this and I do have my issues with it.

Initially it was the idea that multiple threads asking for the same resource result in the first getting it and the others going into the queue. If now the one thread using it causes the connection to throw an exception, we invalidate the connection and create a new one.

The idea was to return the new connection to the next waiting thread.

By clearing the queue, we immediately kill also the waiting threads.

@chrisdutz
Copy link
Contributor

If the component is not doing the reconnecting and handing the connection to the next, then I think we should fix this instead.

@tenthe
Copy link
Contributor Author

tenthe commented Nov 12, 2025

Hi @chrisdutz,

thanks for your comment. Yes, that is how our client in StreamPipes currently work: if a connection fails, we skip the reading and retry it during the next pull interval.

The issue before this PR was that we always received a broken connection because it was never removed from the queue.

@chrisdutz
Copy link
Contributor

I guess the main issue is, that the cache didn't create a new connection and pass that to the next one in the queue. Or ... if for example the connection is lost ... so the re-connect will also fail ... not sure how it currently would handle that.

Ideally the container should keep on trying to connect and once that works, it passes the connection to the next in the queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants