Skip to content

Python lambdas lose connection to cluster nodes #705

@thomhickey

Description

@thomhickey

I know it's a long shot and I do apologize for asking a semi out-of-bounds question as our backend is not scylladb. Here's our setup:

scylladb python driver version: 3.29.7
aws lambda runtime: python3.13 arm w/ aws http api gateway
backend: aws keyspaces or ec2 cassandra cluster, multi-region

What we see with either your driver or the datastax driver is a very consistent 'host marked down' followed by 'no host available' when the lambda sits idle for 1-2 minutes, but before it is torn down completely by aws. There is a period of time when the lambda runtime is stopped, but still exists so a new request does not cause a cold start which would of course cause a new connection to the cluster. In this period of time when the lambda is put in idle state, the driver loses connection to keyspaces/cassandra. First query fails, we catch it, reconnect, and all is well but latency of that request suffers greatly putting what is normally a 30-50ms response time into the 300ms territory.

With cassandra we're able to experiment with heartbeat settings and low-level socket settings, but none of it fixes the issue. I wouldn't say we've tried everything, but we've tried a lot including provisioned concurrency of the lambdas which, surprisingly, does not fix the issue. We've also tried pinging our own apis every minute with a real request and we still saw issues.

Has anyone else reported this? Any fix you can recommend? Any faster reconnect strategies you can recommend?

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions