Skip to content

Conversation

@tcdowney
Copy link
Member

@tcdowney tcdowney commented Jul 1, 2025

Issue: #4372

  • I have reviewed the contributing guide

  • I have viewed, signed, and submitted the Contributor License Agreement

  • I have made this pull request to the main branch

  • I have run all the unit tests using bundle exec rake

  • [ in progress ] I have run CF Acceptance Tests

@tcdowney tcdowney requested review from Gerg, Samze and sethboyles July 1, 2025 16:43
@tcdowney tcdowney force-pushed the add-user-to-tasks branch from 8c9ee27 to 3fd097b Compare July 1, 2025 18:47
@tcdowney tcdowney requested a review from Gerg July 1, 2025 22:31
@tcdowney tcdowney merged commit cc006c0 into main Jul 2, 2025
18 of 21 checks passed
@tcdowney tcdowney deleted the add-user-to-tasks branch July 2, 2025 18:52
ari-wg-gitbot added a commit to cloudfoundry/capi-release that referenced this pull request Jul 2, 2025
Changes in cloud_controller_ng:

- Add ability to set user on Tasks
    PR: cloudfoundry/cloud_controller_ng#4433
    Author: Tim Downey <tcdowney@users.noreply.github.com>
@spgreenberg
Copy link

spgreenberg commented Jul 17, 2025

@tcdowney @Gerg We are seeing errors querying the task API for existing applications once this change is applied. Tasks still execute successfully but querying the task API fails. The error message is vague:

cf tasks <app>

Unexpected Response
Response Code: 500
Request ID:    d346f3ca-6195-...
Code: 0, Title: , Detail: {"errors":[{"title":"UnknownError","detail":"An unknown error occurred.","code":10001}]}
FAILED

This only fails for apps that existed before the update and had tasks ran. I suspect this points to a nil/null issue related the missing user in the task records created before the update.

Please let me know if there is additional info I can provide to help track this down.

@Gerg
Copy link
Member

Gerg commented Jul 18, 2025

The underlying error should be available in the cloud controller logs.

I wasn't able to reproduce this by deploying an older version, running a task, deploying a new version, and then listing tasks. I attempted to reproduce on an environment using MySQL; are you using Postgres?

@spgreenberg
Copy link

Thanks, I was able to find some logs. This impacted multiple customers before we rolled back.

{"timestamp":"2025-07-17T12:20:50.127204482Z","message":"Completed 500 vcap-request-id: d346f3ca-6195...","log_level":"info","source":"cc.api","data":{"request_guid":"d346f3ca-6195...","user_guid":"...","b3_trace_id":"d346f3ca...","b3_span_id":"bc24...","status_code":500,"time_taken_in_ms":12,"request_method":"GET","request_fullpath":"/v3/apps/bb94003e.../tasks?per_page=5000"},"thread_id":16200,"fiber_id":16220,"process_id":12,"file":"/var/vcap/data/packages/cloud_controller_ng/82618a.../cloud_controller_ng/lib/cloud_controller/logs/request_logs.rb","lineno":33,"method":"complete_request"}

Yes, we are using postgres.

@spgreenberg
Copy link

We have been trying to recreate this issue in our dev environment and have not been able to yet. Oddly we saw this in our dev and prod environments at the same time. We are still working on it.

@Gerg
Copy link
Member

Gerg commented Jul 21, 2025

I tried to reproduce with Postgres, and it still worked fine. My guess is it's something specific to those apps/tasks.

There should be a stack trace in the logs coming from here.

It should look something like:

{"timestamp":"...","message":"Request failed: 500: {\"errors\"=\u003e[{\"title\"=\u003e\"UnknownError\", \"detail\"=\u003e\"An unknown error occurred.\", \"code\"=\u003e10001, \"test_mode_info\"=\u003e{\"detail\"=\u003e\"
<actual error message>\", \"title\"=\u003e\"<actual error type>\", \"backtrace\"=\u003e[<stack trace>...

@cweibel
Copy link

cweibel commented Jul 21, 2025

Having a heck of a time reproducing the problem, but DID see what Steve saw. Unfortunately the two apps that had the problem have either been deleted or redeployed which in each case made the problem go away.

There are a few logs the next day with PG::UndefinedColumn: ERROR: column processes.user does not exist but those I attribute to the short period of time (2 minutes?) between when I rolled back the schema changes manually and pinned back and started the deployment in Concourse.

If we find anything more definitive we'll drop an update here.

@spgreenberg
Copy link

spgreenberg commented Jul 22, 2025

@cweibel Did notice missing droplets around the same time. Perhaps this is our root issue: #4467.

One of the apps should not have had its droplets purged though. It could have, but this would be impossible for us to validate at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants