Skip to content

fix(otel): fix broken schedule run spans#2727

Merged
ericallam merged 1 commit intomainfrom
ea-branch-106
Dec 2, 2025
Merged

fix(otel): fix broken schedule run spans#2727
ericallam merged 1 commit intomainfrom
ea-branch-106

Conversation

@ericallam
Copy link
Member

@ericallam ericallam commented Dec 2, 2025

schedule spans can sometimes show as generic spans when using the task_events_v2 table because of the inserted_at filter. Increasing the buffer for the start time does the trick and doesn’t cause any perf issues (and is in general just more robust)

CleanShot 2025-12-02 at 17 04 35

schedule spans can sometimes show as generic spans when using the 
task_events_v2 table because of the inserted_at filter. Increasing the 
buffer for the start time does the trick and doesn’t cause any perf 
Issues (and is in general just more robust)
@changeset-bot
Copy link

changeset-bot bot commented Dec 2, 2025

⚠️ No Changeset found

Latest commit: 46a906d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

This pull request introduces a taskEventStore field through the application layer. The field is added to the SpanRun object returned by the presenter layer, propagated into a new admin-only UI display in the run details page, and includes a timing adjustment to a database query filter in the event repository that increases the lower time bound from 1 second to 60 seconds.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify that taskEventStore is consistently typed across presenter, API contract, and UI layers
  • Confirm the new admin-only property in the RunBody component correctly restricts visibility
  • Validate that the start_time filter adjustment (1000 ms to 60_000 ms) in getTraceSummary does not unintentionally exclude relevant trace records or degrade query performance

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the issue and solution but lacks required template sections like issue reference, checklist, and testing details. Add issue reference (Closes #), complete the checklist, include testing steps, and provide a changelog section following the template format.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main fix: addressing broken schedule run spans in OpenTelemetry integration.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ea-branch-106

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts (1)

1034-1070: Widened time buffer looks correct; consider centralizing and checking other callers.

Using a 60s buffer for startCreatedAtWithBuffer (and thus start_time / inserted_at lower bounds) should address the missing schedule spans and still keeps partition pruning in place. The only nuance is that other query methods here (getSpan, getTraceDetailedSummary, getRunEvents) still use a 1s buffer; if the underlying issue is specific to getTraceSummary that’s fine, but it’s worth double‑checking that they don’t suffer from the same edge case.

You might also want to extract 60_000 into a shared constant (e.g. TRACE_SUMMARY_START_TIME_BUFFER_MS) so it’s easier to tune and reason about later.

apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx (1)

801-822: Admin “Task event store” display is wired correctly; consider a fallback label.

Showing run.taskEventStore in the admin‑only section is consistent with the presenter change and useful for debugging. If there’s any chance this field can be null/undefined (older runs, migrations), you might want to render a placeholder like "–" or "unknown" instead of a bare empty/null value.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f27422 and 46a906d.

📒 Files selected for processing (3)
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1 hunks)
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx (1 hunks)
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
apps/webapp/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access all environment variables through the env export of env.server.ts instead of directly accessing process.env in the Trigger.dev webapp

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier

Files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
🧠 Learnings (4)
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions

Applied to files:

  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution

Applied to files:

  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam.spans.$spanParam/route.tsx
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
📚 Learning: 2025-06-14T08:07:46.625Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 2175
File: apps/webapp/app/services/environmentMetricsRepository.server.ts:202-207
Timestamp: 2025-06-14T08:07:46.625Z
Learning: In apps/webapp/app/services/environmentMetricsRepository.server.ts, the ClickHouse methods (getTaskActivity, getCurrentRunningStats, getAverageDurations) intentionally do not filter by the `tasks` parameter at the ClickHouse level, even though the tasks parameter is accepted by the public methods. This is done on purpose as there is not much benefit from adding that filtering at the ClickHouse layer.

Applied to files:

  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (1)
apps/webapp/app/presenters/v3/SpanPresenter.server.ts (1)

213-278: Plumbing taskEventStore through SpanPresenter looks good.

taskEventStore is selected in findRun and surfaced unchanged in the run payload, which keeps SpanRun coherent with the DB shape and supports the new admin UI without altering control flow.

@ericallam ericallam merged commit df4ab97 into main Dec 2, 2025
31 checks passed
@ericallam ericallam deleted the ea-branch-106 branch December 2, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments