Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

Copy link
Member Author

NathanFlurry commented Jan 24, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@NathanFlurry NathanFlurry changed the title queues example workflow example Jan 24, 2026
// Workflow Sandbox - Actor Registry
// Each actor demonstrates a different workflow feature using actor-per-workflow pattern

import { setup } from "rivetkit";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'setup' function is imported from 'rivetkit' but the module cannot be found. This is because the root package.json needs to have resolutions for RivetKit packages. Add 'rivetkit': 'workspace:*' to the resolutions in the root package.json.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

@claude
Copy link

claude bot commented Jan 24, 2026

Pull Request Review: Workflow Example

This PR adds a comprehensive workflow sandbox example and addresses several friction points in the RivetKit workflow API. Overall, this is a valuable contribution that improves developer experience and provides excellent documentation through a working example.


✅ Strengths

1. Excellent Documentation

  • workflow-friction.md is exceptional - documenting pain points during development is invaluable for improving the API
  • The example README follows the template correctly with clear feature sections
  • Good inline comments explaining workflow patterns

2. API Improvements

The PR addresses real friction points:

  • Added missing send() method to ActorQueue
  • Added missing broadcast() method to ActorWorkflowContext
  • Re-exported Loop from rivetkit/workflow to fix package resolution
  • Updated CLAUDE.md with RivetKit package resolution guidelines

3. Comprehensive Example

The workflow-sandbox example demonstrates:

  • Sequential steps (order processing)
  • Sleep/timers (countdown)
  • Loops with state (batch processing)
  • Listen with timeout (approval queue)
  • Join (parallel data fetching)
  • Race (work vs timeout)
  • Rollback/compensating transactions (payment)

Each pattern is implemented as a separate actor with clean separation of concerns.


⚠️ Issues & Recommendations

1. Type System Workaround (Critical)

Issue: The actorCtx() helper in _helpers.ts is a workaround for a fundamental type system problem documented in workflow-friction.md.

Location: examples/workflow-sandbox/src/actors/_helpers.ts:10-12

Problem: Every example needs to cast the loop context to access state and broadcast, which defeats type safety.

Recommendation:

  • Short-term: This workaround is acceptable for the example, but add a TODO comment linking to a tracking issue
  • Long-term: Fix the root cause by making ActorWorkflowContext the declared type for loop callbacks (as noted in friction log)

2. Type Cast in workflow/mod.ts

Issue: The return type is cast to satisfy RunConfig.

Recommendation: Consider:

  1. Adjusting the RunConfig type to accept the actual return type
  2. Ensuring the workflow run function signature matches what RunConfig expects
  3. Adding a comment explaining why the cast is necessary if it's intentional

3. Error Handling Patterns

Issue: All actor files use try-catch with error serialization, but there's no consistent error handling pattern.

Examples:

  • order.ts:85-96 - catches and sets state.error
  • payment.ts:158-170 - same pattern
  • No validation of error types or structured error objects

Recommendation:

  • Consider creating a shared error handling utility
  • Define error types instead of using Error | unknown
  • Follow the custom error system documented in CLAUDE.md (packages/common/error/)

4. Logging Style Inconsistency

Issue: Pick a consistent naming convention for IDs (orderId vs txId vs id)

5. Documentation Completeness

Issue: The workflow-friction.md document is excellent but lives at the repo root instead of in a docs/ directory.

Recommendation:

  • Move to docs/development/workflow-friction.md or similar
  • Add a link from the main README or CONTRIBUTING.md
  • Consider converting this into tracking issues for each friction point

🔍 Security Considerations

✅ No security concerns identified:

  • No sensitive data handling
  • No external API calls
  • Simulated payment/order processing only

📊 Test Coverage

Missing: No test files in tests/ directory for the workflow-sandbox example.

Recommendation: Add tests following the pattern from examples/CLAUDE.md


📝 Code Quality

Positive:

  • Consistent formatting
  • TypeScript strict mode enabled
  • Good separation of concerns
  • Clear naming conventions

Areas for Improvement:

  • Reduce duplication in actor patterns (consider shared utilities)
  • Add JSDoc comments for exported types

📋 Checklist

  • ✅ Follows repository conventions (CLAUDE.md)
  • ✅ Uses pnpm workspace patterns correctly
  • ✅ ESM imports with .ts extensions
  • ✅ Structured logging with tracing
  • ⚠️ Type safety (workarounds present)
  • ❌ Missing test coverage
  • ⚠️ Friction points documented but not fully resolved

🎬 Conclusion

Recommendation: Approve with minor changes

This PR significantly improves the RivetKit developer experience by:

  1. Adding missing API methods that were clearly needed
  2. Providing a comprehensive workflow example
  3. Documenting friction points for future improvement

The type system workarounds are acceptable for now, given they're well-documented in the friction log. However, these should be addressed in follow-up work.

Required before merge:

  • None (approval can proceed)

Recommended follow-ups:

  1. Create tracking issues for each friction point in workflow-friction.md
  2. Add test coverage for the workflow-sandbox example
  3. Fix the type system issues so actorCtx helper isn't needed
  4. Review and potentially refactor the 880-line App.tsx component

Great work on documenting the friction points - this kind of developer feedback is invaluable! 🎉

Comment on lines +44 to +50
import { timer } from "./actors/timer.ts";
import { order } from "./actors/order.ts";
import { batch } from "./actors/batch.ts";
import { approval } from "./actors/approval.ts";
import { dashboard } from "./actors/dashboard.ts";
import { race } from "./actors/race.ts";
import { payment } from "./actors/payment.ts";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These imports need to be sorted alphabetically to pass Biome linting. Run 'pnpm biome check --apply' to automatically fix this issue.

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +134 to +136
useEffect(() => {
actor.connection?.getOrder().then(setOrder);
}, [actor.connection]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates an unhandled floating Promise. Add a .catch() handler to properly handle potential errors: actor.connection?.getOrder().then(setOrder).catch(error => console.error('Failed to get order:', error));

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +714 to +721
useEffect(() => {
actor.connection?.getTask().then(setTask);
}, [actor.connection]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling to this Promise: actor.connection?.getTask().then(setTask).catch(error => console.error('Failed to get task:', error));

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +527 to +532
<button className="approve" onClick={() => onApprove("Admin")}>
Approve
</button>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If onApprove returns a Promise, handle it properly: onClick={() => { const result = onApprove('Admin'); if (result instanceof Promise) result.catch(err => console.error('Approval failed:', err)); }}

Spotted by Graphite Agent (based on CI logs)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

This was referenced Jan 26, 2026
@NathanFlurry NathanFlurry changed the base branch from 01-21-feat_rivetkit_integrate_workflows_in_to_actors to graphite-base/4032 January 28, 2026 01:12
@NathanFlurry NathanFlurry changed the base branch from graphite-base/4032 to 01-24-feat_frontend_traces_ui January 28, 2026 01:12
@claude
Copy link

claude bot commented Jan 28, 2026

PR Review: Workflow Example

Overview

This PR adds comprehensive workflow and queue sandbox examples with a rich frontend UI. It includes significant changes to RivetKit's workflow engine and actor configuration, along with new documentation and two major example applications.

Changes Summary:

  • 99 files changed: +9,859 additions, -302 deletions
  • New examples: queue-sandbox and workflow-sandbox (both local and Vercel versions)
  • Workflow engine improvements: logging support, rollback actions
  • Actor configuration: RunConfig support for workflow metadata
  • Documentation: New flushing guide, updated style guide

Code Quality & Best Practices

✅ Strengths

  1. Excellent Type Safety - Strong TypeScript types throughout RivetKit changes with proper generics
  2. Comprehensive Documentation - New FLUSHING.md and detailed frontend style guide
  3. Good Separation of Concerns - Clean actor/workflow separation in examples

⚠️ Concerns

  1. Inconsistent Logging (context.ts:178-181) - Logger is optional but used throughout without clear configuration docs
  2. Demo Error Patterns (payment.ts:130-132) - Generic Error throws might be copied by users
  3. Magic Numbers (config.ts:91-106) - Hardcoded timeouts without tuning guidance

Critical Issues

🔴 Must Fix

1. Rollback State Transition Bug (index.ts:878-893)

storage.state = "rolling_back";
await flush(storage, driver);
try {
    await executeRollback(/* ... */);
} catch (rollbackError) {
    if (rollbackError instanceof EvictedError) {
        return { state: storage.state };  // Returns "rolling_back" not "failed"!
    }
}

Issue: If evicted during rollback, workflow stays in "rolling_back" state. On resume, no recovery logic exists for this state.

Fix: Transition to "failed" before return or add recovery logic on resume.

2. Memory Leak in Live Runtime (index.ts:188-203)

function notifyMessage(runtime: LiveRuntime, name: string): void {
    runtime.pendingMessageNames.push(name);  // Unbounded growth!
}

Issue: Array grows indefinitely if messages arrive without consumers.

Fix: Add max size + cleanup for old unmatched messages.

3. Missing Rollback Tests

  • New rollback functionality has no comprehensive tests
  • Only 5 lines added to existing test suite
  • Critical paths like rollback+eviction untested

Medium Priority Issues

4. setTimeout Without Cancellation (payment.ts:91-93)
Demo code uses setTimeout without abortSignal - bad pattern for users to copy.

5. No Input Validation (payment.ts:42-44)
Amount can be negative/zero/huge in example code.

6. Stack Traces in Production (index.ts:922-925)
Full stack traces persisted - could leak paths/structure.

7. Excessive Loop Flushing (context.ts:722-735)
Default commitInterval=20 may be too frequent for tight loops. Needs tuning docs.


Architecture & Design

✅ Good Patterns

  • RunConfig Design - Clean, backward-compatible extension point
  • Rollback Actions - Type-safe with automatic completion tracking
  • Dirty Flag Pattern - Efficient batched writes

⚠️ Concerns

  • WorkflowContextImpl Size - 1893 lines, handles too many responsibilities
  • Complex Execution Flow - executeWorkflow ↔ executeRollback ↔ workflowFn with shared mutable state

Recommendation: Extract rollback logic + validation into separate modules.


Test Coverage

Missing:

  • Rollback checkpoint enforcement tests
  • Rollback action execution order tests
  • Rollback with eviction scenarios
  • RunConfig metadata tests
  • Logger integration tests
  • Example applications (no automated tests)

Documentation

✅ Excellent

  • FLUSHING.md persistence guide
  • Frontend style guide (examples/CLAUDE.md)
  • JSDoc on public APIs

⚠️ Needs

  • Logger configuration examples
  • RunConfig migration guide
  • Timeout tuning guidance
  • Rollback patterns in CLAUDE.md
  • Troubleshooting sections in example READMEs

Specific File Recommendations

workflow-engine/src/index.ts

  • Line 878-893: Fix rollback eviction state transition ⚠️ CRITICAL
  • Line 188: Add max size to pendingMessageNames ⚠️ CRITICAL
  • Line 631: Consider batch message API
  • Line 925: Add option to exclude stack traces

workflow-engine/src/context.ts

  • Line 124: Document logger parameter
  • Line 336: Add JSDoc about entryInProgress try-finally
  • Line 503-558: Note timeout doesn't cancel underlying operation
  • Consider refactoring (1893 lines is large)

rivetkit/src/actor/config.ts

  • Line 62-64: Add RunConfig usage examples
  • Line 91-111: Document timeout value selection

examples/workflow-sandbox/src/actors/payment.ts

  • Line 42-44: Add input validation (amount > 0)
  • Line 91-93: Use ctx.abortSignal or add disclaimer
  • Line 130-132: Use custom error class

Summary & Recommendation

Overall Assessment: 🟡 Approve with Required Changes

This is high-quality work that adds significant value to RivetKit. The examples are well-designed, documentation is comprehensive, and type safety is excellent. However, critical bugs must be fixed before merge.

Action Items

Before Merge (Critical):

  1. ✋ Fix rollback state transition on eviction
  2. ✋ Add memory cleanup to live runtime
  3. ✋ Add comprehensive rollback tests

Soon After Merge:

  1. Document logger configuration
  2. Add input validation to examples
  3. Document timeout tuning

Future Improvements:

  1. Refactor WorkflowContextImpl (extract rollback logic)
  2. Add batch message API
  3. Make stack trace persistence configurable
  4. Add automated tests for examples

The core functionality is solid and the examples will be valuable learning resources. Once the critical issues are addressed, this will be ready to merge.


Review completed: 2026-01-30

@jog1t jog1t force-pushed the 01-24-feat_frontend_traces_ui branch from 389f28e to 3c7fabc Compare January 28, 2026 19:55
NathanFlurry and others added 2 commits January 30, 2026 00:01
…WithTimeout

When listenWithTimeout/listenUntil was used, workflows would only wake
when the deadline passed, not when a message arrived. This broke
approval workflows and other human-in-the-loop patterns.

Changes:
- Extended SleepError to optionally carry messageNames array
- Modified listen methods to pass message names when throwing SleepError
- Updated execution loop to wait for EITHER messages OR deadline
- Fixed race branches to properly yield on SleepError instead of failing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@NathanFlurry NathanFlurry force-pushed the 01-24-feat_frontend_traces_ui branch from 3c7fabc to 7dc3fdd Compare January 30, 2026 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants