Skip to content

Fix duplicate cron trigger events when callback blocks on I/O#667

Draft
bolekk wants to merge 1 commit into
mainfrom
httptrigger_multi
Draft

Fix duplicate cron trigger events when callback blocks on I/O#667
bolekk wants to merge 1 commit into
mainfrom
httptrigger_multi

Conversation

@bolekk

@bolekk bolekk commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Production workflows with a single cron trigger were occasionally delivering the same trigger event ID twice, causing the workflow engine to log "Skipping duplicate execution" for a delayed second delivery.

Root cause: gocron allows overlapping task runs by default. The cron callback read trigger.nextRun at the start but only advanced it after blocking work (org resolver lookup, TriggerExecutionStarted emission). If the callback blocked long enough for the next scheduled tick to fire, a concurrent gocron invocation read the same stale nextRun and emitted a duplicate event with the identical scheduled execution time / legacy execution ID.

Fix:

  • Advance nextRun via job.NextRun() immediately after capturing the scheduled execution time, before any blocking I/O in the callback.
  • Register cron jobs with gocron.WithSingletonMode(LimitModeWait) so a second invocation cannot run concurrently while the first is still in progress; overdue ticks queue and run sequentially afterward.

Add TestCronTrigger_DelayedDuplicateEventWhenCallbackBlocks with a blocking org resolver to reproduce the production scenario and assert that no duplicate trigger event IDs are delivered.

Production workflows with a single cron trigger were occasionally
delivering the same trigger event ID twice, causing the workflow engine
to log "Skipping duplicate execution" for a delayed second delivery.

Root cause: gocron allows overlapping task runs by default. The cron
callback read trigger.nextRun at the start but only advanced it after
blocking work (org resolver lookup, TriggerExecutionStarted emission).
If the callback blocked long enough for the next scheduled tick to fire,
a concurrent gocron invocation read the same stale nextRun and emitted
a duplicate event with the identical scheduled execution time / legacy
execution ID.

Fix:
- Advance nextRun via job.NextRun() immediately after capturing the
  scheduled execution time, before any blocking I/O in the callback.
- Register cron jobs with gocron.WithSingletonMode(LimitModeWait) so a
  second invocation cannot run concurrently while the first is still
  in progress; overdue ticks queue and run sequentially afterward.

Add TestCronTrigger_DelayedDuplicateEventWhenCallbackBlocks with a
blocking org resolver to reproduce the production scenario and assert
that no duplicate trigger event IDs are delivered.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cl-sonarqube-production

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant