Skip to content

Race condition when a task is interrupted leads to exceptions #335

@etiennebarrie

Description

@etiennebarrie
First Job Second Job Run status
Task is created, job enqueued in Runner.run enqueued
TaskJob#before_enqueue running
Job is re-enqueued in JobIteration::Iteration#reenqueue_iteration_job
~ TaskJob#before_enqueue running
TaskJob#after_perform interrupted
~ Task is processing interrupted

At this point you have a running Task which a job is actively processing, so in the UI it's a bit weird because you can see progress being made to a seemingly interrupted Task.
An exception is then raised when the Task succeeds, because we try to move the status from interrupted to succeeded.

Before 1.1.0, this exception is caught and another one is raised when we try to move from interrupted to errored.

What happened next with 1.0.0 and sidekiq is that the job was retried because there was an unhandled exception (with the previous cursor as explained in the README). That finally led the task to finish and succeed. Re-using the last cursor meant that elements were processed multiple times, luckily the process was idempotent.

With 1.1.0, the exception is caught, and the task is moved to errored thanks to the second commit of #293, which was meant for deleted tasks but applies here as well: dbe3234.
The only solution at this point is to run the task again, which may skip the already processed items with the collection method, or last line of defense hopefully the process method is idempotent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions