Skip to content

Fix background jobs that previously failed and were never completed properly#3113

Draft
tw4l wants to merge 1 commit intomainfrom
issue-3067-rerun-bg-jobs-not-finished
Draft

Fix background jobs that previously failed and were never completed properly#3113
tw4l wants to merge 1 commit intomainfrom
issue-3067-rerun-bg-jobs-not-finished

Conversation

@tw4l
Copy link
Copy Markdown
Member

@tw4l tw4l commented Jan 15, 2026

Fixes #3067

This PR includes two migrations:

  • One that fixes background jobs that never had finished or success set properly in the database
  • One that finds all crawl and profile files that have not been replicated (if replica locations are set) and creates new background jobs to re-replicate them.

Work in progress. In particular, we should make sure that we check for the expected number of replicas for each file rather than just whether they've been replicated at all.


crawls_match_query = {
"oid": {"$in": orgs_with_replicas},
"files": {"$elemMatch": {"replicas": {"$in": [None, []]}}},
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to instead check that the length of the replica array isn't the same length as the number of configured replica locations. It's possible a file would be replicated to one location but not to another.


profiles_match_query = {
"oid": {"$in": orgs_with_replicas},
"resource.replicas": {"$in": [None, []]},
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to instead check that the length of the replica array isn't the same length as the number of configured replica locations. It's possible a file would be replicated to one location but not to another.

@tw4l tw4l force-pushed the issue-3067-rerun-bg-jobs-not-finished branch from acbddbf to e5823e7 Compare April 16, 2026 09:47
Some background jobs previously failed and did not have success
or finished fields set due to bugs. The first migration targets
these and ensures that their finished and success values are set
in the database as expected.

The second migration looks for all unreplicated crawl and profile
files and replicates them. This is preferably to simply
retrying older failed replication background jobs, as it is
possible that the objects they correlate have been deleted or
changed since and so those old jobs are no longer applicable.
@tw4l tw4l force-pushed the issue-3067-rerun-bg-jobs-not-finished branch from e5823e7 to a95d2bb Compare April 16, 2026 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: Re-run previously failed background jobs that never had finished/success set

1 participant