[ENG-10788] add nodes and preprints files reinding on account merging#11727
[ENG-10788] add nodes and preprints files reinding on account merging#11727mkovalua wants to merge 6 commits into
Conversation
|
LGTM |
cslzchen
left a comment
There was a problem hiding this comment.
The fix looks good but I have two notes:
- Is this feature testable by unit tests? If so, please update/add.
- Are there any possible impact to the celery queue if too many
update_share(file), could it crowd/block the celery queue?
Hi @cslzchen
|
+1 for putting bulk tasks like these on a lower-priority queue -- currently always high priority over in SHARE there's task routing to prioritize certain tasks called with |
cslzchen
left a comment
There was a problem hiding this comment.
The approach should work 👍 .
I'd like to see if we can improve and generalize it so that this can be used by other tasks and that it doesn't conflict with other tasks.
Afterwards, make sure we have good docstring and unit tests.
|
|
||
|
|
||
| def update_share(resource): | ||
| def update_share(resource, urgent=True): |
There was a problem hiding this comment.
Add docstring, mentioning which queue we are using, the default behavior and why we should use one over the other.
|
|
||
|
|
||
| def _enqueue_update_share(osfresource): | ||
| def _enqueue_update_share(osfresource, urgent): |
| retry_backoff=True, | ||
| ) | ||
| def task__update_share(self, guid: str, is_backfill=False, osfmap_partition_name='MAIN'): | ||
| def task__update_share(self, guid: str, is_backfill=False, osfmap_partition_name='MAIN', urgent=True): |
| if kwargs and (kwargs.get('urgent') is False): | ||
| return {'queue': CeleryConfig.task_low_queue} |
There was a problem hiding this comment.
Can you take a further look at https://docs.celeryq.dev/en/latest/userguide/routing.html#routers to see if this is the proper way?
In addition, my understanding is the args and kwargs comes from the task. It's very likely other tasks may use urgent for a different purpose or even with different type.
- Please check if we have any tasks that uses previously uses
urgent? - Even if not, should we provide a way to avoid
urgentbeing used in the future. My though is use a more verbose/explicit name. e.g.
def update_share(..., target_queue=None)
...
update_share(..., target_queque=CeleryConfig.task_low_queue)- In addition, make sure
CeleryRouterverifies the queue is valid if manually selected.
Ticket
Purpose
User merge does not seem to be working for files
Changes
Adding nodes and preprints files reinding on account merging
Also some updates for SHARE is good to have
CenterForOpenScience/SHARE#892
Side Effects
File indexing does not work for local setup, so maybe it is needed add something else because something may be missed because of a bit blame fix
QE Notes
CE Notes
Documentation