Skip to content

[ENG-10788] add nodes and preprints files reinding on account merging#11727

Open
mkovalua wants to merge 6 commits into
CenterForOpenScience:feature/pbs-26-9from
mkovalua:fix/ENG--10788
Open

[ENG-10788] add nodes and preprints files reinding on account merging#11727
mkovalua wants to merge 6 commits into
CenterForOpenScience:feature/pbs-26-9from
mkovalua:fix/ENG--10788

Conversation

@mkovalua
Copy link
Copy Markdown
Contributor

@mkovalua mkovalua commented May 5, 2026

Ticket

Purpose

User merge does not seem to be working for files

Changes

Adding nodes and preprints files reinding on account merging

Also some updates for SHARE is good to have

CenterForOpenScience/SHARE#892

Side Effects

File indexing does not work for local setup, so maybe it is needed add something else because something may be missed because of a bit blame fix

image

QE Notes

CE Notes

Documentation

@mkovalua mkovalua changed the title add nodes and preprints files reinding on account merging [ENG-10788] add nodes and preprints files reinding on account merging May 5, 2026
Copy link
Copy Markdown
Contributor

@antkryt antkryt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small import fix

Comment thread osf/models/user.py
@antkryt
Copy link
Copy Markdown
Contributor

antkryt commented May 8, 2026

LGTM

Copy link
Copy Markdown
Collaborator

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good but I have two notes:

  • Is this feature testable by unit tests? If so, please update/add.
  • Are there any possible impact to the celery queue if too many update_share(file), could it crowd/block the celery queue?

@mkovalua
Copy link
Copy Markdown
Contributor Author

mkovalua commented Jun 3, 2026

The fix looks good but I have two notes:

  • Is this feature testable by unit tests? If so, please update/add.
  • Are there any possible impact to the celery queue if too many update_share(file), could it crowd/block the celery queue?

Hi @cslzchen

  1. Have updated unittests

  2. Hard to say from my side (had problems with local files share indexing as shown in PR description), I suppose that it may be possible. If yes - maybe the solution is to create low priority tasks for files and setup several workers and it may be helpful. Maybe @mfraezz and @aaxelb have ideas. Thanks

@mkovalua mkovalua requested a review from cslzchen June 3, 2026 20:49
@aaxelb
Copy link
Copy Markdown
Collaborator

aaxelb commented Jun 3, 2026

  • Are there any possible impact to the celery queue if too many update_share(file), could it crowd/block the celery queue?
  1. Hard to say from my side (had problems with local files share indexing as shown in PR description), I suppose that it may be possible. If yes - maybe the solution is to create low priority tasks for files and setup several workers and it may be helpful. Maybe mfraezz and aaxelb have ideas. Thanks

+1 for putting bulk tasks like these on a lower-priority queue -- currently always high priority

over in SHARE there's task routing to prioritize certain tasks called with urgent=True -- could do similar in OSF with (or replacing) the is_backfill kwarg to task__update_share (which indirectly decides urgency in SHARE already), and allowing something like update_share(obj, urgent=False) for updates not directly spurred by user interaction

Copy link
Copy Markdown
Collaborator

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach should work 👍 .

I'd like to see if we can improve and generalize it so that this can be used by other tasks and that it doesn't conflict with other tasks.

Afterwards, make sure we have good docstring and unit tests.

Comment thread api/share/utils.py


def update_share(resource):
def update_share(resource, urgent=True):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docstring, mentioning which queue we are using, the default behavior and why we should use one over the other.

Comment thread api/share/utils.py Outdated


def _enqueue_update_share(osfresource):
def _enqueue_update_share(osfresource, urgent):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, docstring

Comment thread api/share/utils.py
retry_backoff=True,
)
def task__update_share(self, guid: str, is_backfill=False, osfmap_partition_name='MAIN'):
def task__update_share(self, guid: str, is_backfill=False, osfmap_partition_name='MAIN', urgent=True):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, docstring

Comment on lines +35 to +36
if kwargs and (kwargs.get('urgent') is False):
return {'queue': CeleryConfig.task_low_queue}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you take a further look at https://docs.celeryq.dev/en/latest/userguide/routing.html#routers to see if this is the proper way?

In addition, my understanding is the args and kwargs comes from the task. It's very likely other tasks may use urgent for a different purpose or even with different type.

  • Please check if we have any tasks that uses previously uses urgent?
  • Even if not, should we provide a way to avoid urgent being used in the future. My though is use a more verbose/explicit name. e.g.
def update_share(..., target_queue=None)
   ...

update_share(..., target_queque=CeleryConfig.task_low_queue)
  • In addition, make sure CeleryRouter verifies the queue is valid if manually selected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants