cdh.files: various improvements by tymees · Pull Request #132 · CentreForDigitalHumanities/django-shared-core

tymees · 2026-02-24T21:11:10Z

This pull request is in response to the current 'missing files' situation, but I'm not 100% convinced this fixes the problem. It does effectively close 2 avenues I found that could result in the behavior we've seen; however one I could only replicate in a scenario that does not happen in the grant tool, and the other is purely theoretical.

However, I am hoping that these changes might actually fix the real problem, as the code is now a lot more hesitant to delete files in general. But without logs, I can't say for certain.
(In any case, I've added a killswitch on file deletion as well, so worst case that should fix the issue in a stupid way).

The changes:

Added a logger
Pretty self explanatory, a lot of debug (and some info/warning/error) log calls have been sprinkled across the code. This should help debugging any issues.

Fixed a deletion bug
There was a somewhat convoluted check to make sure a File wasn't deleted before it's time. Except, it didn't exactly work, and allowed for files-on-disks to be removed when they shouldn't be deleted.
Honestly, I'm having a hard time trying to explain what it tried to do and why it doesn't work; it's one of those things my abstract reasoning understands, but my linguistic reasoning doesn't.
An attempt: it was trying to account for a dangling reference that might be there, but it was 1) doing it using the wrong ref.counter and 2) couldn't actually do that using the correct ref.counter.

In any case, it was fixed by simplifying and re-arranging the deletion logic/checks a bit. The possible dangling reference is now no longer possible, and we can hard-fail if we find a reference anyway.

As part of these changes, the code-path to remove files without it's corresponding File DB entry should now be closed off. (As it's a nonsensical path to take)

Run IO operations post-DB transaction
I don't think we actually ran into this (yet), but it was possible to for files to 'disappear' if a file-deletion happened during a DB transaction that was later rolled-back. The code simply deleted the files as soon as the code in question was hit, instead of waiting for the DB changes to actually be committed. Thus, you ended in a situation where a valid code path was taken (scoped to cdh.files itself), but another issue caused side-effects that majorly screws up the files.

Now the actual IO operations are wrapped using a transaction.on_commit callback. When no DB transaction is active, that callback is executed directly. If one is active, the code will only be run once that transaction is committed.

For those that don't know what a transaction is:
It's basically mechanism that allows you to run SQL commands, but be able to roll back all your changes you made after you started the transaction.

As an example, let's say you are writing a bank-accounting software and you want to transfer money from account A to account B. You'd first need to subtract amount from account A, and then add it to account B. Makes sense, right?
Okay, then imagine that an error happened between the subtraction of amount in account A and adding it to account B.
Now the money is not added to account B, but is gone from account A.
If you use a transaction, you avoid that issue as you are saying to the DB 'hey, these two operations MUST both succeed or both fail'.
The DIAPP is setup to run every request in a transaction, for this reason :)

Global killswitch for file deletion
Just to make it easy to easily disable file deletion globally[1], the setting FILE_BLOCK_DELETION was added. It defaults to False, but if set to True it will disable all file-deletion operations for cdh.files. This does break some code-paths, but in 'normal' operation that shouldn't be a problem.

To be exact, changing the underlying file-contents of File and FileWrapper will break, as that does a remove+write operation. (As replacing contents isn't allowed)
If used normally, this won't be a problem as cdh.files doesn't actually touch contents of existing File(Wrapper) objects. (If you change the contents of a FileField, a new File and FileWrapper is created and the old one is marked for cleanup[2])

[1] So globally means 'for everything using cdh.files'; Django's default storage backend is unaffected
[2] Marked for cleanup does not mean it will be deleted; it just means the code will check if it's now unreferenced and only then will it delete stuff.

…lete

…nces

tymees · 2026-02-25T09:59:45Z

Okay the problem turned out to be a completely different issue unrelated to the code. However, I still think the problems I found yesterday should be fixed, making this PR still relevant

miggol

Thanks for helping us out looking into this issue! My main suggestion is in the logging department

miggol · 2026-02-27T18:34:12Z

src/cdh/files/logger.py

I would like to suggest not importing from our own logging module, unless there's a rationale I'm missing here. I would usually get a logger wherever I need one with:

logger = logging.getLogger(__name__)

And then leave the rest to the formatter and filters in settings.py. I feel like this leaves the most flexibility to the client app in question, allowing them to mute/show messages from only wrappers.py, for example. Would save you a few helper functions as well.

A default formatter like this:

"format": "{levelname} {name} {message}",

Would then give you the following:

DEBUG cdh.files.db.descriptors FileDescriptor: Getting required_file for SingleFile object (1)

Making the prefix from the helper function a bit redundant.

The reasoning is mostly just following a pattern I started years ago; I cannot even really why I did that (probably just saw other libraries doing it this way).

But yes this PR does show the limits of this approach, so revisiting that pattern is probably a good idea. I'll try and rewrite it

src/cdh/files/db/wrappers.py

miggol · 2026-03-02T08:55:47Z

src/cdh/files/db/wrappers.py

+        _debug(f"Deleting FileWrapper {self.uuid} (save={save})")
+
        if not self.storage.exists(self.name_on_disk):
+            _warning(f"File {self.name_on_disk} does not exist on disk, skipping deletion")


This isn't new to this PR but I would like to ask why we abort deletion of metadata in this case?

Because it may indicate a problem with the storage backend. In normal operations the file should always exist if there is a File entry for it, so if it's missing something went wrong.

As delete is very destructive, I decided it best to go for the safe route and abort. That way, if there is a storage problem that caused the file to (temporarily) disappear, the app will not forget about the file if the issue is later resolved. (And you can then do a safe delete)

miggol · 2026-03-02T08:59:21Z

src/cdh/files/db/wrappers.py

+        if self.file_instance:
+            model = self.file_instance._meta.model
+            if model.objects.filter(Q(pk=self.file_instance.pk) | Q(uuid=self.uuid)).exists():
+                _error(f"FileWrapper {self.uuid} still has references in DB, this should not happen!")


Perhaps this situation could raise an exception?

That's a matter of philosophy I guess; currently the code only raises exceptions if the implementing developer interacts with the FileField in a way that's not supported/correct. (So in the public API)
If something goes wrong in the underlying code (the private API) I'm hesitant to throw exceptions because I don't want the implementing developer to have to anticipate problems in that part of the code.

However, if you feel like this should raise an exception, I'm happy to add one

miggol · 2026-03-02T09:00:03Z

src/cdh/files/settings.py

+FILE_BLOCK_DELETION = getattr(
+    settings,
+    'CDH_FILES_FILE_BLOCK_DELETION',
+    False,
+)


Let's hope we never need this :D

miggol · 2026-03-02T09:04:57Z

src/cdh/files/signals.py

-    # force=True means we will ALWAYS delete the file, even if the ORM still
-    # sees some references to it
-    instance.get_file_wrapper().delete(save=False, force=True)
+    logger.debug(f"Signals: pre_delete called for {sender.__name__}; deleting file for {instance}")


This is now connected to the post_delete signal. Might also be better to just mention {sender} so that its __str__() can provide instance information.

tymees added 6 commits February 24, 2026 20:25

misc: add logging to cdh.files

50d39c3

fix: fixed bug that allowed referenced files to be seen as safe-to-de…

bdc26c4

…lete

fix: add sanity check to prevent deleting files with active DB refere…

e0bb1ee

…nces

feat: global kill-switch on file-deletion

bf18178

docs: document the hard-kill switch on file deletion

d081150

fix: ensure file operations run post-transaction to prevent data loss

45268c9

tymees requested review from EdoStorm96 and miggol February 24, 2026 21:12

miggol reviewed Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cdh.files: various improvements#132

cdh.files: various improvements#132
tymees wants to merge 6 commits intodevelopfrom
fix/files

tymees commented Feb 24, 2026 •

edited

Loading

Uh oh!

tymees commented Feb 25, 2026

Uh oh!

miggol left a comment

Uh oh!

miggol Feb 27, 2026

Uh oh!

tymees Mar 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

miggol Mar 2, 2026

Uh oh!

tymees Mar 2, 2026

Uh oh!

miggol Mar 2, 2026

Uh oh!

tymees Mar 2, 2026

Uh oh!

miggol Mar 2, 2026

Uh oh!

miggol Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tymees commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tymees commented Feb 25, 2026

Uh oh!

miggol left a comment

Choose a reason for hiding this comment

Uh oh!

miggol Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

tymees Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

miggol Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

tymees Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

miggol Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

tymees Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

miggol Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

miggol Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tymees commented Feb 24, 2026 •

edited

Loading

tymees Mar 2, 2026 •

edited

Loading