Skip to content

Same-table join DB optimizations.#6395

Open
dessalines wants to merge 43 commits intomainfrom
notification_optimizations
Open

Same-table join DB optimizations.#6395
dessalines wants to merge 43 commits intomainfrom
notification_optimizations

Conversation

@dessalines
Copy link
Copy Markdown
Member

@dessalines dessalines commented Mar 9, 2026

I've spent many days on these, and exhausted my expertise on SQL. It really needs some other eyes besides mine, although at least the performance is passable for most things.

Comment table changes

  • This creates a community_id column on the comment table, which allows for faster moderation_view, and subscription views (ie, we can use comment::community_id.eq_any)
  • This migration takes ~8 hours (even disabling and re-enabling all the indexes and triggers), and doesn't bring too much benefit IMO, other than making the ListingType::ModeratorView, and ListingType::Suggested not crash out. It also balloons the size of the comment table.
  • Still, it may be a good idea to keep it, because it allows for faster joining to the community table. Removing it again and testing with a fresh DB would take days. Also there are potential speed gains others could find, and in that case it'd be better to keep it on the table.
  • It might be a better option to just create a CommentListingType, and remove those from it. Or even better, just remove listing_type for comment entirely.

Notification / report_combined

  • Adds community_id, instance_id columns to notifications, and makes all the inserts to notifications insert the relevant fields also. These ended up not bringing big speed improvements, but did simplify the joins.
  • Does the same type of operation for report_combined, which again the improvements were not very large.
  • A lot of the time spent in report_combined, and notifications, is just using the map_to_enum and building the large objects, while the SQL queries actually stayed below 100ms.
  • We could do what we did for modlog, and not build the full object views like CommentView, but just the Comment. This would save a lot of joins, and still have the core comment info. It just wouldn't show things like your votes for that comment, whether the comment creator is banned, etc.

Person saved / liked / content combined

  • The biggest speed improvements were here. Adding the extra post_id especially when filling the comment, reduced a cascading join OR condition.

Full speed results. I'm mainly not happy with get comments for a post, list notifications, and list reports. But I don't think there's much more I can do there. I played around A LOT with joins, indexes, and filters, and was unable to do better than this.

List posts with different sorts

sort time
active 102ms
hot 100ms
new 100ms
old 77ms
top 74ms
most_comments 74ms
new_comments 86ms
controversial 75ms
scaled 97ms

List posts for a community with different sorts

sort time
active 72ms
hot 74ms
new 73ms
old 71ms
top 73ms
most_comments 72ms
new_comments 73ms
controversial 71ms
scaled 73ms

List posts with different listing types

type time
all 115ms
local 97ms
subscribed 96ms
moderator_view 74ms
suggested 77ms

List posts with show hidden

show hidden: 104ms

List posts with hide read

show read : 103ms

List posts with higher pages

page # time
0 99ms
1 98ms
2 101ms
3 95ms
4 93ms
5 97ms
6 93ms
7 92ms
8 100ms
9 100ms
avg 97ms

List posts for a multi-community with different sorts

sort time
active 81ms
hot 81ms
new 78ms
old 70ms
top 74ms
most_comments 75ms
new_comments 83ms
controversial 73ms
scaled 81ms

List communities with different sorts

sort time
active_six_months 60ms
active_monthly 60ms
active_weekly 58ms
active_daily 60ms
hot 56ms
new 55ms
old 55ms
name_asc 56ms
name_desc 56ms
comments 53ms
posts 56ms
subscribers 46ms
subscribers_local 23ms

List communities with different listing types

type time
all 60ms
local 30ms
subscribed 41ms
moderator_view 16ms
suggested 12ms

List multi-communities with different sorts

sort time
new 6ms
old 5ms
name_asc 6ms
name_desc 6ms
communities 6ms
subscribers 6ms
subscribers_local 6ms

Get a community

get community: 23ms

Get a post

type time
url post 60ms
text post 37ms

Get comments for a post with different sorts, tree

sort time
hot 446ms
top 449ms
new 448ms
old 440ms
controversial 441ms

Get comments for a post with different sorts, flat

sort time
hot 421ms
top 421ms
new 419ms
old 413ms
controversial 415ms

Get comments for a post slim

sort time

getCommentsSlim: 370ms

Get all comments with different sorts

sort time
hot 81ms
top 77ms
new 91ms
old 76ms
controversial 84ms

Get comments with different types

type time
all 123ms
local 82ms
subscribed 117ms
moderator_view 97ms
suggested 74ms

List person content with types

type time
all 82ms
comments 76ms
posts 81ms

List person saved with types

type time
all 81ms
comments 77ms
posts 81ms

List person liked with types

type time
all 125ms
comments 122ms
posts 125ms
all 124ms
liked_only 124ms
disliked_only 131ms

List person read

list person read: 67ms

List person hidden

list person hidden: 26ms

List registration applications

type time
unread only 10ms
all 16ms

List reports

type time
unresolved only 126ms
all 340ms

List reports with type

type time
all 360ms
posts 390ms
comments 382ms
private_messages 295ms
communities 123ms

Search with types

type sort time
post active 165ms
post hot 167ms
post new 171ms
post old 85ms
post top 121ms
post most_comments 177ms
post new_comments 158ms
post controversial 99ms
post scaled 165ms
comment hot 106ms
comment top 103ms
comment new 105ms
comment old 74ms
comment controversial 94ms
community active_six_months 72ms
community active_monthly 73ms
community active_weekly 71ms
community active_daily 70ms
community hot 63ms
community new 63ms
community old 52ms
community name_asc 63ms
community name_desc 62ms
community comments 68ms
community posts 71ms
community subscribers 70ms
community subscribers_local 55ms
multi community new 6ms
multi community old 5ms
multi community name_asc 5ms
multi community name_desc 5ms
multi community communities 5ms
multi community subscribers 5ms
multi community subscribers_local 5ms
person new 26ms
person old 25ms
person post_score 26ms
person comment_score 35ms

Notifications with types

type time
all 218ms
mention 225ms
reply 219ms
subscribed 144ms
private_message 302ms
mod_action 146ms

Notifications with unread only

type time
all 221ms
unread_only 208ms

Liking a comment / post

type time
comment 60ms
post 61ms

Get modlog with types

type time
all 35ms
admin_add 27ms
admin_ban 29ms
admin_allow_instance 73ms
admin_block_instance 29ms
admin_purge_comment 40ms
admin_purge_community 35ms
admin_purge_person 34ms
admin_purge_post 42ms
mod_add_to_community 320ms
mod_ban_from_community 30ms
admin_feature_post_site 41ms
mod_feature_post_community 238ms
mod_change_community_visibility 29ms
mod_lock_post 287ms
mod_remove_comment 30ms
admin_remove_community 45ms
mod_remove_post 25ms
mod_transfer_community 45ms
mod_lock_comment 29ms

Context : #5929

ADD COLUMN instance_id int REFERENCES instance (id) ON UPDATE CASCADE ON DELETE CASCADE,
ADD COLUMN community_id int REFERENCES community (id) ON UPDATE CASCADE ON DELETE CASCADE,
-- drop the check constraint
DROP CONSTRAINT notification_check;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added back, similar to modlog_check with different values depending on kind

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I get the performance stuff done I'll re-add this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K done, and it works on prod lemmy.ml data.

Comment on lines +3 to +8
ALTER TABLE comment
ADD COLUMN community_id int REFERENCES community (id) ON UPDATE CASCADE ON DELETE CASCADE;

-- Fill the rows
UPDATE
comment AS c
Copy link
Copy Markdown
Member Author

@dessalines dessalines Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been running this on my dev machine now for 3 days, and its still not done.

Really not sure what to do about it, because I do believe it'd massively speed up our comment queries.

And removing community from the CommentView (and just making CommentViewSlim the default) isn't really an option IMO unless we want to remove all community-type context from comments. Which means comments can't be viewed from /home (at least without showing the community context) or /community

EDIT: This could have been due to the trigger on comment filling search_combined, which is gone now in main. I'm rerunning.

@dessalines dessalines changed the title Some notification optimizations Same-table join DB optimizations. Mar 24, 2026
@dessalines
Copy link
Copy Markdown
Member Author

Finally finished the long migration adding community_id to the comment table, and it sped up a few things but I found some other things to speed up.

@dessalines
Copy link
Copy Markdown
Member Author

dessalines commented Mar 25, 2026

Comment listings for a post, report combined, and notifications are still slow after this. Its been a long day trying to figure these out.

EDIT: Finally figured out speeding up comments.

@dessalines dessalines marked this pull request as ready for review March 27, 2026 14:07
@dessalines dessalines requested a review from Nutomic March 27, 2026 14:07
pub federation_pending: bool,
/// Whether the comment is locked.
pub locked: bool,
/// This field is a dupe of post.community_id, but necessary for join performance.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the API documentation, but the comment is completely irrelevant for API clients so better leave it out here. Or use a simple // comment. You can also consider making these fields as #[serde(skip)] so we can remove/change them in the future without any breaking API change.

@Nutomic
Copy link
Copy Markdown
Member

Nutomic commented Apr 7, 2026

Is this ready to merge? Not entirely clear from your last comment. You mentioned that the migration is very slow and bloats the table, how is the duration and table size now?

The comment queries look quite slow compared to others. Could this simply be because they are returning more data (in bytes) compared to your other tests? This could explain why Multi-community queries are very fast compared to Communities, even though queries are very similar.

I also wonder how long GetComments takes with 0.19 with the same setup, to see if 400ms is reasonable or not.

Its very odd that some modlog queries take ~300ms, while most of them are around 30ms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Speed up comment list queries Speed up notification queries

2 participants