Cache H264/H265 GOPs in order to allow readers to decode frames immediately by jean343 · Pull Request #4189 · bluenviron/mediamtx

jean343 · 2025-01-23T01:15:47Z

GOP Cache

This PR introduces Group of Pictures (GOP) caching to MediaMTX, enhancing its performance and reducing latency in streaming scenarios. By caching the last GOP for each stream, new subscribers can immediately receive the latest video data without waiting for the next keyframe, improving the user experience, especially for streams with long keyframe intervals.

This works for both H264 and H265, as well as for RTSP and WebRTC.

Configurable Cache Settings:

Introduced a new configuration parameter gopCache in mediamtx.yml for enabling/disabling GOP caching.

Fix: #1209

aler9 · 2025-01-24T08:35:14Z

This is a great work. The important thing is that you can confirm that the feature works and in which scenarios (protocols, codecs, with/without B-frames).
We can then adjust details, the ones that come into my mind are:

check on the GOP size to prevent RAM exhaustion
additional codecs, at least AV1

jean343 · 2025-01-25T01:55:36Z

Thanks @aler9 for your kind comment!

The feature does work in the following scenarios:

Protocol:

WebRTC
RTSP

Codecs:

H.264
H.265

I have only tried videos without b-frames as WebRTC does not support b-frames. There might be adjustments to make when dealing with b-frames over RTSP.

In order to reduce RAM exhaustion, we do not cache anything until we get a key frame, this will prevent unsupported codecs from storing anything, and will save a little bit for supported codecs.
In case the GOP is really long, we trim at 512 packets, conserving memory. In this case, clients will need to wait until the next key frame before video playback.

As per additional codecs, I could not find a reliable way to detect their keyframes.

In the WebRTC playback scenario, the PTS and Timestamp needs to be modified to prevent gaps. WebRTC will pause and stop playback if set incorrectly.

For example, incorrect Timestamp will look like:

Untitled.mov

Correct timestamp will look like:

Screen.Recording.2025-01-23.at.1.40.16.PM.mov

yairzahavi · 2025-03-04T09:44:44Z

You have missing fixes needed in order to align with base branch.
I did the in my own fork since i needed the Gop Cache and base branch alignment..

you can reference them here:
#4282

angry-beaver · 2025-03-10T09:56:32Z

@aler9 @jean343 This is a cool feature. Do you have any thoughts on when it will be completed?

aler9 · 2025-03-10T09:58:17Z

@angry-beaver i need to finish a couple of other things then i'll focus on this. In the meanwhile, @jean343 and @yairzahavi can try to go on by themselves.

yairzahavi · 2025-03-10T10:06:33Z

@angry-beaver i need to finish a couple of other things then i'll focus on this. In the meanwhile, @jean343 and @yairzahavi can try to go on by themselves.

@jean343 I'll try and do it this soon including the av1 and ram exhaustion prevention and i'll ping you for a review. 🙌

jean343 · 2025-03-14T00:18:55Z

Thanks everyone for the help. I merged from master and fixed the build!

@yairzahavi, the AV1 work is awesome. I merged the AV1 work into this branch and fixed the merge conflicts. I did not change the CacheLength logic, it it's not clear that the additional logic helps performance.

We should aim at making a final PR as coauthors!

…OP-in-MediaMTX

yairzahavi · 2025-03-16T09:23:10Z

internal/stream/stream_format.go

+		if s.CachedUnits != nil {
+			s.CachedUnits = append(s.CachedUnits, u)
+		}
+		l := len(s.CachedUnits)
+		if l > maxCachedGOPSize {
+			s.CachedUnits = s.CachedUnits[l-maxCachedGOPSize:]
+			sf.decodeErrLogger.Log(logger.Warn, "GOP cache is full, dropping packets")
+		}
+	}


@jean343

Thanks everyone for the help. I merged from master and fixed the build!

@yairzahavi, the AV1 work is awesome. I merged the AV1 work into this branch and fixed the merge conflicts. I did not change the CacheLength logic, it it's not clear that the additional logic helps performance.

You are right that the change is not clear and additionally it doesn't really work and i have yet to figure out why.

But the reason I tried and change it is that i inspected this code block and it seems there is another memory allocation when you surpass the maxCachedGOPSize
And if you truncate it afterwards the memory allocation already happened.
Additionally I tried to reduce memory allocations and usage by allocating a fixed size.

I hope you'd have a solution\idea for this.

We could drop the entire cache once it reaches maxCachedGOPSize. It would solve memory allocations, and once the cache size reaches maxCachedGOPSize the GOP gets affected and the video player will need to wait for the next key frame regardless.

yairzahavi · 2025-03-16T09:24:18Z

Another point i noticed is that the GOP cache causes higher webrtc jitter
So i'm contemplating how and if to solve this as well.

aler9 · 2025-03-16T12:45:03Z

Hello, i've tested the patch, while the working principle is present, there some aspects that can be improved:

the startup phase is not pleasant as all past frames of the GOP are shown and played very fast (see attached video). I understand that this behavior is needed as workaround to prevent freezing, but there must be some alternative. For instance, we had a similar problem when implementing the Playback server. In that case, duration of past frames is set to zero, because the transport mean (which is MP4) allows that (more or less!). However i know that it's not possible to do that with most streaming protocols. An alternative might be grouping multiple H264 access units into a big H264 access unit that contains all frames.
audio sync: linked to point 1. If you add or remove some delta T from video, then audio will get desynced.
RTSP: this patch does not cover RTSP. In RTSP, packets are not written to individual readers, but they are sent by calling ServerStream.WritePacketRTP once, in order to support the multicast transport, in which packets are sent once to the network.
There should be a mechanism in which when a RTSP reader is created and the transport protocol is not Multicast, then GOP frames are sent through ServerSession.WritePacketRTP.

The GOP caching feature has always been difficult to implement because it has to take into consideration how players react when receiving a bunch of access units at the same time. It involves testing all possible ways to send the GOP, digging into source code of all players and codec specifications.

The feature can be merged into the main branch only when a high level of compatibility with all major protocols and players is reached.

out.mp4

jean343 · 2025-03-16T14:39:36Z

Thank you @aler9 for testing and for your feedback.

I did not expect to uncover this many corner cases when I started implementation :)

I agree that playing back all frames very quickly is not optimal, in our testing, where our GOP setting is ~12s, it ends up being better than waiting up to 12s for video playback.
As you know, we can't set the duration to 0s, and because the transport is UDP, we can not send an unlimited amount of packets instantly.
We have tested sending a key frame, and dropping all p-frames until the next key frame. In this case, playback starts instantly, and obviously freezes until next key frame. This indicates that we could possibly combine the p-frames into one...
It would be amazing if you could help merging the frames together into one!
I do not believe audio desync should be a problem, because we do not play audio back during catch up.
Unicast RTSP is supported, and in this case, playing back all frames at the same time does work, which is great.
Multicast RTSP should not have GOP cache enabled, is there a check in the code I could add to disable that portion?

…OP-in-MediaMTX

hugeleaf · 2025-04-22T09:25:41Z

internal/stream/stream_format.go

 	atomic.AddUint64(s.bytesReceived, size)

+	if sf.gopCache && medi.Type == description.MediaTypeVideo {
+		if isKeyFrame(u) {


Would it be better to record from SPS/PPS? Or save the historical SPS/PSP and append to here.

JeDiE99 · 2025-06-06T21:09:59Z

mediamtx.yml

 udpMaxPayloadSize: 1472
+# Enable GOP cache to improve initial playback experience for new clients.
+# Note: will increase memory usage.
+gopCache: false


maybe need parameter(in bytes ? per path ?) for more transparent memory control ?

The cache is in-memory and does not need a path. I like the idea of specifying the cache size, in bytes or packets.

PsymoNiko

Great for better performance

tuan3w · 2025-11-22T02:05:24Z

It's interest to see this feature is merged ? Any progress here. Thanks

jean343 · 2025-11-22T02:10:05Z

It's interest to see this feature is merged ? Any progress here. Thanks

I would definitely like to see this being merged.

I can not find a good solution for issue 1. in this comment. If we accelerate playback too much, client refuses to load video.
#4189 (comment)

Maybe we could indicate it in the docs, and leave feature disabled by default.

Javier-d98 · 2025-12-04T10:48:08Z

This feature would be great for me too
I think the current implementation is good enough, specially if it's disabled by default as @jean343 suggests.

xiaoxuan010 · 2026-02-07T20:08:34Z

I agree that playing back all frames very quickly is not optimal, in our testing, where our GOP setting is ~12s, it ends up being better than waiting up to 12s for video playback.
As you know, we can't set the duration to 0s, and because the transport is UDP, we can not send an unlimited amount of packets instantly.
We have tested sending a key frame, and dropping all p-frames until the next key frame. In this case, playback starts instantly, and obviously freezes until next key frame. This indicates that we could possibly combine the p-frames into one...
It would be amazing if you could help merging the frames together into one!

Thanks for your contribution! It seems that handling the previous frames (that occurs before the new reader comes into the server) is a big problem? The point of combining the p-frames reminds me, and I found a paper that implemented the feature in a very similar and effective way!

In short, the article introduces a video decoder and a temp video encoder on the server side, always providing the newcomers with the newest frames (which are re-encoded from the normal stream, to a temp GOP that starts with a I-frame). I believe that this mechanism is effective and elegant, while the only problem is that it needs a kind of FFmpeg stuff on the server side, which I'm not sure whether we could implement by kind like external command hook, other than embedding the decoder and encoder inside MediaMTX.

Refs:

jean343 added 2 commits January 22, 2025 19:55

Add GOP cache.

5d63214

Update stream.go

4b104c1

jean343 added 5 commits January 27, 2025 09:12

Merge branch 'main' into SL-1800-Cache-the-GOP-in-MediaMTX

5e0be0d

Restart on GOP setting change

64801a1

Merge branch 'main' into SL-1800-Cache-the-GOP-in-MediaMTX

36fad69

Update session.go

61a8614

Merge branch 'main' into SL-1800-Cache-the-GOP-in-MediaMTX

5c1d09e

yairzahavi mentioned this pull request Feb 27, 2025

Feature/mediamtx gop cache #4282

Closed

jean343 added 3 commits March 13, 2025 20:50

Merge remote-tracking branch 'upstream/main' into SL-1800-Cache-the-G…

d72fd90

…OP-in-MediaMTX

Merge updates and fix build

27edd47

Update stream_format.go

cfdeebe

yairzahavi reviewed Mar 16, 2025

View reviewed changes

jean343 added 6 commits March 18, 2025 13:10

Merge remote-tracking branch 'upstream/main' into SL-1800-Cache-the-G…

c753226

…OP-in-MediaMTX

Update stream.go

e5284c4

Update session.go

518c105

up

e6b8094

Update server_test.go

41fd5f0

Check for empty GetRTPPackets

aaffa28

hugeleaf reviewed Apr 22, 2025

View reviewed changes

JeDiE99 reviewed Jun 6, 2025

View reviewed changes

PsymoNiko reviewed Oct 12, 2025

View reviewed changes

Conversation

jean343 commented Jan 23, 2025

GOP Cache

Configurable Cache Settings:

Uh oh!

aler9 commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jean343 commented Jan 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yairzahavi commented Mar 4, 2025

Uh oh!

angry-beaver commented Mar 10, 2025

Uh oh!

aler9 commented Mar 10, 2025

Uh oh!

yairzahavi commented Mar 10, 2025

Uh oh!

jean343 commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yairzahavi Mar 16, 2025

Choose a reason for hiding this comment

Uh oh!

jean343 Mar 16, 2025

Choose a reason for hiding this comment

Uh oh!

yairzahavi commented Mar 16, 2025

Uh oh!

aler9 commented Mar 16, 2025

Uh oh!

jean343 commented Mar 16, 2025

Uh oh!

hugeleaf Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JeDiE99 Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

jean343 Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

PsymoNiko left a comment

Choose a reason for hiding this comment

Uh oh!

tuan3w commented Nov 22, 2025

Uh oh!

jean343 commented Nov 22, 2025

Uh oh!

Javier-d98 commented Dec 4, 2025

Uh oh!

xiaoxuan010 commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

aler9 commented Jan 24, 2025 •

edited

Loading

jean343 commented Jan 25, 2025 •

edited

Loading

jean343 commented Mar 14, 2025 •

edited

Loading

hugeleaf Apr 22, 2025 •

edited

Loading