encode & decode 32-bit float (IEEE 754 standard) samples. by aliheidary1381 · Pull Request #842 · xiph/flac

aliheidary1381 · 2025-08-11T17:04:25Z

I've also added support to encode & decode raw, wave riff, & aiff float formats.

The float encoding feature achieves a near 70% compression ratio, which is better than nothing. No oss-fuzz or tests have been added yet (sorry). The replay gain feature should also be expanded in the future to support the new feature. Sorry for the big PR. It's pretty readable tho! I tried to make it as modular & independent as possible. Documentation is good. Should this make it to a release version, an update to the standard RFC could also be considered. The changes are backwards-compatible and are as follows:

When storing float samples, the bps bits in the streaminfo metadata block should be 0b00000.
When storing float samples, the zero-padded bit, originally reserved, after the bit-depth bits of each frame header should be 1.
When storing float samples, the actual data samples stored in each subframe are obtained by doing some bit manipulation before encoding and after decoding. That part is in src/libFLAC/transform_float.c and is necessary to boost the compression ratio (more info in the file comments). Monkey's Audio .ape does something similar, but this one seems to achieve better ratios (~8%). Sorry for the unorthodox type conversions. will fix it if I see it being considered for a merge.

…standard) samples. I've also added support to encode & decode raw, wave riff & aiff float formats. The float encoding feature achieves a near 70% compression ratio, which is better than nothing. no oss-fuzz or tests have been added yet (sorry). the replay gain feature should also be expanded in the future to support the new feature. sorry for the big PR. it's pretty readable tho! I tried to make it as modular & independent as possible. documentation is good. should this make it to a release version, an update to the standard RFC could also be considered. the changes are backwards-compatible and are as follows: 1. when storing float samples, the bps bits in the streaminfo metadata block should be 0b00000. 2. when storing float samples, the zero-padded bit, originally reserved, after the bit-depth bits of each frame header should be 1. 3. when storing float samples, the actual data samples stored in each subframe are obtained by doing some bit manipulation before encoding and after decoding. That part is in src/libFLAC/transform_float.c and is necessary to boost the compression (more info in the file comments). Monkey's Audio .ape does something similar, but this one seems to achieve better ratios (~8% improvement). sorry for the unorthodox type conversions. will fix it if I see it being considered for a merge.

ktmf01 · 2025-08-12T10:14:58Z

Hi,

Many thanks for filing a PR. I haven't tested it, and as you might understand, this will need a lot of testing :) I've tried doing this in the past, but didn't get very far.

float PCM coding has been a often-requested feature, so this would be a great addition in that regard. I have some "objections on principal grounds" to floating point audio in FLAC. For example, floating point audio isn't meant for playback. Also, because this breaks forward compatibility. It feels to me like this will cause confusion for a lot of users, as these FLAC files will not playback on current equipment. But then again, that was also the case (and still is for some newly manufactured equipment) for 24-bit audio. I'm not saying this won't be merged, but I'll need to hear a few opinions of other people involved in the FLAC project. Still, I already know a lot of people want this.

I am surprised you didn't need any new coding methods (subframe types or residual coding methods). I think this is really nice.

FLAC is also used for compression measurement signals (for scientific experiments and such) where this also might be useful. Not sure whether your bit manipulation works for that as well.

After I test this myself (which might take a while), I would like to propose this change through various channels with FLAC enthousiasts. However, before that, I would like to make the format changes future-proof. The problem is that the reserved bit after the bit depth bits that you use, is the last remaining reserved bit. Use of this makes it impossible for the frame header to extended any further in the future. I would like this bit to have a different function: signal an extension to the frame header with one byte of extra flag bits (or feature bits) and a channel mask.

Anyway, many thanks for taking on this task. Please be patient, as merging might take quite a while, and publishing a release after that a while longer.

ktmf01 · 2025-08-12T10:51:27Z

By the way, perhaps we should also define a new STREAMINFO metadata block variant.

aliheidary1381 · 2025-08-12T13:30:41Z

Thx!

The problem is that the reserved bit after the bit depth bits that you use, is the last remaining reserved bit. Use of this makes it impossible for the frame header to extended any further in the future. I would like this bit to have a different function: signal an extension to the frame header with one byte of extra flag bits (or feature bits) and a channel mask.

There's also another reserved bit in the frame header, just after the sync bits at the very beginning (FLAC__FRAME_HEADER_RESERVED_LEN). That one suits better for your proposed (alternative) functionality, as it is closer to the beginning.
I think using the sample type indicator fits better in this place (FLAC__FRAME_HEADER_ZERO_PAD_LEN, right after the sample bps), IMHO.

By the way, perhaps we should also define a new STREAMINFO metadata block variant.

I don't think it'd be necessary. A bps of 1-3 is forbidden, according to the current standard rfc. Using them seems like a no-brainer to me. It also suggests (to the older decoders) to stop playing these (cause of the forbidden value), and requires minimal changes in the standard and other implementations.

p.s. Sorry for closing the PR, wrong button 😅

ktmf01 · 2025-08-12T14:41:41Z

There's also another reserved bit in the frame header

No, there isn't. The code still refers to it, but the RFC made it part of the sync code. That is to make it distinguishable from MPEG. See here: https://lists.xiph.org/pipermail/flac-dev/2008-December/002607.html So, we only have one bit remaining.

By the way, perhaps we should also define a new STREAMINFO metadata block variant.

I don't think it'd be necessary. A bps of 1-3 is forbidden, according to the current standard rfc. Using them seems like a no-brainer to me. It also suggests (to the older decoders) to stop playing these (cause of the forbidden value), and requires minimal changes in the standard and other implementations.

When it is necessary to deviate from the standard, I'd like to do it in a clear and conscious way. So, there'll be more features incorporated in a new streaminfo metadata block, like increasing the max number of samples or perhaps increasing the max samplerate and the total number of samples. Still really niche stuff, so it is really only necessary for exotic stuff. The thing is, like using floats (which is a niche), there are others using FLAC for stuff it wasn't made for, like RF captures.

So, this will take a long time.

aliheidary1381 · 2025-08-12T15:38:07Z

No, there isn't.

Oh, OK.
There's also a reserved bit pattern for the frame header's bit depth bits (0b011). How about doing something similar to what's done to the streaminfo bps bits?

H2Swine · 2025-08-16T12:25:22Z

I completely agree both on

cool! and
think over this more than just once.

There is also the 64-bit float format (in both endiannesses) - not that it is any more urgently needed for listening, but for compatibility (edit: with DAW plugins, for example) it wouldn't be a bad thing for a FLAC plugin to be able to handle "everything" the application could save. (64-bits would likely need more than 5-bit Rice, but who cares if that element also makes today's decoders err out on something they cannot decode.)

Here is a part of a possible solution, if we adopt the following view:

For example, floating point audio isn't meant for playback.

The FLAC format admits sample rate "0" for non-audio. It could also be used for "audio in files that are stored as non-audio", with the FLAC format being used to compress the file and not just the audio stream: interpret the use of "0" as "files you need to treat as a full file", and a player/DAW should then skip it unless it knows what it is doing.

There are a bunch of APPLICATION block types left to be used. You got ones for foreign metadata already, but here you might consider ones for mandatory file headers/footers (I think footer would be potentially more crucial for float than for integer, with possible metadata chunks for volume?!), and maybe one for audio properties the decoder needs to reconstruct the files (endianness, and signendness although that isn't applicable for float) - and source file extension (like WavPack does)?

So the workflow of a "player"/DAW would then be:

Step 0: Read the "0" sample rate. If it doesn't know there could be something for it, then treat it as per current RFC, refuse to recognize it as anything playable. Otherwise:
Step 1: Read and understand those APPLICATION block types and the content - or err out. If it continues:
Step 2: Receive the "audio"; likely then flac-the-decoder should know enough to reorder endianness (/signedness) and interleave the channels
Step 3: Treat the bitstream as if it were the original file, from the beginning of the file headers.

This could also make it possible to store AIFF(/CAF) with non-integer sampling rates - or object-based audio in the BW64 format (like Monkey's and WavPack do), as there would be no need for FLAC to "understand" the objects metadata. Sure the _en_coder does need to know what a sample and a channel is, and if it is not aware of the input format it could be force-fed with "raw format options" plus OptimFROG-stype --headersize and --tailsize. As long as the source file is structured in the order header--audio--footer and nothing between audio and audio, then it would future-proof against new file types? (And past-proof enough to encode .au and A-law and µ-law ... just what the world has not been waiting for.)
To handle headerless input, the _en_coder needs to have a minimal file header and store it. The default for FLAC would be a WAVE header although AIFF can contain more info, like non-integer sample rates - and the level of support would be up to implementing more "raw format" parameters.

~~One more consideration at file-level: lift the max metadata size limit. Attached art could be much bigger nowadays.~~ Edit: Thanks to a correction that the size limit is per metadata block, it could just be done by distinguishing "first 16 meg of file header" from "next 16 meg of file header" and same for footer.

In frame headers then:

There's also a reserved bit pattern for the frame header's bit depth bits

And more:

Bit depth info as you say, has vacant/forbidden values 1 to 3, although I suggest that "1" should be available in case of a future extension to compress 1-bit PWM (there is one such compressor out there, WavPack - that has a permissive license but I would surely take the polite way to ask first. "1" for "other, please specify" of course also keeps it available).
A frame header has three distinct ways to set "zero" sample rate. This idea likely spends two of them to distinguish between "0 that means rewind to beginning and see file metadata" and one for "0 that means this IS STILL NOT audio so don't even try!".
Channel bits info has vacant values/patterns (already when the channel mask Vorbis comment was introduced, one such could be used for "see that info or err out").

aliheidary1381 · 2025-08-18T14:26:16Z

How does WavPack (or its patented competitor, DST) compress 1-bit PWM streams, I don't know. I suppose we also need to reserve some space for defining new subframe types later?

AFAIK, with --keep-foreign-metadata used, FLAC could store BW64 XML chunks (including ADM chna and axml chunks used for immersive/3d moving sound objects) as-is. It can also store channel masks in Vorbis comments, at least as a file header.

As for the frame header, I will use bit_depth=0b000 to indicate that it is only stored in the streaminfo metadata block, with no changes to the frame header definition. The float samples feature is already outside of the streamable subset. There was no need to change anything in frame headers. Though I think a channel mask on frame headers is a very welcome addition for the streamable subset, it is outside the scope of this PR.

I'm more in favour of defining a new STREAMINFO_EXTENSION metadata block type (instead of wrapping it inside an APPLICATION block), with the same workflow you said.
The required fields, using the ideas mentioned here, are:

sample type (float PCM, int PCM, 1-bit PWM, or even A-law, µ-law, etc). It should also specify the number of bits, endianness, and signedness, ultimately specifying the samples' bit structure.
sample rate, stored in floats (adding support for bigger and non-integer rates).
more channels (than 8). It could also be used when/if support for immersive/3d moving sound objects becomes better. Dolby Atmos, for example, supports up to ~128 channels (and 16 even for its downmixed spatial coded streams).
more reserved bits.

H2Swine · 2025-08-19T10:09:07Z

Oof my bad, I wrote APPLICATION when I meant "metadata". I agree yes, different (new) type - there are 120 left, not running out soon. STREAMINFO_EXTENSION to inform a "new audio type-aware" decoder/player how to play, and a "this block type-aware but won't play" decoder how to order the decoded bits wrt. endianness and signedness and interleaving. Then I suggest types to inform about and store file header and footer when those are "mandatory" to get output right; If full file headers/footers are indeed indispensable, there should be a way to get them past the 16MiB by spanning them over several blocks if necessary; whether the STREAMINFO_EXTENSION contains the info about that or there are just block types for "first" and "continued" header and ditto footer ... there are many ways.

Though I think a channel mask on frame headers is a very welcome addition for the streamable subset, it is outside the scope of this PR.

Though I tend to disagree (I think that "subset" should not become more permissible, that would put new demands on finalized software that claims to decode subset), and maybe it has to be sorted out before committing, but ... not needed yet.

aliheidary1381 · 2025-08-19T14:05:30Z

Well, now that we're modifying the STREAMINFO block, we can also replace the 3-byte block size indicator with an elias gamma representation...

…success!

aliheidary1381 · 2025-11-03T00:21:23Z

Sorry guys... Been busy lately.
What is the final consensus on this? How should we extend the 16MiB block size limitation? A "continued" flag bit or a more fundamental change like replacing the 3-byte block size indicator with an elias gamma representation for streams with STREAMINFO_EXTENSION?

ktmf01 · 2026-01-25T14:12:24Z

This needs a really thorough review and a lot of work to make it happen. Lately, I haven't had a lot of time to work on this, and it might take quite a while before I do.

madah81pnz1 · 2026-02-18T12:26:28Z

The only backwards compatible way to extend the 16 MiB block size is to add a new block type, e.g. FLAC__METADATA_TYPE_EXTENDED. This is then chained one after another, and is a container for other blocks, which can then be defined with a new header. For example, a 40 MiB metadata block would need to be split up in 3 of those extended blocks.

But I don't think this is needed for this float proposal, which mostly needs a new FLAC__METADATA_TYPE_STREAMINFO2 block type. +1 for 64-bit float support, but if we're thinking ahead, don't forget (b)float16_t, float128_t.

Extending the blocksize is useful for big pictures, but also for future-proofing. It would be useful to have a new way of encoding metadata blocks, since you could add optional checksums, compression, etc.

I can start a new discussion topic about this, maybe others have good ideas about it too.

madah81pnz1 · 2026-03-09T04:46:49Z

include/FLAC/format.h

 #define FLAC__STREAM_SYNC_LENGTH (4u)

+typedef enum { NOT_SPECIFIED = -1,
+			   INT = 0,


Since this is put into the global scope/namespace, I suggest to add a prefix like SAMPLE_TYPE_ for all enum values, or even FLAC__SAMPLE_TYPE_.
Especially INT is a bit too short and too common name as to not be confusing.

madah81pnz1 · 2026-03-09T04:47:26Z

include/FLAC/format.h


+typedef enum { NOT_SPECIFIED = -1,
+			   INT = 0,
+			   FLOAT = 1 } SampleType; // TODO: merge with endian & unsigned


The name of the type should be FLAC__SampleType, to follow naming style

madah81pnz1 · 2026-03-09T04:48:33Z

src/CMakeLists.txt

@@ -1,4 +1,5 @@
 option(ENABLE_64_BIT_WORDS "Set FLAC__BYTES_PER_WORD to 8, for 64-bit machines. For 32-bit machines, turning this off might give a tiny speed improvement" ON)
+option(ENABLE_EXPERIMENTAL_FLOAT_SAMPLE_CODING "Enable experimental feature to encode & decode 32-bit float (IEEE 754 standard) samples" ON)


Should be OFF by default?

madah81pnz1 · 2026-03-09T04:50:22Z

src/flac/decode.c

 #include "share/compat.h"
 #include "decode.h"

+#define sample_type(is_float) is_float == FLOAT ? "float" : "int"


Make this a static inline function instead, sample_type is too short name as to not be confused for a local variable somewhere. Especially also since it is used as sample_type(sample_type).

The name could be something more like FLAC__get_sample_type_string().
Don't forget about the undefined enum value (-1), for debugging purposes I'd recommend to also check and print that one instead of "int" for everything else.

madah81pnz1 · 2026-03-09T04:56:15Z

src/flac/encode.c

 				/* first part of GUID */
 				if(!read_uint16(e->fin, /*big_endian=*/false, &x, e->inbasefilename))
 					return false;
 				if(x != 1) {


Here you should also check for 3, since that is when WAVEFORMATEXTENSIBLE is used for 32-bit float samples.

madah81pnz1 · 2026-03-09T05:13:03Z

src/metaflac/options.c

+	else if(0 == strcmp(opt, "set-sample-type")) {
+		FLAC__ASSERT(0 != option_argument);
+		op = append_shorthand_operation(options, OP__SET_SAMPLE_TYPE);
+		if(strcmp(option_argument, "float") == 0 || strcmp(option_argument, "IEEE754") == 0 || strcmp(option_argument, "binary32") == 0) {


same here, only allow "float", and "int" below

madah81pnz1 · 2026-03-09T05:17:13Z

src/libFLAC/transform_float.c

+	subtracting an (automatically recognised) DC offset from the "exponents" channel
+	and	storing the offset in each frame header (i.e. unsigned to signed conversion)
+		could be better (haven't tried it yet. hard for me to implement).
+		my guess would be a *consistent* ~60% ratio, at least.


a 10% improvement in compression might be worth investigating a bit further? Would need more discussions on how to split up the signal in "sub-channels", maybe too much complexity to be worth it.

How does the current (choice 2) compare to other coders like WavPack? If it is already as good or very close, then it might not be worth it.

madah81pnz1 · 2026-03-09T05:24:20Z

src/flac/decode.c

 			return false;

 		/* GUID = {0x00000001, 0x0000, 0x0010, {0x80, 0x00, 0x00, 0xaa, 0x00, 0x38, 0x9b, 0x71}} */
 		if(flac__utils_fwrite("\x01\x00\x00\x00\x00\x00\x10\x00\x80\x00\x00\xaa\x00\x38\x9b\x71", 1, 16, f) != 16)


You need to handle float here also for waveformatextensible, the first value of the GUID would be 3 for float

madah81pnz1 · 2026-03-09T05:25:00Z

src/flac/decode.c


+#if ENABLE_EXPERIMENTAL_FLOAT_SAMPLE_CODING
+	/* sanity-check the sample type */
+	if(decoder_session->sample_type != -1) {


Use the enum value, don't hardcode -1

madah81pnz1 · 2026-03-09T05:26:30Z

src/flac/decode.c

+#if ENABLE_EXPERIMENTAL_FLOAT_SAMPLE_CODING
+			if(decoder_session->sample_type == FLOAT && metadata->data.stream_info.sample_type != FLOAT) {
+				stats_print_name_and_stream_number(1, decoder_session->inbasefilename, decoder_session->stream_counter);
+				flac__utils_printf(stderr, 1, "ERROR, this link's STREAMINFO is set to float PCM format but was int PCM in previous one\n");


Use the sample_type string function here, so the name is consistent everywhere

aliheidary1381 closed this Aug 12, 2025

aliheidary1381 reopened this Aug 12, 2025

aliheidary1381 added 2 commits August 20, 2025 00:56

reversed some bits in experimental floats, got 20% more compression. …

48db1cc

…success!

minor fix in the docs

41998ed

madah81pnz1 reviewed Mar 9, 2026

View reviewed changes

		@@ -1,4 +1,5 @@
		option(ENABLE_64_BIT_WORDS "Set FLAC__BYTES_PER_WORD to 8, for 64-bit machines. For 32-bit machines, turning this off might give a tiny speed improvement" ON)
		option(ENABLE_EXPERIMENTAL_FLOAT_SAMPLE_CODING "Enable experimental feature to encode & decode 32-bit float (IEEE 754 standard) samples" ON)

Conversation

aliheidary1381 commented Aug 11, 2025

Uh oh!

ktmf01 commented Aug 12, 2025

Uh oh!

ktmf01 commented Aug 12, 2025

Uh oh!

aliheidary1381 commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ktmf01 commented Aug 12, 2025

Uh oh!

aliheidary1381 commented Aug 12, 2025

Uh oh!

H2Swine commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aliheidary1381 commented Aug 18, 2025

Uh oh!

H2Swine commented Aug 19, 2025

Uh oh!

aliheidary1381 commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aliheidary1381 commented Nov 3, 2025

Uh oh!

ktmf01 commented Jan 25, 2026

Uh oh!

madah81pnz1 commented Feb 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aliheidary1381 commented Aug 12, 2025 •

edited

Loading

H2Swine commented Aug 16, 2025 •

edited

Loading

aliheidary1381 commented Aug 19, 2025 •

edited

Loading