Path hash collision de-duplication #1652

ripplebiz · 2026-02-10T04:56:27Z

ripplebiz
Feb 10, 2026
Maintainer

I have come up with a way, that doesn't require a V2 protocol, to resolve the path hash/ID collision problem.
Remember that this is, essentially, just a diagnostic issue. It doesn't affect routing in any meaningful way.

So, I think the objective is purely: improve the ability of diagnostic tools to resolve paths.

Method

The path hashes reserve 0x00 and 0xFF, and the purpose of this was for potential future escape sequences. This proposal makes use of this.

sender (flood mode) primes the path byte array with (hex): 00, 00, 00, 00, 00 (length here could be debatable)
Each repeater appends their own 1 byte path hash, as usual, but checks if first byte is 00. If so, they take the next 4 bytes (X0), and compute: sha256(X0, repeater's public key).truncate(4) -> X1
Repeaters rewrites the 4 bytes in the path (ie. byte array offset 1) with X1
Repeater then retransmits per usual rules

Receiver/Observer de-duplication

Any node that receives, like a client/companion or web diagnostics like LetsMesh, can then determine which path was actually taken. Most of these environments should have a fairly comprehensive list of discovered nodes that have cached/stored. So, if the received (flood) packet path starts with 00, then it becomes simply a brute-force compute task.

For example, if there were 8 hops, and on average in there are 2 known public keys per 1-byte path hash, then the combinations are 2^8 = 256 sha256 chains to calculate. Obviously this balloons out exponentially the longer the path is. So, a 16 hop path could be somewhere around 65,000 chains to calculate. The likelihood of any two chains resulting in the same hash is extremely low, so diagnostic environments could very confidently display the verified path.

For most client environments like smartphones or web servers, this is easily achievable. Even a low end smartphone can calculate around 5 million sha256'a a second. So, I don't think there'll be any noticeable cost.

The BIG cost, as we are all very aware of is airtime. This is the biggest thing we need to minimise.

Compatibility

All of the PAYLOAD_TYPE_PATH packets remain untouched. Only one issue with these is that the sender who receives these back, has to check for the leading 00, then strip the escaped sequence from the path before storing on the client device. (ie. for use when sending direct-mode packet, ie. using that path). Ot the client could also save the full path (incl. escaped) and just strip out just before using for direct-mode.

Also, another issue (which this also shares with every V2 proposal) is that if there's just one repeater in the path which is on an old firmware, ie. doesn't know about the escaped sequence, then the resulting truncated hash at receiving node cannot be de-duplicated (ie. it won't match any calculated chain). So, the diagnostic at each receiver can have an 'unknown' as output.

Misc

The escaped truncated hash, could potentially be 3 bytes. It might take some experimentation to find out what the optimal length is.

I also wondered about killing two birds with one stone, and having the sender, say with group channel chat packets, prime the truncated hash with their public key. Then receivers could potentially also verify the sender, which is currently missing in group chats. But then it introduces many more sha256 chains to compute, and then a lot of diagnostic environs may only have repeaters in their discover/cache, and then they wouln't be able to even do the de-duplication.

If anyone has ideas around this, would be cool to potentially weave in sender-id to group chats.

Also, something other than sha256 cold be used. If there's a more compute-friendly alternative? But, sha256 is everywhere now, and is even hardware optimised into a lot of compute environments.

ripplebiz · 2026-02-10T05:12:28Z

ripplebiz
Feb 10, 2026
Maintainer Author

Also, forgot to mention, that this is still in line with the core principles of MeshCore, namely privacy. The sender can still opt to send without the escape, ie. just send the current way. No truncated hashes get calculated. There are still places around the world, 'political environments', where this could be very important.

0 replies

mikecarper · 2026-02-10T05:49:46Z

mikecarper
Feb 10, 2026

Crc32 should be quicker than sha256.

0 replies

mikecarper · 2026-02-10T06:51:49Z

mikecarper
Feb 10, 2026

In terms of not being able to reverse the hash because of older versions; you could just add additional checks for non hashed steps while trying to reverse the hash.

This seems like a mini blockchain is that correct? Where the hash from hop 1 is used when calculating the hash for hop 2 etc. Using the v1 path it seems like you can use it get a list of keys to quickly try for each step for hash reversal. I don't think it'll be as computationally complicated if you can narrow down the possibilities by using the 1byte as a lookup table. If the flood advert frequency is zero it should not do hashing as it'll be unable to be reversed

0 replies

mikecarper · 2026-02-10T10:46:56Z

mikecarper
Feb 10, 2026

On paper 5bytes to store the hash is probably the correct amount of bits to reduce collisions. In practice because we have a look up table of keys to try in order to get exact keys used to create the blockchain hash 4 bytes/32bits is probably good enough. We know the hop count which limits the number of inputs needed to hit the path. With 4 bytes there's still a chance it can have a collision; with 5 the odds drop down dramatically.

0 replies

446564 · 2026-02-10T14:35:16Z

446564
Feb 10, 2026

Honestly this seems more complicated than necessary and still introduces not one but two breaking changes. As said older repeaters will not understand and break it, plus all client apps need updating.

As for the alternate proposal to have a variable size path prefix ID length, where the user still maintains the same level of privacy if they choose (given they are the only one to set the resolution), if we take a phased approach all repeaters still function until the final break.

This allows users to decide how precise their paths are, not just a cosmetic issue for monitoring and data collection.

0 replies

MrAlders0n · 2026-02-10T19:28:09Z

MrAlders0n
Feb 10, 2026

I really appreciate you engaging with the community in a public forum regarding this. I genuinely appreciate you and all of the other devs and the efforts you continue to put into this awesome project.

I like the concept of the hash chain approach and I see the appeal of not needing a protocol version bump, however I feel like this is patchwork instead of fixing the underlying issue. Here's my thinking:

The proposal requires changes across the entire stack: the sender primes 5 bytes, every repeater computes SHA-256 and updates the chain, and the receiver brute-forces combinations of candidate repeaters to find a match. That's a lot of moving parts, and if any single repeater in the path is on old firmware, the whole chain breaks for that packet.

That firmware update requirement is actually the part that stands out most to me. If we're asking the entire network to update firmware to support this, we're going through the same migration effort as a protocol change. But instead of coming out the other side with a clean, permanent fix, we'd have a workaround that adds ongoing complexity and still leaves the core ID collision issue in place. And as we know, there will always be repeaters on old firmware that have been forgotten about, sitting on a rooftop or solar install somewhere. Those will break the hash chain for any packet they touch, and there's nothing anyone can do about it. On top of that, this only addresses the diagnostic side. It still doesn't help with the routing problem where two repeaters sharing the same ID within range of each other both repeat a direct message that was only intended for one of them.

Regarding privacy, I understand the concern and respect that there are political environments where this matters. But I think we both know that the privacy ship has already sailed with 1-byte IDs. With noise floor and SNR data alone, which is readily available to anyone listening on the mesh, you can already pinpoint the approximate location of every repeater in an area quite easily. We actually had to mask noise floor and SNR reporting on MeshMapper for exactly this reason. Keeping repeater IDs short and ambiguous doesn't meaningfully protect location privacy when the RF characteristics already give it away. If anything, I worry it's more dangerous in those political environments to give users a false sense of privacy when a repeater is on their roof. Users should understand and be aware that beaconing RF is like holding up a red flag saying "I'm participating in a decentralized network." Ambiguous IDs don't change that.

My overall feeling is: if the network has to go through a firmware update cycle regardless, it feels like the right time to solve this cleanly rather than build a layer on top that we'd need to maintain and explain going forward. A larger ID space is simpler to implement, simpler for tools to consume, and addresses both observability and routing in one shot.

I really like Liam's concept of a variable size path prefix ID length, and I love the idea of reserving a block of IDs that cannot be generated automatically on the repeater itself.

5 replies

ripplebiz Feb 11, 2026
Maintainer Author

I honestly don't follow most of those points. You're talking about a mandatory V2 update for EVERYONE. The 1-byte path hash dups, for direct mode, is not a 'problem/mistake'. It all still works, but just a bit less efficiently.

What the variable length path hashes do is bloat the packet sizes, forces more airtime on the whole network, just because someone wants to do some diagnostics. My proposal puts the least burden on everyone else, and most of the burden on the node wanting the diagnostics.

446564 Feb 11, 2026

People have been asking for this for a long time, I didn't hear any complaints about making trace packets variable length.

DrakiaXYZ Feb 11, 2026

You're talking about a mandatory V2 update for EVERYONE

Wouldn't this proposal also require a mandatory update for everyone? An outdated client or repeater seems like it would make the whole thing fall apart. An outdated repeater would break the chain, an outdated client/companion would end up storing/using/forwarding invalid route data

446564 Feb 11, 2026

100%

alextemp Feb 11, 2026

It is not a big deal to upgrade everything if it improves scalability and dependability of complete path progression. Things need to grow and improve. I appreciate everyone’s effort in making this project work and improving it. Cheers

unfocused8 · 2026-02-11T06:00:04Z

unfocused8
Feb 11, 2026

I agree with @MrAlders0n and @446564 in that a V2 protocol would provide more benefits for the same cost of switching and is more inline with the request by the community compare to the proposed patchwork.

However, I support @ripplebiz comment that privacy is a core principal. Reasonable effort should be made to further improve privacy protections in V2.

We should appreciate the work done by MeshMapper, which highlight current limitations in privacy protection and try to learn from them.

2 replies

liamcottle Feb 11, 2026
Maintainer

However, I support @ripplebiz comment that privacy is a core principal. Reasonable effort should be made to further improve privacy protections in V2.

We should appreciate the work done by MeshMapper, which highlight current limitations in privacy protection and try to learn from them.

I'm working on a v2 packet proposal document, that we will discuss internally and then share with the community for feedback if it looks like it might be a good approach. What limitations in privacy protection are you referring to, so I can keep these in mind during the design phase?

unfocused8 Feb 11, 2026

I am glad to hear that and I appreciate you listen to the community.

I was referring to locating repeater locations based on the noise floor (see #1613 (comment)).
However, I do not think there is a solution to this in the context of a protocol. I would like to be proven wrong, but avoiding to be geolocated through noise floor or signal triangulation is outside the scope of this project.

mikecarper · 2026-02-12T00:27:40Z

mikecarper
Feb 12, 2026

A 4-5 byte blockchain of the path even for v2 of the protocol still has it's benefits. We're always going to have prefix collisions with a large network. This is another way to figure out the exact path a message took.

3 replies

446564 Feb 12, 2026

Yes but more complicated for what purpose? What benefit does the extra complexity provide?

mikecarper Feb 12, 2026

yeah that's a great point. It'd allow for 59 hops in v2 (assuming variable len) and you could mine the path if you really wanted to see where it went. Mainly for DX nerding out. But for getting a message from point A to point B in a repeatable manner it doesn't help.

446564 Feb 12, 2026

Very neat

DrakiaXYZ · 2026-02-12T00:42:53Z

DrakiaXYZ
Feb 12, 2026

Further pondering on this, wouldn't a "blockchain"-like implementation require that every repeater in the chain be known to the receiving end (Or observer) to be able to decipher the actual path?

If so, this would break if any non-advertised repeaters are in the chain, such as a local house repeater that never adverts.

2 replies

mikecarper Feb 12, 2026

If it never adverts it doesn't add to the blockchain. The miner needs to take this possibility into account when trying to find the path. Adds one more permutation to check on every hop. With a lot of observers you can check your work as the message goes out to make mining less costly.

DrakiaXYZ Feb 12, 2026

A single entity with multiple observers can, sure. But as an end user, who wants to be able to see the path via my own personal standalone observer, I don't have the luxury of "checking my work" before the packet arrives to me.

The computational overhead a method such as this adds seems to outweigh the benefits of trying to use this as a means of "validating" a route

vfortin99-ctrl · 2026-02-12T01:07:43Z

vfortin99-ctrl
Feb 12, 2026

I share the view held by @MrAlders0n and @446564 : a full V2 protocol offers a better return on investment than a patchwork solution and aligns much closer with the community's desires.

Regarding privacy, I support @ripplebiz 's focus on the principle, but I acknowledge the reality @MrAlders0n pointed out: true privacy in an RF-based protocol is essentially unfixable. As MeshMapper has demonstrated, the physical layer leaks everything a motivated listener needs.
We should appreciate MeshMapper for exposing this reality. Instead of trying to 'patch' privacy where it can't be fixed, I think we should follow focus on a clean V2 protocol that prioritizes routing efficiency and transparency, ensuring users are fully aware of the inherent trade-offs in beaconing RF.

0 replies

Path hash collision de-duplication #1652

Uh oh!

ripplebiz Feb 10, 2026 Maintainer

Replies: 10 comments · 12 replies

Uh oh!

Uh oh!

ripplebiz Feb 10, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ripplebiz Feb 11, 2026 Maintainer Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liamcottle Feb 11, 2026 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ripplebiz
Feb 10, 2026
Maintainer

Replies: 10 comments 12 replies

ripplebiz
Feb 10, 2026
Maintainer Author

ripplebiz Feb 11, 2026
Maintainer Author

liamcottle Feb 11, 2026
Maintainer