Add `geoLoopAreaRads2` function by ajfriend · Pull Request #1101 · uber/h3

ajfriend · 2025-11-28T05:15:20Z

This PR introduces a new area calculation algorithm which works for large, globe-spanning spherical polygons. That is, this algorithm should work for polygons that enclose the poles, cross the antimeridian, or even ones that are not contained in a hemisphere. The algorithm critically pays attention to vertex ordering, so, for example, you can switch the order of vertices in a cell to get a region representing "everything on the globe except that cell", and then compute that area correctly.

I'll try to keep this PR small by introducing only the core algorithm for simple loops of lat/lngs and for cells (replacing our existing algorithm in cellAreaRads2). In a follow-up PR, I can add functions that compute area for polygons (with holes), multi-polygons, their linked list versions, and the different units (rads^2, km^2, m^2). Although, this will be a lot of functions, so we should probably think about which ones we want to actually expose and support going forward. I would even consider making geoLoopAreaRads2() an internal-only function, and only exposing higher level functions at the polygon and multipolygon level. Curious to hear what others think.

I'll use the "global polygon" area algorithm from this PR for a new implementation of the "cellsToMultiPolygon" algorithm.

PR summary:

Introduce geoLoopAreaRads2() function as the core algorithm for computing area. In this PR, we use it to implement cellAreaRads2(), and in the future, it will be the core algorithm for computing area of polygons and multipolygons.
Enables area calculation of "global" polygons without issues along poles or the antimeridian, or with polygons larger than a hemisphere.
1.64x faster than the existing cellAreaRads2 function.
Adds a "compensated summation" algorithm to compute sums of floating point numbers more accurately, which we use when summing terms to compute the area of polygons/loops. Accuracy benchmarks are improved in this PR, although much of that improvement is not in the cell area calculation itself, but in summing up the areas in the benchmark when testing if we arrive at the area of the globe. We'll keep the compensated summation code since we expect an improvement when computing the area of larger and more complex loops/polygons, but we should add a benchmark/test to demonstrate this in a follow-up PR dealing with such polygons.

coveralls · 2025-11-28T05:20:40Z

coverage: 98.952% (+0.003%) from 98.949%
when pulling 4931800 on ajfriend:cagnoli_area
into 312fbbc on uber:master.

ajfriend · 2025-11-28T05:31:35Z

I'm still playing around with some optimizations to make this code faster. Here's one failed attempt: https://gist.github.com/ajfriend/ba522a38fa46dbc54875a9042157d2e3

Reusing some of the trig operations across points that appear twice in the area calculation doesn't seem to save much time. Maybe 2% improvement at most.

ajfriend · 2025-11-29T03:51:54Z

Comparison of Naive, Kahan, and Neumaier summation

Accuracy

Here are the accuracy results from benchmarkArea.c, which adds up the areas of all the cells at a given resolution, and computes the difference between the area of the entire globe 4*pi:

Naive summation

res: 0, diff: 3.552714e-15
res: 1, diff: 3.907985e-14
res: 2, diff: 1.456613e-13
res: 3, diff: 4.654055e-13
res: 4, diff: 2.220446e-12
res: 5, diff: 1.520561e-12
res: 6, diff: 1.761613e-11
res: 7, diff: 1.515765e-11
res: 8, diff: 2.926903e-11

Kahan

res: 0, diff: 5.329071e-15
res: 1, diff: 3.552714e-15
res: 2, diff: 3.552714e-15
res: 3, diff: 0.000000e+00
res: 4, diff: 3.552714e-15
res: 5, diff: 1.776357e-15
res: 6, diff: 0.000000e+00
res: 7, diff: 0.000000e+00
res: 8, diff: 0.000000e+00

Neumaier

res: 0, diff: 3.552714e-15
res: 1, diff: 3.552714e-15
res: 2, diff: 3.552714e-15
res: 3, diff: 1.776357e-15
res: 4, diff: 1.776357e-15
res: 5, diff: 0.000000e+00
res: 6, diff: 0.000000e+00
res: 7, diff: 0.000000e+00
res: 8, diff: 1.776357e-15

Notice that Kahan and Neumaier give a big improvement over naive summation, with Neumaier usually being just slightly better than Kahan.

Compute time

I ran some quick benchmarks on my Apple M3 laptop, and found (roughly) that Kahan is about 0.5% slower than simple summation, and Neumaier is around 1.0% slower than simple summation.

Conclusion

Kahan improves accuracy over simple almost as well as Neumaier. Kahan is slightly slower than simple and slightly faster than Neumaier. Kahan has a slightly simpler implementation than Neumaier. I'll go with Kahan for the time being. If we run into examples where Neumaier provides obviously better behavior (e.g., degenerate polygons?) we can consider switching.

Also, probably do not need to fuzz if no longer in the public API

ajfriend · 2025-12-02T00:54:01Z

All comments should be addressed. Ready for re-review.

ajfriend · 2025-12-11T04:09:40Z

Some notes on translating the "Cagnoli math" from d3-geo to what we have here:

We drop the absolute value logic from d3-geo because, as far as i can tell, it is no longer necessary. See the discussion in Simplify dLambda handling in area calculation d3/d3-geo#294

sdLambda = dLambda >= 0 ? 1 : -1,
adLambda = sdLambda * dLambda,

The d3 code takes in one point at a time, and updates "previous" and "current" points:

// Advance the previous points.
lambda0 = lambda, cosPhi0 = cosPhi, sinPhi0 = sinPhi;

Our code, on the other hand, takes in two points (one edge arc) at a time. I thought this made the code more decoupled clearer, and there was no performance loss: #1101 (comment)

The d3 code dLambda = lambda - lambda0 becomes our double d = y.lng - x.lng;

Instead of "phi" and "lambda", I just reused "lat" and "lng".
The d3 areaRingSum.add(atan2(v, u)); is the same as our -2.0 * atan2(sa * sd, sa * cd + ca);
- The 2.0 comes from us doing the multiplication earlier, while d3 does it in https://github.com/d3/d3-geo/blob/8c53a90ae70c94bace73ecb02f2c792c649c86ba/src/area.js#L75
- The -1 factor comes from the fact that D3 uses the opposite "right hand rule" from what we use.
- The remaining differences are just rearranging the algebra to highlight symmetries in the formula: https://github.com/ajfriend/h3/blob/ce65b2d04d3db0f80b47fea4801ce76808bf7f68/src/h3lib/lib/area.c#L33-L45

dfellis · 2025-12-17T16:15:03Z

@@ -0,0 +1,27 @@
+/*


My only question here is why in new area.h/.c files instead of in one of the existing files. Do you expect to add more to this later?

I will be adding more functions for areas of polygons and multipolygons as a follow-up to this PR (#1104), and the area.c file seemed like a good way to group them and their helper functions.

But I agree this area.h file is a little odd with just one function. Currently, it just enables the tests in testGeoLoopArea.c, but it will serve the same purpose in #1104.

If we decide to expose the area functions in the public API, we can probably drop this area.h file (but keep the area.c file.

This is also a byproduct of this being a sequence of stacked PRs. I've added a TODO in #1103 to remind me to review file organization questions like this after we have all the changes landed.

nrabinowitz

This looks good to me!

nrabinowitz · 2025-12-17T21:02:56Z

+summation), which allows us to add up sequences of floating-point numbers
+with better precision than naive summation, especially when the terms in
+the sum vary significantly in magnitude.
+See: https://en.wikipedia.org/wiki/Kahan_summation_algorithm


TIL. Very cool!

nrabinowitz · 2025-12-17T21:13:01Z

+ * The edge arcs between adjacent vertices are assumed to be the shortest
+ * geodesic path between them; that is, all arcs are interpreted to be less
+ * than 180 degrees or pi radians.
+ * Avoid arcs that are exactly pi (i.e, two antipodal vertices).


Might want to add a comment on consequences of this, and maybe throw an error if such an arc is encountered? Though presumably you could only tell if they were using the same const for Pi.

There isn't a numerically stable test for exactly pi radians apart, so I'm not sure what we can do other than to let users know that two points will always be interpreted as describing the shortest arc between them. And to note that there are discontinuities at around arcs with pi radians, so they should be avoided.

I suppose one thing we could do would be to pick some tolerance and throw an error if an arc is within pi - tol. Let me create an issue for that.

nrabinowitz · 2025-12-17T21:17:07Z

+H3Error H3_EXPORT(cellAreaKm2)(H3Index cell, double *out) {
+    H3Error err = H3_EXPORT(cellAreaRads2)(cell, out);
+    if (!err) {
+        *out *= EARTH_RADIUS_KM * EARTH_RADIUS_KM;


Nit: Worth precalculating this as a new const?

This is how it was done before when this function lived in the latlng.c file. I'm 95% certain modern compilers pick up on this and will make the constant for us, but we probably should verify that at some point.

I think compilers will form the constant for us automatically.

(Side note: Compiler optimizations can be pretty impressive. I spent a while seeing how I could speed up the area calculation math, and one of the ideas was to used a fused sincos like https://developer.apple.com/documentation/simd/sincos(_:)-9pa2l which avoids duplicating some computation. I didn't get any speedup... because the compiler was already doing exactly that in the cagnoli function!)

ajfriend added 11 commits November 26, 2025 23:46

area.c

df1b842

adding in more stuff

b6d9ddb

merp

8080e56

kahan accumulator

5c8950a

neumaier_add

b82cab2

notes

ec26cb8

benchmark area

74e277f

turn it to 11.

c915b51

sadly, no improvement from reusing some trig

a74b9d9

back to original

98ecbee

simplify

db21eb0

ajfriend added 5 commits November 28, 2025 20:49

try adding constants

04640ef

format, we must

2349ab4

Adder docs

0dd6a5e

bench

7528234

adder docs

4a4fb07

ajfriend added 11 commits November 28, 2025 23:03

Settled on Kahan implementation

6bd3299

Tighten tolerances in testH3CellAreaExhaustive.c due to compensated sum

1b21bed

6

64853b3

notes

3547c47

starting new tests

0c73921

tests work

31fd416

more tests

61702f1

comments

4b33b02

docstring

26eb143

better docstrings

06b2a6b

Adder note

f55e1b1

ajfriend added 3 commits November 30, 2025 20:33

drop numVerts < 3

6db33a3

degenerate loop tests

e2c20b0

try _compareArea(NULL, 0, 0.0);

eeb3046

ajfriend mentioned this pull request Dec 1, 2025

Add cellsToMultiPolygon function #1103

Open

6 tasks

ajfriend added 8 commits December 1, 2025 18:20

remove docs

28105c3

drop H3_EXPORT

a3ee94d

move geoLoopAreaRads2 out of public API for now

571d72e

slim down benchmark

7d0f56a

clear these comments, maybe?

77278a6

try adding area.h to APP_SOURCE_FILES

fa555e3

can't fail if you don't try!

64e9ad8

Also, probably do not need to fuzz if no longer in the public API

minor

a059397

ajfriend added 2 commits December 1, 2025 19:54

remove justfile

b24442f

Adder adder = {};

dd316e0

ajfriend mentioned this pull request Dec 2, 2025

(Multi)polygon helper functions #1104

Merged

ajfriend added 2 commits December 2, 2025 07:41

degenerate loop note

0c7efce

use same public/private underscore convention as iterators.h

ce65b2d

isaacbrodsky approved these changes Dec 10, 2025

View reviewed changes

Comment thread src/apps/testapps/testGeoLoopArea.c

isaacbrodsky reviewed Dec 10, 2025

View reviewed changes

Comment thread src/h3lib/lib/area.c

ajfriend added 3 commits December 16, 2025 19:41

move other cellArea* functions to area.c

2dac047

Stitch two 1/4 triangles

27911ae

add copyright headers

4931800

dfellis reviewed Dec 17, 2025

View reviewed changes

nrabinowitz approved these changes Dec 17, 2025

View reviewed changes

dfellis approved these changes Dec 17, 2025

View reviewed changes

ajfriend merged commit 7a9fdf1 into uber:master Dec 17, 2025
45 checks passed

ajfriend deleted the cagnoli_area branch December 17, 2025 22:58

ajfriend added this to the v4.5 milestone Feb 11, 2026

Conversation

ajfriend commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajfriend commented Nov 28, 2025

Uh oh!

ajfriend commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comparison of Naive, Kahan, and Neumaier summation

Accuracy

Naive summation

Kahan

Neumaier

Compute time

Conclusion

Uh oh!

ajfriend commented Dec 2, 2025

Uh oh!

Uh oh!

Uh oh!

ajfriend commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajfriend Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrabinowitz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ajfriend commented Nov 28, 2025 •

edited

Loading

coveralls commented Nov 28, 2025 •

edited

Loading

ajfriend commented Nov 29, 2025 •

edited

Loading

ajfriend commented Dec 11, 2025 •

edited

Loading

ajfriend Dec 17, 2025 •

edited

Loading