Embree geometry instancing for surface element BVH setup#130
Embree geometry instancing for surface element BVH setup#130Waqar-ukaea wants to merge 3 commits intoxdg-org:mainfrom
Conversation
204cc60 to
006ff68
Compare
|
Latest commit is essentially me having started fresh with the workflow changes to creation of Embree BVHs which have been recently merged in main. I.e making the new The main caveat however is still that I haven't been able to figure out how to get RTCPointQueries to work with the instancing in place. I'm seeing the same incorrect results I saw when I first refactored |
006ff68 to
e8974c9
Compare
|
I have managed to resolve the issues I was having with I'm not actually seeing any difference in memory usage in the instancing :/ But maybe that does actually make sense. Previously we were creating creating one set of RTCGeometry and surface_user_data per surface and attaching them to multiple RTCScenes (volume TLASes). Now with instances, I am still doing the same thing and re-using the RTCGeometry/user data for each surface but instead I am actually creating an RTCScene (BLAS) for each surface and then having the volume TLAS instance those BLASes. So if anything memory should be about the same or slightly higher because there are now more, smaller BVHs created for any given model. Because I think we have essentially gone from a few large per-volume BVHs to many small per-surface BVHs + ligher TLAS BVHs per-volume. Whilst we don't get any memory gains, we are essentially getting a finer level of modularity and the potential for extra control over filtering (i.e culling surfaces via masks). |
|
Copied over from #129 (comment) - A description on how we may be able to use instancing for finer grained control over surface culling. Another benefit of switching to per-surface instancing is that we can use instance masks to pre-filter which surface instances participate in a query. This lets us drop most instances before we enter their BLAS, reducing TLAS work for surface-scoped queries (for example Bucketed masksEmbree exposes a 32-but geometry mask. So we could use that to bucket surfaces into 32 possible groups: // Build time (on the instance geometry):
uint32_t bucket = surfaceId % 32;
uint32_t bit = 1u << bucket;
rtcSetGeometryMask(inst, bit);Then at ray query time set: // Ray query time:
RTCRayHit rh{};
uint_32_t bucket = (surfaceId % 32);
rh.ray.mask = 1u << bucket ; // select the bucket for this surface
rtcIntersect(scene, &rh);This prunes ~31/32 of instances up-front for that query, so if a volume TLAS has many surfaces, the traversal visits far fewer instance leaves. It won’t be exactly a 32× speedup but I think it will still be meaningful. If we ever need to perform a global surface queries (using a TLAS that instantiates all surfaces), the benefit is more obvious because we have a single TLAS with many surfaces. For example, with 3200 surfaces, bucketing reduces candidates to ~100 (≈1/32) for a surface-specific query. I'm not sure if this kind of query will ever be required but geometry masking seems like a cheap easy way to get some free speedups without any extra cost if we do end up merging this PR. Embree Point QueriesIt might not actually be possible for this to work with Embree Point Queries. I need to do some more reading but It's not clear whether or not they support geometry masking in the same way the standard ray traversal algorithms do. Vulkan/GPRTIn our GPRT implementation we actually already make use of explicit BLAS instancing. Though I haven't currently made use of geometry masking at all. I also think that in Vulkan geometry masks are limited to 8bits which means we can only cull a maximum of 7/8 buckets before BVH traversal. |
Looking through the Embree docs, it seems that PointQueries doesn't support masking unfortunately :/ So that does unfortunately put this PR in a slightly weird spot. We already perform volume scoped ray queries (ray_fire and point_in_volume) anyway which won't actually benefit from geometry masking since we don't want to exclude any surfaces in those BVH traversals. I do still think having instances in place is probably the more modern/conventional way to approach BVH setup. It is inherently more modular and provides additional flexibility in extra levers which we might be able to use in the future. The PR does also include some nice simplification of the
EDIT - To clarify, this is only for surface element BVH setup. I have not touched volumetric elements. I could do so in this PR if preferred. |
ad99003 to
fc2ff6e
Compare
Started work on switching over to embree instances to hopefully reduce the memory cost for BVHs on shared surfaces (to close issue #129). This branch seems to be working fine for ray intersections but point queries are not working. Currently the memory footprint has not been reduced at all from these changes
As far as I can tell the idea is that we create an
rtcGeometryof typeRTC_GEOMETRY_TYPE_INSTANCEand attach it to anrtcScenefor each unique surface and then thatrtcScenecan be attached as an instance to our volumertcSceneTLAS. The changes I've made so far seem to be working for ray intersections but not point queries. It's possible that ray intersections automatically descend down into the BLAS instance from the TLAS but point queries maybe don't? I did make an attempt to try and do this descent manually but it didn't fix the failing tests so I've commented it out in my latest commit.The tests in
test_closest.cppandtest_normal.cppare both failing due toxdg->closest()not returning the expected value:Test result:
Other changes:
rtcReleaseGeometryon anrtcGeometryafter its been attached to anrtcScenebut I notice that we hadn't been doing that previously so I don't know if this is strictly required or not.register_surface()is nowcreate_surface_instance()and returns anrtcScene(to be attached as an instance to the volume) rather than anrtcGeometry.primitive_idsto be handled entirely in thecreate_surface_instance().Useful links:
RTC_GEOMETRY_TYPE_INSTANCErtcPointQuerywhich mentions some stuff about instancing.