Improvements to the Separating Axis Test

Jul 9

Make your separating axis tests 5x faster with this one weird trick!

5 Comments

Hi, I'm the author of the SAT chapter in Physics Pearls book.

Long ago (10-15 years, so my memry is fuzzy), I implemented this very idea. The scan without some O(N^2) loop is nontrivial - I used O(N+M) Bentely-Ottmann algorithm for finding all the edge-edge intersections of the gauss maps.

It has awesome theoretical properties because you're essentially just enumerating the very minimal set of axes, and you don't even have to evaluate support function for them! The perf bottleneck was the graph algorithm itself: it took ~100 vertex hull for it to start being faster than the highly SIMDizeable sweep of classic SAT (with the optimization you're mentioning here).

I implemented two variants sweeping from north to south and west to east (planar graphs of a sphere are a bit tricky), but ultimately 100 vertex hulls aren't super useful in games, so I dropped the idea.

It'd be very humbling to find out I was wrong and there is a fast algorithm to find all the intersections that makes this idea workable for smaller (~10 verts) hulls. I'd appreciate if you described the algorithm, as I can't tell what it is from skimming the code. At which point is it faster than SIMDized optimized classic SAT? I didn't try any other algorithms but BO, it's probably possible to make a practical BVH on the gauss map that works better, but I haven't tried it. BO is quite fast, but it's hard to beat the speed of a simple optimized SIMDized loop.

Expand full comment

Reply (1)

Cairn O

5dEdited

Hello Sergiy, thank you for the history. I have not read physics pearls but I learned alot from your valve presentation.

> because you're essentially just enumerating the very minimal set of axes, and you don't even have to evaluate support function for them!

Yes that is what my algorithm is doing, you do need to evaluate one support function to seed it.

However moving between vertex regions you have to do a ray cast against a linear convex cone which I think when you tally it all up might not be much better. I also realized I uploaded the wrong version of the code to github.

Edit: I had a link here but it was also the wrong version so I removed it to avoid confusing people. It has been a few years since I looked at this code and I had so many branches that its hard to remember which one is the right one.

Edit 2: Ok I think I found the right version. I will update the version in the repo but this one is much simpler. I was experimenting with pretransforming geometry (since you can do it fast with SIMD) but you can ignore that part and just calculate the geometry of B in the space of A on the fly. https://gist.github.com/cairnc/cf6ecbd43ee7552c10cfcd1c448769ed

Also don't ask me why it is in C

Expand full comment

Reply (1)

Cairn O

5dEdited

I will leave this code but I just realized this is also not quite the right version. This version still does all the full support function evaluations for the faces normals of A. To remove this, as you trace the edges of B every arc you hit you save the min dot product from your origin vertex region. I.e. you relax the values of the support function for A's face normals. Since every face normal of A is contained in a vertex region of B, traversing all the arcs of B will mean that you've computed the dot product of a face in A with a vertex from its MD.

If a normal from A is not connected to an arc that directly intersects an arc from B then you do flood fill at the end to color those "islands" which are entirely contained in a vertex region of A.

I'll leave it there and keep looking for the code. I'll include a more detailed explanation in the next post.

Expand full comment

Reply (1)

Cairn O

5dEdited

I updated the code in github to use the correct algorithm. I could not find the code so I tried to reimplement it from memory. It seems to be working...

Unsurprisingly the cost to move between regions ends up costing the same as full support evaluation. I suspect this is some theoretical limit for small hulls and you can probably prove it using Euler's polyhedron formula. Oh well!

Perhaps a better name for this article would be "How to get the same result in a more round about way" but it doesn't quite have the same ring to it.

In any case, it is yet to be seen whether its SIMDized version is better than a SIMD version of the traditional algorithm. In any case here is both methods side by side. Each in a standalone function. I've tried to comment the graph traversal approach

https://gist.github.com/cairnc/039a0a8253db33991fd9dd0bf2e01e0c

Expand full comment

Reply (1)

Sergiy Migdalskiy

Thanks for sharing the code!

Yes, this is still superlinear loop, with numerical issues, but IMHO it still hints at the possibility of a better complexity algorithm. The optimization that tests that arcs overlap on gauss map obviates the necessity to compute full support for arc-arc cases, like you correctly observed. That's step 1. Step 2 is the unnecessary O(N^2) scan over all the arc-arc (edge-edge) pairs. It's unnecessary because there is an algorithm that finds intersecting pairs in O(N+M) time and space. Unfortunately, sometimes O(N^2) is better because the constant is small. The side effect of Step 2 is that we don't have to compute support for Face-Vertex cases because the algorithm will give us information about which vertex region contains which face normal. That removes the O(F*V) step, which is another source of quadratic complexity.

My Bentely-Ottmann implementation took about 1300 lines of code with all the debug and logging code included, so it's not all that complex. And it's basically returning you the arc-arc intersections without having to loop through neighbors throwing away most of the portal candidates. I'd encourage you to try it. I wasn't successful at making it competitive for small practical hulls, but I may have missed something.

But fundamentally, like you observed, it's a concave optimization problem. There's no easy way (that I know of) to make it sublinear. You still have to diligently test every single axis you find, so it's still O(F+M) (M= number of arc-arc intersections of overlaid gauss maps). GJK can skip over swaths of vertices, never testing them, it can be warmstarted and it's incremental. SAT is fundamentally different. Step 3 would be to build some kind of BVH over the gauss map and maybe a primal-dual kind of iteration to quickly zoom into the correct region that contains the answer. I implemented Expanding Polytope Algorithm many years ago, it has some of these qualities, but just like with Bentley-Ottmann it's just too complicated for practical purposes.

Expand full comment

Cairn’s Substack

Improvements to the Separating Axis Test