Last time, I made the triangulation part fast enough for real time use in my game. But, adding the constraints is now way too slow with over 400 constraints to apply in a full level scenario. Finding the constraints, and the removing inside triangles is way too slow, as we do iterate over all the triangles every time. We will need to find a smarter solution to get away with it.
First Improvement : Point lookup
The first real problem is that we cannot look a point up and find a triangle it belongs to. We need to iterate over all the triangles to find it, which is very slow to begin with. So the first thing will be to have another structure that will enable fast point lookup, but which will be cheap enough to not slow down the triangulation phase. I decided on an array of triangle index with the same scheme as the adjacency: 30 top bits are the index + 1 and the two bottom bits are the offset in the triangle :
u32 triangleAggreg = pointTriangle[pointIndex];
u32 triangleIndex = (triangleAggreg >> 2) - 1;
u32 trianglePointIndex = triangleAggreg & 0b11;
Triangle* t = triangles + triangleIndex;
Assert(t->E[trianglePointIndex] == pointIndex);
It can give you any triangle that incorporate this point, but with the adjacency structure from the previous post, we can iterate all the triangles containing this point.
The good thing is that as we can give back any triangle, it’s very cheap on the triangulation side. Every time I create a triangle (or flip an edge), I assign the new triangle to the 3 points, resulting in a very cheap function to call :
inline void updatePointTriangle(u32* p, Triangle* t, u32 tIndex)
{
p[t->a] = ((tIndex + 1) << 2) | 0b00;
p[t->b] = ((tIndex + 1) << 2) | 0b01;
p[t->c] = ((tIndex + 1) << 2) | 0b10;
}
Don’t get me wrong that do incur some overhead on the triangulation side, but it’s not so bad, some movs, two ors and a load effective address on msvc with 02:
mov eax, DWORD PTR [rdx]
lea r9d, DWORD PTR [r8*4+4]
mov r8d, r9d
or r8d, 1
mov DWORD PTR [rcx+rax*4], r9d
or r9d, 2
mov eax, DWORD PTR [rdx+4]
mov DWORD PTR [rcx+rax*4], r8d
mov eax, DWORD PTR [rdx+8]
mov DWORD PTR [rcx+rax*4], r9d
ret 0
We do have one small problem though, as if the point is on the hull we may not be able to reach all the triangles if we keep turning anti-clockwise. In the image below if the structure gives us back the T1 point, we won’t get the T0 in our iteration :
So I need to have a way to restart clockwise if we hit a hull boundary. This makes the iteration a little ugly to write by hand every time, so I extracted this into a little point Iterator used like this :
for(PointTriangleIterator it = iterateTrianglePoint(triangles, i); isValid(it); it = nextTriangle(it))
{
u32 triangleIndex = getTriangleIndex(it);
u32 p0 = getPointIndex(it);
Triangle* start = triangles.triangles + triangleIndex;
u32 p2 = pointIndexTable[p0][1];
u32 p1 = pointIndexTable[p0][0];
// @Note do stuff
}
Finding intersecting Edges faster
The current version is the naive way, iterate over all triangles and check if we intersect the constrained edge. This clearly becomes a problem when you have large triangulations even with fast intersection check. But we can make an observation here. Now that we can look up triangles from a point we can iterate the triangles and look for 2 things :
Either we find the second constrain point in a triangle, meaning the constraint is already there and we can bail, or the opposite edge is intersecting the constrained line. If so, we can start what I call the Shish kebab algorithm, meaning the next intersecting triangle must share this edge, so add the edge to the intersecting list and go to next triangle sharing the edge. Here is an image to help visualize the process :
This make a huge speed up, as we now only iterate over the triangles that intersects the constraint (plus some overhead for other A triangles) instead of all the triangles in the triangulation. Here what the pseudo code looks like for an [AB] edge:
Iterate over all A triangles T : // @Note t[pointIndex] is A
if B belongs to T => stops
else if edge[t[pointIndex + 1 % 3]-t[pointIndex + 2 % 3]] intersects AB :
Add [t[pointIndex + 1 % 3]-t[pointIndex + 2 % 3]] to edge list
While(B is not found)
GetTriangle T2 that shares the last Edge on the list
if B belongs to T2 => stops
else if edge[t2[1]-t2[2]] intsersects add to the list
else add [t2[2]-t2[0]] to the list
This is already very good compared to our naive version, but we do an intersection check at each iteration. But if we think about it, we can have a lighter check operation. We already have the previous intersecting edge, which will know will have one of its point in the next one. As we know its intersects [AB], we know that a simple lineSide check will give opposite signs for the 2 points. So if we do a lineSide check for the new point and we take the opposite sign from the previous edge, that will be our new intersecting edge.
inline r32 lineSide(Vector2 edgeStart, Vector2 edgeEnd, Vector2 point)
{
return (point.x - edgeStart.x) * (edgeEnd.y - edgeStart.y) -
(point.y - edgeStart.y) * (edgeEnd.x - edgeStart.x);
}
This is a very light operation and our edgeStart and edgeEnd stay the same during the whole iteration, meaning we can cache half of those computation. Which makes our lineside check takes 2 muls and 3 subs.
With this finding intersection edges becomes pretty cheap. But we have another slow element we need to improve.
Removing Triangles faster
Once we forced an edge into the triangulation, we need to remove the triangles that are inside the polygon. For this we iterate over all the triangles, and do a line side with each constraints edge, if each point of the triangle is negative with all the edges it is inside and we can remove it. When removing a triangle, I remove it from the adjacency structure and the point-triangle structure. When one polygon is removed (case 1 in the image below) point-triangle structure is fine because that’s already handled with the hull case. But when 2 polygons removed share a point, then the point iteration is now broken, as there is no adjacency bewteen T2 and T3 :
This bugs me for a while, as I did not want to make a more complicated Point-Triangle structure, as it would have a larger impact on the triangulation code as well. The solution struck me when I thought that it was working but I was breaking it with removals. Well, then just do not remove them. Or at least remove it from the user-accessible portion. Here how I did removal :
My triangles are in a flat array (as you already know if you read the previous posts), so removing one was pretty easy, swap the triangle with the last one, decrement the size, and fix the adjacency and point-Triangle indices like this :
Triangle* t = triangles.triangles + i;
u32 index = --triangles.triangleCount;
if(index != i)
{
Triangle* adj = triangles.adjacency + i;
Triangle* t2 = triangles.triangles + index;
Triangle* adj2 = triangles.adjacency + index;
// @Note fix the point-triangle structure
for(u32 k = 0; k < 3; ++k)
{
triangles.pointTriangle[t2->E[k]] = ((i + 1) << 2) | k;
}
// @Note fix the adjacency structure
for(u32 k = 0; k < 3; ++k)
{
if(adj2->E[k])
{
u32 otherInd = (adj2->E[k] >> 2) - 1;
u32 offset = adj2->E[k] & 0b11;
triangles.adjacency[otherInd].E[offset] = ((i + 1) << 2) | k;
}
}
*t = *t2;
*adj = *adj2;
}
Now, I just need to keep 2 sizes, and do the fix up for the other triangle as well which looks like this:
Triangle* t = triangles.triangles + i;
u32 index = --triangles.triangleCount;
if(index != i)
{
Triangle* adj = triangles.adjacency + i;
Triangle* t2 = triangles.triangles + index;
Triangle* adj2 = triangles.adjacency + index;
for(u32 k = 0; k < 3; ++k)
{
triangles.pointTriangle[t2->E[k]] = ((i + 1) << 2) | k;
triangles.pointTriangle[t->E[k]] = ((index + 1) << 2) | k;
}
for(u32 k = 0; k < 3; ++k)
{
if(adj->E[k])
{
u32 otherInd = (adj->E[k] >> 2) - 1;
triangles.adjacency[otherInd].E[adj->E[k] & 0b11] = ((index + 1) << 2) | k;
}
}
for(u32 k = 0; k < 3; ++k)
{
if(adj2->E[k])
{
u32 otherInd = (adj2->E[k] >> 2) - 1;
// @Note if we see index in there, it's because the loop above modified us
if(otherInd == index)
{
otherInd = i;
}
u32 offset = adj2->E[k] & 0b11;
triangles.adjacency[otherInd].E[offset] = ((i + 1) << 2) | k;
}
}
Triangle oldT = *t;
Triangle oldadj = *adj;
*t = *t2;
*adj = *adj2;
*t2 = oldT;
*adj2 = oldadj;
}
So it’s a little more work, but that means that you can always have the triangles that are within a constrained, and the same final results could be used with multiple users who can toggle on and off some constraints, still using the same triangulation results.
We do need an additional function which is :
inline bool isRemovedTriangle(TriangulationResult& triangles, u32 index)
{
return index >= triangles.triangleCount;
}
And we are set, the Point-Triangle structure always works, you just need to call this function to know if you’re in bounds or not.
This is all good but it just fixed a bug we’re not faster by any means, we’re even a little slower. Our main issue is that we are iterating over every triangles, but maybe we can use the same trick as before ? Iterate over the triangle of a point, find the triangle containing the edge, with the third point inside. Remove this, and then walk the adjacency structure, if it leads to a removed triangle or a constrained edge do nothing, else remove the triangle.
The direct implementation of this is not great. It is fast for holes containing not many triangles, but the cost per triangle is way higher than the dumb all triangle iteration so as soon as we have a lot of triangles inside the cost skyrocket. I changed this, to have an array of bool for visited cells and first iterate on the constraints, marking outside triangles as visited so the core of the algorithm can now just follow the adjacecny structure and add not visited traingles. It is better but the very nature of the algorithm is a little too much “pointer tracking” which makes it not the best. Anyway, it will do for now and the major improvement point from this is that my constrained polygons does not have to be convex anymore which is good.
Small Roadbump: overlaping holes
The way we handle holes is to force its edges into the triangulation but if we have holes that overlaps then we have an issue as we will not be able to force a triangulation that contains both edges. As we can’t assume anything about what the gameplay code will do we have to handle it. Here is a little image that shows the problem:
There is a simple solution: allowing us to add points into the triangulation. If 2 constrained edges intersects, we add a point at the intersection, and change the constrained edges to include it.
In Game
I added a small trim operation that removes adjacency between triangles if the moving entity have a bounding box projection bigger than the edge we’re trying to go through. Here what it looks like in the game real time with a way bigger nav mesh that it will probably ever need.
The most expensive operation for now is, by far, the sanitizing of the obstacles which the game could probably do way faster with its spatial partition. But this is it for now, I am now able to generate a pretty big nav mesh each frame and do very simple navigation in it. But, if we put the dynamic part aside, this is a very common planification right now. It only works with entities that does not have their own movement constraints (like gravity), which is almost no entities in the game. So to improve that we’ll need a way to represent an entity movement in this triangulated space. But this will be for another post.
Thank you for reading.
The next post in this serie is out, go read it here