My first lightmap baker: Bakery

A note about my struggle while I was making my first lightmap baker.

2020-03-13

A simple Sponza Palace baked by my first lightmap baker. Only one directional light and Skylight(ambient light) setup in the scene. Looking pretty rough.

And the lightmap of the whole scene above. It has an information of both direct light and indirect light colour.

It’s been a couple of years. Since I started making my own lightmap baker.
And before I start this article, I finished this project. Although I need a lot of things to do: such as optimization, complete some minor features and fix many bugs…
But now I’m planning to make a Mark.II of it. Because I judged here are too many things to be changed. Also, I won’t use radiosity anymore. Instead, I will use raytrace with monte carlo method for my next project.

Anyway, here’s how I created it.

At the beginning

Let’s go back to when I started this project.
I got a request that our game needs indirect lighting technique to make it looks fancier.
And at that time, I don’t have any knowledge of indirect light. So I had no idea where do I start it.
But thankfully, one of my bosses told me a keyword. Radiosity.

So the first I read was Radiosity on Wikipedia. Then I could learn how it working. But still, needed something more detail or an actual example of implementation.
And I found Radiosity, DX11 style. It described how to calculate irradiance(On here, it’s the result of radiosity calculation) in every vertex, and bake them using Spherical harmonics to store in each vertex.
But in my case, irradiance needed to be calculated in every texel of each object. So I decided I better make it possible to store in texels first.

Find sample points

Unlike using vertex as sample point(what I call sample point is a point that calculates irradiance by its position and normal direction, then stores them), texel as sample point need to be found where is their position and normal direction along their UV coordinate.

So my idea to find out each sample point was:

Give a specific size of segment. The number of sample points on the object surface will depend on this segment size.

Multiply each UV position by the segment size. The result of each UV position(let’s call this vertex on pixel) will be a position of its lightmap in pixel.

Connect every vertex on pixel by their index(vertex order). Then you will get polygons on their lightmap(and let’s call this polygon on pixel).

Put each polygon on pixel on a grid whose distance from line to line is 1 pixel.

Now you can see an intersection between every polygon on pixel and 1x1 squares made by the grid.

Find every centroid in every polygon made by the intersection.

Convert every centroid to 3D space. You will earn position and normal direction of each sample point.

And Figure 01 is how are the steps I described working on:

Figure 01: A flow of finding sample points in 2D(UV) space. The green points are each of 'vertex on pixel'. The red squares are intersections. And the black points are the centroids to be calculated by sample point on 3D space.

Then next I thought was how do I come up with an algorithm to find out these centroids on the UV space.
And after few times of attempt, I got an idea:

It’s quite simple. Guess the segment size is 32.
Then let’s start off making 32x32 square on the UV space.

Check if a box(square) and a triangle(polygon on pixel) are intersected. And the result of intersection can be expressed by these four:

Triangle contains box

Box contains triangle

They’re intersected

Nothing intersected

If the triangle contains the box, every 1x1 square in the box is contained by the triangle.
Therefore, chop the box in every 1x1 square. Then calculate centroid of each square.

If the box contains the triangle or they’re just intersected, chop the box in half and back to step 1. Then perform this procedure again for chopped two boxes.
And if the box has larger height than width, chop it in horizontal direction. Or else, chop it in vertical direction.

If nothing intersected between them, this loop can be finished.

If the box size became 1x1, the loop also can be finished. But after doing this following subsequential steps:

If the triangle contains the box, calculate centroid of the box.

If the box contains the triangle, calculate centroid of the triangle.

If they’re just intersected, find generated polygon(which is generated by the intersection between them). and calculate its centroid.

And here’s Figure 02. As I mentioned, when boxes chopped in half, chopped two boxes need to perform above procedure individually. But I depicted only one of the box here for explanation.

Figure 02: A flow of finding centroids. The yellow squares are current searching region. The black squares are discarded region. And black points are centroids.

Also, I remembered Sutherland–Hodgman algorithm to find out generated polygon between intersected two polygons.
On the article, they mentioned that if subject polygon is concave, the output generated polygon may have extraneous lines. Since it has only one vertex list output.
But luckily, subject polygon here always be a triangle, so it will never be concaved.

And here is C++ like pseudo-code of my idea. I used Intel® TBB library. Which allowed me easily parallelize my code. Because it has its own task scheduler.
If you’re working with MVSC, you don’t necessarily have to integrate TBB. As an alternate, you can use a built-in library called PPL(since MVSC 2010) which has almost the same interface with TBB.

#include "tbb/task_group.h"

...

Triangles triangles;
Centroids centroids;

tbb::task_group worker;

...

Polygon calculateIntersectGeneratedPolygon(const Polygon& clipPolygon, const Polygon& subjectPolygon){
    // use Sutherland–Hodgman algorithm to calculate 'generated polygon'.
    // mind you that 'subjectPolygon' will be clipped by each side of 'clipPolygon'.
    ...
}

void collectCentroidsInTriangle(
                                const Triangle& triangle,
                                unsigned left,
                                unsigned top,
                                unsigned right,
                                unsigned bottom
                                )
{
    Box box(left, top, right, bottom);
    
    // the 'state' can be expressed by:
    // "triangle contains box"
    // "box contains triangle"
    // "they're intersected"
    // "nothing intersected"
    IntersectState state = box.intersect(triangle);
    
    // if 'box' has 1x1 of size, you can't chop it anymore.
    if(box.width == 1 && box.height == 1){
        if(state == "triangle contains box"){
            centroids.add(box.calCentroid());
        }
        else if(state == "box contains triangle"){
            centroids.add(triangle.calCentroid());
        }
        else if(state == "they're intersected"){
            // find polygon generated by intersection between 'box' and 'triangle'.
            // then calculate its centroid.
            centroids.add(calculateIntersectGeneratedPolygon(box, triangle).calCentroid());
        }
    }
    else{
        if(state == "triangle contains box"){
            // chop 'box' in every 1x1 square.
            // then calculate centroid of each square.
            for(unsigned y = top; y < bottom; ++y){
                for(unsigned x = left; x < right; ++x){
                    Box boxLocal(x, y, x + 1, y + 1);
                    
                    centroids.add(boxLocal.calCentroid());
                }
            }
        }
        else if(state == "box contains triangle" || state == "they're intersected"){
            // chop current 'box' in half and execute this function again.
            // if 'box' has larger height than width, chop it in horizontal direction.
            // or else, chop it in vertical direction.
            if(box.height > box.width){
                worker.run([&]{
                    collectCentroidsInTriangle(triangle, left, top, right, (top + bottom) / 2);
                    });
                worker.run([&]{
                    collectCentroidsInTriangle(triangle, left, (top + bottom) / 2, right, bottom);
                    });
            }
            else{
                worker.run([&]{
                    collectCentroidsInTriangle(triangle, left, top, (left + right) / 2, bottom);
                    });
                worker.run([&]{
                    collectCentroidsInTriangle(triangle, (left + right) / 2, top, right, bottom);
                    });
            }
        }
    }
}

void collectCentroidsInTriangles(unsigned segmentSize){
    // scale UV coordinate by segmentSize
    triangles.setSegment(segmentSize);
    
    // since multiple triangles can exist in the same object,
    // you need to iterate all triangles in the object.
    for(const Triangle& triangle : triangles){
        unsinged minX, minY, maxX, maxY;
        
        // find min/max UV coordinate of each 'triangle'
        triangle.findMinMax(&minX, &minY, &maxX, &maxY);
        
        worker.run([&]{
            collectCentroidsInTriangle(triangle, minX, minY, maxX + 1, maxY + 1);
            });
    }
}

...

int main(){
    ...
    triangles.init();
    centroids.init();
    
    collectCentroidsInTriangles(32);
    
    // wait for every task to complete
    worker.wait();
    ...
}

Now we have a couple of steps left to be ready to calculate irradiance.

Centroids become sample points in 3D space

We’ve been watching how to find centroids in 2D space. And now it’s time to convert these in 3D space to use them as an actual sample point.

And fortunately, there was a good tool to convert my centroids. it’s called Barycentric coordinates.
According to the article in the link above, it’s possible to convert each centroid by down below:

Here you have a triangle \(\triangle ABC\).

And there’s a point \(P\) which placed inside the \(\triangle ABC\).

Then if you connect all the points \(A\), \(B\), \(C\) and \(P\), you will get 3 sub-triangles \(\triangle ABP\), \(\triangle ACP\) and \(\triangle BCP\).

And let’s say, \(x = \frac{\triangle BCP _\rm{Area}}{\triangle ABC _\rm{Area}}\), \(y = \frac{\triangle ACP _\rm{Area}}{\triangle ABC _\rm{Area}}\) and \(z = \frac{\triangle ABP _\rm{Area}}{\triangle ABC _\rm{Area}}\). Then you can see \(x + y + z = 1\).

Finally, you can calculate \(P\) by the following: \(P = xA + yB + zC\).

So the first, when it comes to the current situation, the \(\triangle ABC\) represents each triangle of the object. And the \(P\) is each calculated centroid(note that this is not a centroid of \(\triangle ABC\). I’m talking about the centroids that we’ve calculated above.) in the triangle.
Therefore, It is possible to find \(x\), \(y\) and \(z\) for each centroid.
Also, you don’t have to calculate every \(x\), \(y\) and \(z\) by calculating each area, but two of them. Since \(x + y + z = 1\).

And second, you would remember \(A\), \(B\) and \(C\) each point of the triangle are actually vertices of geometry in 3D space. And here just used its UV component of each vertex.
So pretend \(A^\prime\), \(B^\prime\) and \(C^\prime\) are the position component in 3D space of each vertex.

Then finally, each centroid in 3D coordinate \(P^\prime\) is: \(P^\prime = xA^\prime + yB^\prime + zC^\prime\).
Also, you can apply it in every component of the vertex. Such as normal direction. But in this case, make sure to normalize after calculating \(P^\prime\).

One pixel, multiple sample points

I briefly mentioned on the pseudo-code above that multiple triangles can exist in the same object.

And let’s think about what will be happened if more than one triangle placed in the same space.
Probably, with a high chance, many pixels will contain more than one centroid. And one of the common cases you can see it happens is rectangle.

As you know, every single primitive are expressed by sequence of triangles. Hence, rectangle is not an exemption. More exactly, rectangle is made up of two triangles.

Then where will it be? Which spot have multiple centroids in rectangle? You would intuitively notice, it’s diagonal side of rectangle.
The diagonal side of rectangle, in other words, is a spot that one edge on each side of two triangles is overlapped.
And the edge also crosses the pixels. See Figure 03.

Figure 03: The rectangle on UV(2D) space. Pixels are each grid. And the orange and red points are centroids generated by intersecting the rectangle and pixels. You can observe two centroids on each spot where the diagonal side of each triangle crosses pixels.

We just saw the case two centroids are placed in one pixel. But then again, there are so many possibilities that more than two centroids placing in one pixel. Depending on how the triangles allocated.

One pixel, multiple sample points: What’s the matter?

Then, I asked myself what is the actual problem if each irradiance is stored in the same pixel. To solve this problem, I needed to remember how I use lightmap. The final result of lightmap that I’m gonna make will be used as texture when primitives in a scene are rendered.
So if multiple irradiances are stored in one pixel, it will cause an unexpected(or weird looking) result when shader samples the position of the pixel while rendering primitives.

Figure 04: A lightmap UV of an object in a scene. You can see there is no overlapping or duplication between triangles.

But there was one more thing I had to remember. I said it will cause an unexpected result. But somehow it will be pretty expected. What I meant is, every triangle in the UV space, I pretended they’re not overlapped each other. See Figure 04.
And that means, for sample points that each of their irradiances are needed to be stored in a same place, they will be quite close together in 3D space. So I instantly thought if I calculate average position for overlapped sample points, then compare with the result between original and this one.

And this is how I calculate irradiance in both original and average position.

The original:

Calculate irradiance of each sample point.

Calculate area of each polygon whose centroid in 3D space is each sample point.

Add all the area in the current pixel. And divide each area by this summation(I’m gonna call it participation rate).

Multiply each irradiance by each participation rate. Then add them together.

This summation is the irradiance of the current pixel. (Weighted averaged irradiance)

Figure 05: 3 triangles are placed in one pixel(see left illustration). Magenta arrows are normal directions of each sample point. Each colour of triangle is the irradiance of each sample point. The final colour is weighed average of each irradiance.

The average position:

Calculate area of each polygon whose centroid in 3D space is each sample point.

Add all the area in the current pixel. And divide each area by this summation(This is participation rate which is the same one above).

Multiply position and normal direction of each sample point by each participation rate. Then sum each position and normal direction separately. Also, don’t forget to normalize the summation of normal directions.

Shoot ray at the summation of positions along summation of normal directions. And test if there are any intersections between polygons which are belonging with current pixel.

Find farthest intersection. That is new sample point position of current pixel.

If nothing intersected, Shoot ray at the same point along negative summation of normal directions direction. And samely, test if there are any intersections between polygons which are belonging with current pixel.

Find nearest intersection. That is new sample point position of current pixel.

If nothing intersected still, summation of positions is the new sample point position of current pixel.

Calculate irradiance with summation of normal directions at new sample point position.

Figure 06: Magenta arrows are normal directions of each sample point. Red arrow is weighted average of each normal direction. The final colour is an irradiance of averaged normal direction.

Figure 06-1: Averaged position will not always be placed on a surface of the object. So it needs to be moved along averaged normal direction until the position contact to the object.

And thankfully there was not much difference between two. Except when the segment size goes too small. In the worst case, every triangle in the object will be placed in the same pixel. And this may cause the obvious difference between two. But we have to notice that both original and averaged methods won’t show up correct result in that case.

Direct light computation

Now it’s time to jump into the actual calculation of lightmap.
And this will be the first visibly observable work of the whole process.

The direct lighting itself is dead easy. And probably has no relation with the radiosity method.
Because I did it with the traditional diffuse lighting. Which is usually called by Lambertian Shading. As you know, it’s just done by calculating \(\cos\theta\) between surface normal direction \(N\) and negated incident light direction \(L\).
Also, I added here a tiny process to calculate occlusion between light and surface. So this will make a shadow by other object or by even itself.

And this is a simple pseudo-code of direct light calculation.
Code will check the occlusion between each light and each sample point. Then if something occluded, make current colour of sample point to zero. Or else, calculate diffuse colour using Lambertian shading depends on its light type(e.g., Directional light or Point light).

for(auto& sample : samplePoints){
    sample.totalLight = 0;

    for(const auto& light : lights){
        Ray rayBetween = calculateRay(light, sample);

        bool bOccluded = false;
        for(const auto& object : objects){
            if(object.intersect(rayBetween)){
                bOccluded = true;
                break;
            }
        }

        sample.totalLight += bOccluded ? 0 : light.calculateColorOf(sample);
    }
}

And Figure 07 is a result of the code. I placed a box on the ground. Only one directional light setup in the scene. In order to observe the clear result, I didn’t use texture filter to render the lightmap.

Figure 07: A box on the ground. The black area is the shaded region by directional light.

So the direct light calculation is easily done. But as you notice, there is an obvious aliasing. Of course in actual practice, I used the bilinear filter to render the lightmap. But still, it was hard to hide ziggy-zaggy pattern.
See how it looks different compare to non filtered one.

Figure 08: Comparison of non filtered and bilinear filtered rendering. Right one is the bilinear filtered.

As you can see, there is the aliasing remains.
And I thought the simplest way to reduce the aliasing is increasing lightmap size. Yet, it will take too much time to calculate. So, for the halfway solution, I divided each sample into small pieces but only for the direct light calculation. Hence, the indirect light calculation will not be affected.

And I made a simple logic to divide them. Which is: you would remember in the early stage, I converted each centroid to 3D space and that is the sample point. And for each centroid, they have their own polygon(if you don’t remember, check here). So, what I did is connect centroid with every vertex of the polygon and for each vertex, connect to their neighbour vertices. Then you will find sub-triangles in the polygon. Now, find all centroids of each sub-triangle, then convert them to the 3D space. That will be the sub-divided sample point for the direct light calculation.

Since each sub-divided sample point are placed in the same pixel, don’t forget to divide by their area size of sub-divided polygon.

And the result of this is Figure 09.

Figure 09: Image on the left is the result that calculates direct light on each sample point. The other one is the same but calculated on each sub-divided sample point.

The edge part of the shaded region(especially diagonal side in the lightmap), they look like multi-sampled. And let’s see what if the bilinear filter applied.

Figure 09-1: Same as Figure09, but the bilinear filter applied.

Now we see the aliasing is a bit relieved.

Direct light computation: A small tweak for speed up

While I writing this article, I randomly got an idea to improve calculating speed of direct light computation. But it is just an idea, not implemented. So I have to test it to my future work.

Anyway, What I think is that most pixels of the shaded region have the same value, except when they placed close to the unshaded region. Therefore, if a pixel has the same(or similar even) result as its nearby pixels, it doesn’t have to be sub-divided.

So the estimation will be like this:

Calculate direct light on the sample points.

Iterate sample points and calculate difference with their neighbour pixels(sample points).

If the difference has a big enough number(threshold), sub-divide polygon of the sample point, and calculate direct light again.

But I’m not sure will it be effective if direct light calculation contains point light, spot light or if it allows transparent/translucent material or caustic effect. If so, each sample point will have more chance to have different value with others even if they’re in the shaded region.

The radiosity

Now it’s the time to look up how to calculate indirect light. And like I mentioned in the early stages, I used radiosity method to calculate them. The reason why I choose it is, well, this was the only keyword I could search for.

According to the slide, Radiosity OverView(see slide 5 in the page), the radiosity method has its basis in the theory of thermal radiation. So the theory applied here to describe light energy.

Anyway, the important part is its equation. And for the finite number \(n\) of surface, the equation is:

\(B _\rm{i} = \textit{E} _\rm{i} + ρ _\rm{i} \sum _{j=1}^n {\textit{B} _\rm{j} \cdot \textit{F} _\rm{ij}}\)

where:
\(B _\rm{i} =\) radiosity of surface \(i\)
\(E _\rm{i} =\) emissivity of surface \(i\)
\(ρ _\rm{i} =\) reflectivity of surface \(i\)
\(B _\rm{j} =\) radiosity of surface \(j\)
\(F _\rm{ij} =\) form factor of surface \(j\) relative to surface \(i\)

As you see, the equation is recursive. Because the \(B _\rm{j}\) on the right side is the previous result of \(B _\rm{i}\).
Also, the \(B _\rm{i}\) would be the irradiance we’ve been looking for.
The case of \(E _\rm{i}\) and \(ρ _\rm{i}\), I could leave these as a constant value or sort of configurable one that depends on its surface material.
And there’s one left which called form factor, \(F _\rm{ij}\). This stands for a relation between leaves and arrives energy. And the following equation shows how the form factor should be calculated.

\(F _\rm{ij} = \frac{1}{\textit{A} _\rm{i}} \cdot \int _{\textit{A} _\rm{i}} {\int _{\textit{A} _\rm{j}} {\frac{\cos\theta _\rm{i} \cdot \cos\theta _\rm{j}}{\pi r^2} \textit{dA} _\rm{i} \textit{dA} _\rm{j}}}\)
(It will be better to see slide 7 of this page)

Each letter is:
\(A _\rm{i}\) and \(A _\rm{j}\) are areas of surface \(i\) and surface \(j\).
\(\theta _\rm{i}\) is the angle between surface \(i\) normal direction and a direction towards surface \(i\) to surface \(j\).
\(\theta _\rm{j}\) is the angle between surface \(j\) normal direction and a direction towards surface \(j\) to surface \(i\).
The \(r\) is the distance between surface \(j\) and surface \(i\).

And based on the equation, I wrote a code like this:

for(auto& sample : samplePoints){
    sample.totalLight = directLightOfCurrentSample() + sample.emissivity;
    sample.emitLight = sample.totalLight;
}

for(unsigned bounce = 0; bounce < bounceCount; ++bounce){
    Color totalEmitOfBounce = 0;

    for(auto& iSample : samplePoints){
        Color totalIncidentOfSample = 0;

        for(auto& jSample : samplePoints){
            if(iSample == jSample)
                continue;

            Ray rayBetween = calculateRay(iSample, jSample);

            bool bOccluded = false;
            for(const auto& object : objects){
                if(object.intersect(rayBetween)){
                    bOccluded = true;
                    break;
                }
            }

            if(!bOccluded){
                Color addLight = jSample.emitLight * jSample.reflectivity;
                addLight *= calFormFactor(iSample, jSample);

                totalIncidentOfSample += addLight;
            }
        }

        iSample.totalLight += totalIncidentOfSample;
        iSample.emitLight = totalIncidentOfSample;

        totalEmitOfBounce += totalIncidentOfSample;
    }

    if(isTooSmall(totalEmitOfBounce))
        break;
}

Just like the equation, the totalLight of sample point will be calculated with the previous result of radiosity.
And the bounceCount is a configurable value. By increasing bounceCount, the final result will be accurate. The reason why I left this as configurable because light bouncing needs a bunch of calculation. It’s literally too slow. Moreover, in most cases, there are only tiny differences after 3 or 4 times of bouncing calculation.
Also, if something exists between iSample and jSample, that means, there is no incident light from jSample in terms of iSample on the current level of the bounce.
The calFormFactor(…) will return 0.0 to 1.0 of value, hence the totalEmitOfBounce will keep decreasing in every iteration of bouncing. As a result, when totalEmitOfBounce nearly closed to zero, future bouncing iteration will be cancelled.

Apparently, there are plenty of ways to implement the radiosity method for lightmap baking, but I could start off with two different ways. The biggest difference between two is how to calculate the form factor. And I’m going to review both ways. But before the start, each way have pros and cons. So I can’t say which is better.

Sample point to sample point

The first way is visiting every sample points per every other sample points. What I’m saying is, guess we have a table which contains every sample point in a scene. Then we’ll iterate all the sample point in the table. While iterating each sample point, we will also iterate all another sample points per each and calculate form factor with them.

As you see, it will take \(O(n^2)\) of time complexity where \(n\) stands for the count of sample points. But in general, it’s quite hard to find the case that one sample point relates to every other sample points in the scene. Because the equation shows when an angle of sample point normal direction and the direction from the sample point to another has a value which smaller than 90° is the only case the form factor is valid. In other words, if two sample points are not in range of each other’s hemisphere(their normal direction oriented) boundary, they don’t need to be related.
Also, if two sample points are too far away, it’s okay to be ignored since the equation says form factor divided by \(r^2\).

It might be the most accurate way as much as the slowest way to calculate radiosity. Because there is no trick or tweak for calculating form factor. The only thing that affects the quality of the calculation is the segment size of lightmap and this also strongly related to the calculation time.

And this is what vrad of an old Source engine do. Of course on vrad, it contains lots more logics to optimize the calculation. The form factor calculation is the same as above, but it has a much faster algorithm to iterate each sample point.

I tried to build my baker with this way for the first time. But then I realized it’s way too slow to use. Guess if I applied K-D Tree or something like that, it could be better. Anyway, so I gave up and started to look for another way…

Hemicube form factor

So the second way I tried is Hemicube form factor which is well explained by Hugo Elias. And the biggest difference from the early way is, instead of gathering contactable sample points, it actually renders the scene on each sample point position.

The reason why its name is Hemicube form factor, because the renderer will render the scene on the sample point position towards each face of hemicube. See Figure 10.

Figure 10: Scene will be rendered along positive normal(blue), both direction of tangent(red) and both direction of binormal(green) direction.

Also, since it is Hemicube, tangent and binormal toward viewports will render only half size of normal towards viewport.

After render the scene, each five face need to be combined with form factor. But before doing that, there is an issue to solve first.

Hemicube form factor: Perspective projection

To render the scene, I used perspective projection matrix which has 90° of FoVY value to fit to each five face of hemicube.
And here is perspective projection matrix I used. I followed DirectX’s way.

Figure 11: Perspective projection matrix.

But there is a problem. It already dealt with Hugo Elias on his article, but if an object rendered close to side of the viewport, the object looks strached compare to when it located at center. And it causes object emits more energy at the side.

So, to prevent this happening, Hugo suggested multiply the cosine of the angle between the direction the camera is facing in, and the line from the camera to the texel(see Compensating for the hemicube’s shape).

And Ignacio Castaño who was a programmer of The witness pointed out that it has little bug because this term needs to be divided by the squared distance between the location of the camera and the center of the texel(see Integration) not just only multiply the cosine.

But I don’t think Hugo was wrong. Because once it is rendered by perspective projection, the area size of the rendered object will be decreased logarithmically when it goes far from the camera. And I think this will give pretty the simillar effect when it comes to form factor. So I decided to follow only Hugo’s way.