Implementing SSAO in Iterum

Author: Max Fagerström, September 2025

What is SSAO?

Screen Space Ambient Occlusion (SSAO) is a technique for simulating how much ambient light should reach each point in a scene, resulting in soft shadows within creases, holes, and other areas where surfaces are close together. It was first introduced in games by Crysis in 2007. Although a subtle effect, it’s effective in giving scenes a sense of more depth and realism.

How does SSAO work?

SSAO works by calculating a “visibility factor” for each pixel on the screen. This visibility factor represents how much of the ambient light can reach that pixel, depending on how occluded it is by nearby geometry.

To determine this, the algorithm samples multiple points in a hemisphere oriented along the surface normal of the pixel being shaded. This collection of sample points is called a kernel. For each sample, the depth buffer is checked to see whether that point is occluded by surrounding geometry. If many of the samples are occluded, the pixel is in a crevice or near another surface, and less ambient light is applied. If few samples are occluded, the pixel remains fully lit by ambient light.

The use of a hemisphere is a well-known improvement on the SSAO implemented in Crysis, as the kernel used there was a sphere, which resulted in self-occlusion on flat surfaces, since half of the sphere is intersecting with the surface itself.

How is it implemented in Iterum

Tutorials for SSAO often deal with implementing it in the context of a deferred renderer. This is because calculating SSAO requires depth and normal information for each pixel on the screen. In deferred rendering we have easy access to these in the form of the depth and normal buffers, before any lighting calculations are done. In a forward renderer like Iterum has, we don’t have a complete depth buffer before we render the entire scene, and we don’t have a normal buffer at all.

This necessitated adding an extra depth pre-pass to the render pipeline, this pass renders all geometry we want to calculate SSAO for, using only a depth buffer.

Implementing render passes

The Evergreen game engine uses .pipe file to configure graphics pipelines, here all render passes are specified in the order they should be rendered. To add a depth pre pass, we add the following at the start of the .pipe file:

This specifies we want a pass with the name “DepthPrePass”. We specify that it should clear each frame and use the “DepthPrePass” shader. We take all the entities that have SSAO enabled on them as an input. And we output to “DepthBuffer”, this is a framebuffer with only a depth buffer attached. The fragment shader for this pass contains no code, since we only want depth information.

The output of this pass looks like this:

We now have the depth information, but we still need the normals to orient the hemisphere correctly.

Using the depth buffer and the inverse projection matrix we can calculate the view space position of each pixel in the depth buffer. Once we have the position, we can use it to approximate the surface normal by looking at how it changes across neighbouring fragments.

We do this using the GLSL functions dFdx and dFdy , which compute partial derivatives of a value in screen space, applying these to the view position gives us two tangent vectors along the surface. The cross product of these tangents produces a surface normal.

If we output this normal, it looks like this:

We now have both the depth and normal information to calculate SSAO. We can now implement our main SSAO render pass. We specify it in the pipeline file right after our Depth pre-pass:

We set “AOPass” to true, so we get uniforms specific to this pass, like the kernel. We input the DepthBuffer, which was the output of our Depth pre-pass. And we output to the ”AORaw” framebuffer. Setting the input to DepthBuffer tells the renderer to bind the DepthBuffer as a texture and render a full screen quad.

Setting up the rendering of this full screen quad required some extra work, since the usual function to do this in the engine nullifies the camera matrices to render this quad. We need the projection matrix to be able to figure out the view-space position from the depth value.

Instead we use a modified version of this function that just sends the quad vertices without touching the camera matrix. Then in the vertex shader we don’t multiply position by the camera matrices, effectively nullifying it from that end instead. We now have access to the camera matrices while also rendering a full screen quad correctly.

Generating the kernel of sample points is done on the CPU. It is then passed to the shader in a uniform. We add a bias so that points are denser towards the center.

We currently produce a kernel with 20 samples. The output of the AOPass right now:

While this does show some kind of shadowing in the spots we want, it doesn’t really look great. This is because our sample kernel only contains 20 samples, this means the AO is calculated using very low precision, and we get artifacting called banding. If we increase our samples to 512, we can see that the banding is gone and we get a good result:

With this output we can see what the final effect will influence, soft shadows in corners. From just from this SSAO map we can get a sense of the depth of the scene. Increasing the amount of samples does come at a quite high performance cost, we are now doing a lot of calculations and texture lookups for each pixel on the screen. So instead of increasing the number of samples, we can for every fragment rotate the kernel positions by a random vector.

We do this using a small 4×4 noise texture that we sample in the main SSAO pass. If we do this, at our original kernel size of 20, we get this result:

The shader for the main SSAO pass looks like this:

Step by step it functions like this:

Reconstruct view-space position
Reconstruct view-space normal, using the partial derivative method mentioned earlier
Create local tangent basis (TBN Matrix). Samples a noise vector, orthogonalizes it to the normal, compute bitangent, and then assembles a TBN matrix.
Loops over all kernel samples. Rotates kernel sample to view-space using TBN matrix, scale and translate by “sampleRadius” and “viewPos”. Then project sample back to clip space, convert to UV coords to read depth from “depthMap”, compute a distance-based weight, and then finally checks if projected sample is closer than this fragment. If it is, add the calculated weight to the accumulated occlusion factor.

After the loop we normalize the accumulated occlusion factor and output it to the single channel output.

Introducing the noise texture has taken care of banding at a fraction of the performance cost of increasing the kernel sample count, but if we zoom in, we can see that the output is very noisy:

To remedy this, we do an additional render pass where we just blur the output of the main SSAO pass, and output it to a framebuffer. We specify it in the pipeline file right after the main SSAO pass:

The shader is a simple 4×4 box blur, that for every pixel on the screen takes the average of the surrounding 16 pixels. The input tells the renderer to bind the framebuffer we wrote to in the main SSAO pass and render a full-screen quad. The output “AOBlur” is an identical single channel framebuffer as “AORaw”, but once it has been written to, it is available as a texture for other shaders to take in as a uniform.

If we zoom in on the output of this pass, it now looks like this:

Using the SSAO map

Now, in our standard Phong shader, we want to find out the visibility factor of the current fragment we are working with.

We first get the current fragments 2D position in Normalized Device Coordinates, we adjust the range of that to correspond to the UV’s of the screen-space AO map. And then we sample the texture at those coordinates. This gives us the visibility factor for this fragment. We can now use these as we calculate the ambient and diffuse light:

Ambient we multiply by our visibility factor directly. While not physically correct, we also make the visibility factor have an influence on the diffuse lighting. This is done in some implementations of SSAO to prevent the scene looking too flat if strong directional light is present.

We don’t multiply it directly like the ambient component. We limit it with a mix so it doesn’t have too strong of an influence.

Results

In the gif below we see the SSAO turned on and off. With SSAO enabled we clearly see more shading between the trees and ground.