Thursday, 6 September 2012

2.5D XNA RPG Engine - Some Technical Details

I keep getting questions regarding some of the more technical aspects of my 2.5D shading implementation in my earlier YouTube videos like this one, so I want to dedicate an entry here to explaining a little bit what's going on behind the scenes there and fielding questions.

First things first, the original codebase at the moment is very outdated (still on XNA 3.1) and very tangled. It was one of my first projects, it's very badly organised and, at the same time, I do have plans for it in the future, so I won't be releasing the whole source code for the time being.
That said, let's take a look at the basics.

Overview

The idea was very simple - to do away with drawing polygons for the most part and, instead, deal with geometry in screenspace, as an extension of the sprites you'd normally use in a 2D engine.

Cons:

Right off the bat, I'll come out and say that this isn't necessarily as great a performance-saving idea as it seemed to me back then. You're still doing a lot of work across the whole screen with pixel shaders, which is where the majority of the computation time comes in anyway. Moreover, you're essentially loading every sprite three times over, so you have a pretty high memory usage too (and unlike in 3D rendering, where you can easily have much smaller resolution normal textures with high-resolution diffuse maps, you can't really do that here without sacrificing quality dramatically).

I'm pretty sure something similar could implemented in say, Unity3D, with the right scripting and customised shaders, but I don't think this approach will ever be much of a performance saver - certainly not as much of one as I thought back at the time, when my understanding of computational expenses was still very limited. And while on more modern handheld devices you could probably get something like it to run, it'll probably need to be a fairly pared-down version due to restraints on how much per-pixel computation you can do and due to memory restraints.

Bear in mind you could easily get a very similar effect with a fully 3D engine, rendered out with an isometric perspective. If you don't make the geometry too complicated, your memory usage would likely be significantly lower as you'd be able to share textures between objects, while the per-pixel computation overhead would be the same as with the 2.5D engine, if not lower.

By going 2.5D you give up quite a few things that would come naturally to a properly 3-dimensional engine - things like having a proper zoom function.

Pros:

With that in mind, here's what I see as some of the advantages. The main reason to go down this route, I think, mostly comes down to content creation and stylistic considerations since, as I've pointed out above, there's no real reason to do so for performance or graphics, with one exception.

I think, graphically, the only real advantages here are the potential capacity for dealing with really huge polycounts by baking them into your sprites and getting a limited sort of multisampling for free, as your assets are likely going to come out of 3DS Max nicely antialised (I say 'limited' because geometrical intersections are not going to antialias themselves and, indeed, can end up looking quite rough). Even so, they're a bit dubious - you don't really need huge polycounts and even if you do:

They're not all that expensive, relatively speaking.
It's actually pretty easy to combine pre-rendered elements with an otherwise fully 3D isometric engine.

But here's a few boons contentwise. The main, really, I think is that you can do a lot of the texturing work much more easily on flat sprites. While you need a 3D mesh initially to generate a rudimentary colour map, normal map and heightmap, you can then do a lot of extra detailing in an image editor like Photoshop. You can bake in very high-quality ambient lighting from your 3D program without any overhead and you can also potentially save time on a lot of UV mapping and texture creation.

You also get a little bit more freedom with how you go about creating assets in the first place - while you can start off making a simple mesh for calculating normal and height data, you could use a flat photo, or a hand-drawn sprite for the colour, essentially giving it extra depth.

Assets

Quite briefly, this is how I generated the assets I used in my videos and here's a sample of one. I'm going with the Utah teapot on this:

To save on memory a bit, only the colour map actually has an alpha - the rest borrow the colour map's alpha channel when they're being drawn. Also, as you can see, the colour map was rendered with some basic 3DS Max ambient occlusion.

All three I rendered in 3DS Max in my case, though it might be easier to create an XNA-based tool for generating the maps more reliably via custom shaders. Or, perhaps, you could just Maxscript it.

Here's what the normal map material looks like:

You're just adding together a red, green and blue object-space falloff map, it's as simple as that.

As far as the height map goes, you've got a few ways to go with that one. In my case, I set up a vertical gradient with a bit of extra dithering and worked out what the top value should be for every given object. The heightmap, like the normal map, is of course in worldspace.

The issue with the heightmap though is that you end up being restricted by the range of colours in it quite badly. The purpose of the heightmap is to, literally, state the worldspace height of every pixel - and that means you need a range of values as large as your object is tall, in pixels. So if you stick with a grayscale 256-colour bitmap, you run out of values pretty damned quickly - if your image is more than 256 pixels high, it's going to run out of height.

My solution, a fairly hacky one, was to displace the gradients in the RGB channels separately by a unit and, when calculating the height of a pixel, just add them for the result. That gave me a maximum height of 768 pixels (once you get really high, you can actually let the gradient clamp, as long as you don't have point-lights that high up to mess things up).

Alternatively, if you make your own content preprocessor, you could do more elaborate things, such as multiplying the RGB values to get the output. That's just something that's harder to do with Max materials on the fly (and you'd need to think about how you render it if you want to change the object's height).

Code

Last but not least, I'll go over what actually happens in the render pipeline in some broad strokes.

The way my original code works is that you set up three separate RenderTargets that you draw the entire scene onto - once with the diffuse sprites, once with the normal maps and once with the heightmaps. The normal maps and heightmaps, as mentioned earlier, all borrow their alphas from the diffuse map.

A neat trick you get to do at this stage is that you can use the heightmap for depth testing. Now, the heightmap is in worldspace, so it doesn't exactly tell you how 'deep' into a scene a given pixel is, but it's good enough - if we have two objects occupying the same bit of space, one of them's grey, the other's white, that means the white object is higher up in worldspace than the darker object, which in turn means it has to be in front of it. So that's the basis of our depth query.

This is the shader that does it - as an extra parameter, it scales the heightmap by the scale of the object, allowing you to adjust for, well, scaling:

sampler AlphaSampler : register(s0);
sampler ColorSampler : register(s1);

float Scale = 1;

struct PS_OUTPUT
{
float4 color : COLOR0;
float depth : DEPTH;
};

PS_OUTPUT DepthMapPS(in float2 texCoord : TEXCOORD0)
{ 
 PS_OUTPUT Out = (PS_OUTPUT)0;
 
 float4 alpha = tex2D(AlphaSampler, texCoord);
 float4 tex = tex2D(ColorSampler, texCoord);
 
 Out.depth = ((tex.r + tex.g + tex.b) / 3) * Scale;
 Out.color = float4(Out.depth, Out.depth, Out.depth, alpha.a);
 return (Out);
}

technique DepthMapping
{
    pass Pass0
    {
        PixelShader = compile ps_2_0 DepthMapPS();
    }
}

Next up, let's look at the lighting code. The game had two lighting shaders - one for directional light (i.e. sunlight) and another for point lights. Since the directional light shader is just a subset of the point light shader and overall quite simple, I'll just look at the point light shader.

Now, this implementation is, again, far from being anywhere near optimal. For starters, each light involves a whole new pass being rendered across the entire screen, with no compensation for falloff (it also uses a really goofy light falloff curve I came up with, which never really peters out). As you can see below, it uses three texture samplers to feed in the screen as it was rendered with colour, normal and heightmap sprites, then uses all three to figure out the position of any given pixel. That's a lot of crap going on on a per-pixel basis.

sampler ColorSampler : register(s0);
sampler NormalSampler : register(s1);
sampler DepthSampler : register(s2);

float3 LightPosition;
float LightIntensity;
float LightRange;
float4 LightColor = float4(2, 0, 0, 1);

float ScreenWidth;
float ScreenHeight;

float4 NormalMappingPointPS(float4 color : COLOR0, 
                                 float2 texCoord : TEXCOORD0) : COLOR0
{
 float4 tex = tex2D(ColorSampler, texCoord);
 float3 normal =(2.0 * (tex2D(NormalSampler, texCoord))) - 1.0;
 float3 depth = tex2D(DepthSampler, texCoord);
 
 float Z = ((depth.r + depth.g + depth.b) / 3) * 1024;
 
 float3 pixelPosition = float3(ScreenWidth * texCoord.x, 
                                  (ScreenHeight * texCoord.y) + (Z * 0.7547), 
                                  Z);
                                  
    float3 lightDir = (LightPosition - pixelPosition) * float3(0.75, 1.0, 1.0);
    
    float lightDistance = length(lightDir);    
    float distModifier = LightIntensity / (max(lightDistance * LightRange, 1.0 / LightRange));
 float lightAmount = max(dot(normal, normalize(lightDir)), 0.0) * distModifier;
  
 float4 output = tex * lightAmount * LightColor;
 
 return output;
}

technique Deferred2DNormalMapping
{
    pass Pass0
    {
        PixelShader = compile ps_2_0 NormalMappingPointPS();
    }
}

I think overall it's pretty self explanatory otherwise. Between these two functions, that's the main chunk of the work being done, really.

Final Notes

To cap off, a few ways to improve on what I've got so far. There's definitely much that can be done to improve the efficiency of my shaders from, what was it, two? No, three years ago, goodness. It's been a while. But yeah, they're not great and drawing point lights in particular is a massive drain at the moment. If you wanted to do this for a handheld, your best bet would be to limit the number of lights on screen and do them with a single shader pass for all of them.

Secondly - and this is an important one - don't render more than you have to. I mean, the vast majority of the time, all you're going to be doing is moving the camera around while everything else remains static. So the best course of action is to do all the expensive pixel computations once and just keep a large chunk of the playable area in memory, if you've got the memory for it.

If a given area is just made up of lots of static sprites, you could even render your diffuse/normal/heightmap layers for the whole area and then dispose of all those other sprites to free up space, since you don't need them. You can then render dynamic objects on top of it all using depth testing.

Needless to say, when a dynamic object updates, you can draw it separately with lighting computed just for that object. So really, you only need to do the expensive light computation when the lights are actually moving around, and you'll probably want to keep that to a minimum (if you're concerned about performance or battery drain, anyway).

As it stands, the engine isn't doing any of that, so as I say, overall it's in actually a pretty basic state.

41 comments:

Unknown7 September 2012 at 08:55
I think your ideas for depth/height testing are brilliant. The thing I'm most curious about is how you would handle things like 3D world-space position, collision, etc. Having the depth calculated during rendering without an actual Z position in 3d space seems to make it awkward to say the least.

And thanks for this by the way...

-sublm66 from youtube
ReplyDelete
Replies
9of97 September 2012 at 09:04
Can you elaborate a bit?

As the game's isometric, the way the engine was set up was that collision between, say, the player and other objects is just calculated with polygons drawn out on a flat plane.

If you mean collision in the sense of just having objects intersect, that's just done through the depth testing shader. The one I've got quoted in the code block, by the way, seems to just draw the depth map out into the RGB output, but you can replace Out.color with something like Out.color = alpha and have it output the colour map (which you'd pass into the 'alpha' sampler) which is Z-buffer tested according to the heightmap.
ReplyDelete
Replies
Unknown7 September 2012 at 11:06
Ok, so by "polygon drawn out on a flat plane" do you basically mean like a quad? So as objects pass each other you can see if they collide (horizontally), but you have to make sure they are at the same depth, right? Otherwise, they're not colliding. But, if I'm understanding correctly, you don't calculate the depth of (each pixel of) an object until the rendering stage.

So, I guess what I'm asking is how would you tell the objects that they've collided? I'm sure there's something I'm just not understanding. Do you do your rendering and then pass on that information to the objects? I'm imagining an example of say, a player walking behind a tall skinny building. Their polys would be colliding, but in reality (world-space) they shouldn't be.

Also, the weather system looks pretty incredible. I really like the snow laying on the ground. Is that some sort of particle system you created?

Thanks for the answers by the way...I'm just really interested and I want to make sure I understand what's happening :)
ReplyDelete
Replies
Evgeniy7 September 2012 at 11:39
Hi 9of9,

It's amazing.
Thanks to this post I understood where was my mistake.

I am very grateful to you.

Best regards,
Evgeniy
ReplyDelete
Replies
Evgeniy7 September 2012 at 17:00
I added the bloom, glow and blur effects to this lightning system and its looks amazing and sweet :)

Thank you for idea about fog shader I forgot about it.
ReplyDelete
Replies
Unknown8 September 2012 at 10:01
Being not much of a 3d artist at all myself ><, I keep ending up with a baby blue material when trying to duplicate your normal mapping method. By any chance, can you elaborate on how you ended up with your normal mapping material the way that you did? Also, did you do any modifying to your color map texture that you got after you rendered it out in 3ds max? The reason I'm asking is because I cannot seem to get my shadows as dark as yours with ambient occlusion - http://tinypic.com/r/1z70pw8/6

Cheers!
ReplyDelete
Replies
Anonymous8 September 2012 at 23:05
You also take another performance hit by outputting depth from the pixel shader, since this disables early-Z testing. That means you end up running the pixel shader for every pixel, even if it's hidden behind another object that you've already drawn!

In addition, dynamic shadows don't work with this method.

On the other hand, your "models" ended up being extremely detailed and beautiful, so it turns out looking really nice.
ReplyDelete
Replies
Evgeniy11 September 2012 at 10:03
Please advise how best to make the treatment more than one source of light?

Provide an array sources to the shader or cause shader for each source?

Best regards,
Evgeniy
ReplyDelete
Replies
Evgeniy14 September 2012 at 12:22
Thank you for advice, I'll test different techniques.

This is a link where is an example how to calculate multiple lights in the pixel shader 2_0 , in this article author suggest create more of 3 lights with help of several passes of the shader.

Here is a link: http://habrahabr.ru/post/134819/
( sorry, this article in russian but you can see the code )
ReplyDelete
Replies
WhtsTheDeal22 September 2012 at 13:42
This comment has been removed by the author.
ReplyDelete
Replies
Jim25 September 2012 at 20:08
Thanks for posting this. I've implemented your technique in a small sample application. I have a question about render passes.

My first instinct was to completely render the height map first, and then use that for depth testing when I created the normal and diffuse layers.

However, since additions to the height map require testing against the existing height map, this led to switching render targets for each sprite that needed to go into the height map. (A texture cannot be both the input and output of a pixel shader in XNA 4.0)

Is there a better way to generate the composite height map? My understanding is that switching render targets is fairly expensive.
ReplyDelete
Replies
Stephen2 October 2012 at 03:29
Thanks for your reply to my message, it was helpful.

I'm also having some doubts about the method, and I wonder whether it would be better to go with a 3D engine, although in my case that involve throwing away a lot of code which I always hate. Originally I was making a simple 2D RPG with a similar look and feel to the infinity engine games. Then I started looking for some kind of dynamic lighting solution, this was after seeing this done in the Eschalon series to very atmospheric effect. (of course those games are tile based so they have an easier time with this.)

The file size issue is one thing I think might be significant for a serious project, there's another indie RPG called Underrail which released an alpha demo recently that came to 500 MB almost all of which was the animated sprites, and this isn't using a method like this.

I had a question about the depth map, wouldn't it be better to use 16 bit greyscale image inside of adding the channels as you do? 16 bits is probably overkill, but I think it would be nice in principle to be able to support large vertical structures like say the D'arnise keep in BG2:

http://www.gamebanshee.com/baldursgateii/walkthrough/images/ar1300.jpg

Most of the areas on the walls of the keep are actually walkable by the player, which is kind of tricky to do. (I imagine in the IE games it wasn't an issue since you didn't have to worry about dynamic lights in the courtyard area.)

Another I'm not sure about is exactly what settings to use when creating the colour textures for your tiles in the renderer. I'm not really an art guy but I suppose there's a fair bit of freedom here. My first thought was just to set self-illumination to 100, in which case the lighting contribution would be entirely derived from in game dynamic lights. I was looking at some sprites from Icewind Dale:

http://www.planetbaldursgate.com/iwd/encounters/monsters/animations/index.shtml

and I see a lot of self-shadowing. They generally look great, but they might be made more for a static lighting set-up, or maybe there's a good way to combine the two things.

ReplyDelete
Replies
Unknown2 November 2012 at 18:38
This comment has been removed by the author.
ReplyDelete
Replies
Unknown2 November 2012 at 18:41
This comment has been removed by the author.
ReplyDelete
Replies
Anonymous24 November 2012 at 08:19
thanks for the normal map materia, was exactly what I needed!:)
ReplyDelete
Replies
Narcis24 May 2013 at 00:42
Thx!!!!!!
I'm amazed what you can do with xna +c#+hlsl.....!
Keep up the good work!
ReplyDelete
Replies
Narcis24 May 2013 at 01:26
how do i apply this shader on 2d sprite?
ReplyDelete
Replies
Apox25 July 2013 at 09:17
Can you Share the Engine?
ReplyDelete
Replies
Kristijonas21 March 2015 at 12:13
Hello,
First I want to express my level of sympathies to you. As I can see in comments your work inspires a lot of game developers or just simple gamers. Thanks man for your devotion to your work. Secondly I wish to introduce myself since it is related to the topic (:. I consider myself devoted to my work too, for example spent a lot of time on university degrees, learning from internet and doing projects. Thing is that I am an artist designer, who knows about art but nothing about programming languages. I always dreamed about creating my game with my own world but it was always to much to understand how to do it. I can see that I could probably deal with engine which you designed. If you would share the engine with me I could credit you as a co creator. Or maybe you have better ideas for me. Maybe you want to work on something together... Sincerely, Kris
ReplyDelete
Replies
Natures Realm13 March 2019 at 05:23
The lighting and weather effects is the best I've seen along with the asset layering works spot on. The only thing that is missing is shadows.
Dude, seriously, I'd be willing to pay a decent amount to use this engine but you do you of course. I wander how hard it is to recreate in unity
ReplyDelete
Replies
Jamie17 June 2019 at 22:05
I wonder if Obsidian Entertainment would be interested in this.

They use a fusion of 2d and 3d elements in pillars of eternity, in unity. They probably want to make Pillars of Eternity 3 after their current project, they now have funding from Microsoft (they were acquired), and no doubt they want to make PoE 3 as pretty as humanly possible.

There might be something here for them. If you are not going to use this, you should try and reach out.
ReplyDelete
Replies

InFictitious