【转载】Deferred Shading

时间 2019-12-14

标签转载 deferred shading 繁體版

原文原文链接

来源：http://www.cnblogs.com/rickerliang/archive/2011/05/07/2040062.html

Deferred Shading是如今比较流行实时渲染方式，这种渲染方式能把Geometry和Lighting之间的耦合解开，把Forward Shading的Geometry Pass*Lighting Pass复杂度降低为Geometry Pass+Lighting Pass，特别适合于渲染较多动态光源的场景，本文将快速浏览实现Deferred Shading的各个阶段，并提供一个带源代码的简单的例子程序，这个程序能够在SM2.0及以上的硬件上运行，经过dx9接口实现。html

Deferred Shading介绍可参阅《RealTime Rendering》3rd 7.9.二、《GPU Gems2》及《GPU Gems3》。另外，《Deferred Shading Tutorial》提供了详细的OpenGL实现流程。而网上可找到示例代码有nVidia SDK 9.52以及Intel的《Deferred Rendering for Current and Future Rendering Pipelines》，网址是http://software.intel.com/en-us/articles/deferred-rendering-for-current-and-future-rendering-pipelines/。框架

Deferred Shading可分为四个阶段：Geometry、Lighting、Post-Processing和MergeOutput，其中第三阶段可选。各个阶段分别输出到texture，因此，deferred shading将使用到Render To Texture(RTT)及Multiple Render Targets(MRT)。每一个阶段及其对应的输出以下表：spa

阶段3d	输出orm	做用htm
Geometryblog	G-Buffer接口	记录整个场景的几何信息例如normal、depth(position)、diffuse color、specular intensity等ip
Lightingci	P-Buffer1	使用G-Buffer信息逐像素计算光照
Post-Processing	P-Buffer2	后处理，例如motion blur、Bloom、Anti-Aliasing等
MergeOutput	BackBuffer	混合以前全部Buffer的数据，输出到BackBuffer

GeometryStage G-Buffer

此阶段是把场景内全部3D模型的几何信息都渲染(记录)到G-Buffer内，G-Buffer的分辨率是屏幕分辨率，以便后续阶段进行逐像素渲染。G-Buffer能够有多个texture，一般，使用MRT在一个Batch内完成这些属性的渲染。此阶段，场景的几何信息都以texture coordinate的方式插值并投影到G-Buffer上，因此，须要设置好各个space的转换矩阵。示例程序在此阶段输出normal、depth、diffuse color及specular intensity到G-Buffer。示例程序在view space计算光照，因此这里输出的normal是转换到view space的值。这里输出的depth是已转换到normalized device space，在计算光照时，depth配合project matrix能够恢复出view space下的坐标值。G-Buffer输出以下图：

由上往下分别是normal(view space)、depth、diffuse color、specular intensity。

LightingStage P-Buffer1

此阶段使用光照模型、光源位置结合G-Buffer的几何信息计算G-Buffer上每一个像素的颜色，若是有多个光源，每一个光源执行此阶段一次，并把计算结果累积到P-Buffer上。再次提示，示例程序是在view space上计算光照，因此G-Buffer上的depth须要恢复为view space的position。要理解恢复view space position的过程，先来认识一些概念：

G-Buffer上的depth是normalized device space，而view space转换到normalized device space要经过view-->homogeneous-->normalized device，其中，view-->homogeneous经过projection matrix完成；而homogeneous->normalized device则是把4d vector都除以w，而w是view space下的z。projection matrix以下（请注意D3D使用row-major matrix而且使用pre-multipling），所以，咱们获得homogeneous下的z是，除以view space的z就是等于normalized device下的depth。表达是内的z均为view space下的z，f是far plane，n是near plane。f和n是咱们定义project matrix时指定而且表达式的值咱们知道，因此经过上述表达式，能够求出view space下的z的值。normalized device space下的xy咱们也知道，分别是texture coordinate的u*2-1及-(v*2-1），这是由于，咱们要把G-Buffer点对点地渲染到P-Buffer上， texture coordinate是[0,1]要转换到[-1,1]normalized device space的xy区间。想详细了解各个空间转换及转换矩阵的推导，可参阅《RealTime Rendering》3rd及《Introduction to 3D Game Programming with DirectX 9.0c—A Shader Approach》。

当咱们获得了normalized device space下的xyz以及view space下的z后，有两种方法能够回到view space，第一种方法，normalized device space的xyzw(w=1)分别乘以view space的z，回到homogeneous clip space，而后经过projection matrix的inverse matrix(projection matrix并无真正把点投影到平面上，只是转到homogeneous space，因此这个matrix是invertable的)回到view space；第二种方法，使用projection matrix的(0,0)及(1,1)元素计算出view space的xy值，其中R是aspect ration，a是fovy。示例程序使用第二种方法。

获得每一个像素的view space 坐标，就能够作逐像素光照，获得P-Buffer1，以下图：

Post-Processing P-Buffer2

示例程序进行了AA及Bloom处理。AA处理使用G-Buffer的normal做为依据，检测三角形边界并决定3x3临近像素的混合权重，混合输出中心像素。更有效的Post-Processing AA可参考MLAA及SRAA(后续文章中介绍)。Bloom就是对P-Buffer1进行纵向和横向模糊。下面是Post-Processing的输出：

MergeOutput Backbuffer

这个步骤很简单，对Post-Processing的输出进行混合并渲染到Backbuffer上就ok了，下图就是完整的渲染效果：

最后须要说明的是，示例程序使用《Introduction to 3D Game Programming with DirectX 9.0c—A Shader Approach》的框架代码及纹理。

示例程序源代码下载：http://files.cnblogs.com/rickerliang/AmbientDiffuseSpecularDemo-DeferredShading.zip

但愿本文对想了解Deferred Shading的朋友有帮助。