Rendering in UE4（Gnomon School UE4 大师课笔记）

时间 2019-12-08

标签 rendering ue4 gnomon school 大师笔记繁體版

原文原文链接

Rendering in UE4

Presented at the Gnomon School of VFX in January 2018, part two of the class offers an in-depth look at the rendering pipeline in Unreal Engine, its terminology and best practices for rendering scenes in real-time. This course also presents guidelines and profiling techniques that improve the debugging process for both CPU and GPU performance.ios

分7个部分介绍UE4中的渲染管线。app

Index

1.Introless
2.Before Renderingide
3.Geometry Rendering布局
4.Rasterizing and Gbufferpost
5.Dynamic Lighting/Shadows性能
6.Static Lighting/Shadowsflex
7.Post Processing优化

1.INTRO

Everything needs to be as efficient as possible
Adjust piplelines to engine and hardware restrictions
Try to offload parts to pre-calculations
Use the engine's pool of techniques to achieve quality at suitable cost
CPU and GPU handle different parts of teh rendering calculations
They are interdependent and can bottleneck each other
Know how the load is distributed between the 2动画
不单单用来渲染高质量的静态图片，也用来渲染有交互的动态场景。
Quality Features Performance 三者间的权衡
调节引擎的pipelines和硬件限制
进行预计算

Shadring techniques

Real time rendering techniques are differnt fromm offline rendering
Expensive ray-tracing features are approximated or pre-calculated
Depends on projection(rasterization)
Shading/lighting are mainly done either through defferred or Forward shading UE4 supports both

Deferred Shading

1.Composition based using the GBuffer

2.Shading happens in deferred passes

3.Good at rendering dynamic lighting

4.More flexible when it comes to disabling feature,less flexible when it comes to surface attributes

延迟渲染：经过GBuffer，渲染动态光照更有优点。当涉及到禁用特性时，更灵活，在涉及表面属性时不那么灵活。

============================================

2.BEFORE RENDERING

CPU-Game Thread

Calculate all logic and transforms

1.Animations
2.Position of models and objects
3.Physics
4.AI
5.Spawn and destroy,Hide and Unhide

Anything that relates to the posistion of objects to change

CPU阶段计算全部的逻辑和转换，动画，坐标，物理属性，建立和销毁

============================================

CPU-Draw Thread

Before we can use the transforms to rendering the image we need to know what to include in the rendering

Ignoring this question might make rendering expensive on GPU

Occlusion process-Builds up a list of all visible models/objects

Happens per object-Not per triangle

Stage process-in order of execution

1.Distance Culling
2.Frustum Culling
3.Precomputed Visibility
4.Occlusion Culling

几种剔除：距离剔除，视锥剔除，预计算，遮挡剔除。

剔除具体到物体，而不是三角面

============================================

Occlusion Performance Implications

UE4 has a list of models to render

1.Set up manual culling(i.e.distance culling,pre-coputed vis)
2.Even things like particles occlude
3.Many small objects cause more stress on CPU for culling
4.Large models will rarely occlude and thus increase GPU
5.Know your world and balance objects size vs count

性能分析，1.设置距离剔除，预计算来提升性能

2.小物体太多影响性能，大物体基本上不影响遮挡

3.找到平衡，场景中物体的大小的数量。

3.GEOMETRY RENDERING

GPU-Prepass/Early z pass

The GPU now has a list of models and transforms but if we just render this info out we could possibly cause a lot of redundant pixel rendering

Similar to excluding objects,we need to exclude pixels

We need to figure out which pixels are occlluded

To do this, we generate a depth pass and use it to determine if the given pixel is in front and visible

z pass 来处理像素的渲染，被遮挡的不渲染。

============================================

Drawcalls

GPU renders drawcall by drawcall not triangle by traingle

A drawcall is group of tris sharing the same properties

Drawcalls are prepared by the CPU(Draw) thread

Distilling rendering info for objects into a GPU state ready for submission

GPU 渲染物体经过drawcall 而不是三角形，CPU阶段提交drawcall到GPU state

============================================

UE4 with current gen high-end PCs

2000-3000 is reasonable

More than 5000 is getting high

4.RASTERIZING AND GBUFFER

Rasterzing

GPU ready to render pixels

Determine which pixels should be shaded called rasterizing

Done drawcall by drawcall then tri by tri

Pixel Shaders are responsible for calculating the pixel color

Input is generally interpolated vertex data, texture samplers

Rasterizing inefficiency

When rasterizing dense meshes at distance, they converge to only few pixels

A waste of vertex processing

A 100k tris object seen from so far away that it would be 1 pixel big,will only show 1 pixel of its closest triangle!

光栅化：ps处理vs阶段传来的顶点信息，距离特别远的mesh，可能占的像素特别小，会浪费许多vs阶段的性能。

============================================

Overshading

Due to hardware design, it always uses a 2x2 pixel quad for processing

If a traingle is very small or very thin then it means it might process 4 pixels while only 1 pixel is actually filled

因为硬件的缘由，每次处理2x2 4个像素

============================================

Rasterization and Overshading Performance Implications

Triangles are more expensive to render in great density
When seen at a distance the density increases
Thus reducing triangle count at a distance(lodding/culling) is critical
Very thin triangles are inefficient because they pass through many 2x2 pixel quads yet only fill a fraction of them
The more complex the pixel shader is the more expensive

性能分析：密度大的三角面，性能要求高。距离远密度会变大，尽量下降三角面个数，thin tri资源消耗大。

============================================

Results are written out to:

Multiple Gbuffers in case of deferred shading

Shaded buffer in case of forward shading

光栅化后的数据用在延迟光照的Gbuffer中。

GBuffer PPerformance Implications

The GBuffer takes up a lot of memory and bandwidth and thus has a limit on how many different GBuffer images you can render out

Gbuffers memory is resolutions dependent

性能分析：GBuffer占用大量内存带宽，能渲染出的GBuffer数量有限。

5.LIGHT AND SHADOWS

Two approaches for lighting and shadows

Dynamic
static

Lighting(Deferred Shading)

Is calclated and applied using pixel shaders

Dynamic point lights are rendered as spheres

The spheres act like a mask

Anything within the sphere is to receive a pixel shader operation to blend in the dynamic light

动态点光源渲染成球体，至关于一个蒙版遮罩，遮罩内的像素，在ps里面作混合

============================================

Light calculation requires position

Depth buffer used to get pixels pos in 3D

Use normal buffer to appley shading.Direct between Normal and light

计算光照，深度depth buffer和Normal buffer共同做用，计算光照。

============================================

Shadows

Common technique for rendering shadows is Shadow Maps

Check for each pixel if it is visible to the given light or no

Requires rendering depth for light Pov

在light view空间下，渲染shadow map。

Process Pros/Cons

Pros
- 1. Is rendered in real time using the GBuffer
- Lights can be changed,moved,or add
- Does not need any special model preparation
Cons
- Especially shadows are performance heavy

利弊分析：
利：利用GBuffer实时渲染能够动态调整灯光
弊：性能代价

============================================

Quality Pros/Cons

Shadows are heavy on performance, so usually render quality is reduced to compensate
Doea not do radiosity/global illumination for majority of content
Dynamic soft shadows are very hard to do well, dyn shadows ofter looks sharp or blocky

质量利弊：
性能代价大，下降质量提升性能；没法渲染自发光和全局光照；动态软阴影效果差。

============================================

Dynamic Lighting Performance Implications

Small dyn light is relatively cheap in a deferred renderer
The cost is down to the pixel shader operations, so the more pixels the slower it is
the radius must be as small as possible
Prevent excessive and regular overlap

动态光照性能分析：
延迟渲染动态光源小，性能占用较小。
成本受ps影响，处理像素越多，越慢。
半径尽可能小。避免过分叠加。

============================================

Dynamic Shadows Performance Implication

Turn off shadow casting if not needed
The tri count of geometry affect shadows perf
Fade or toggle off shadows when far away

动态阴影性能分析：
关闭没必要要的阴影，三角面数量影响阴影效果，距离远的时候简化阴影。

6.STATIC LIGHTING AND SHADOWS

Dynamic lights and shadows expensive

Thus part of it is offloaded to pre-calculations/pre-rendering

This is referred as static lights and shadows

Lighting data stored mainly in lightmaps
动态光照昂贵，使用lightmap。

Lightmaps

A lightmap is a texture with the lighting and shadows baked into it

An object usually requires UV lightmap coordinates for this to work

This texture is then multiplied on top of the basecolor

将光照信息烘焙到原有的纹理信息上。

============================================

Lightmass

Stand alone application that handles light rendering,baking to lightmaps and integerating into materials

Raytracer supporting Gl

Supports distributed rendering over a network

Bake quality is determined by Light Build Quality as well as settings in the Lightmass section of each level

Better to have a lightmass importance Volume around part of the scene

光照烘焙：单独的模块处理光照渲染。支持全局光照，烘焙区域和质量可调节。

============================================

Process Pros/Cons

Super fast for performance in real-time, but increases memory
Takes a long time to pre-calculate the lighting
Each time something is changed,it must be re-rendered again
Models require lightmap UVs,this additional prep step that takes time

利弊分析：
速度更快，但内存增长；需要花时间预处理；场景改变从新烘焙；模型需要光照uv

============================================

Quality Pros/Cons

Handles Radiosity and Global Illumination
Renders realistic shadows including soft shadows
Quality is dependent on lightmap resolution and UV layout
May have seams in the lighting due to the UV layout

质量利弊：
能够处理辐射度和全局光照；
能够渲染逼真的阴影；
质量受lightmap分辨率和uv布局影响；
uv布局影响可能出现缝隙；

============================================

Static Lighting Performance Implications

Static Lighting always renders at the same speed
Lightmap resolution affects memory and filesize,not framerate
Bake time are increased by:
- Lightmap resolutions
- Number of models/light
- Higher quality settings
- Lights with a large attenuation radius or source radius

静态光照性能分析：
光照贴图影响内存和文件大小。贴图分辨率增大，灯光和模型增长，质量提升，光源半径增大都会致使烘焙时间增多。

============================================

7.POST PROCESSING

Visual effects applied at the very end of the rendering process

Uses the GBuffers to calculate its effects

Once more relies heavily on Pixel Shaders

后处理：
使用Gbuffer计算效果。

Example:

light Bloom
Depth of Field/Blurring
Some types of lensflares
Light Shafts
Vignette
Tonemapping/Color correction
Exposure
Motion Blur

光晕效果，
景深/模糊，
光泽贴图/颜色校订，
曝光，
运动模糊。

Post Processing Performance Implications

Affected directly by final resolution

Affected by shader complexity

Parameter(e.g.DoF blur radius)

后处理性能分析：
受分辨率影响；受shader复杂度影响；参数影响，如模糊半径。

参考视频：

Gnomon Masterclass Part II: Rendering in UE4 | Event Coverage | Unreal Engine

https://www.youtube.com/watch?v=kp3zcyZZBVY