http://www.cnblogs.com/oiramario/archive/2012/09/26/2703277.htmlphp
看过敏敏的http://www.klayge.org/2012/09/21/%E5%8E%8B%E7%BC%A9tangent-frame/html
今年二、3月份曾经整过这玩意,作到用tangent.w来存handedness,解决了uv mirror的问题post
没想到顶点数据压缩还有这么深的学问,因而乎按照资料对max插件进行了修改,效果超出想象this
目前作到使用unsigned char x 4来存normal和tangent,short x 2来存texcoord,咱们能够大体算一下spa
以前是normal = float x 3,tangent = float x 4,texcoord = float x 2(还要看一共有几层uv) ,一共是12 + 16 + 8 = 36.net
压缩以后变成normal = unsigned char x 4,tangent = unsigned char x 4,texcoord = short x 2,一共是4 + 4 + 4 = 12插件
每一个顶点从36字节减小到12字节,少了一半多,经过观察一个20000多面的模型,mesh的大小从1388KB减小到552KB,压缩后是原大小的0.39倍code
尚未像文中介绍的那样将tangent frame压缩到仅用8个字节的程度orm
其优势是数据量大大减小,这样vertex cache的命中率会提升,据观察fps有约5%的提升htm
其缺点是vs中的计算量稍微增长了一些,另外压缩致使精度上会有损失
参考资料:
http://www.humus.name/Articles/Persson_CreatingVastGameWorlds.pdf
http://www.crytek.com/download/izfrey_siggraph2011.pdf
http://fabiensanglard.net/dEngine/index.php
http://oddeffects.blogspot.com/2010/09/optimizing-vertex-formats.html
注意:
在声明顶点元素时,使用UBYTE4或者SHORT4。
D3DVERTEXELEMENT9 declExt[] = { // stream, offset, type, method, usage, usageIndex { 0, 0, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_POSITION, 0 }, { 0, 12, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL, 0 }, // 2d uv { 0, 16, D3DDECLTYPE_FLOAT2, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 0 }, { 0, 24, D3DDECLTYPE_SHORT2N, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 1 }, // tangent { 0, 28, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 2 }, D3DDECL_END() };
可是在着色器中直接使用float4做为输入,GPU会自动转换。
float4 normal : NORMAL;
或:
float4 normal : BLENDINDICES;
有些显卡不支持UBYTE4类型的NORMAL语法输入,可尝试做为BLENDINDICES使用。这也是UBYTE4经常使用的方式。
================================================
Pack the normals into the w value of the position of each vertex, then you should be able to do something similar to this to read it back, and then you just need to convert it back to a normal vector in the shader (multiply by 2, then subtract 1).
To pack the normal into a float you should be able to use something like this (not tested and should probably use the proper casts instead of C style casts, and the normal needs to be normalized):
float PackNormal(const Vector3& normal) { //Use 127.99999f instead of 128 so that if the value was 1 it won't be 256 which screws things up unsigned int packed = (unsigned int)((normal.x + 1.0f) * 127.99999f); packed += (unsigned int)((normal.y + 1.0f) * 127.99999f) << 8; packed += (unsigned int)((normal.z + 1.0f) * 127.99999f) << 16; return *((float*)(&packed)); }