337p人体粉嫩胞高清图片,97人妻精品一区二区三区在线 ,日本少妇自慰免费完整版,99精品国产福久久久久久,久久精品国产亚洲av热一区,国产aaaaaa一级毛片,国产99久久九九精品无码,久久精品国产亚洲AV成人公司
網(wǎng)易首頁(yè) > 網(wǎng)易號(hào) > 正文 申請(qǐng)入駐

Unity實(shí)現(xiàn)Nanite

0
分享至


【USparkle專欄】如果你深懷絕技,愛“搞點(diǎn)研究”,樂于分享也博采眾長(zhǎng),我們期待你的加入,讓智慧的火花碰撞交織,讓知識(shí)的傳遞生生不息!

這是侑虎科技第1939篇文章,感謝作者傻頭傻腦亞古獸供稿。歡迎轉(zhuǎn)發(fā)分享,未經(jīng)作者授權(quán)請(qǐng)勿轉(zhuǎn)載。如果您有任何獨(dú)到的見解或者發(fā)現(xiàn)也歡迎聯(lián)系我們,一起探討。(QQ群:793972859)

作者主頁(yè):

https://www.zhihu.com/people/tian-cai-ya-gu-shou

一、前序

1. 介紹

Nanite是UE5中虛擬幾何體(Virtualized Geometry System)的系統(tǒng),主要用途是高效率渲染的高面數(shù)模型。Nanite會(huì)為模型自動(dòng)生成LOD結(jié)構(gòu),與傳統(tǒng)LOD不同,Nanite的LOD不再是每個(gè)模型的,而是精細(xì)到模型中的局部區(qū)域,藝術(shù)家不需再為制作或處理LOD煩惱。并且還能享有GPU Driven的高效剔除,單個(gè)繪制調(diào)用的好處。


2. 技術(shù)要點(diǎn)

Nanite技術(shù)結(jié)合了多種技術(shù)做到了高效渲染:

1. Cluster Rendering:由Cluster組織三角形,可以享有更高效的剔除。

2. Auto LOD:通過Graph Partitioning技術(shù)劃分和簡(jiǎn)化模型構(gòu)建LOD,并且把數(shù)據(jù)組織成BVH結(jié)構(gòu)在Runtime時(shí)候可以高效地并行選擇LOD,通過這種方式構(gòu)建的LOD過渡非常絲滑。

3. GPU Driven Pipeline:由GPU驅(qū)動(dòng)的繪制,減少了CPU的性能開銷。

4. Occlusion Culling:更細(xì)顆粒的遮擋剔除,用于剔除不可見的三角形。

5. Hardware/Software Rasterization:由于小三角形對(duì)于硬件光柵化非常不友好,所以針對(duì)這些三角形用Compute Shader執(zhí)行軟光柵提高效率。

6. Visibility Buffer:利用Visibility Buffer減少Overdraw,進(jìn)一步提高GPU效率。

7. Streaming:加載只看到的相關(guān)數(shù)據(jù),減少幾何體對(duì)內(nèi)存的壓力。

3. 本文效果

由于Nanite系統(tǒng)非常龐大和有非常多的工程細(xì)節(jié)要處理,所以本文會(huì)簡(jiǎn)化和略過一些東西,僅實(shí)現(xiàn)核心部分,而且會(huì)與有UE5的版本有點(diǎn)出入。

下圖是本文實(shí)現(xiàn)的效果,每個(gè)色塊是一個(gè)三角形,可以看出LOD切換和相機(jī)剔除都非常絲滑。


色塊表示三角面


色塊表示Cluster

二、實(shí)現(xiàn)

1. Clusterize

第一步,在離線階段處理,將復(fù)雜的超高精度網(wǎng)格模型高效且合理地分割成更小、更易于管理的簇(Cluster),每個(gè)Cluster最多128個(gè)三角形。這種劃分不是簡(jiǎn)單的切割,而是旨在最小化簇與簇之間連接的邊數(shù)(即切割大小),同時(shí)保持每個(gè)簇的大小大致均衡。



UE使用的Partition是Metis庫(kù):


https://github.com/KarypisLab/METIS

實(shí)現(xiàn)代碼可以參考UE5的源碼部分:

UnrealEngine-release\Engine\Source\Developer\NaniteBuilder\Private\NaniteBuilder.cpp

本文使用meshoptimizer實(shí)現(xiàn)Mesh的切分Cluster和Partition功能,這個(gè)庫(kù)功能還有優(yōu)化Over Draw,Shadow Depth Index等功能:


https://github.com/zeux/meshoptimizer

我們新建一個(gè)C++導(dǎo)出DLL的工程,封裝幾個(gè)主要函數(shù)讓Unity可以使用。其實(shí)代碼量不多,翻譯成C# 直接用也可以。

分別是:

  • meshopt_buildMeshlets(構(gòu)建Cluster)

  • meshopt_partitionClusters(Cluster劃分Partition)

  • meshopt_buildMeshletsBound(計(jì)算Cluster數(shù)量)

  • meshopt_computeSphereBounds(合并BoundsSphere)


在C# 中引用這些函數(shù):



                                                           unsafe static List 
                    
   
          clusterize(Vector3[] vertices, int[] indices)
{
constint max_vertices = 192; // TODO: depends on kClusterSize, also may want to dial down for mesh shaders
constint max_triangles = kClusterSize; //128
constint min_triangles = (kClusterSize / 3) & ~3;
constfloat split_factor = 2.0f;
constfloat fill_weight = 0.75f;
int max_meshlets = BuildMeshletsBound(indices.Length, max_vertices, max_triangles);//meshopt_buildMeshletsBound
var meshlets = new Meshlet[max_meshlets * 2];
var meshlet_vertices = newint[max_meshlets * max_vertices];
var meshlet_triangles = newbyte[max_meshlets * max_triangles * 3];
var meshlet_count = BuildMeshletFlex(meshlets, meshlet_vertices, meshlet_triangles, indices, indices.Length, vertices, vertices.Length, sizeof(float) * 3, max_vertices, min_triangles, max_triangles, 0.0f,
split_factor);//meshopt_buildMeshlets
List clusters = new List (meshlet_count);
for (int i = 0; i < meshlet_count; i++)
{
ref Meshlet meshlet = ref meshlets[i];
fixed (int* ptr = &meshlet_vertices[meshlet.vertex_offset])
{
fixed (byte* ptr2 = &meshlet_triangles[meshlet.triangle_offset])
{
OptimizeMeshlet(ptr, ptr2, (int)meshlet.triangle_count, (int)meshlet.vertex_count);
}
}

Cluster cluster = new Cluster();
cluster.indices = newint[meshlet.triangle_count * 3];
for (int j = 0; j < meshlet.triangle_count * 3; ++j)
cluster.indices[j] =
meshlet_vertices[meshlet.vertex_offset + meshlet_triangles[meshlet.triangle_offset + j]];

cluster.parent.error = float.MaxValue;
clusters.Add(cluster);
}

return clusters;
}

然后可以直接通過meshopt_buildMeshlets函數(shù),獲得每個(gè)cluster的indexs。

2. Build DAG

有了這些Cluster,就可以構(gòu)建“LOD”了,只需要循環(huán)這個(gè)操作:打組->合并->減面->clusterize。如下圖:





這個(gè)過程感覺就像Mipmap一樣,一層一層往上合并和簡(jiǎn)化,并記錄一個(gè)Err誤差值和Bounds用于運(yùn)行時(shí)LOD選擇用。而這些合并的的節(jié)點(diǎn)就叫做Cluster Group。最后得出一個(gè)DAG(有向無環(huán)圖,Directed Acyclic Graph)的結(jié)構(gòu)。

                                                           public struct ClusterGroup
{
public List Children;
public Vector3 Bounds;
publicfloat radius;
public Vector3 LODBounds;
publicfloat MinLODError;
publicfloat MaxParentLODError;
publicint MipLevel;
}

publicclassNaniteSubMesh
{
public List Group> clusterGroupList;
public List clusterList;
publicint maxMipLevel;
}

static NaniteSubMesh Nanite(Vector3[] vertices,Vector3[] normals, int[] indices)
{
NaniteSubMesh res = new NaniteSubMesh();
List Group> clusterGroupList = new List Group>();
var clusters = clusterize(vertices, indices);
res.clusterList = clusters;
res.clusterGroupList = clusterGroupList;
res.maxMipLevel = 0;
for (int i = 0; i < clusters.Count; ++i)
{
var c = clusters[i];
c.self = Bounds(vertices, clusters[i].indices, 0f);
c.mip = 0;
clusters[i] = c;
}

List pending = new List(clusters.Count);
int[] remap = newint[vertices.Length];
for (int i = 0; i < remap.Length; ++i)
remap[i] = i;
for (int i = 0; i < clusters.Count; ++i)
pending.Add(i);

int curMip = 1;
byte[] locks = newbyte[vertices.Length];
while (pending.Count > 1)
{
List int>> groups = partition(clusters, pending, remap, vertices);
if (kUseLocks)
lockBoundary(locks, groups, clusters, remap);
pending.Clear();
List retry = new List();
int triangles = 0;
int stuck_triangles = 0;
for (int i = 0; i < groups.Count; ++i)
{
var curGroupClusters = groups[i];
if (curGroupClusters.Count == 0)
{
continue; // metis shortcut
}

List merged = new List(vertices.Length);
for (int j = 0; j < curGroupClusters.Count; ++j)
{
merged.AddRange(clusters[curGroupClusters[j]].indices);
}
LODBounds groupb = boundsMerge(clusters, curGroupClusters);
ClusterGroup clusterGroup = new ClusterGroup();
clusterGroup.Bounds = groupb.center;
clusterGroup.MaxParentLODError = groupb.error;
clusterGroup.radius = groupb.radius;
clusterGroup.Children = new List(merged.Count);
clusterGroup.MipLevel = curMip - 1;
for (int j = 0; j < curGroupClusters.Count; ++j)
{
clusterGroup.Children.Add(curGroupClusters[j]);
}
clusterGroupList.Add(clusterGroup);

// aim to reduce group size in half
int target_size = (merged.Count / 3) / 2 * 3;
float error = 0f;
var simplified = simplify(vertices, normals, merged.ToArray(), kUseLocks ? locks : null, target_size,
ref error);
if (simplified.Count > merged.Count * kSimplifyThreshold)
{
stuck_triangles += merged.Count / 3;
for (int j = 0; j < curGroupClusters.Count; ++j)
{
retry.Add(curGroupClusters[j]);
}

continue; // simplification is stuck; abandon the merge
}

// enforce bounds and error monotonicity
// note: it is incorrect to use the precise bounds of the merged or simplified mesh, because this may violate monotonicity

var split = clusterize(vertices, simplified.ToArray());
groupb.error += error; // this may overestimate the error, but we are starting from the simplified mesh so this is a little more correct
// update parent bounds and error for all clusters in the group
// note that all clusters in the group need to switch simultaneously so they have the same bounds
for (int j = 0; j < curGroupClusters.Count; ++j)
{
int clusterIndex = curGroupClusters[j];
var t = clusters[clusterIndex];
t.parent = groupb;
clusters[clusterIndex] = t;
}

for (int j = 0; j < split.Count; ++j)
{
var sj = split[j];
sj.self = groupb;
sj.mip = curMip;
split[j] = sj;
clusters.Add(sj); // std::move
pending.Add(clusters.Count - 1);
triangles += sj.indices.Length / 3;
}
}

curMip++;
}

if (pending.Count == 1)
{
var c = clusters[pending[0]];
ClusterGroup clusterGroup = new ClusterGroup();
clusterGroup.Bounds = c.self.center;
clusterGroup.MaxParentLODError = c.self.error;
clusterGroup.radius = c.self.radius;
clusterGroup.Children = new List(1);
clusterGroup.MipLevel = curMip - 1;
clusterGroup.Children.Add(pending[0]);
clusterGroupList.Add(clusterGroup);
}

res.maxMipLevel = curMip - 1;
return res;
}

static void lockBoundary(byte[] locks, List int>> groups, List clusters, int[] remap)
{
// for each remapped vertex, keep track of index of the group it's in (or -2 if it's in multiple groups)
int[] groupmap = newint[locks.Length];
for (int i = 0; i < groupmap.Length; ++i)
groupmap[i] = -1;

for (int i = 0; i < groups.Count; ++i)
{
var c = groups[i];
for (int j = 0; j < c.Count; ++j)
{
var indices = clusters[c[j]].indices;
for (int k = 0; k < indices.Length; ++k)
{
var v = indices[k];
var r = remap[v];

if (groupmap[r] == -1 || groupmap[r] == i)
groupmap[r] = i;
else
groupmap[r] = -2;
}
}
}

// note: we need to consistently lock all vertices with the same position to avoid holes
for (int i = 0; i < locks.Length; ++i)
{
var r = remap[i];
locks[i] = (byte)((groupmap[r] == -2) ? 1 : 0);
}
}

這樣我們得到各級(jí)Mip的一系列Clusters。


3. 加速結(jié)構(gòu)

即使把三角形劃分成Clusters數(shù)量也太多,使用Compute Shader來做并行結(jié)算效率也不高,于是Nanite就使用了BVH來作為ClusterGroup的加速結(jié)構(gòu),然后配合Persistent Threads做查找過濾。





Persistent Threads遍歷BVH部分,有興趣可以參考UE5源碼:Shaders\Private\Nanite\NaniteClusterCulling.usf

UE5中也有不使用Persistent Threads的流程,應(yīng)該說一般默認(rèn)就是不使用的。


UE5源碼部分

個(gè)人認(rèn)為Persistent Threads方案在GPU遍歷這種BVH結(jié)構(gòu)有點(diǎn)暴力和重度,所以簡(jiǎn)化了一下,把多個(gè)Cluster合并成一個(gè)剔除單元(Part),先并行對(duì)Part做剔除,再對(duì)Part里的Cluster去做并行剔除,兩層結(jié)構(gòu)來加速作為Persistent Threads的一個(gè)簡(jiǎn)單替代方案。

然后把多個(gè)Part組織成Page用于分塊加載。材質(zhì)處理細(xì)節(jié)也不同,UE5的材質(zhì)是每個(gè)Cluster會(huì)記錄MaterialRange,簡(jiǎn)單起見這里實(shí)現(xiàn)是每個(gè)SubMesh會(huì)去構(gòu)建獨(dú)立的Clusters。

代碼如下:

                                                            [Serializable]
publicstruct NaniteCluster
{
publicint indiceIndex;
publicint indiceCount;
publicfloat selfErrer;
publicfloat parentErrer;
public Vector4 selfSphere;
public Vector4 parentSphere;
publicint subMeshID;
publicint vertexOffset;
};
[Serializable]
publicstruct NaniteClusterGroup
{
publicint ClusterStart;
publicint ClusterCount;
public Vector3 Bounds;
publicfloat radius;
public Vector3 LODBounds;
publicfloat MinLODError;
publicfloat MaxParentLODError;
publicint MipLevel;
}


[Serializable]
publicstruct NaniteMeshPart
{
publicint ClusterStart;
publicint ClusterCount;
public Vector4 selfSphere;
publicfloat MaxParentLODError;
}

                                                           public classNaniteSubMesh
{
public List Group> clusterGroupList;
public List clusterList;
publicint maxMipLevel;
}
publicclassBuildPart
{
public List clusterList;
publicint mip;
publicint subMesh;

}
public static void BuildNaniteMesh(Mesh mesh)
{
var vertices = mesh.vertices;
var normals = mesh.normals;
var uvs = mesh.uv;

int subMeshCount = mesh.subMeshCount;
int totalClusterCount = 0;
int totalIndexCount = 0;
List subMeshList = new List ();
for (int i = 0; i < subMeshCount; i++)
{
var triangles = mesh.GetTriangles(i);
var subMesh = Nanite(vertices,normals,triangles);
subMeshList.Add(subMesh);
totalClusterCount += subMesh.clusterList.Count;
}

List buildPartsList = new List (totalClusterCount);
int MAX_PART_PERPAGE = 128;
int MAX_CLUSTER_PERPART = 8;

for (int subMeshIndex = 0; subMeshIndex < subMeshList.Count; subMeshIndex++)
{
var subMesh = subMeshList[subMeshIndex];
List clusters = subMesh.clusterList;
var groupsList = subMesh.clusterGroupList;
BuildPart buildPart = null;
for (int i = 0; i < groupsList.Count; i++)
{
var gIndex = i; // sortGroups[i].OldIndex;
var g = groupsList[gIndex];
var childs = g.Children;
for (int c = 0; c < childs.Count; c++)
{
int cIndex = childs[c];
int cMip = clusters[cIndex].mip;
totalIndexCount += clusters[cIndex].indices.Length;
//new Part
if (buildPart == null || buildPart.clusterList.Count >= MAX_CLUSTER_PERPART ||
buildPart.mip != cMip)
{
buildPart = new BuildPart();
buildPart.clusterList = new List(MAX_CLUSTER_PERPART);
buildPart.mip = cMip;
buildPart.subMesh = subMeshIndex;
buildPartsList.Add(buildPart);
}

buildPart.clusterList.Add(cIndex);
}
}
}

int buildPartCount = buildPartsList.Count;
NaniteMeshPage[] pageArray = new NaniteMeshPage[(buildPartCount+(MAX_PART_PERPAGE-1))/MAX_PART_PERPAGE];//ceil
List tempIndiceList = new List(totalIndexCount);
List mipLists = new List(totalClusterCount);
int partIndex = 0;
for (int i = 0; i < pageArray.Length; i++)
{
//create new page
var p = ScriptableObject.CreateInstance ();
pageArray[i] = p;
tempIndiceList.Clear();
int partCount = (i == (pageArray.Length -1)) ? (buildPartCount % MAX_PART_PERPAGE) : MAX_PART_PERPAGE;
p.parts = new NaniteScene.NaniteMeshPart[partCount];
List pageClusters = new List (partCount * MAX_CLUSTER_PERPART);
for (int j = 0; j < partCount; j++)
{
var buildPart = buildPartsList[partIndex];
var buildPartCluster = buildPart.clusterList;
//create part
var part = new NaniteScene.NaniteMeshPart();
part.ClusterStart = pageClusters.Count; //local index
part.ClusterCount = buildPartCluster.Count;
int subMeshID = buildPart.subMesh;
float maxParentErr = 0f;
var clusters = subMeshList[subMeshID].clusterList;
for (int c = 0; c < buildPartCluster.Count; c++)
{
var cluster = clusters[buildPartCluster[c]];
mipLists.Add(cluster.mip);
//create Cluster
NaniteScene.NaniteCluster naniteCluster = new NaniteScene.NaniteCluster();
naniteCluster.indiceIndex = tempIndiceList.Count;
naniteCluster.indiceCount = cluster.indices.Length;
naniteCluster.parentErrer = cluster.parent.error;
naniteCluster.parentSphere = new Vector4(cluster.parent.center.x,cluster.parent.center.y,cluster.parent.center.z, cluster.parent.radius);
naniteCluster.selfErrer = cluster.self.error;
naniteCluster.selfSphere = new Vector4(cluster.self.center.x,cluster.self.center.y,cluster.self.center.z, cluster.self.radius);
naniteCluster.subMeshID = subMeshID;
tempIndiceList.AddRange(cluster.indices);
maxParentErr = Mathf.Max(naniteCluster.parentErrer, maxParentErr);
pageClusters.Add(naniteCluster);
}

LODBounds partBounds = boundsMerge(clusters, buildPartCluster,true);
part.selfSphere = new Vector4(partBounds.center.x,partBounds.center.y,partBounds.center.z,partBounds.radius);
part.MaxParentLODError = maxParentErr;
p.parts[j] = part;
partIndex++;
}
p.clusterArray = pageClusters.ToArray();
p.indiceArray = tempIndiceList.ToArray();
p.clusterMip = mipLists.ToArray();
}

string fileName = AssetDatabase.GetAssetPath(mesh);
string extension = Path.GetExtension(fileName);
fileName = fileName.Replace(extension, "");
//Build page
int totalVerts = 0;
for (int i = 0; i < pageArray.Length; i++)
{
var page = pageArray[i];
var clusterArray = page.clusterArray;
var indiceArray = page.indiceArray;
Dictionary indicesMap = new Dictionary();
List tempVerts = new List (vertices.Length);
List tempNormals = new List (vertices.Length);
List tempUVs = new List (vertices.Length);
List newIndices = new List(totalIndexCount);
for (int c = 0; c < clusterArray.Length; c++)
{
refvar cluster = ref clusterArray[c];
var indexStart = cluster.indiceIndex;
var indexEnd = indexStart+cluster.indiceCount;
for (int index = indexStart; index < indexEnd; index++)
{
int vertIndex = indiceArray[index];
int newIndex;
if (!indicesMap.TryGetValue(vertIndex,out newIndex))
{
newIndex = newIndices.Count;
indicesMap.Add(vertIndex, newIndex);
tempVerts.Add(vertices[vertIndex]);
tempNormals.Add(normals[vertIndex]);
if (uvs.Length == 0)
{
tempUVs.Add(Vector2.zero);
}
else
{
tempUVs.Add(uvs[vertIndex]);
}

newIndices.Add(newIndex);
}

indiceArray[index] = newIndex;
}
}

page.vertexStride = 5;//pos3 + uv2
page.vertexData = newfloat[tempVerts.Count * page.vertexStride];
page.vertexCount = tempVerts.Count;
for (int v = 0; v < tempVerts.Count; v++)
{
int vertexIndex = v * page.vertexStride;
page.vertexData[vertexIndex + 0] = tempVerts[v].x;
page.vertexData[vertexIndex + 1] = tempVerts[v].y;
page.vertexData[vertexIndex + 2] = tempVerts[v].z;
page.vertexData[vertexIndex + 3] = tempUVs[v].x;
page.vertexData[vertexIndex + 4] = tempUVs[v].y;
}
totalVerts +=tempVerts.Count;
string newPath = fileName + "_p"+i +".asset";
AssetDatabase.CreateAsset(page, newPath);
}
AssetDatabase.Refresh();

Debug.Log("mesh Vertx:"+vertices.Length +" mesh Nanite:"+ totalVerts + " cluster:"+totalClusterCount + "part:"+ buildPartCount +" page:"+pageArray.Length);
NaniteMesh naniteMesh = ScriptableObject.CreateInstance ();
{
naniteMesh.subMeshCount = subMeshCount;
naniteMesh.pageArray = new NaniteMeshPage[pageArray.Length];
for (int i = 0; i < pageArray.Length; i++)
{
string newPath = fileName + "_p" + i + ".asset";
naniteMesh.pageArray[i] = AssetDatabase.LoadAssetAtPath (newPath);
}
}

var meshBound = mesh.bounds;
naniteMesh.boundingSphere = meshBound.center;
naniteMesh.boundingSphere.w = meshBound.extents.magnitude;
string meshExt = "_mesh.asset";
AssetDatabase.CreateAsset(naniteMesh, fileName + meshExt);
AssetDatabase.Refresh();
}

到這里離線部分基本結(jié)束,可以得到一個(gè)Nanite的資源。當(dāng)然UE5原文還做了很多操作,如BVH、Encode、編碼、壓縮、Page的劃分、頂點(diǎn)屬性優(yōu)化等,個(gè)人認(rèn)為這些都屬于工程細(xì)節(jié)。


4. 運(yùn)行時(shí)資源

來到Runtime部分,我們需要把這個(gè)Nanite Mesh加載上來,方便起見,這里直接引用一下資源在腳本上,偷懶省略加載部分。


把資源、Object、材質(zhì)信息整合起來,傳到GPU的Buffer中。這里做法很不正式還是偷懶來處理。當(dāng)然也可以用Compute Shader來更新Page數(shù)據(jù)到GPUBuffer中。

                                                               public static List 
                  
 renderers =  
         new List 
                  
 (); 
         
privatestatic SceneObject[] gpuObjects = new SceneObject[2048];
//cluster -> part -> page
publicstruct SceneObject
{
publicint naniteMeshID;
public Matrix4x4 localToWorldMatrix;
publicint materialIDOffset;
}
publicstruct NaniteRes
{
public Vector4 boundingSphere;
publicint partIndex;
publicint partCount;
}

unsafe static void UpdateRenderList()
{
if(renderers.Count == 0)
return;
//object update
if (renderers.Count > gpuObjects.Length)
{
gpuObjects = new SceneObject[Mathf.NextPowerOfTwo(renderers.Count)];
}

objectCount = 0;
maxPartCount = 0;
naniteMeshes.Clear();
materialList.Clear();
List materialIndices = new List();
for (int i = 0; i < renderers.Count; i++)
{
var renderer = renderers[i];
var nMesh = renderer.naniteMesh;
foreach (var p in nMesh.pageArray)
{
maxPartCount += p.parts.Length;
maxClusterCount += p.clusterArray.Length;
}

SceneObject obj = new SceneObject();
obj.localToWorldMatrix = renderer.transform.localToWorldMatrix;
//mesh index
int index = naniteMeshes.IndexOf(nMesh);
if (index < 0)
{
index = naniteMeshes.Count;
naniteMeshes.Add(nMesh);
}
obj.naniteMeshID = index;
//mat indexs
obj.materialIDOffset = materialIndices.Count;
for (int m = 0; m < renderer.materials.Length; m++)
{
var mat = renderer.materials[m];
int matIndex = materialList.IndexOf(mat);
if (matIndex < 0)
{
matIndex = materialList.Count;
materialList.Add(mat);
}
materialIndices.Add(matIndex);
}
gpuObjects[i] = obj;
renderer.transformChanged = false;
objectCount++;
}

if(candidateClusterBuffer!=null)
candidateClusterBuffer.Dispose();
candidateClusterBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured, maxClusterCount *2, sizeof(int));

if(visibleClusterBuffer != null)
visibleClusterBuffer.Dispose();
visibleClusterBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured,maxClusterCount *2, sizeof(int));

if (objectsBuffer != null)
objectsBuffer.Dispose();
objectsBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured, objectCount, sizeof(SceneObject));
objectsBuffer.SetData(gpuObjects,0,0,objectCount);

if(visObjectsBuffer !=null)
visObjectsBuffer.Dispose();
visObjectsBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured,objectCount, sizeof(int));

int vertCount = 0;
List tempClusters = new List ( 2048);
List tempParts = new List ( 2048);
List naniteRes = new List ( 2048);
List tempIndices = new List(2048 * 100);
List vertexDataList = new List();
//load page
for (int nID = 0; nID < naniteMeshes.Count; nID++)
{
NaniteRes res = new NaniteRes();
var nMesh = naniteMeshes[nID];
//填充到GPU
var pages = nMesh.pageArray;
res.partIndex = tempParts.Count;
res.partCount = 0;
res.boundingSphere = nMesh.boundingSphere;
for (int p = 0; p < pages.Length; p++)
{
var page = pages[p];
var parts = page.parts;
int vertOffset = vertCount;
int indicesOffset = tempIndices.Count;
int clusterOffset = tempClusters.Count;

//add all cluster
var clusters = page.clusterArray;
for (int c = 0; c < clusters.Length; c++)
{
var cluster = clusters[c];
cluster.indiceIndex += indicesOffset;
cluster.vertexOffset = vertOffset;
tempClusters.Add(cluster);
}

//add all part
for (int partIndex = 0; partIndex < parts.Length; partIndex++)
{
var part = parts[partIndex];
part.ClusterStart += clusterOffset;
tempParts.Add(part);
res.partCount++;
}

//add page data
tempIndices.AddRange( page.indiceArray);
vertexDataList.AddRange(page.vertexData);
vertCount += page.vertexCount;
}
naniteRes.Add(res);
}

//TODO GPU Update Buffer
if (naniteResBuffer != null)
naniteResBuffer.Dispose();
naniteResBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured, naniteRes.Count, sizeof(NaniteRes));
naniteResBuffer.SetData(naniteRes);

if (partsBuffer != null)
partsBuffer.Dispose();
partsBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured,tempParts.Count, sizeof(NaniteMeshPart));
partsBuffer.SetData(tempParts);

if (clusterBuffer != null)
clusterBuffer.Dispose();
clusterBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured, tempClusters.Count, sizeof(NaniteCluster));
clusterBuffer.SetData(tempClusters);

if (indiceseBuffer != null)
indiceseBuffer.Dispose();
indiceseBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Raw, tempIndices.Count, sizeof(int));
indiceseBuffer.SetData(tempIndices);

if(materialIndexBuffer!=null)
materialIndexBuffer.Dispose();
materialIndexBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Structured,materialIndices.Count, sizeof(int));
materialIndexBuffer.SetData(materialIndices);

if(vertexDataBuffer!=null)
vertexDataBuffer.Dispose();
vertexDataBuffer = new GraphicsBuffer(GraphicsBuffer.Target.Raw, vertexDataList.Count,sizeof(float));
vertexDataBuffer.SetData(vertexDataList);
}

//input object ID =>
public unsafe static void UpdateNaniteScene()
{
if (renderListDirty)
{
UpdateRenderList();
// UpdateRenderListGPU();
renderListDirty = false;
}

for (int i = 0; i < renderers.Count; i++)
{
var renderer = renderers[i];
if (renderer.transformChanged)
{
gpuObjects[i].localToWorldMatrix = renderer.transform.localToWorldMatrix;
renderer.transformChanged = false;
transformDirty = true;
}
}

if (objectsBuffer != null && transformDirty)
objectsBuffer.SetData(gpuObjects, 0, 0, objectCount);
}

5. 剔除

這時(shí)離線時(shí)候已經(jīng)把Clusters扁平化到數(shù)組中了,這些Clusters是可以并行進(jìn)行剔除的,巧妙之處是他記錄了父級(jí)的誤差和自己的誤差,當(dāng)我們傳入誤差系數(shù)時(shí)候就可以獨(dú)立地判斷自己是否被剔除,而和上下級(jí)無關(guān)。




先從CPU發(fā)起剔除Compute Shader的Dispatch。這里因?yàn)榻M織數(shù)據(jù)時(shí)候就知道了所有Object最大的Parts/Cluster數(shù)量,所以直接用這個(gè)數(shù)去Dispatch了。


Objects剔除:


根據(jù)Object找到NaniteMesh的Parts進(jìn)行Culling:


ClustersCulling:



6. 軟光柵

略。

7. VisibilityBuffer

VBuffer主要用來減少Overdraw,著色器直接輸出InstanceID、ClusterID、材質(zhì)ID。然后用這個(gè)VBuffer來計(jì)算頂點(diǎn)數(shù)據(jù)來著色。



這個(gè)得益于GPUDriven的好處,一個(gè)DrawProceduralIndirect就可以繪制所有物體了:


一次DrawProceduralIndirect繪制多個(gè)物體


VBuffer存哪些屬性,多少位,都是工程細(xì)節(jié)這里就不考究了。


8. 著色

有了VBuffer就需要逐材質(zhì)進(jìn)行繪制,原文是材質(zhì)ID分Tile組合IndirectDraw畫Quad的思想。




需要注意一下這里VBuffer通過三角重心插值求出的UV是不能直接采樣貼圖的,因?yàn)镈DXY不對(duì),所以需求重新計(jì)算,計(jì)算的代碼放下面。并且利用SampleGrad(samplerName, coord2, dpdx, dpdy)來采樣。

                                                           uint MurmurMix(uint Hash)
{
Hash ^= Hash >> 16;
Hash *= 0x85ebca6b;
Hash ^= Hash >> 13;
Hash *= 0xc2b2ae35;
Hash ^= Hash >> 16;
return Hash;
}
float3 IntToColor(uint Index)
{
uint Hash = MurmurMix(Index);

float3 Color = float3
(
(Hash >> 0) & 255,
(Hash >> 8) & 255,
(Hash >> 16) & 255
);

return Color * (1.0f / 255.0f);
}

struct FBarycentrics
{
float3 Value;
float3 Value_dx;
float3 Value_dy;
};

float2 Lerp(float2 Value0, float2 Value1, float2 Value2, FBarycentrics Barycentrics, out float2 dxy)
{
float2 Value = Value0 * Barycentrics.Value.x + Value1 * Barycentrics.Value.y + Value2 * Barycentrics.Value.z;
dxy.x = Value0 * Barycentrics.Value_dx.x + Value1 * Barycentrics.Value_dx.y + Value2 * Barycentrics.Value_dx.z;
dxy.y = Value0 * Barycentrics.Value_dy.x + Value1 * Barycentrics.Value_dy.y + Value2 * Barycentrics.Value_dy.z;

return Value;
}

/** Calculates perspective correct barycentric coordinates and partial derivatives using screen derivatives. */
FBarycentrics CalculateTriangleBarycentrics(float2 PixelClip, float4 PointClip0, float4 PointClip1,
float4 PointClip2, float2 ViewInvSize)
{
FBarycentrics Barycentrics;
PixelClip.y = 1 - PixelClip.y;
PixelClip.xy = PixelClip.xy * 2 - 1;
const float3 RcpW = rcp(float3(PointClip0.w, PointClip1.w, PointClip2.w));
const float3 Pos0 = PointClip0.xyz * RcpW.x;
const float3 Pos1 = PointClip1.xyz * RcpW.y;
const float3 Pos2 = PointClip2.xyz * RcpW.z;

const float3 Pos120X = float3(Pos1.x, Pos2.x, Pos0.x);
co...

特別聲明:以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺(tái)“網(wǎng)易號(hào)”用戶上傳并發(fā)布,本平臺(tái)僅提供信息存儲(chǔ)服務(wù)。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相關(guān)推薦
熱點(diǎn)推薦
央視今晚開播!鄭曉龍32集大劇來了!陣容扎實(shí),我想說:一播就爆

央視今晚開播!鄭曉龍32集大劇來了!陣容扎實(shí),我想說:一播就爆

草本紀(jì)年
2026-03-22 15:52:46
以色列要讓全世界禁聲?斬首俄羅斯記者,普京下令,撞槍口上了

以色列要讓全世界禁聲?斬首俄羅斯記者,普京下令,撞槍口上了

諦聽骨語(yǔ)本尊
2026-03-22 14:37:11
北京中醫(yī)醫(yī)院院長(zhǎng)劉清泉:春季養(yǎng)生記住六字口訣

北京中醫(yī)醫(yī)院院長(zhǎng)劉清泉:春季養(yǎng)生記住六字口訣

人民日?qǐng)?bào)健康客戶端
2026-03-20 21:31:03
2-0!譽(yù)為“史上最強(qiáng)U17國(guó)足”太猛了,掀翻亞洲勁旅,穩(wěn)進(jìn)世界杯

2-0!譽(yù)為“史上最強(qiáng)U17國(guó)足”太猛了,掀翻亞洲勁旅,穩(wěn)進(jìn)世界杯

侃球熊弟
2026-03-22 01:45:03
續(xù)航2000km!奇瑞官宣:3月25日,新車正式預(yù)售

續(xù)航2000km!奇瑞官宣:3月25日,新車正式預(yù)售

高科技愛好者
2026-03-21 23:07:41
劉強(qiáng)東拿下OPPO千億大單

劉強(qiáng)東拿下OPPO千億大單

電商派Pro
2026-03-20 10:02:49
宅基地確權(quán)最后沖刺:今年不把名字改成兒子的,以后可能就改不了

宅基地確權(quán)最后沖刺:今年不把名字改成兒子的,以后可能就改不了

混沌錄
2026-03-19 21:59:03
A股跌破4000點(diǎn)!股民怒懟量化,五部門連夜托底

A股跌破4000點(diǎn)!股民怒懟量化,五部門連夜托底

慧眼看世界哈哈
2026-03-22 14:55:01
5分鐘開通國(guó)家免費(fèi)電視!不用機(jī)頂盒、不連網(wǎng),永久免費(fèi)

5分鐘開通國(guó)家免費(fèi)電視!不用機(jī)頂盒、不連網(wǎng),永久免費(fèi)

叮當(dāng)當(dāng)科技
2026-03-20 03:29:51
傅斯年怒斥蘇聯(lián)“集罪惡之大成”:那些敢向蘇聯(lián)開炮的中國(guó)硬骨頭

傅斯年怒斥蘇聯(lián)“集罪惡之大成”:那些敢向蘇聯(lián)開炮的中國(guó)硬骨頭

顧史
2026-03-21 19:52:44
穆里尼奧真的是欲哭無淚,本菲卡27輪不敗卻依然落后榜首4分!

穆里尼奧真的是欲哭無淚,本菲卡27輪不敗卻依然落后榜首4分!

田先生籃球
2026-03-22 11:18:58
蒯曼獲第二次女隊(duì)內(nèi)選拔賽冠軍,鎖定倫敦世乒賽參賽資格

蒯曼獲第二次女隊(duì)內(nèi)選拔賽冠軍,鎖定倫敦世乒賽參賽資格

乒乓世界
2026-03-22 19:27:54
相親還債?一安徽網(wǎng)友自爆相親經(jīng)歷,連豆包都憤怒了!高呼趕緊跑

相親還債?一安徽網(wǎng)友自爆相親經(jīng)歷,連豆包都憤怒了!高呼趕緊跑

火山詩(shī)話
2026-03-21 10:01:19
小學(xué)生校門口買“小火龍”后中毒嘔吐!其含河豚毒素,緊急提醒

小學(xué)生校門口買“小火龍”后中毒嘔吐!其含河豚毒素,緊急提醒

揚(yáng)子晚報(bào)
2026-03-21 17:47:02
聚餐砸家后續(xù):妻子已請(qǐng)律師,男子依舊不死心,兒子曝更多黑料

聚餐砸家后續(xù):妻子已請(qǐng)律師,男子依舊不死心,兒子曝更多黑料

奇思妙想草葉君
2026-03-19 21:37:58
含劇毒,無解藥,這種東西不能吃,加熱120也沒用,已有人中招!

含劇毒,無解藥,這種東西不能吃,加熱120也沒用,已有人中招!

離離言幾許
2026-03-19 11:55:12
才喝了七天,肝火全滅了,眼睛不干也不澀,渾身都舒坦了

才喝了七天,肝火全滅了,眼睛不干也不澀,渾身都舒坦了

健身狂人
2026-03-21 20:03:27
“鋅”是聰明根!春天孩子多吃高鋅菜,腦子靈、記性好、個(gè)頭猛長(zhǎng)

“鋅”是聰明根!春天孩子多吃高鋅菜,腦子靈、記性好、個(gè)頭猛長(zhǎng)

距離距離
2026-03-21 22:15:32
水深9533米!中國(guó)科學(xué)家發(fā)現(xiàn)巨大的“生命綠洲”,綿延2500公里

水深9533米!中國(guó)科學(xué)家發(fā)現(xiàn)巨大的“生命綠洲”,綿延2500公里

心中的麥田
2026-03-19 19:43:11
蹭飯哥公開道歉:否認(rèn)260次蹭飯,稱帶盒飯因患病,無辜人被牽連

蹭飯哥公開道歉:否認(rèn)260次蹭飯,稱帶盒飯因患病,無辜人被牽連

潮鹿逐夢(mèng)
2026-03-21 18:38:08
2026-03-22 20:43:00
侑虎科技UWA incentive-icons
侑虎科技UWA
游戲/VR性能優(yōu)化平臺(tái)
1558文章數(shù) 986關(guān)注度
往期回顧 全部

科技要聞

嫌臺(tái)積電太慢 馬斯克要把芯片產(chǎn)能飆升50倍

頭條要聞

伊朗提出停戰(zhàn)"六項(xiàng)條件":關(guān)閉美國(guó)在中東的軍事基地

頭條要聞

伊朗提出停戰(zhàn)"六項(xiàng)條件":關(guān)閉美國(guó)在中東的軍事基地

體育要聞

鄭欽文連續(xù)迎戰(zhàn)大滿貫冠軍 “雙教練”團(tuán)隊(duì)正式亮相

娛樂要聞

今晚首播!央視年代劇《冬去春來》來了

財(cái)經(jīng)要聞

睡夢(mèng)中欠債1.2萬?這只“蝦”殺瘋了

汽車要聞

14.28萬元起 吉利銀河星耀8遠(yuǎn)航家開啟預(yù)售

態(tài)度原創(chuàng)

房產(chǎn)
數(shù)碼
手機(jī)
時(shí)尚
公開課

房產(chǎn)要聞

全城狂送1000杯咖啡!網(wǎng)易房產(chǎn)【早C計(jì)劃】,即刻啟動(dòng)!

數(shù)碼要聞

飛利浦復(fù)古耳機(jī)來了,配色亮了

手機(jī)要聞

iPhone 17e上手體驗(yàn):不吐不快,說說優(yōu)缺點(diǎn)!

伊姐周六熱推:電視劇《隱身的名字》;電視劇《正義女神》......

公開課

李玫瑾:為什么性格比能力更重要?

無障礙瀏覽 進(jìn)入關(guān)懷版