Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive batches when we update GltFast 2.1 to 4.0 (low fps) #197

Closed
ketourneau opened this issue Jun 18, 2021 · 15 comments
Closed

Excessive batches when we update GltFast 2.1 to 4.0 (low fps) #197

ketourneau opened this issue Jun 18, 2021 · 15 comments
Assignees
Labels
bug Something isn't working

Comments

@ketourneau
Copy link

Hi,

We have automated benchmark and we see fps regression when we update GltFast 2.1 to 4.0 (we got excessive batch).

GltFast 2.1

  • 19980 batches
  • 674 batches after optimization (StaticBatchingUtility.Combine)
  • 50 FPS

GltFast 4.0

  • 20056 batches
  • 758 batches after optimization (StaticBatchingUtility.Combine)
  • 45 FPS

In both GltFast version we got

  • Vertices: 16 499 855
  • Triangles: 8 869 527
  • Objects: 20 062
  • Materials: 128

It seems is the StaticBatchingUtility.Combine fault ? But why ?
Would it be the way to build a mesh in GltFast 4.0 that would have an impact to StaticBatchingUtility.Combine ?

@atteneder
Copy link
Owner

Hi @ketourneau ,

Thanks for providing this insight.

Unfortunately I cannot think of anything, that could cause this behavior. Are these results reproducable? Could you provide a sample project demonstrating this?

Regards

@atteneder atteneder added the question Further information is requested label Jun 21, 2021
@ketourneau
Copy link
Author

Hi @atteneder ,

After some more test, we got x2 in CPU & render thread.
We attach the project to reproduce (we don't add optimization, we only use GltFast).

GltFast

GltFast-4-0

@ketourneau
Copy link
Author

ketourneau commented Jul 8, 2021

Hi @atteneder,

Do you have time to look at the regression between GltFast 2.1 => GltFast 4.0 ?

@atteneder
Copy link
Owner

Sorry for the late reply, I haven't found time to investigate yet.

I presume some change in how the hierarchy is created is to blame here. Or maybe (if the glTFs contain animation) it's the SkinnnedMeshRenderer that isn't batched anymore. In that case I'd say it's by design, but I don't want to theorize before taking measurements.

@ketourneau
Copy link
Author

The gltf doesn't contains animation, so SkinnnedMeshRenderer is not used.

@ketourneau
Copy link
Author

@atteneder How can we help you to debug the regression ?
We don't really understand why we got more render thread ms (343ms => 662ms).

@atteneder
Copy link
Owner

Note: I'm currently on paternal leave and will only answer sporadically. I'll be able to provide help later this month.

@atteneder
Copy link
Owner

I can reproduce it and I'm baffled by this one as well :/ I suspected the shader changes, but using the old 3.2.1 shaders with 4.2.0 didn't improve performance.

@ketourneau Did you compare builds as well or just Editor?

Testing is hard, since the scene is huge and unoptimized. Would it be able for you to provide a glTF file that's more pleasant to test? This one (although below 100MB) freezes both Blender and online web viewers. I'm actually amazed how fast it loads with glTFast 🏎

@atteneder
Copy link
Owner

@ketourneau Did you compare builds as well or just Editor?

I just did....similar result

@atteneder
Copy link
Owner

Suspicion: DracoUnity (from version 2.x on) always creates 32bit indices, also for smaller meshes that could get away with 16bit indices.

@atteneder
Copy link
Owner

I brought back 16bit indices (not released yet), which made it a bit better (also less memory usage), but still a lot worse than with DracoUnity 1.4.

Turns out putting the normals on a dedicated vertex stream is what's causing the performance impact :/

Expect a fix soonish. I'll try to figure out why it is like that.

@ketourneau
Copy link
Author

Hi @atteneder, I tried with other glTF file but none of them put as well the regression...

Thank you for your time, I hope you will find a fix 🤞

@atteneder
Copy link
Owner

Alright, this should be fixed when updating DracoUnity to version 3.2.0.

@ketourneau Thank you very much for bringing this to my attention. Although tedious to fix I got some useful insights from it.

This scene holds >18k objects with an average size of only ~110 triangles.
It appears as if the graphics API overhead for binding an additional vertex stream (for separated position/normal data) is costing a lot of time.
Therefore DracoUnity now combines streams for meshes with 65k triangles and below.

That being said, you could optimize this scene a lot by merging all objects with equal materials. A lot of materials are identical, but different instances (making it harder to batch/merge them automatically). You could safe a lot of draw calls by batching them together.

hth,
Andi

@atteneder atteneder self-assigned this Aug 27, 2021
@atteneder atteneder added bug Something isn't working and removed question Further information is requested labels Aug 27, 2021
@ketourneau
Copy link
Author

Hi @atteneder,

Yes already make optimization in our app: we merge all objects with equal materials for optimization and we call StaticBatchingUtility.Combine on root GameObject.

We try DracoUnity 3.2.0 and GltFast 4.2.1 and we always have FPS refression (more batches).

  • GltFast 4.2.1 & DracoUnity 3.2.0

GltFast-4 0

  • GltFast 2.1 && DracoUnity 1.4

GltFast

I will try to find a gtlf that I can share.

@atteneder
Copy link
Owner

@ketourneau To be more precise: I recommend merging objects of same material at design-time (before exporting to glTF). You could remove the overhead of merging similar materials at runtime this way.

atteneder added a commit to Unity-Technologies/com.unity.cloud.gltfast that referenced this issue Dec 16, 2024
…o subfolder. (atteneder#190)

* refactor: Moved package content into sub-folder `Packages/com.unity.cloud.gltfast`.

* feat: Added global README.

* refactor: Adopting CI scripts to new monorepo structure.

* refactor: Moved package tests to dedicated tests package.

* feat: Added test projects to be used for development and testing.

* fix: RenderPipelineAssets reside in tests package now.

* fix: Declare related package.

* doc: Added built-in modules to the installation docs.

* doc: Added development docs.

* feat(ci): Static analysis

* chore: Updated test dependencies.

* fix: Changelog markdown structure fix

* fix: XML doc additions and fixes.

* fix: Removed unnecessary "type" property from `package.json`.

* fix: Removed warning about obsolete `GraphicsDeviceType.OpenGLES2` in Unity 2023.1 or newer.

* fix(export): Missing texture transform if texture on glTFast material was scaled vertically only.

* fix: Moving PNG image file to GIT LFS by fixing extension to being lowercase now.

* chore: Removed outdated and unused code coverage badge.

* refactor: Code examples (atteneder#194)
  * refactor: Moved code examples that are referenced by the documentation into a different folder (`DocExamples`).
  * refactor: Renamed code example assembly to `glTFast.Documentation.Examples` and namespace to `GLTFast.Documentation.Examples` for consistency.
  * chore: Removed outdated API documentation filters.
  * fix: Exclude documentation example code from code coverage.
  * fix: Image file name changed (`Images/gltf-extra-data.png`).

* feat(ci): Package Verification Profile checks and tooling. (atteneder#195)
  * feat: Added PVP check to pack job.
  * feat(ci): Replaced upm-ci with upm-pvp/UTR for packing, package tests, player tests and publishing.
  * API validation is still done via upm-ci.
  * refactor: Merged jobs `build_and_run_*` into `run_*` to remove redundant scripts.
  * feat(ci): Automated publish and dry run publish jobs.
  * feat(ci): PR trigger comment `pr` for PRs that don't target `main`/`develop` branch.

* fix: Various issues reported by static analysis. (atteneder#196)

  * fix: Improved reliability by adding null checks and imprecision-aware floating-point comparisons in various places.
  * fix: Async methods return a `Task` type where feasible.
  * fix: Using immutable fields only in hash calculation for `ImageExport` classes.
  * refactor: Renamed property ImageExport.ValidAssetPath to be pascal case.
  * fix(export): Consolidate textures based on dedicated comparison (`TextureComparer.Equals`) instead of problematic equality implementation.
  * refactor: Replaced `GetHashCode` implementation referencing mutable fields with default fallback to avoid potential unexpected behavior.
  * refactor: `MeshPrimitiveComparer` is now used for clustering mesh primitives (instead of `GetHashCode`/`Equals` on `MeshPrimitive` and sub-types).
  * chore(ci): Renamed variable to avoid confusion between project key and name.
  * fix(ci): Use GIT branch as version
  * chore(ci): Use repository relative paths for reports.
  * chore: Updated code coverage exclusion paths.
  * doc: Added missing changelog entry.
  * doc: Explicitly mention lack of morph target export support.
  * fix(ci): Ensure all jobs succeeds with Windows and 2020 LTS.
  * feat(ci): Ported sonarscanner script from macOS to Windows to unify it with code coverage jobs.

* feat(test): Package coherence tests that make sure package versions match across exported generator string and documentation. (atteneder#197)

* fix: Set minimum required Unity version to 2020.3.48f1 in the documentation.

* doc: Developer onboarding (atteneder#198)

* chore: Bumped package versions

* fix(ci): Consistently repeat once after a failed run for all test jobs.

* chore(ci): Disabled `enablePackageManagerTraces`

* chore: Removed obsolete vscode pkg.

* fix(export): Avoid potential loss of data by allocating output streams persistently.

* feat(test): Added glTF assets for testing (atteneder#192)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants