feat(profiling): Integrate Tracy profiler#2202
feat(profiling): Integrate Tracy profiler#2202stephanmeesters wants to merge 6 commits intoTheSuperHackers:mainfrom
Conversation
12b8fbf to
8ede749
Compare
…ells/paths to Tracy profiling (TheSuperHackers#2202)
Greptile Overview
|
| Filename | Overview |
|---|---|
| cmake/config-build.cmake | Adds RTS_BUILD_OPTION_PROFILE_TRACY option and conditionally links Tracy when enabled |
| Core/Libraries/Include/rts/profile.h | Adds Tracy header include and macro definitions with no-op fallbacks for disabled builds |
| GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h | Declares Tracy frame capture methods and member variables for texture/surface management |
| GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp | Implements Tracy frame capture with GPU-side downscaling and BGRA-to-RGBA pixel shader conversion |
| GeneralsMD/Code/GameEngine/Source/GameLogic/System/GameLogic.cpp | Adds Tracy zones throughout game logic update loop and plots for logic frame number |
Sequence Diagram
sequenceDiagram
participant GC as GameClient::update()
participant GL as GameLogic::update()
participant D as Display::draw()
participant W3DD as W3DDisplay::draw()
participant Tracy as Tracy Profiler
Note over GC: Frame Start
GC->>Tracy: FrameMark
Note over GC,GL: Logic Update Phase
GC->>GL: update()
GL->>Tracy: ZoneScopedNC("GameLogic::update", green)
GL->>Tracy: TracyPlot("LogicFrame", frameNumber)
GL->>GL: ScriptEngine, TerrainLogic, AI, etc.
Note over GL: Each subsystem wrapped in ZoneScopedN
GL->>Tracy: TracyPlot("PathfindCells", cellCount)
GL->>Tracy: TracyPlot("PathfindPaths", pathCount)
Note over GC,W3DD: Render Phase
GC->>Tracy: ZoneScopedNC("Render", blue)
GC->>D: draw()
D->>W3DD: draw()
W3DD->>Tracy: ZoneScopedN zones for rendering stages
W3DD->>W3DD: Render 3D scene, UI, particles
W3DD->>W3DD: TracyCaptureImage()
W3DD->>Tracy: FrameImage(pixels, width, height)
Note over GC: Frame End
|
Would you mind adding a few images or a short video to the first post? |
Done. Added examples.
Nevermind I accidentally had DXVK DLL's in the binary dir and that changed the profile: the |
8ede749 to
38cf6dd
Compare
…ells/paths to Tracy profiling (TheSuperHackers#2202)
GeneralsMD/Code/GameEngineDevice/Include/W3DDevice/GameClient/W3DDisplay.h
Outdated
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp
Show resolved
Hide resolved
38cf6dd to
a36e02c
Compare
Merge by rebase
This PR adds Tracy profiler to the CMake presets
win32-profileandwin32-vcpkg-profile.A variable
RTS_BUILD_OPTION_PROFILE_TRACYis added to enable Tracy. Enabling this will disable the old profiler.It is not added to the
vc6-profilebecause Tracy library can't compile against it. The presetmingw-w64-i686-profileis also skipped as this preset does not build yet (see #2163).Library include
Tracy is added as a package to VCPKG (version 0.11.1) and as a CMake
FetchContent_Declare(version 0.13.1). These are the latest available versions at the time. Unfortunately the VCPKG is old, but the VCPKG Github has PR's going to update Tracy to the latest version. The version 0.13.1 improves coloring immensely, which is important for at a glance seeing if we are render-bound of update-bound.Tracy themselves recommend adding the library as a CMake include because the version of the library that is included in the game must match exactly with the version of their main executable
tracy-profiler.exe. This can be guaranteed by building and copyingtracy-profiler.exeto the build dir, however we can not do this right now as it requires a 64 bit compile.What are we tracing?
A number of zones were picked that capture the majority of processing time, and will hopefully be useful for profiling sessions.
Performance impact
Tracy itself has a negligible impact, however the frame capturing does have a minor impact (0.5ms average per frame). I have done my best to optimize the frame capturing by downscaling the backbuffer on the GPU side before copying to the CPU. It will be possible to simplify some of this code by moving from DX8 to DX9 and perhaps increase performance too.
Future work
It should be determined whether this will be sufficient to replace the old profiling code. At that point, and when all other profiles build, we can activate Tracy using only
RTS_BUILD_OPTION_PROFILE. Other interesting things to trace would be a plot of the memory usage.Todo
Example of a capture. Render bound frame (top), update bound frame (bottom)
