Awesome explanation.Feralidragon wrote: ↑Wed Mar 17, 2021 6:03 pmThere was a confirmed problem concerning one of the renderers using Shader 2.0 in 469b, and integrated GPUs not really supporting it.Gustavo6046 wrote: ↑Wed Mar 17, 2021 4:30 pmOpenGLDrv.so's UOpenGLRenderDevice::DrawComplexSurface(FSceneNode*, FSurfaceInfo&, FSurfaceFacet&) seems to segfault with the Intel driver (i915), which it didn't before; I suspect it's a configuration issue, but I simply set ShaderType to SHADER_ARB, then I set it back to SHADER_None and it still didn't fix it!
It does work with the Nouveau backend (switched by setting the environment vairable DRI_PRIME=1 prior to starting the game), but I'd rather it run on the iGPU. It was working so well before... :<
note: I don't know the source code, I just demangled the C++ symbol "_ZN19UOpenGLRenderDevice18DrawComplexSurfaceEP10FSceneNodeR12FSurfaceInfoR13FSurfaceFacet".
Code: Select all
Developer Backtrace: [ 1] Core.so(_Z12HandleSignali+0x20b) [0xf79746cb] [ 2] linux-gate.so.1(__kernel_sigreturn+0) [0xf7f3f560] [ 3] /usr/lib32/dri/i965_dri.so(+0x366d74) [0xf5815d74] [ 4] /usr/lib32/dri/i965_dri.so(+0xabac7) [0xf555aac7] [ 5] /usr/lib32/dri/i965_dri.so(+0x237fa2) [0xf56e6fa2] [ 6] OpenGLDrv.so(_ZN19UOpenGLRenderDevice18DrawComplexSurfaceEP10FSceneNodeR12FSurfaceInfoR13FSurfaceFacet+0xb09) [0xdfef7b39] [ 7] Render.so(_ZN7URender9DrawFrameEP10FSceneNode+0x132a) [0xe15c401a] [ 8] Render.so(_ZN7URender9DrawWorldEP10FSceneNode+0x351) [0xe15c6161] [ 9] Engine.so(_ZN11UGameEngine4DrawEP9UViewportiPhPi+0x972) [0xf7c6de72]  SDLDrv.so(_ZN12USDLViewport7RepaintEi+0x93) [0xf2dd3c83]  SDLDrv.so(_ZN10USDLClient4TickEv+0x1a6) [0xf2dccf06]  Engine.so(_ZN11UGameEngine4TickEf+0x19a6) [0xf7c72466]  XC_Engine.so(_ZN14UXC_GameEngine4TickEf+0xa2) [0xf17bbfc2]  ./ut-bin(main+0x1018) [0x804eb08]  /usr/lib32/libc.so.6(__libc_start_main+0xed) [0xf72f1a0d]  ./ut-bin() [0x804d84f] Signal: SIGSEGV [segmentation fault] Aborting. Signal: SIGSEGV [segmentation fault] History: UOpenGLRenderDevice::DrawComplexSurface <- URender::DrawFrame <- URender::DrawWorld <- UGameEngine::Draw <- USDLViewport::Repaint <- USDLClient::Tick <- ClientTick <- UGameEngine::Tick <- UXC_GameEngine::Tick <- UpdateWorld <- MainLoopIteration <- MainLoop <- main
But I think that was already fixed in the latest snapshot, but I am not entirely sure (feel free to report that in github, or the OldUnreal discord).
This is something that can be better clarified perhaps by Anth himself.
But my own take at it, is the fact that the main renderer is SoftDrv, in other words, a software renderer that is capable of only working using the CPU (no GPU involved at all).
And it works this way as well (batching polys in SoftDrv won't really give you much gain).
And at the time OpenGL was still in its infancy, and although included it never really worked well, and they only had pretty much D3D7 as one of the main and tested hardware renderers, and graphic cards weren't really a thing yet back then, like they are nowadays.
So it seems to me that they simply replicated what they had done in SoftDrv to the hardware renderers, and it just worked fairly well over time, and it was easy to implement.
So there was no big reason at the time to really batch anything, there were no particular noticeable gains as far as I know doing that back then, especially in a game with such a small amount of polys being rendered overall.
I mean, even working like this, it was still one of the best and fastest engines around at the time.
However, as the years passed, GPUs became much more powerful and also much more autonomous (being even able to store everything in their own memory, and to the point they have their own programmable instructions: shaders).
All of a sudden, even the bus speed of AGP wasn't fast enough and adequate anymore and PCI Express had to created.
So it wasn't until much later on that this type of bottleneck was actually noticeable, especially when mods and maps started to have higher poly meshes and models.
Even nowadays you won't really notice the difference in performance if all you play is the vanilla game.
You will only notice a difference if you use mods that actually take advantage of this, like NW3 itself.
Having that said, one thing I need to clarify: at the moment, this batching is only applied to mesh polys, not BSP surfaces.
BSP surfaces are pretty much still rendered as before, also 1 at a time.
Trying to batch those seems to be more difficult, to the point it was already considered as a possible alternative to simply batch render the entire map, disregarding the BSP nodes completely, since in any GPU from the last 10 years that will likely be much faster than render each one of them at a time.
That's why maps like CTF-Blice (a map I released, meant as one of the benchmarks for 469 in the end) still lag a lot despite this improvement (not the only reason, but one of the reasons).
The ideal end goal is to pass pretty much everything that is currently involving the CPU in some way (rendering-wise) to the GPU.
But for that the rendering pipeline of the engine itself will require a significant rewrite, which will also lead to rewrites of the renderers themselves, which won't really be feasible unless we get rid of some of the renderers (like SoftDrv itself, which is nightmare fuel).
If we ever reach that point, then every single map and mod released to this date will be pretty much lag-free as far as rendering goes, with no FPS loss whatsoever in any reasonably modern system (anything from the last 5-10 years).
Even volumetric fog, one of the slowest engine features, could finally be fast and usable.
But for that the community has to help a bit, like letting go of SoftDrv completely, otherwise it will always be impossible.
I appreciate the in depth reply.
I'm curious, is there code of SoftDrv somewhere I can look at? I'd love to see some of the routines.
Also, would it be possible to include some new functions natively for future mod/map use (and for backwards compatibility, just use a naive implementation in UC) as to make things faster? Figured there might be some good candidates you've stumbled upon.
Is there a testing plan or document of scenarios to test? If so, I wouldn't mind helping with that and going enough random tests to help further this.
Also... One more thing...
Can you possibly add some changes to UCC compilation that would make coding lives easier? (Such as predefined/dev picked order order of compilation -- such as a keyword/default property, class namespaces -- other game engines just threw older code into a standard namespace for backwards compatibility when they add this, and a way to see quick stack traces -- provided debug data is provided?).
Would be very helpful for development :/ getting a random memory address from an accessed none and only the filename SUCKS.