XC_Engine megathread

Re: XC_Engine megathread

Postby sektor2111 » Sat Mar 23, 2019 8:53 pm

ALL are loaded fast if you have been fired game previously. At first launch after system's boot up, if you have 15+ game-types, starting a "Practice Session" takes a while (I was about to sleep) until menu shows up on a slower machine with a slower hard drive and nothing can speed up this because that disk has limited reading speed regarding to what you do. This is why I'm using a sort of "precache" task in Win servers in order to gain files placement (files list) for "DynamicLoad" specific mods... else all it's a lagging fest until OS is caching data.
User avatar
sektor2111
Godlike
 
Posts: 3865
Joined: Sun May 09, 2010 6:15 pm
Location: On the roof.

Re: XC_Engine megathread

Postby darksonny » Sat Mar 23, 2019 11:48 pm

sektor2111 wrote:ALL are loaded fast if you have been fired game previously. At first launch after system's boot up, if you have 15+ game-types, starting a "Practice Session" takes a while (I was about to sleep) until menu shows up on a slower machine with a slower hard drive and nothing can speed up this because that disk has limited reading speed regarding to what you do. This is why I'm using a sort of "precache" task in Win servers in order to gain files placement (files list) for "DynamicLoad" specific mods... else all it's a lagging fest until OS is caching data.

mmm...understood :tu:
darksonny
Skilled
 
Posts: 198
Joined: Sat Sep 13, 2008 10:24 pm

Re: XC_Engine megathread

Postby Higor » Sun Mar 24, 2019 12:23 am

The main problem with the map list combobox is the sorting algorithm.
UnrealScript is simply bad at sorting due to the speed of the virtual machine and the iteration limits that crash the game (safety assert actually, not a crash)
I remind you that GetMapName function is used to retrieve ONE map at a time, so you have to call it 'N' times for 'N' maps.

The new change involves XC_Engine_Menu disabling any UnrealScript sorting whatsoever.
So after disabling the sorting, the map list is constructed almost instantly...

Why was it done in the first place?
Maybe because there is no sorting in the native 'GetMapName' function as it gets the maps in the order the operating system supplies them.

Why not sorting in GetMapName? (c++)
Remember, GetMapName needs to be called 'N' times, and UT's default GetMapName function retrieves the map names using operating system API (win32 and linux).
The map list may not be sorted due to multi-folder (many map folders), also in Linux the map names aren't even sorted by name at all.

So how did I fix this problem?
Single-frame cache that gets cleared every time a frame is rendered (or the world is updated on a server).
So the first time I call GetMapName (XC version), the map list is constructed and sorted accordingly.
The results are then stored in a map list cache, as well as a marker of the position of the last map that has been returned (initially 0)
Every time I call GetMapName to get the other maps (N maps = N calls), the position marker changes to 'next' or 'prev' it simply takes the map from the cache.
The XC GetMapName's cache won't be rebuilt if GetMapName is called as follows:
Code: Select all
GetMapName( "CTF", FirstMap, Position); //Position increments with every call
GetMapName( "CTF", CurrentMap, 1); //CurrentMap changes with every GetMapName call, but '1' indicates direction to follow (next)
GetMapName( "CTF", CurrentMap, -1);  //CurrentMap changes with every GetMapName call, but '-1' indicates direction to follow (prev)

The above examples should cover pretty much 99% of cases.

Result is: no UnrealScript sorting, and a one-time sorting everytime we want to generate a map list.

EDIT, play in 1080p:
https://www.youtube.com/watch?v=vDyTO5fn8n8?&vq=hd1080
Higor
Godlike
 
Posts: 1713
Joined: Sun Mar 04, 2012 6:47 pm

Re: XC_Engine megathread

Postby Higor » Mon Apr 08, 2019 12:43 am

Script compiler.
As seen here: https://github.com/CacoFFF/XC-UT99/comm ... 22d6f1e212

On the upcoming XC_Core version, the script compiler will receive a hack (as long as XC_Core is loaded) where most XC_Engine/XC_Core
functions will become global methods that no longer need to be defined in any class.

What does it mean?
If you want to make a class use XC_Engine functions, you no longer have to copy the function definitions over to the new class.
With this comes an interesting side effect that is that you can also call these functions on already existing classes.

Example where you're calling these functions without defining them, both in 'self' and 'P'.
Code: Select all
class Test expands Object;

static function int CountDynamic( PlayerPawn P, out float Time)
{
    local float F[2];
    local Actor A;
    local int i;

    Clock(F);
    ForEach P.DynamicActors( class'Actor', A)
        i++;
    Time = UnClock(F);
}
Higor
Godlike
 
Posts: 1713
Joined: Sun Mar 04, 2012 6:47 pm

Re: XC_Engine megathread

Postby sektor2111 » Thu Apr 11, 2019 6:35 am

I must ask: What minimal specifications must have the CPU running this XC_Core ? I'm asking because XCv23 doesn't work in all my machines and I do not have plans to throw money for upgrading 4 machines and... the rest of my needs...
User avatar
sektor2111
Godlike
 
Posts: 3865
Joined: Sun May 09, 2010 6:15 pm
Location: On the roof.

Re: XC_Engine megathread

Postby Higor » Fri Apr 12, 2019 5:43 pm

Post full specs of the systems having problems with XC. (mainly operating system version/distro and CPU name, arch)
I suspect it's the c++11 requirements in CacusLib, which I'll deal with accordingly.
Higor
Godlike
 
Posts: 1713
Joined: Sun Mar 04, 2012 6:47 pm

Re: XC_Engine megathread

Postby sektor2111 » Fri Apr 12, 2019 6:55 pm

All PIV machines are running until a random moment when game is crashing (daughter is playing locally multiple times the same map) XCv23 - What it's often seen (or not seen) when a dependent file/object is missing, game is not returning to entry map, but crashing... launcher used is default because the newer it's not what I want, it was more unstable and wants to send an "error report" bla bla.
AMD proudly presented here... viewtopic.php?f=64&t=11630&start=0#p94126 being
Image

it's not even blinking in XCv23, all backward compatibility it's gone - this was one of my best machines for playing UT'99 in back days without X cores problems and all that circus. Even if I have here more resources and I could use last XC_Core, I cannot do anything, I see things unstable and that's why I did not do any configurable UScript shared publicly but private versions for testing.
User avatar
sektor2111
Godlike
 
Posts: 3865
Joined: Sun May 09, 2010 6:15 pm
Location: On the roof.

Re: XC_Engine megathread

Postby Higor » Fri Apr 12, 2019 9:23 pm

Well well, decided to take a look at the generated code and found something interesting:

This
Code: Select all
Path->visitedWeight = appRound( appSqrt((*Ptr)->Dist2DSq));


Was producing this, moving numbers between xmm and fpu registers, with a SSE2 DOUBLE->INT conversion due to appSqrt returning a DOUBLE instead of FLOAT.
Code: Select all
fld     dword ptr [eax+0Ch]
sub     esp, 8
fstp    qword ptr [esp+08h]
call    ds:__imp_?appSqrt@@YANN@Z ; appSqrt(double)
add     esp, 8
fstp    [ebp-44h]
cvtsd2si eax, [ebp-44h]  //SSE2 instruction
mov     [ebx+314h], eax


So a small redefinition of appSqrt that takes a FLOAT and returns a FLOAT, couple more tweaks and tada:
Code: Select all
movss   xmm0, dword ptr [eax+0Ch]
movss   [ebp-38h], xmm0
xorps   xmm1, xmm1
movss   xmm1, xmm0
sqrtss  xmm1, xmm1
movss   [ebp-30h], xmm1
xorps   xmm0, xmm0
movss   xmm0, xmm1
cvtss2si eax, xmm0 //SSE instruction
mov     [edx+314h], eax


Now I'm gonna have to hunt for more SSE2 instructions all over the code generation, and find out a way to get rid of those unnecessary XORPS instructions
Higor
Godlike
 
Posts: 1713
Joined: Sun Mar 04, 2012 6:47 pm

Re: XC_Engine megathread

Postby sektor2111 » Sat Apr 13, 2019 6:09 am

And after that you might want to look at "Mover history". Is everything in these movers initialized properly ? I did a test-map with a door and buttons that need to be shot for opening door. If human player will shot them first time, Bot will suddenly know that handling, else they don't look like they know what to do during firsts seconds. Why I have to speed up using door first ? I don't know. Does Navigation need a bit of delay ? At a moment they were shooting door and I was only witnessing their roaming doing nothing because these were working properly - maybe I ruined something and I don't see what... I go to switch machines and I'll drop here that lab-junk, maybe some people will understand that shotable stuff is doable as long as Bot code can do some stunts even if UT don't have such things in original stock.
MH-ShotOpen.7z
(24.81 KiB) Not downloaded yet

This is the thing in cause. Using 2 Freelanced Bots.
User avatar
sektor2111
Godlike
 
Posts: 3865
Joined: Sun May 09, 2010 6:15 pm
Location: On the roof.

Re: XC_Engine megathread

Postby Chris » Sat Apr 13, 2019 12:15 pm

sektor2111 wrote:All PIV machines are running until a random moment when game is crashing (daughter is playing locally multiple times the same map) XCv23 - What it's often seen (or not seen) when a dependent file/object is missing, game is not returning to entry map, but crashing... launcher used is default because the newer it's not what I want, it was more unstable and wants to send an "error report" bla bla.
AMD proudly presented here... viewtopic.php?f=64&t=11630&start=0#p94126 being
Image

it's not even blinking in XCv23, all backward compatibility it's gone - this was one of my best machines for playing UT'99 in back days without X cores problems and all that circus. Even if I have here more resources and I could use last XC_Core, I cannot do anything, I see things unstable and that's why I did not do any configurable UScript shared publicly but private versions for testing.


Trying to execute unsupported instructions usually results in a hardware exception being raised and passed onto the OS. The OS would then say something like "illegal instruction".
Is there any textual info in the error report mentioning unsupported instructions?


Higor wrote:Well well, decided to take a look at the generated code and found something interesting:

This
Code: Select all
Path->visitedWeight = appRound( appSqrt((*Ptr)->Dist2DSq));


Was producing this, moving numbers between xmm and fpu registers, with a SSE2 DOUBLE->INT conversion due to appSqrt returning a DOUBLE instead of FLOAT.
Code: Select all
fld     dword ptr [eax+0Ch]
sub     esp, 8
fstp    qword ptr [esp+08h]
call    ds:__imp_?appSqrt@@YANN@Z ; appSqrt(double)
add     esp, 8
fstp    [ebp-44h]
cvtsd2si eax, [ebp-44h]  //SSE2 instruction
mov     [ebx+314h], eax


So a small redefinition of appSqrt that takes a FLOAT and returns a FLOAT, couple more tweaks and tada:
Code: Select all
movss   xmm0, dword ptr [eax+0Ch]
movss   [ebp-38h], xmm0
xorps   xmm1, xmm1
movss   xmm1, xmm0
sqrtss  xmm1, xmm1
movss   [ebp-30h], xmm1
xorps   xmm0, xmm0
movss   xmm0, xmm1
cvtss2si eax, xmm0 //SSE instruction
mov     [edx+314h], eax


Now I'm gonna have to hunt for more SSE2 instructions all over the code generation, and find out a way to get rid of those unnecessary XORPS instructions


Visual C++ "should" never generate SSE2 under x86 unless specifically instructed to do so with the "Enable enhanced instruction set" found in the project config.
I'm not sure if the default "Not set" makes the compiler think that it can use whatever it wants. What if you specifically select SSE or IA32(x87 only) rather than "Not set"?
Chris
Experienced
 
Posts: 116
Joined: Mon Nov 24, 2014 9:27 am

Re: XC_Engine megathread

Postby Higor » Sat Apr 13, 2019 7:36 pm

Chris wrote:Visual C++ "should" never generate SSE2 under x86 unless specifically instructed to do so with the "Enable enhanced instruction set" found in the project config.

I added a variation of appRound() that directly rounds double floats using a CVTSD2SI instead of converting to float first, left it lying there doing 'nothing' because why not.
But thanks to taking a second look I saw that the compiler is picking it up after any of the 'app' functions that work on doubles, a very stupid mistake from my part.

In XC_Core and derivates VC++ is not using any of the special built-in type conversions, it's mostly handled manually through platform specific code.
Example, compare FTime implementations (mine vs original):
https://github.com/CacoFFF/XC-UT99/blob ... ore/Core.h
https://github.com/stephank/surreal/blo ... Inc/Core.h

One of the reasons to do this, is to discard the usage of the VC140 runtime libraries when switching to the new compiler, binary stays small and nobody needs to install the new runtimes.
https://github.com/CacoFFF/XC-UT99/blob ... /API_MSC.h
But I still have to find a workaround to the FTOL2 conversion, right now it only uses SSE2 instructions.
PD: This method keeps the exception handlers consistent with the original libraries, new binaries can also add their function history during a crash.
Higor
Godlike
 
Posts: 1713
Joined: Sun Mar 04, 2012 6:47 pm

Re: XC_Engine megathread

Postby Chris » Sun Apr 14, 2019 2:01 pm

Higor wrote:
Chris wrote:Visual C++ "should" never generate SSE2 under x86 unless specifically instructed to do so with the "Enable enhanced instruction set" found in the project config.

I added a variation of appRound() that directly rounds double floats using a CVTSD2SI instead of converting to float first, left it lying there doing 'nothing' because why not.
But thanks to taking a second look I saw that the compiler is picking it up after any of the 'app' functions that work on doubles, a very stupid mistake from my part.

In XC_Core and derivates VC++ is not using any of the special built-in type conversions, it's mostly handled manually through platform specific code.
Example, compare FTime implementations (mine vs original):
https://github.com/CacoFFF/XC-UT99/blob ... ore/Core.h
https://github.com/stephank/surreal/blo ... Inc/Core.h

One of the reasons to do this, is to discard the usage of the VC140 runtime libraries when switching to the new compiler, binary stays small and nobody needs to install the new runtimes.
https://github.com/CacoFFF/XC-UT99/blob ... /API_MSC.h
But I still have to find a workaround to the FTOL2 conversion, right now it only uses SSE2 instructions.
PD: This method keeps the exception handlers consistent with the original libraries, new binaries can also add their function history during a crash.


I've also encountered the mess caused by the poor versioning system with the CRT binaries.
The most common one is probably libraries being linked with different versions having different malloc implementations.
The different exception handlers is another one, as you mentioned.

So you pretty much have to resort to the FPU for the FTOL2 conversion?
You could either change the FPU control word bit 10, 11 to 0b11 and then perform the
Code: Select all
fld [esp]
fistp [esp + 8] ;Or wherever you wish to store it
mov eax, dword ptr[esp + 8]


Or you could subtract .5 from it and then perform the conversion with the default "round to nearest integral" to get the truncation effect.
Chris
Experienced
 
Posts: 116
Joined: Mon Nov 24, 2014 9:27 am

Previous

Return to Discussions

Who is online

Users browsing this forum: No registered users and 1 guest