encoding
-
- Masterful
- Posts: 569
- Joined: Sun Jul 12, 2009 7:10 pm
- Personal rank: Dude
- Location: Trieste, Italy
Re: encoding
Windows-1252 ?
Personal map database: http://www.ut99maps.net
"These are the days that we will return to one day in the future only in memories." (The Midnight)
"These are the days that we will return to one day in the future only in memories." (The Midnight)
-
- Godlike
- Posts: 5489
- Joined: Wed Feb 27, 2008 6:24 pm
- Personal rank: Work In Progress
- Location: Liandri
Re: encoding
I am not sure, but I would say perhaps UTF-8, which seems the most logical one to go with given their Linux support.
-
- Masterful
- Posts: 569
- Joined: Sun Jul 12, 2009 7:10 pm
- Personal rank: Dude
- Location: Trieste, Italy
Re: encoding
Well, I took a look over some params. Don't ask me why but this is what I got from server logs...
the log of a Linux UT99 server v451 shows me this:
Init: Character set: ANSI
...while the log of a Windows UT99 server v451 says this:
Init: Character set: Unicode
I hope this may help you
the log of a Linux UT99 server v451 shows me this:
Init: Character set: ANSI
...while the log of a Windows UT99 server v451 says this:
Init: Character set: Unicode
I hope this may help you

Personal map database: http://www.ut99maps.net
"These are the days that we will return to one day in the future only in memories." (The Midnight)
"These are the days that we will return to one day in the future only in memories." (The Midnight)
-
- Godlike
- Posts: 5489
- Joined: Wed Feb 27, 2008 6:24 pm
- Personal rank: Work In Progress
- Location: Liandri
Re: encoding
But Unicode is not an "encoding", it's a specification on how to define the encodings themselves (such as UTF-8 and UTF-16 for example).
-
- Adept
- Posts: 427
- Joined: Sun Sep 07, 2008 6:20 am
- Personal rank: XFIRE: mmmmtoasty
- Location: PA,USA
-
- Adept
- Posts: 256
- Joined: Thu May 13, 2010 2:23 am
Re: encoding
Linux servers use ASCII (not ANSI!) encoding for playernames and UTF-16LE for logfiles.
Windows uses UTF-16LE for both.
Windows uses UTF-16LE for both.
-
- Adept
- Posts: 258
- Joined: Sat Aug 24, 2013 6:04 pm
Re: encoding
I'm going to state UT2004 stuff, but perhaps it applies to UT1 as well. Text files (INI, INT) are first scanned for the BOM of UTF-16LE/BE. If found, the files are treated correspondingly, if not, then Windows-1252 is assumed for reading the file. Internally the game supports Unicode characters, but doesn't take advantage of that fact in most places. Yes, UT1 stat log files are UTF-16, but the main log file (and custom log files in UT200x) is Windows-1252. Actually the game simply treats bytes as characters and vice versa. The font textures are based on Windows-1252, though.
-
- Novice
- Posts: 7
- Joined: Thu Apr 16, 2020 6:27 pm
Re: encoding
Ok, humongous necro, sorry.
I'm trying to dig up some inconsistencies in UT99 on linux, and part of the problem is that some of the stuff online I'm reading regards UTF-8 as an 8 bit-only encoding, which it isn't.
ASCII is 7 bits only ("Extended ASCII" isn't an official ASCII representation, though the term is still useful, as is "8 bit ASCII")
ISO-8859(-1) is 8 bits only
UTF-8 is 1-4 groups of 8bits. Yes, despite its name, a UTF-8 file can have 32bit wide characters, cascading down from the preceeding MSb's.
BTW, "UTF-16" is an oddball (1-2 groups of 16 bits) and is so seldom used, I can't find any information regarding it at all in .ini (or any other standard) use. No one uses it for hardly anything AFAICT. Some of the claims about UTF-16 I think might be spurious and are actually referring to UTF-8.
My question: Is there a problem regarding ISO-8859 that anyone has seen? The focus on windows encodings (due to its origin) makes sense, but *which* of the various encodings causes the least trouble on linux?
Thanks!
I'm trying to dig up some inconsistencies in UT99 on linux, and part of the problem is that some of the stuff online I'm reading regards UTF-8 as an 8 bit-only encoding, which it isn't.
ASCII is 7 bits only ("Extended ASCII" isn't an official ASCII representation, though the term is still useful, as is "8 bit ASCII")
ISO-8859(-1) is 8 bits only
UTF-8 is 1-4 groups of 8bits. Yes, despite its name, a UTF-8 file can have 32bit wide characters, cascading down from the preceeding MSb's.
BTW, "UTF-16" is an oddball (1-2 groups of 16 bits) and is so seldom used, I can't find any information regarding it at all in .ini (or any other standard) use. No one uses it for hardly anything AFAICT. Some of the claims about UTF-16 I think might be spurious and are actually referring to UTF-8.
My question: Is there a problem regarding ISO-8859 that anyone has seen? The focus on windows encodings (due to its origin) makes sense, but *which* of the various encodings causes the least trouble on linux?
Thanks!