encoding

Discussions about UT99
Post Reply
User avatar
Acid.OMG
Adept
Posts: 427
Joined: Sun Sep 07, 2008 6:20 am
Personal rank: XFIRE: mmmmtoasty
Location: PA,USA

encoding

Post by Acid.OMG » Thu Aug 29, 2013 5:42 am

what type of character encoding does ut99 use for gamer names?
[/Awesome]

User avatar
UTPe
Adept
Posts: 471
Joined: Sun Jul 12, 2009 7:10 pm
Personal rank: Dude
Location: Trieste, Italy

Re: encoding

Post by UTPe » Thu Aug 29, 2013 8:15 am

Windows-1252 ?
Personal UT99 website: http://utdatabase.gamezoo.org | Personal forum: http://fragfinity.freeforums.org/index.php
Personal file database: http://ut99files.gamezoo.org | Personal map database: http://ut99maps.gamezoo.org

"These are the days that we will return to one day in the future only in memories." (The Midnight)

User avatar
Feralidragon
Godlike
Posts: 5179
Joined: Wed Feb 27, 2008 6:24 pm
Personal rank: Work In Progress
Location: Liandri

Re: encoding

Post by Feralidragon » Thu Aug 29, 2013 10:52 am

I am not sure, but I would say perhaps UTF-8, which seems the most logical one to go with given their Linux support.

User avatar
UTPe
Adept
Posts: 471
Joined: Sun Jul 12, 2009 7:10 pm
Personal rank: Dude
Location: Trieste, Italy

Re: encoding

Post by UTPe » Thu Aug 29, 2013 12:58 pm

Well, I took a look over some params. Don't ask me why but this is what I got from server logs...

the log of a Linux UT99 server v451 shows me this:
Init: Character set: ANSI

...while the log of a Windows UT99 server v451 says this:
Init: Character set: Unicode

I hope this may help you :?
Personal UT99 website: http://utdatabase.gamezoo.org | Personal forum: http://fragfinity.freeforums.org/index.php
Personal file database: http://ut99files.gamezoo.org | Personal map database: http://ut99maps.gamezoo.org

"These are the days that we will return to one day in the future only in memories." (The Midnight)

User avatar
Feralidragon
Godlike
Posts: 5179
Joined: Wed Feb 27, 2008 6:24 pm
Personal rank: Work In Progress
Location: Liandri

Re: encoding

Post by Feralidragon » Fri Aug 30, 2013 12:42 am

But Unicode is not an "encoding", it's a specification on how to define the encodings themselves (such as UTF-8 and UTF-16 for example).

User avatar
Acid.OMG
Adept
Posts: 427
Joined: Sun Sep 07, 2008 6:20 am
Personal rank: XFIRE: mmmmtoasty
Location: PA,USA

Re: encoding

Post by Acid.OMG » Fri Aug 30, 2013 9:21 am

Thanks for the help guys
[/Awesome]

User avatar
anth
Skilled
Posts: 220
Joined: Thu May 13, 2010 2:23 am

Re: encoding

Post by anth » Fri Aug 30, 2013 12:13 pm

Linux servers use ASCII (not ANSI!) encoding for playernames and UTF-16LE for logfiles.
Windows uses UTF-16LE for both.

User avatar
Wormbo
Adept
Posts: 258
Joined: Sat Aug 24, 2013 6:04 pm
Contact:

Re: encoding

Post by Wormbo » Thu Sep 05, 2013 10:07 am

I'm going to state UT2004 stuff, but perhaps it applies to UT1 as well. Text files (INI, INT) are first scanned for the BOM of UTF-16LE/BE. If found, the files are treated correspondingly, if not, then Windows-1252 is assumed for reading the file. Internally the game supports Unicode characters, but doesn't take advantage of that fact in most places. Yes, UT1 stat log files are UTF-16, but the main log file (and custom log files in UT200x) is Windows-1252. Actually the game simply treats bytes as characters and vice versa. The font textures are based on Windows-1252, though.

tgm1024
Novice
Posts: 7
Joined: Thu Apr 16, 2020 6:27 pm

Re: encoding

Post by tgm1024 » Fri May 22, 2020 7:04 pm

Ok, humongous necro, sorry.

I'm trying to dig up some inconsistencies in UT99 on linux, and part of the problem is that some of the stuff online I'm reading regards UTF-8 as an 8 bit-only encoding, which it isn't.

ASCII is 7 bits only ("Extended ASCII" isn't an official ASCII representation, though the term is still useful, as is "8 bit ASCII")
ISO-8859(-1) is 8 bits only
UTF-8 is 1-4 groups of 8bits. Yes, despite its name, a UTF-8 file can have 32bit wide characters, cascading down from the preceeding MSb's.

BTW, "UTF-16" is an oddball (1-2 groups of 16 bits) and is so seldom used, I can't find any information regarding it at all in .ini (or any other standard) use. No one uses it for hardly anything AFAICT. Some of the claims about UTF-16 I think might be spurious and are actually referring to UTF-8.

My question: Is there a problem regarding ISO-8859 that anyone has seen? The focus on windows encodings (due to its origin) makes sense, but *which* of the various encodings causes the least trouble on linux?

Thanks!

Post Reply