UnrealArchive.org

Share interesting stuff you have found or created yourself.
User avatar
Shrimp
Adept
Posts: 305
Joined: Wed Oct 10, 2018 11:15 am
Location: Australia

UnrealArchive.org

Post by Shrimp »

https://unrealarchive.org/

Hey guys,

So I've been working on putting this together for the past few months, inspired by JackGriffin and others' efforts considering ways of building resilient archives of Unreal/UT content. This is the solution I've come up with.

To start with, it may look pretty bland, just being effectively a static list of files. There are no forums (no need for more forums when this one exists! 8)), comments sections on each piece of content, user ratings, etc. I found all these things incompatible with the goal of building a stable and long-lived content archive. The point was also not to make the fanciest website in the world, but rather one that's portable and somewhat time-proof.

Rather than me simply hosting all content myself, you'll note that each download has multiple mirrors. The idea behind this is to provide some form of resiliency, so if one goes down, hopefully some of the others are still up. I still need to implement functionality to cull dead links.

A significant goal of the project is also to ensure the distribution of content, so even if all the mirrors go down, there's a reasonable chance someone has the files locally somewhere. As such, built-in tooling allows simple mirroring and updating of a mirror, of all content in the archive. You may even generate the website itself and use it locally or re-host it.

Here are a couple of the things I've done and decisions made, to achieve these goals:

- The whole thing is open source and hosted on GitHub (https://github.com/unreal-archive)
- The source code and metadata are unlicensed, meaning I do not hold copyright or ownership over it, and it effectively belongs to the world/community
- The main project itself is written in Java, primarily because I'm familiar with it, but it's also proven to be extremely backward compatible, and I don't see the JVM going away any time soon.
- The metadata is hosted in a Git repository, in YAML format. I chose this over a database, since its plain text and in the future anyone can use it without any drivers, codecs, connectors, or whatever else. Being version controlled in Git means its easy for anyone to clone and host in remote repositories, as well as on their own machines.
- The website itself is statically generated, meaning it doesn't rely on a hosting environment with PHP, Java, ASP, or other magical stuff. I can be dumped in a directory on any web server and work, and I've also been careful to ensure it works with local file:// paths, so if you want to use the site for personal use without web hosting, you can do that too.
- To encourage "replication" of the data, the tooling provides the ability to effectively "clone" the entire data store. I want people to download everything, the more copies of stuff we have, the better (over time, this leads to a bit of fragmentation, but I'm OK with that). A problem I have with most of the current archives, is that mirroring is actively discouraged, which I don't feel is a particularly healthy approach. If you want to crawl and scrape the website as-is, that's fine, but ideally you should use the tooling.
- The project is not, and cannot be, a cache service. It is intended to host complete "release" packages, rather than single or .uz compressed cache files.

Here's a quick summary of the current contents (Jan 2024):
Loaded content index with 59607 items (340.74GB) in 4.07s
Current content by Type:
> Mutator: 2390
> Map: 40833
> MapPack: 1751
> Model: 1651
> Skin: 2983
> Voice: 1026
Current content by Game:
> Unreal Tournament 3: 2330
> Unreal: 4563
> Unreal Tournament: 28901
> Unreal Tournament 2004: 12196
> Rune: 2594
> Unreal 2: 50
I'm out of time to document more of this wall of text right now, but this afternoon I'll add more notes on list of and thanks to current content sources, contributions, current metadata management process, cleanup processes, and more. In the mean time please drop any questions you have and I'll answer them later.

Cya!
Last edited by Shrimp on Thu Jan 04, 2024 1:38 am, edited 2 times in total.
ShrimpWorks
Unreal Archive - preserving over 25 years of user-created content for the Unreal series!
User avatar
UTPe
Masterful
Posts: 602
Joined: Sun Jul 12, 2009 7:10 pm
Personal rank: Dude
Location: Trieste, Italy

Re: UnrealArchive.org

Post by UTPe »

Hi Shrimp,
you did a great job ! if you need to add maps to your database, try this website: http://ut99maps.gamezoo.org (it hosts mainly DM sniper maps).
Recently I shared a 4GB+ archive with Kelly with docs and tools about ut99. If you want, I can upload it somewhere on next weekend.

cheers,
Pietro
Personal map database: http://www.ut99maps.net

"These are the days that we will return to one day in the future only in memories." (The Midnight)
User avatar
RetroEpoch
Average
Posts: 62
Joined: Wed Oct 03, 2018 3:02 pm
Personal rank: The more the merrier

Re: UnrealArchive.org

Post by RetroEpoch »

Oh my... Thank you for your work! :gj:

I have to ask. Do you have all the ut-files.com site also included in yours? I might like to host a mirror of it but, if yours holds the same, or more content, I'd be more than happy to give it a try! :D
User avatar
UnrealGGecko
Godlike
Posts: 2975
Joined: Wed Feb 01, 2012 11:26 am
Personal rank: GEx the Gecko
Location: Kaunas, Lithuania

Re: UnrealArchive.org

Post by UnrealGGecko »

Man, fantastic job. :tu: gonna explore this site for days now.

Probably the only inclusion I'd suggest atm would be to make the author clickable, and it would show all the maps/mods that author has made. I know that in some cases the author just puts his nickname on the Author tab and his full name on others, but it would still be really good to see if I missed anything from that author (for example I didn't know GTD-Carthage made an AirFight map, plus I remember he did do another non FoT map that I forgot the name of.)

Thank you so much for your hard work, man. :rock:
Aldebaran
Masterful
Posts: 672
Joined: Thu Jan 28, 2016 7:30 pm

Re: UnrealArchive.org

Post by Aldebaran »

Thank you for that work you and others put into this project so far, it looks awesome for me.
User avatar
Hellkeeper
Inhuman
Posts: 914
Joined: Tue Feb 25, 2014 12:32 pm
Personal rank: Soulless Automaton
Location: France

Re: UnrealArchive.org

Post by Hellkeeper »

Great work, and I agree that a simple, plain-looking and clear website, is the best way to go.
Now it's just a matter of archiving everything as time goes by.

Great job! :thuup:
You must construct additional pylons.
User avatar
esnesi
Godlike
Posts: 1050
Joined: Mon Aug 31, 2015 12:58 pm
Personal rank: Dialed in.

Re: UnrealArchive.org

Post by esnesi »

Great looking website!
Added to my UT folder in favs!

Centralisation is needed alot for UT though.
So many entry's like ut-files, medor, community forums etc, the list is really endless.

1 Point of Entry would really be nice and a big wish.
For example UT99.org/files.
User avatar
Shrimp
Adept
Posts: 305
Joined: Wed Oct 10, 2018 11:15 am
Location: Australia

Re: UnrealArchive.org

Post by Shrimp »

I wanted to go into a bit more detail about the current state of things.

Firstly, thanks to the following who've been keeping files and servers up all these years, from which I scraped the majority of this content. Where I could, I made donations in exchange for the anti-social scraping behaviour (though I did rate limit my scraping to try to be less harmful :oops:)

- Medor (who went through a very long and painful process of uploading his archive to me)
- http://www.ut-files.com/
- http://ut99maps.gamezoo.org/ (yes, I already grabbed all your maps :rock: )
- http://www.uttexture.com/ and http://www.unrealtexture.com/
- http://fpsnetwork.com/
- https://gamebanana.com/
- https://gamefront.online/ (this was a sort-of unofficial mirror of gamefront.com - glad I grabbed their stuff, its now gone and not on gamefront any more)
- UnrealGGeko's Google Drive :lol:
- Archive.org's BeyondUnreal dump
- Many smaller sites and random forgotten FTPs, a lot linked off http://utdatabase.gamezoo.org/maps.html (thanks again for this resource)

I still want to pull stuff from ModDB and perhaps some other spots.


About the content categorisation process

I've implemented a Java library for reading Unreal Engine packages, which is used for reading map information, extracting textures/screenshots, parsing INT and similar files, reading and unpacking UMOD files, etc. For anyone interested, this is also on GitHub: https://github.com/shrimpza/unreal-package-lib It's designed to be fast and light on resources.

Using this library, I attempt to inspect the contents of archives (zip, rar, 7z, exe, etc) to determine their content, figure out the game (only supporting Unreal, UT, and UT2004; nobody cares about the black sheep of the family UT3 :ironic:), and categorise them appropriately (maps, mutators, voices, etc). I'm not repackaging any content, it's provided as-is, as found on the various pages above.

The biggest problem currently is finding author information, which I attempt to extract from README files and stuff, and is why a lot of the author strings on things which aren't maps look muddled sometimes. A lot of this will require manual touchup. If you see your content, please submit a patch by simply editing the relevant YML file on GitHub (if you can find it - there's tooling to help with this, but it needs some documentation and exposure; Coming Soon[tm]).

I'm still working on how best to expose gametypes/mods, as they will probably require completely manual curation, much like I've done for patches and renderers and things.

If your content is listed, and you want it removed for some reason, please let me know, and I'll remove it. That will make me sad though, as I think all user created content has contributed to the history of Unreal and UT, and the whole point of this project is to have it preserved.


About written guides

The "Guides and Articles" section is very light on content at the moment. Its only hosting a couple of pages for providing instructions on how to contribute to the project (not content, yet), where it should actually have some useful game-specific stuff. I'd really like this to preserve some of the knowledge the community has gathered over the years on topics such as server hosting, getting started with UnrealEd or UnrealScript, UCC's missing manual, installing the games on exotic platforms, etc.

To kick the section off, I'd like permission to re-publish Everything you need to know about using UT99 on new hardware by Dr.Flay and TUTORIAL: Unreal Tournament Audio & Visuals Tweaking Guide by Raynor. I think these would serve as a good reference for the sorts of things to add to these sections.

You may either just grant me permission to publish these on the site, or feel free to contribute them yourselves for practice and demonstration purposes :). Let me know and I'll guide you through the process (which will then become a guide itself - see, it's a lovely cycle of content creating content! :thuup:).


About submitting your own content

To begin this adventure, I've kicked off everything by seeding the system with as much existing content I could find.

On the mid-term roadmap is a plan to submit your own content. This does have some challenges, since everything is, by design, quite static, and once I have a clean plan, I'll provide the ability for you to upload your existing and future content.
ShrimpWorks
Unreal Archive - preserving over 25 years of user-created content for the Unreal series!
User avatar
Shrimp
Adept
Posts: 305
Joined: Wed Oct 10, 2018 11:15 am
Location: Australia

Re: UnrealArchive.org

Post by Shrimp »

UTPe wrote: Hi Shrimp,
you did a great job ! if you need to add maps to your database, try this website: http://ut99maps.gamezoo.org (it hosts mainly DM sniper maps).
Recently I shared a 4GB+ archive with Kelly with docs and tools about ut99. If you want, I can upload it somewhere on next weekend.
Thanks! As per my previous post, I've already grabbed your maps :).

I'd be interested in that additional archive. I'll make a plan for somewhere it can be uploaded. Can I download it form you? Will contact via PM if not, with an upload plan later.
captainepoch wrote: I have to ask. Do you have all the ut-files.com site also included in yours? I might like to host a mirror of it but, if yours holds the same, or more content, I'd be more than happy to give it a try! :D
I haven't included every file from ut-files.com; but I do have as many maps, mods, mutators, voices, skins, models as possible. Unrealarchive.org currently only hosts content which can be "safely" categorised automatically, so things like corrupt archives, unknown content, etc, are not represented. I'm pulling things together as I implement new functionality, and don't want to mirror everyone's sites all at once. I'd rather only download what I need, when I need it.
UnrealGGecko wrote: Probably the only inclusion I'd suggest atm would be to make the author clickable, and it would show all the maps/mods that author has made. I know that in some cases the author just puts his nickname on the Author tab and his full name on others, but it would still be really good to see if I missed anything from that author (for example I didn't know GTD-Carthage made an AirFight map, plus I remember he did do another non FoT map that I forgot the name of.)
One of my main goals is and has always been to implement by-author browsing still. I think it should definitely be possible to find and give credit to all the people who've contributed stuff for so many years!
iSenSe wrote: Centralisation is needed alot for UT though.
So many entry's like ut-files, medor, community forums etc, the list is really endless.
I actually think having a lot of mirrors and archives is a good thing :). It ensures continued existence of a lot of stuff if one source goes offline. I know it's a hassle to have stuff all over, and everyone has a little subset of stuff, which is less than ideal, but I really do thing that to promote resilience of the loads of content people have created, it needs to be spread around if possible.

That said, if one or two resources can connect all these others simply to make browsing easier, that's maybe not a terrible thing.


Thanks all for the kind words, work will continue and I'll keep this thread updated as things progress!
ShrimpWorks
Unreal Archive - preserving over 25 years of user-created content for the Unreal series!
darksonny
Masterful
Posts: 518
Joined: Sat Sep 13, 2008 10:24 pm

Re: UnrealArchive.org

Post by darksonny »

Thank you dude, its a huge effort making this nice collection project! and many of the maps does have some picture to preview it before downloading them. Could be nice (maybe this is a wip process or still unfinished task) all maps could have any pic mainly those funny author who posted a nonrelated preview of his map (pictures like bugsbunny or the logo clan, those are they uselees info that does not helps ut fan to decide wether download or not).

Thanks again!
User avatar
papercoffee
Godlike
Posts: 10531
Joined: Wed Jul 15, 2009 11:36 am
Personal rank: coffee addicted !!!
Location: Cologne, the city with the big cathedral.

Re: UnrealArchive.org

Post by papercoffee »

First at all ...awesome work. :tu:

I noticed you didn't got all mapping contest packs from UT99.org ...I pinned them to our mapping Discord channel.
https://discordapp.com/channels/2267363 ... 7467059205 (the pin symbol on the right upper corner)
If you prefer those downloads via dropbox PM me and I upload them there.
User avatar
sektor2111
Godlike
Posts: 6443
Joined: Sun May 09, 2010 6:15 pm
Location: On the roof.

Re: UnrealArchive.org

Post by sektor2111 »

I'll add this at Bookmarks.
JimmyCognitti

Re: UnrealArchive.org

Post by JimmyCognitti »

ut99.org + unrealarchive.org = 100% Awesomeness!
You're doing a great job, keep up!
User avatar
Hook
Inhuman
Posts: 754
Joined: Tue Apr 22, 2008 11:21 pm
Personal rank: UT99 Promoter/Admin
Location: Minnesota USA

Re: UnrealArchive.org

Post by Hook »

Excellent work Shrimp! :tu:
:gj:
Bookmarked and also distributed at my sites!
This IS a real plus for the Unreal Tournament Community for sure! :wink:
=Hook=(Member# 626)
HUTP Active Forums: https://hooksutplace.freeforums.net/forum
HUTP UT99 Community Portal: https://hooksutplace.freeforums.net/
OR: https://hermskii.com/hook/ut99_hutp/
UT99 Server -> CROSSBONES Missile Madness {CMM}

* Newest Versions of: PRO-Redeemers | PRO-SNIPER-Redeemers | PRO-SEEKER-Redeemers <-(the Original)
and Now with FOOD FIGHT and Frying Pan arena !!!
IP: 68.232.181.236:7777 <-(NEW IP to come)
UT99 MH Server -> {CMH} CROSSBONES Monster Hunt (MH) by Mars007 (The Original) - IP: 108.61.238.93:7777
JackGriffin
Godlike
Posts: 3776
Joined: Fri Jan 14, 2011 1:53 pm
Personal rank: -Retired-

Re: UnrealArchive.org

Post by JackGriffin »

Sorry I haven't responded sooner. We got hit with a couple of days of snowstorms, which is not at all normal for where I live. School was out so the wife was home yesterday on my day off.

Shrimp, you are the man. Reading this yesterday really made my day. Honestly I was major bummed that I spent all that time collecting everything and then the dude that was going to host it all just stopped answering any emails. Guess they changed their minds. That really sucked.

Then I saw this and everything was set right again.

I probably need to talk with you about uploading my stuff and what gets uploaded. Over the years I've siteripped everything I could get into and downloaded all the files I ever came in contact with. The things explicitly sent to me to private test I've always placed in a distinct folder so I didn't mistakenly share them but the ripped stuff contains all kinds of unreleased content, beta files, test versions, etc. There's for sure things in there that were not meant to be public but I don't know which is which. I just saved everything.

What are your feelings about files like this? Do you want to just mirror everything and remove stuff if the author gets upset? I can work through it and guess at what should be archived safely but if I do that the community will lose out on a bunch of content that's valuable to learn from as well as history of mod development. I also have a lot of anticheat sources and aimbots, etc. What about those? What about bytehacked files? It can all get a little muddy depending on the level of 'community archive' you want to host.

If you want to discuss this privately just drop me an email. I'll explain more in depth what I have saved. The easiest way to visualize it is that many websites owners are gamers and not web designers. The majority of them think that their files are private when they aren't and someone with a crawler is able to grab all over them. Which I did. A lot. :lol2:
So long, and thanks for all the fish