Why can't elemental cache preparsed XML files?

I was working on a (very boring) project that needed to parse 5 megabytes of XML on loadup. It was for processing transactions but that's not very important. The load-up time was very frustrating to everyone who used the project. We spent a lot of time pulling out hair over how to fix the problem, but the end solution came down to this:

We created an index (a plaintext file) containing hashes of every XML file. At runtime we would hash the files (this is very quick operation, especially for us since we were using an intel processor with the aes instruction set) and compare them to the last known values. Any recently changed files were parsed again and saved. We parsed them down to a collection of lists and serialized them. We got the load time down from a minute and a half to twenty seconds.

Why isn't this an option for Stardock? A developer could knock out code to do this in half a day.

9,103 views 8 replies
Reply #1 Top

It's on the to-do list. He mentions in the thread specifically doing something like that.

Ultimately this is an optimization thing. You do it closer to the end because when you're focused on making things work, making it faster isn't on top of the list.

Reply #2 Top

I think it depends on where most of the CPU time is spent.  If it is parsing the text, then pre-parsing should help.

If it is creating the in-game objects - I'm not sure how pre-parsing will help.  I suppose it could be easier to have a binary representation, if most of the objects are static - records/structures with lots of flat data.  But if they are mostly pointers to lots of little things with more pointers, it can get difficult to save this off as binary data.  Of course - in software, anything is possible given enough time and effort.

Reply #3 Top

Hmmm. That's a good view.

 

On the downside you get MUCH longer loading time for first time and any time you get a big mod. But that's understandable and any user will tolerate big sign "First time I load longer to load faster later!". Oh yeah, and HDD space.

Reply #4 Top

It's on the to-do list.
End of quote

 Ah, storing them as binary blobs. Didn't see that in the posts.

Reply #5 Top

well, if they just distributed the pre-processed XML (or had the initial XML scan made during the install process), it would just look like the game was still installing. Most users expect a long install time and having a slightly longer install time probably wouldn't be noticed. 

Reply #7 Top

the base game will be the same each patch/installations. Therefore, the "list of binaries" can be populated as well. When the game starts up for the first time, it scans the non-included XMLs (for mods). It would then parse the mod data and add that to its list. 

And... I'm pretty sure an installation can be just as powerful as a game? What magical property does a game have that enables it to be multithreaded while an installation can't be? 

In fact, unpackers/packers tend to LEAD the multithreaded application and set the standards (along with encryption/decryption). Read about the effect of CAS RAM timings, you'll see that the speed and timings of your RAM are FAR FAR more important for basic packing/unpacking applications such as WinRAR than they are for a video game such as Crysis. 

Reply #8 Top

Well, I thought that Windows Installer (which is used in GalCiv 2 AFAIK) is paranoic and wouldn't allow any custom unpacker or indexing program. But it turns out I'm wrong: you can add custom DLLs to installer to do any evil things you want. And you can use advanced packers like 7Z (which I'd strongly recommend for digital distribution).

Anyway, the game would have indexing module if caching would be enabled so there's no point to include it in installation as well as index added in installation would not speed up things much.