1.19E OOM Errors

Yup, it's still here. All games were just straight, no mods, nothing weird. Just play game until it crashes, then restart, load last autosave, lather, rinse repeat. It's mildly better than 1.19d, but not what I would term at all acceptable. Some large maps, some medium.

 

http://dl.dropbox.com/u/21911701/6%20Mar%202011%201941%20Error%20Rollup.zip

48,954 views 20 replies
Reply #1 Top

Me too.  OOM in 1.19e when I hit next-turn.  Task-mgr thought it was using a bitg over 1,100,000K at the time.

debug.err     http://dl.dropbox.com/u/8928343/debug-OOM-1-19e.err

don't see a crash zip file this time.

Reply #2 Top

No cheat keys right?

I've seen some OOM where the user used the Ctrl-U cheat key to unhide all and that will, on a large map, definitely ultimately lead to OOM. You're unhiding *everything*. It's really there for debugging (AI debugging specifically).

Reply #3 Top

  Thanks guys, we are checkign both of these out.  Especially yours ClaytonHollowell, I love havings all the instances, error logs and saves!  

Reply #5 Top

I play with cheats enabled, but I'm not using any of them.  Specifically, not turning off FOW.  But I was on a max-sized map, with a big empire and a lot of area visible the without cheats.

FWIW, when I re-loaded, mem usage started a bit under 700,000 k, and grew to around 8-and-a-bit over several turns.

Reply #6 Top

Max sized maps are fine. You should be able to do that.

Unhiding the FOW is what is killer.

We're putting it all through a fine tooth comb. We don't have memory leaks per se, the issue is the nature of our memory allocation philosophy 
(open ended ala GalCiv) which means that when players play in a way we didn't think they would, things can fall apart.  So  a lot of time is being spent looking at your logs and seeing how real live humans are playing the game and opitmizing on that. 

Reply #9 Top

Brad & Derek - I was bored so I tried to purposely oom my system and success!

All I did was:

  1. Load a fresh instance of Elemental
  2. Load my season 187 save game
  3. When loaded, go back to main menu and reload
  4. Rinse and repeat
  5. OOM error on the 16th load - on each load I recorded the % of physical memory used
  6. What's interesting is it gave me the oom error dialog box, then popped up a box saying the save game was not compatible with the version I was playing, then allowed me to manually exit the game from the main menu

Loads (each # represents the % of physical memory in use per task manager):

  1. 53
  2. 59
  3. 59
  4. 59
  5. 60
  6. 59
  7. 62
  8. 67
  9. 65
  10. 67
  11. 68
  12. 70
  13. 71
  14. 72
  15. 73
  16. CRASH

I hope this helps!

Reply #10 Top

Brad & Derek - round 2:

  1. 1.53GB @ 51%
  2. 1.71GB @ 57%
  3. 1.76GB @ 58%
  4. 1.78GB @ 59%
  5. 1.83GB @ 60%
  6. 1.83GB @ 61%
  7. 1.88GB @ 62%
  8. 1.93GB @ 64%
  9. 1.98GB @ 66%
  10. 2.10GB @ 69%
  11. 2.08GB @ 69%
  12. 2.15GB @ 71%
  13. 2.15GB @ 71%
  14. CRASH

So over 13 successful loads, memory usage went up a net 620MB across 13 saves = an average of 47.69MB per reload not clearing from memory. % of memory used went up 20% over the same 13 loads which is 1.53% average per reload. Round 1 was 20% over 15 successful reloads which is 1.33% average increase. Things look pretty consistent.

As someone who exits the game per reload I can see where I don't have a consistent oom problem. However, it certainly looks like each reload gobbles up around 50MB of memory, never gives it back, thus making a crash inevitable.

Now back to the AI...:cylon:

[Edit] As a side note the game is still running at main menu. I tried to load the game to see what happens it immediately gives the oom error, tells me my save is not compatible with the game version, and goes right back to the main menu. This should be easily duplicated on your end. Vista shows 2.06GB memory usage @ 68%.

Also note the save is still fine. I can quit the game, reload, and load the save game just fine.

Reply #11 Top

Brad & Derek - round 3. Check this out. I'm batting a 1000 thus far.

Load Save = "L" for my comments below, indicates I was loading this file

Test Save = "S" for comments below, indicates I was saving this file

This time I alternated loading and saving. The L's indicate a load of the load save above. The S's indicate a save to the save above.

  1. L 1.55GB @ 51%
  2. S 1.59GB @ 53%
  3. L 1.68GB @ 56%
  4. S 1.69GB @ 56%
  5. L 1.72GB @ 57%
  6. S 1.71GB @ 57%
  7. L 1.83GB @ 61%
  8. S 1.81GB @ 60%
  9. L 1.78GB @ 59%
  10. S 1.79GB @ 59%
  11. L 1.90GB @ 63%
  12. S 1.91GB @ 63%
  13. L 1.89GB @ 62%
  14. S 1.89GB @ 63%
  15. L 1.95GB @ 64%
  16. S 1.94GB @ 64%
  17. L 1.99GB @ 66%
  18. S 1.98GB @ 66%
  19. L 2.12GB @ 70%
  20. S 2.10GB @ 70%
  21. L 2.14GB @ 71%
  22. S 2.14GB @ 71%
  23. L 2.09GB @ 70%
  24. S 2.10GB @ 70%
  25. L 2.15GB @ 71%
  26. S 2.16GB @ 72%
  27. L 2.18GB @ 72%
  28. S CRASH CTD - DUMP CREATED - 2.19GB @ 73% before CTD

So here's what I noticed:

  • Saves did not appear to materially impact memory
  • Loads are definitely causing a problem when not exiting and restarting the game
  • When the crash occurred on a save, it created a dump and CTD'd. When it oom'd on the loads, the game remained open but would not allow future loads.

Let's get this stability crap out of the way and get to Brad's 1.3! I know, easier said than done. Hope this helps.

Reply #12 Top

FWIW, I *never* load without exiting & saving.  That may be why I OOM less often than some others seem to. 

So my OOM above must have some other cause, like maybe playing too many turns too much going on.

I was using the "dominate" ability a lot in that campaign.  Could there be a connection there ?  I know it gets confused by "dominate" be cause every time I use it there is a ghost dead unit left on the screen.

 

Reply #13 Top

Quoting Lord, reply 12
FWIW, I *never* load without exiting & saving.  That may be why I OOM less often than some others seem to. 

So my OOM above must have some other cause, like maybe playing too many turns too much going on.

I was using the "dominate" ability a lot in that campaign.  Could there be a connection there ?  I know it gets confused by "dominate" be cause every time I use it there is a ghost dead unit left on the screen.

 
End of Lord's quote

Yeah, I've had oom crashes in prior builds (not yet in 1.19e) from just playing as I always exit game and reload. Sounds like there's a few gremlins running around in the engine. The load from within a game is a sure thing though.

Reply #14 Top

AlLanMandragoran (and others),

The save/load leaks are something I've been focusing on.  Thank you for your detailed testing.  I'm continuing to work on the save/load leaks and that data will be helpful.

- CC

Reply #15 Top

Quoting CodeCritter, reply 14
AlLanMandragoran (and others),

The save/load leaks are something I've been focusing on.  Thank you for your detailed testing.  I'm continuing to work on the save/load leaks and that data will be helpful.

- CC
End of CodeCritter's quote

Cool, thanks CodeCritter. I'll definitely keep posting any other OOM type issues I see. I imagine most players aren't loading 16x a session so the other OOM issues have other root causes. Thanks for your hard work, the game is much more stable for me now!

Reply #16 Top

@CodeCritter - Also, I was able to duplicate the ghost stack thing in 1.19d. Not sure if you had a chance to look or fix for 1.19e. I haven't seen it yet in 1.19e. See here, reply #14. https://forums.elementalgame.com/405808

Reply #17 Top

http://dl.dropbox.com/u/16468196/debug.err

 

game randomly crashes about every 20 turns or so

Reply #18 Top

and again.....

 

http://dl.dropbox.com/u/16468196/2debug.err

Reply #19 Top

getting better....

http://dl.dropbox.com/u/16468196/3debug.err

http://dl.dropbox.com/u/16468196/Auto1Save.EleSav

Reply #20 Top

http://dl.dropbox.com/u/16468196/1QuickSave.EleSav

http://dl.dropbox.com/u/16468196/4debug.err

 

Latest version. Using some mods, large map, lower difficulty. Quicked saved after a major battle--thankfullly no crash which sometimes happens. I try to save after a major breakthrough that I prefer not to repeat.

 

Hope these help