ID:2380006
 
Resolved
During analysis of init procs to consolidate them, objects could be created that were sometimes not let go of properly or worse, called the soft Del() proc. Now init procs will no longer create atoms, images, or hard datums during anaylysis.
BYOND Version:512.1432
Operating System:Windows Server 2012 r2 64bit
Web Browser:Chrome 68.0.3440.33
Applies to:Dream Daemon
Status: Resolved (512.1434)

This issue has been resolved.
Descriptive Problem Summary:
We put 1432 on one of the servers, it worked fine

The next morning I woke up and was told that everything was fucked.

Players were going to random mobs on log in, editing variables with admin tools would edit the same variables on other objects instead.
lol this isn't a very useful report, stop bloating up the bug queue
That's odd. Any chance it's related to id:2379947?

I need something to go on of course to get to the bottom of this.
as far as i know, we don't use ./: chaining it violates the repo's policy against using operators that don't compile-time type check
I'll try to elaborate a bit on what I know but I did not see this first-hand.

We generally do not allow chained operations like list[1].thing or proc().thing. I'm sure there are a fair number of places where that actually happens, but i don't think it's related.

New players who connected were logged into types other than /mob/dead/new_player, despite this code in /mob/Login()
log_access("Mob Login: [key_name(src)] was assigned to a [type]")


resulting in
ACCESS: Mob Login: Jordie0608/(Jordie0608) was assigned to a /mob/dead/new_player

In actuality, Jordie here was logged into a /mob/living/simple_animal/hostile/hivebot/strong, or at least his Topic() calls came from a hivebot.

An admin also reported that mob tags resolved wrong, effectively "mob_40" locate()'d into a mob with tag "mob_495"
I'm not entirely sure the chained list/proc thing wasn't an issue though, especially if problems only started after a certain point. I can't see any reason for a mob to login to the wrong type unless something got changed inappropriately.
The fix for the other issue is in 512.1433, so please retest with that build.

You will need to recompile.

If this doesn't fix the issue I'll need some more concrete data to work with.
Well now it's reproducible!

https://travis-ci.org/tgstation/tgstation/jobs/ 398482646#L1042

Here is the same code on 1427 with no runtimes https://travis-ci.org/tgstation/tgstation/jobs/ 397638628#L954

This is actually seeming similar to 1431's bug, as those runtimes weren't a thing in 1432
Yep, its all broken down, can confirm. Began with lots of list errors, but I'm sure you can see the runtimes yourself.
Um, do you have that code somewhere else where I can look at it? That site barfs unless I enable cookies for it.
It seems to be TG code, I think this will work with most SS13 codebases, the Travis is just the errors... Which basically comes down to lots and lots of null.proc() errors. Bad index calls for me, nullified lists.

(So I *think* you can just use your version of TG, or one of their newer ones and reproduce it all the same)
This is correct, I had made a post that said just as much, but the website went down while posting it, and i didn't feel like retyping it
Here's a pastebin version of that log

https://file.house/wgZG.txt

It should be reproducible by running DD with this commandline

DreamDaemon.exe -trusted -params "test-run"

with or without LOWMEMORYMODE

And here the exact revision of the tree that's running https://github.com/tgstation/tgstation/archive/ 9acba9cde801ec523cdcb1404c8ecaaa6f5bea34.zip
Test run failed!
Total runtimes: 13190
I won't run tests in trusted mode but I think that log and that .zip should be helpful to me. I'll see what I can find.
I'm working on this, but boy is it a slog. From what I'm seeing right now in the debugger (low-memory mode), it looks like the air subsystem is being initialized before the atom subsystem and that's what's causing the problem. OR, the atom subsystem is somehow skipping over the turf. The first turf this happens to (to get the null.proc error on open.dm:157) is 0x1000300. I'm not seeing its air var being set to anything, which from what I see the atom subsystem is responsible for, before the atmos subsystem takes over. Hence the problem: the air var is expected not to be null, but it is null.
The subsystems are initialized in code/controllers/master.dm, line 80 onwards:
            var/list/subsytem_types = subtypesof(/datum/controller/subsystem)
sortTim(subsytem_types, /proc/cmp_subsystem_init)
for(var/I in subsytem_types)
_subsystems += new I


cmp_subsystem_init is in /code/__HELPERS/cmp.dm, it provides a comparison function to sort the subsystems by their initialization order:
/proc/cmp_subsystem_init(datum/controller/subsystem/a, datum/controller/subsystem/b)
return initial(b.init_order) - initial(a.init_order) //uses initial() so it can be used on types


The subsystems are defined under the files in code/controllers, but each of them only has init_order = SOME_DEFINE. The actual values are in code/__DEFINES/subsystems.dm, line 51 onwards.

I'm guessing the bug involves initial() on variables containing typepaths somehow, a pattern we use in a few places to avoid instantiating an object just to get a constant value from it.
Well, I ran a test on initial() with a regular object earlier and confirmed that much at least seemed to be working correctly. I don't see any reason that type paths in initial() would have changed, though. However it definitely appears the subsystems are initializing out of order.
I'm trying to drill down on this issue further, right now i've confirmed that sortTim() runs and runs the
if(L && L.len >= 2)
path (as it should), but does not sort the list and never runs /proc/cmp_subsystem_init even once, leaving the subsystem init order up to whatever pops out from typesof()

e: /datum/sortInstance/proc/timSort() runs, which runs /datum/sortInstance/proc/binarySort(), but never runs
if(call(cmp)(fetchElement(L,mid), pivot) > 0)
in the for loop. for loop has an odd signature of
for(,start < hi, ++start)
, note the comma

e2: the for loop never runs.
## WARNING: binarysort called, start is 65, hi is 65, cmp is /proc/cmp_subsystem_init

e3:
this call:
sortTim(subsytem_types, /proc/cmp_subsystem_init)

runs this code:
/proc/sortTim(list/L, cmp=/proc/cmp_numeric_asc, associative, fromIndex=1, toIndex=0)
warning("sortTim here, fromIndex is [fromIndex], toIndex is [toIndex]")

which gives this warning:
## WARNING: sortTim here, fromIndex is , toIndex is

so the default arguments got nulled?

Going to run this with 1427 to see what the difference is
On 1427, the warning is
## WARNING: sortTim here, fromIndex is 1, toIndex is 0

and the sorting proc works as it should. So default arguments seem to be broken?
Page: 1 2