According to Lum, the current scheduler queue is a linked list to avoid dynamic collection resizing performance costs.
This means that spawn()/sleep() has to search through the scheduler queue per element from the beginning to find where to insert the new value. The larger your scheduler queue, and the later the current spawn is from current time, the worse performance gets. For large games, the overhead of a scheduler reinsertion is dramatic, and without a full picture of how this impacts developers, the big fish in this community throw their hands up in the air, totally unable to diagnose why their game is shitting the bed.
This is especially problematic because the spawn() pattern in particular makes CPU occupancy nearly impossible to profile, effectively hiding problem areas of your code. See my profiling results and how waitfor in favor of spawn makes where the CPU hit is coming from much more obvious without inducing much extra overhead.
My tests indicate that we can expect better than a five-fold increase in performance across the board, and will improve performance best and worst case.
I performed some testing in softcode to prove my suspicions:
//potential optimization #5: allow multiple schedules based on time granularity. A binary split of very large scheduler lists, attempting to segregate times with a progressive decrease in granularity vs list size should significantly improve scheduling performance and distribute that regained performance over time during Tick().
scheduler //potential optimization #4: Create special scheduler for cyclical events, removing scheduling overhead from repetition.
var
time = 0
tick_lag
cur_tick
next_tick = 0
default_invoke
list/handles = list() //planned optimization #2: Flatten all 5 lists into a single structure. Should result in dramatic speed boost.
list/scheduled = list()
list/invoking = list()
list/schedule = list()
list/parameters = list()
proc
Schedule(datum/ref,delay,invoke,params,handle)
if(!handle)
handle = "\ref[ref]:[time+delay]"
var/sched = time + delay
if(delay<0 || (!cur_tick && sched < next_tick))
call(ref,invoke || default_invoke)(time)
else
var/pos = binarySeek(schedule,sched)
handles.Insert(pos,handle)
scheduled.Insert(pos,ref)
invoking.Insert(pos,invoke)
schedule.Insert(pos,sched)
parameters.Insert(pos,params)
return handle
Cancel(datum/handle)
var/pos, epos
if(istext(handle))
pos = handles.Find(handle) //planned optimization #3: Circularize scheduler references to speed explicit deletion and patch holes.
else if(isnum(handle))
pos = handle
else if(istype(handle)) //planned optimization #3: Should be able to pop the specific index if done properly.
pos = scheduled.Find(handle)
if(!pos)
return 0
epos = pos+1
handles.Cut(pos,epos)
scheduled.Cut(pos,epos)
invoking.Cut(pos,epos)
schedule.Cut(pos,epos)
parameters.Cut(pos,epos)
return 1
Reschedule(datum/handle,delay,params)
var/pos, epos
if(istext(handle))
pos = handles.Find(handle)
else if(isnum(handle))
pos = handle
else if(istype(handle))
pos = scheduled.Find(handle)
if(!pos)
return 0
epos = pos+1
handle = handles[pos]
var/ref = scheduled[pos]
var/invoke = invoking[pos]
if(!params) params = parameters[pos]
handles.Cut(pos,epos)
scheduled.Cut(pos,epos)
invoking.Cut(pos,epos)
schedule.Cut(pos,epos)
parameters.Cut(pos,epos)
var/sched = time + delay
if(delay<0 || (!cur_tick && sched < next_tick))
call(ref,invoke || default_invoke)(arglist(params))
else
pos = binarySeek(schedule,sched)
handles.Insert(pos,handle)
scheduled.Insert(pos,ref)
invoking.Insert(pos,invoke)
schedule.Insert(pos,sched)
parameters.Insert(pos,params)
return 1
Tick(time)
cur_tick = args
next_tick = time + (tick_lag || world.tick_lag)
var/datum/ref, invoke, pos, params
while((pos=schedule.len))
if(schedule[pos] >= next_tick)
return
ref = scheduled[pos]
invoke = invoking[pos] || default_invoke
params = parameters[pos]
--handles.len
--scheduled.len
--invoking.len
--schedule.len
--parameters.len
call(ref,invoke)(arglist(params))
proc
binarySeek(list/l,value)
var/len = length(l)
if(!len)
return 1
var/hlen = round(len/2), seek = hlen
if(l[1]<=value)
return 1
else if(l[len]>=value)
return len+1
while(hlen)
hlen = round(hlen/2)
if(l[seek]>value)
seek += hlen
else
seek -= hlen
return l[seek]>value ? seek : seek+1
The scheduler datum is intented to function similarly to spawn(). It stores all the information that it needs to schedule callbacks. The basic idea is when an event is scheduled, store the parameters sorted by invocation time using a binary sort. Everything is stored backward to increase processing speed, as it's significantly faster to pop items from the end of a list than to cut them from the beginning of a list. This means that the soonest events are at the end of the list, and the latest events are at the beginning of the list.
The below functions are the tests that I used for benchmarking:
mob
var
tested = 0
verb
test_scheduler()
var/scheduler/s = new()
s.default_invoke = "schedtest"
tested = 0
for(var/count in 1 to 100000)
s.Schedule(src,rand()*10)
var/etime = world.time+10
while(world.time<=etime)
s.Tick(world.time)
sleep(world.tick_lag)
test_spawn()
tested = 0
for(var/count in 1 to 100000)
spawn(rand()*10)
++tested
test_waitfor()
tested = 0
for(var/count in 1 to 100000)
waitfortest()
proc
schedtest()
++tested
spawntest()
++tested
waitfortest()
set waitfor = 0
sleep(rand()*10)
++tested
Results indicate that the scheduler is averaging 11 seconds to schedule and run 100,000 callbacks, while spawn() and waitfor are taking roughly 52.9 seconds each.
The difference between the waitfor and spawn functions in the profiler is negligible, the waitfor test just much more clearly shows where the specific CPU occupancy comes from compared to spawn().
![](https://i.imgur.com/pbSh8kk.png)
It should be noted that my scheduler outperforms spawn in every single configuration even though its logic is written in interpreted softcode. I would expect the built-in solution to absolutely shit on what is supposed to be an identical softcode solution. I would also expect the built in solution of my algorithm would offer even more dramatic performance improvements.
My algorithm is also hugely unoptimized at the moment. I've one done one of five major optimizations to it that I have outlined since I initially wrote it.
As for benefit, spawn()/sleep() is a massively overused pattern in DM. Spires, every shitty anime game, and SS13 in particular pepper that shit everywhere. This performance increase would be a dramatic global benefit to every single project using the language.