Current state:
When del is called on an object, the VM iterates everything that could be holding a reference to that object, and sets any references it finds to null. This means that the more objects there are in the world, and the more properties those objects have, the longer it takes to forcibly delete an object. This is a known problem, as described in http://www.byond.com/docs/ref/info.html#/proc/del.
Aside from attaching a reverse engineering tool like IDA to dreamdaemon (no thanks), or iterating the entire world (also no thanks), there is no way to inspect the references being held to an object. This makes it very difficult to find and fix circular or hanging references. This is the biggest performance issue in some of the SS13 codebases, and one that we cannot build userland tools to solve.
Ideal state:
DM has a references(obj) proc.
args:
obj: an object
returns:
if obj is an object:
a list(list(
the object referencing the target,
the string property name where the target is referenced
))
if obj is a primitive type or null:
an empty list
So that's a proc that returns a list of tuples showing everything that's holding the given object and how it's being held.
We can use this proc to:
- Proactively trace hard deletions while load testing the world locally. This will let us easily find the worst culprits of things holding references to objects after we need them to go away.
- Defensively find hard deletion issues in "production" by tracing hard deletions all the time at a low sample rate, like 1/1000. This will let us catch new issues that arise with hard deletions and maintain good performance.
- Make more objects safe to pool. We use object pools in the game to reduce the overhead of allocating small spammy data objects, like signals between machines. We can use this new function at development time to find all the references to these objects and ensure they're cleaned up appropriately before we enable object pooling for them.
Why:
In the Goonstation SS13 codebase, we have approximately 16,000 object types. Many of the other SS13 codebases are in similar situations.
The vast number of object types, the large number of contributors, and the lack of strongly enforced static types makes it difficult to ensure that new and modified code properly cleans up references to unneeded objects.
Given the large codebase and large number of objects alive in the world during the game, hard deletions take a very long time. The game has tons of great content that we're happy with, so simply killing code and reducing the amount of content is not a viable option (although we've done a fair amount of code size reduction over the years).
Things we've tried:
- Delete Queueing: Our delete queue does give objects a chance to be garbage collected by removing them from the queue and getting a text ref to them, then sleeping and checking to see if the object was GCed by trying to `locate` it. This lets us avoid forcibly deleting quite a lot of objects, but there are still tons of objects with hanging references that don't get GCed.
- Code cleanup: Because the codebase is large, and the type system is very loose (doesn't allow static analysis like 'find all things that could hold this thing') this is a manual process requiring extremely high effort and providing generally low impact. I could spend a week trying to clean these things up by reading the code and finding all the references to a thing manually, but it wouldn't be very effective work, because the development environment gives me no assistance in finding the biggest of these deletion problems. I could write a thing to iterate every property on every object in the world, but that would be incredibly slow to run in userland code.
- Time profiling: We did some profiling to see what objects are slow to forcibly delete, and that gave us a few clues, but it's painful to do this. Being able to enumerate references directly would make solving these problems vastly simpler.