ID:2733495
 
Resolved
When movable atoms were continually moving, entries built up in a client-side list of movers, causing major progressive slowdowns even with just a few objects. Normally this was not visible because most objects stop moving at some point. This issue is very old and probably had a small but occasionally noticeable impact on many games.
BYOND Version:514
Operating System:Windows 10 Home 64-bit
Web Browser:Chrome 94.0.4606.81
Applies to:Dream Seeker
Status: Resolved (514.1570)

This issue has been resolved.
Descriptive Problem Summary:
When the client has atoms that are constantly moving, the client will run progressively slower until they are no longer in view or not moving, or other situations.

The test case below will usually start to exhibit signs of lag around 100 or 150 objs, but it can happen with less and it can not show up with more. Sometimes it will also get bad enough that it will somehow fix itself temporarily until it starts to lag again.

The effect may be different depending on if you are running it locally or as hosted session with dreamdaemon.


Numbered Steps to Reproduce Problem:
1. Create (potentially) large amounts of objs
2. Move them around constantly (using Walk()? unsure.)
3. Wait a while. Amount of time varies wildly depending on obj count, computer speed, current rotational axis of Mercury.


Code Snippet (if applicable) to Reproduce Problem:
Source code for test case (fairly small)

Example video:
Discord CDN hotlink in 60fps
YouTube version of video (crappier, 30fps)



Expected Results:
The client does not slow down when there are a static number of (moving) objs.

Actual Results:
Client runs slow.


Does the problem occur:
Every time? Or how often?
- Easily repeatable with a simple test case (see link)

In other games?
- Originally discovered in SS13, replicated in simple test case

In other user accounts?
- No change

On other computers?
- Computer speed magnifies/lessens effects


When does the problem NOT occur?
- Problem goes away upon getting out of view of the moving items, or when the moving items stop moving.


Did the problem NOT occur in any earlier versions? If so, what was the last version that worked? (Visit http://www.byond.com/download/build to download old versions for testing.)
- This has been a bug for several months, but I don't have an exact time.


Workarounds:
- Leaving the problematic area or stopping movement.
I don't believe this is a bug. You're just hitting the ceiling of the renderer. Having hundreds of movables on the screen at once; each moving around has always taken a toll on DS' renderer. I agree it would be awesome to see some improvements to this limitation, though.

Even a surplus of stationary objects can cause the renderer to chug a little bit. This is something I've struggled with for FEED a fair bit, but ultimately it's manageable ime. I can't think of a case where 100s of movable entities should be all drawn in view of a client at once.
I remember someone saying that this sometimes tends to happen even if it's only a few movables moving for a long time. More importantly it seems that the client performance decrease gets worse without new objects appearing which is counter-intuitive.
In response to Kumorii
Kumorii wrote:
I don't believe this is a bug. You're just hitting the ceiling of the renderer.

Note we originally discovered this in the form of standing on a conveyor belt circle for too long would tank your FPS. It's also variable per player how long it takes to tank. Large numbers of objects just make it happen faster.
In response to Kumorii
Kumorii wrote:
I don't believe this is a bug. You're just hitting the ceiling of the renderer. Having hundreds of movables on the screen at once; each moving around has always taken a toll on DS' renderer. I agree it would be awesome to see some improvements to this limitation, though.

Even a surplus of stationary objects can cause the renderer to chug a little bit. This is something I've struggled with for FEED a fair bit, but ultimately it's manageable ime. I can't think of a case where 100s of movable entities should be all drawn in view of a client at once.


Thanks for making me take another look back at this; it's clearly not a perf issue from simply having "too many moving objects", because you can easily trigger it with even just 20 (and probably even less).

I was making the mistake of lowering the conveyor belt speed to the minimum 0.025, under the impression that more movement = more lag. But it looks like 0.075 or 0.1 are the prime spots; with even just 20 objs the client is now getting extremely chunky performance.
In response to Kumorii
Double post, but here's a short 30 second video showing the lag with 21 objects and a conveyor speed of 0.075. At the end of the video, you can see that the game suddenly starts performing better when the objects are allowed to go offscreen for a moment.


E: Ah, no <video> tag support...
If your theory was correct, the same number of moving objects would cause this issue if they did not overlap into the same tiles at any point during their movement cycle, yes?

You should test it that way.

The issue is not a BYOND bug. This particular false positive is caused by a very understandable, but incorrect reckoning of the workload this particular situation puts on the server.

The real problem is that since all of these objects are overlapping, and moving one after the other, you wind up with an insane number of Cross()/Uncross() calls under the hood that wind up being a major performance issue.

In reality, the engine can handle tens of thousands of moving objects. The problem is when they start to get stacked up on top of one another. The more clustered movable activity is in the engine, the worse it handles it.

On the other hand, you may have found a new wrinkle to the problem if what you say is true, and this can be triggered by just 20 objects with minimal overlap.
In response to Ter13
there's almost definitely something wrong on the engine's end here. i've played around with this for an hour or so, testing out different ways to handle the movement and can't get it to go past 600+ objects for more than a minute or so before tanking

i just tested 30 objects with no overlap at all and it still has the problem. profiling the client shows that GetMapIcons usage continues to grow over time as if there were some sort of "leak"
Ter13 wrote:
The issue is not a BYOND bug. This particular false positive is caused by a very understandable, but incorrect reckoning of the workload this particular situation puts on the server.

The real problem is that since all of these objects are overlapping, and moving one after the other, you wind up with an insane number of Cross()/Uncross() calls under the hood that wind up being a major performance issue.

In reality, the engine can handle tens of thousands of moving objects. The problem is when they start to get stacked up on top of one another. The more clustered movable activity is in the engine, the worse it handles it.

On the other hand, you may have found a new wrinkle to the problem if what you say is true, and this can be triggered by just 20 objects with minimal overlap.

Are you always this condescending or did I just catch you at a bad time? The amount of arrogance here is incredible. "If what you say is true" -- did you even bother looking at the video, or running the test case, or indeed investigating anything before you ran in here deciding that this was clearly not a bug at all, but a limitation of the engine because poor little BYOND can't handle 200 circles with no code moving around?

It's incredibly frustrating to narrow down a reproducable test case, provide some information on what's going on, and even demonstrate various different ways that this issue plays out, only for you to come in here and decide that it's not actually a bug but a completely unrelated non-issue, without even (apparently) looking at anything I posted.

Like, seriously. "If what you say is true" -- I posted a god damn test case. You can download it. You can run it. You can see the exact same problems that I'm reporting. It takes maybe three minutes at most because all you have to do is extract it, click the DME, and then hit "Compile + Run". The least you could do is fucking try it before you treat me like I'm clueless.




Anyway, I recorded a new video showing that this occurs with even five objects, it just takes a decent while longer for it to manifest. This is also running as a local instance and not a server, but you can see that the CPU % barely flickers to even 1% during this even though the game is visibly slowing down.



And here is the updated test case (6 KB). It is configured exactly as I have it in the video and you can reproduce the results by turning on the spawner and waiting.

The video includes another demonstration of 200+ objects and shows the CPU% meter functioning and the slowdown becoming far more severe. Also demonstrates that making the movement faster seems to paradoxically cause less slowdown!


E: Music not included, of course :)


E2:

i just tested 30 objects with no overlap at all and it still has the problem. profiling the client shows that GetMapIcons usage continues to grow over time as if there were some sort of "leak"

I've gotten it to run really poorly before 'snapping' back to full speed for a bit before it slows down again. It sounds like a leak is plausible, with the GC running causing it to speed up again until the leak starts to manifest. Nice catch!
In response to Zamujasa
I think you took a lot of offense from something that didn't have any hostility behind it. I digress though.


but a limitation of the engine because poor little BYOND can't handle 200 circles with no code moving around?

BYOND does not handle hundreds of entities moving around on a single screen at once very well at all. It never has and controlling the amount of objects and mobs you have in view of a client is one of the headaches with developing games on the engine. Not that I think this is explicitly the issue you're encountering.


I have gotten the test case to produce the described performance issues at ~150 spawned entities. However, I'd be willing to argue that it could be to do with your code. This test case is littered with pretty poor practices. You're abusing New() with loops a ton, not using any ..() calls, and I can't help but get the feeling that using walk() for your conveyors is a big part of the performance tanking. The walk() procs are pretty terrible and I often avoid them at every possible turn because just a few simultaneous walk() calls absolutely tanks performance. Also, walk() doesn't stop until told to stop. I don't see where you're telling your things to stop walking unless the conveyors get disabled. I think step() would be much better in this case.


After some fiddling around, i'm tempted to say the issue is with using walk() and calling it every single tick; often leading to multiple walk() calls being called on a thing before it moves out of the conveyor space.

In response to Kumorii
E: I'm not sure why this first part is showing up in bold. BYOND.


Kumorii wrote:
I think you took a lot of offense from something that didn't have any hostility behind it. I digress though.

It might not have been hostile, but it was arrogant to the point of farce. I outlined exactly why I was unhappy with that reply, and pretty much everyone that I've shown this who develops BYOND things agrees that reply was remarkably shitty and dismissive. But I digress, because it happened again:


but a limitation of the engine because poor little BYOND can't handle 200 circles with no code moving around?

BYOND does not handle hundreds of entities moving around on a single screen at once very well at all. It never has and controlling the amount of objects and mobs you have in view of a client is one of the headaches with developing games on the engine. Not that I think this is explicitly the issue you're encountering.

I'm not sure why you are all so quick to say that BYOND's engine is garbage. This is really confusing to me, because while BYOND definitely has its share of performance woes, "moving around 5 objects" is not something I would consider to be difficult (again: see the above video.) Even at 300 objects spinning around, BYOND is barely reporting 15% cpu usage. This is a client perf issue, not a BYOND game engine issue.




Here. I have attached yet another video because two wasn't enough.

This video has two clients with the code running in dreamdaemon. You can VERY CLEARLY!!!!! see that the engine itself is fine. The game is running normally. The client is choking to death on this. The client (renderer) is incapable of handling it.



You can see that one client that keeps only a small part of the conveyor belt runs fine (mostly), while the one with the full thing in view starts to die and in fact actually became completey unresponsive during recording (it got better). If this doesn't prove that it's a client side renderer then I'm not sure I could convince you the sky is blue either.



I have gotten the test case to produce the described performance issues at ~150 spawned entities. However, I'd be willing to argue that it could be to do with your code.

(sigh)

This test case is littered with pretty poor practices.

Yes, it's a test case, developed to do as little as possible beyond causing the bug.

You're abusing New() with loops a ton, not using any ..() calls,

If you can demonstrate that any of this matters I will eat a corn dog.




and I can't help but get the feeling that using walk() for your conveyors is a big part of the performance tanking. The walk() procs are pretty terrible

well boy howdy is it a good thing we have a bug report here about how bad they are so they can be fixed, huh!!!!!!!!!!!!!!!!

and I often avoid them at every possible turn because just a few simultaneous walk() calls absolutely tanks performance. Also, walk() doesn't stop until told to stop. I don't see where you're telling your things to stop walking unless the conveyors get disabled. I think step() would be much better in this case.

After some fiddling around, i'm tempted to say the issue is with using walk() and calling it every single tick; often leading to multiple walk() calls being called on a thing before it moves out of the conveyor space.

Instead of peanut posting, I actually changed the walk() calls to use step() instead.



walk() says "just keep moving in this direction until you're told to do otherwise, at a set speed". That's exactly what I want a conveyor belt to do. To use step(), I now need to consider when something entered the space, how fast it was moving (because as you can see, the change is instant -- things are moved multiple times in one tick because they hit the conveyors as they're processed) and when something exits it. It involves a lot more overhead that is immensely simplified by using walk().
For being upset about perceiving others as being shitty and dismissive; you're being awfully dismissive and shitty. But i digress.


Best of luck with figuring out whatever issue you're running into here. Cheers.
Note: this issue only occurs when `world.movement_mode` is set to `LEGACY_MOVEMENT_MODE`
Kumorii wrote:
For being upset about perceiving others as being shitty and dismissive; you're being awfully dismissive and shitty. But i digress.

I mean, if y'all had shown even slightly that you looked into the problem and its causes beyond "I think it's this, and I'm going to disregard everything you've said that provides evidence to the contrary", things would be a lot different. If one of you had said "I tried using step() but I'm not seeing the problem, it might be walk()" that's hugely different than "your code in this very limited test case is awful and also byond is just incapable of handling so many objects"


Tarmunora wrote:
Note: this issue only occurs when `world.movement_mode` is set to `LEGACY_MOVEMENT_MODE`

Tarm, myself, and some others have been looking into this on the Goonstation discord and this is one of the things that Tarm suggested that turned out to have a big impact. Though the visuals look quite different and some other oddities start to happen if the numbers go high enough, it does solve the slowdown.

Here is the world with world.movement_mode set to TILE_MOVEMENT_MODE.

- The slowdown bug no longer seems to happen.
- Visuals will glitch out and stutter, especially with higher object counts.
- The visually distinct conveyor path turns into a sort of limp noodle.
- Occasionally, objects will explode outwards to their actual(?) positions.
- Later on, the game will suddenly freeze for a moment before flinging many objects around, often completely off the belts.





E: I also added a simple obj to block visibility with opacity = 1 and it does not affect the slowdown issue. You have to physically move out of range of the problem; even completely hidden behind opaque objects it still occurs.
In response to Zamujasa
the path "noodling" is a glide_size issue, if you set the glide_size to 32/(move_lag/world.tick_lag) it'll stop cutting corners. it's running smooth for me with 700+ objects when in TILE_MOVEMENT_MODE



while not seen in the gif i am seeing some objects fly off the belt, but that's definitely unrelated to the bug report
(edit: i changed to step() from walk() and the objects no longer fly away)

this bug report is something wrong with LEGACY_MOVEMENT_MODE
Video demo of the issue occurring over the course of an hour with 1 object

https://www.youtube.com/watch?v=-z0G2ZmJsZ8
In response to RootAbyss
RootAbyss wrote:
the path "noodling" is a glide_size issue, if you set the glide_size to 32/(move_lag/world.tick_lag) it'll stop cutting corners. it's running smooth for me with 700+ objects when in TILE_MOVEMENT_MODE

while not seen in the gif i am seeing some objects fly off the belt, but that's definitely unrelated to the bug report
(edit: i changed to step() from walk() and the objects no longer fly away)

this bug report is something wrong LEGACY_MOVEMENT_MODE

sweet catch! i think the things flying off might be related to multiple steps being taken before a loop procs, so it 'steps over' the conveyor that would redirect it. step would fix this since it'd only ever move once.

using step() this way, do you get the same jumping behavior where things will visually skip over segments like in the GIF i posted earlier? i'm not sure how to solve that one without additional tracking to make an object not get moved multiple times in a single tick.

E: This GIF in specific. I changed walk() to step() with no other changes.
In response to Zamujasa
yea, to fix the skip issue i added a move delay variable that just tracks when the next step should be allowed to occur and added that as a condition for the continue
No offense was intended, but it was taken, and ultimately I have a part in that. I'll bow out, because there are new changes to the implementation that I haven't been able to fully test that could be impacting this report, and you are ultimately correct that my approach was lacking grace or tact.

In my defense, this is a situation that was extensively tested back before the 500 series, and the engine never performed well, and at the end of it all, it was determined that the impact, given the choices of the implementation was reasonable.

The "if what you say is true bit" wasn't meant to be a sleight, but a self-aware bit of conversational charity that was meant to try to undercut the tone of dismissal the above could have been read as if read without charity.

But ultimately, you are right. I had very little research in the topic or hard metrics, given my current time constraints and never should have butted in without taking that time to gather metrics. Apologies for the offense.
In response to Ter13
How dare you get out of this gracefully? Get back in the mud!
Page: 1 2