May 23 2016, 2:56 pm
|
|
If there's no name overlap, then I think combining your turfs into one file could be a big help. (Although it'd probably make the big file a pain to edit the file in DM.)
|
I'm wondering if preloading resources is working properly, if this behaviour sounds off: The first few uses of a spell tends to result in an FPS drop, but future uses seem better. Our testers report this consistently too.
Using the spell 'Leaping Strike': First use, 290ms http://puu.sh/p4NkT/df80aec62e.png http://puu.sh/p4NvV/5af5c66e78.png Second use, 108.5ms http://puu.sh/p4NqF/047eab62c9.png http://puu.sh/p4NDR/fc903bf1c2.png I'll provide a more thorough test case of this later tonight. |
So I figured I'd give this a look on the new stable 510.
https://tgstation13.org/msoshit/ CPU-20160525b510.1345ss13.cpuprofile I put a local test /tg/station13 on the web client. During this i had an object follow me around that has 77 smaller objects orbit around it (using loc setting on moves and a static 36 step animate loop of turn() on an already shift()ed transform.) (it was the only client side expensive thing i could think to do, and the number of orbiting mini objects is normally not this high) over all the orbiting ball of energy and lightning bolts and 77 mini orbiting balls of energy and lightning bolts worked rather well. There was however a bit of lag shutter about every second or second and a half. The only expensive and choppy thing was moving, mainly from all the objects we tend to have on screen. NOTE: /tg/station isn't using the webclient on the live servers yet, we need the verb bug fixed (and some other bugs that i haven't spent the time to pin down enough to know it isn't something in our code) because it exposes game state info that is suppose to be private. So don't spend too much time/effort on resolving any thing in that profile. I just figured i'd add another point of data to the plot that might expose some easy targets. |
In response to Pixel Realms
|
|
Pixel Realms wrote:
... Reduced the spell's cooldown to 0 and took a live profile while spamming it. #1, 280ms http://puu.sh/p5yWE/84c1a37fc2.png #2, 14ms http://puu.sh/p5z00/012748282e.png #3, 18ms http://puu.sh/p5z0P/68ecb6e105.png The following 10+ uses are similar: under 20ms. The initial spike is caused by these two functions, which following uses are absent of: http://puu.sh/p5z48/4b22ae878b.png So hopefully something can be done there. |
For a snapshot of performance during combat, using a variety of spells: http://puu.sh/p5zwv/f71496b273.png
Those dark orange spots are long frames with an upwards of 60ms+, with the average being 100 or so. They're mostly caused by 'Image Decode' and 'getImageData'. |
Again though, the .png snapshots of a live profile aren't a lot of help. I need a real saved profile to work with, so I can study it in depth.
The getImageData() calls are from the mouse foo, and that's not fixable until Chrome fixes the bug where it holds onto getImageData() memory inappropriately--at which time I can finally implement alpha preloading. (Although there is one angle I can come at this from: Maybe it's calling getImageData() repeatedly on the same icon, in which case that's probably something I can handle with a short-term cache.) texImage2D() and Image Decode are internal stuff. The former is a call used by WebGL and the latter is strictly browser-internal. [edit] I'm wondering if it's feasible to force the icon to "display" in a hidden canvas on load, which would front-load all the decoding and texImage2D calls. That might be a long shot because I could see all kinds of ways that could go bad, but it's something I can at least try. |
Sure thing: http://puu.sh/p5SIi/aa7647607a
|
In response to Pixel Realms
|
|
Pixel Realms wrote:
For a snapshot of performance during combat, using a variety of spells: http://puu.sh/p5zwv/f71496b273.png Check out the difference as of 510.1346: http://puu.sh/puE7c/de5357b55b.png Here's a profile of moving around/using spells: http://puu.sh/puEeY/476306c058 So it looks like the map tick changes were a massive help. At this point I'm very pleased to say that we're good to test publicly with the webclient and see how performance is for everyone else. |
Have been a little absent lately but Severed is being worked on again now and we're hoping to put up a public server sometime in August.
I'm noticing that the call tree of the worst long frames in the latest BYOND version look pretty similar: http://puu.sh/qiqev/b5b719aec4.png http://puu.sh/qiqf7/0295a9ba97.png http://puu.sh/qiqfR/c314fa1ebb.png http://puu.sh/qiqgi/401eb12067.png http://puu.sh/qiqh2/70f38d5c60.png Seems to all be winset related. Profile: http://puu.sh/qiqrf/09678543e8 |
In response to Pixel Realms
|
|
In that profile info I'm seeing the bulk of the time is taken up not by any of the winset() processing, but by whatever function it's calling; notice the predominance of callMethod(). I believe this is likely to be in your own JS code for one of your custom controls. You might want to compare the functions being called to the ones in your code to see.
At a quick guess I'd surmise jQuery is slowing you down a little, and it's also possible you may be calling winset more often than you need to. If it's something simple like updating a stat bar on the interface or something like that, then I would recommend adding a throttle to your JS code so the meat of it doesn't run so often. |
In this particular case, it was some inefficiencies in our code.
The bulk of our performance slowdowns, however, are related to map drawing still. |
Image Decode and textImage2D seem to be the main culprit in all of the bad jank in my latest profiling:
http://puu.sh/qmZai/7f99cfcd0c.png http://puu.sh/qmZb3/89e4feb37b.png http://puu.sh/qmZbC/6f8a0914ae.png Other than the occasional stutter, things are mostly okay on the mediocre machine I'm using to test, so that's cool. The map tick changes helped a bunch and I'm wondering if anything else can be done there; Chrome's FPS meter is giving me all sorts of weird readings despite an overall improvement so I'm suspicious that there's still issues there, but I *think* you clarified otherwise in a pager conversation a while back. |
Image decoding and texImage2D() aren't anything I can make any faster. The decoding process is what it is; I wish it were something I could force it to do ahead of time, though that's a tricky business. texImage2D() is the process of actually sending texture data to the video card via WebGL, and while there may be some internal things on the browser that potentially make it a little slower than a native call, it's probably not by much.
The main issue would be to try and prevent too many of these calls from occurring within the same frame, but I don't know if that's feasible at all; it might be the kind of thing that has to be taken into consideration by the game author, by grouping together as many icons as possible into bigger sprite sheets. |
Ah, that's not great, since it's causing some nasty pauses. If you do figure out a way to speed them up--like spreading the calls out between frames or sending the data out ahead of time, as you say--then it'd definitely be worthwhile to look into. We'll rework our icon sheets and see how that helps, though I think they're already pretty packed, but I'm sure more can be done.
Other than that I'm not really seeing anything on the same level, but any trimming in general you can do based on past profiles etc would help as always. Overall things seem pretty OK and I'm hoping that the majority of our users will be on half-decent machines... |
The problem is I can't control how the calls are made, because they're done when a new icon is drawn. The decoding and texImage2D() calls are inherent to the act of displaying an icon for the first time.
There's maybe, maybe something to be tried in which I could attempt to display icons in a hidden canvas during the load process. I don't know what this would do to memory or if indeed the browser wouldn't simply discard half of these when it was done, rendering the point moot. But it's worth a look anyway. |
Yeah, it seems like that's the only function causing major stutters, at least on my machine. But any spots that you see where trimming can be done are also valued and will make everything feel more 'smoother', since a lot of frames during play are hitting 40ms+.
The dart stuff, like invokeClosure, seem to appear costly but I don't really know what I'm looking at here. Either way we'll try to get a public server up soon and see how it plays for others. I don't know whether this is helpful or not but here's a sample of the call trees of various 50ms+ frames: http://puu.sh/qpwc2/c803c6731b.png http://puu.sh/qpwd6/faacc89ac0.png http://puu.sh/qpwdG/d4c432a0f9.png http://puu.sh/qpwew/e167f793ed.png http://puu.sh/qpwfu/9a85fb4d3d.png http://puu.sh/qpwgB/f6bdcb9a38.png http://puu.sh/qpwhx/2f83ce0eb9.png http://puu.sh/qpwiB/84108bd0a2.png A lot of them all look pretty similar to me so maybe something can be done there. |
In response to Lummox JR
|
|
Lummox JR wrote:
There's maybe, maybe something to be tried in which I could attempt to display icons in a hidden canvas during the load process. I don't know what this would do to memory or if indeed the browser wouldn't simply discard half of these when it was done, rendering the point moot. But it's worth a look anyway. Just re-emphasising that ImageDecode and textImage2D cause the most stutter on the webclient, with frames as big as 400ms, as I assume you've seen in your own profiling. We'll be trying out your suggestion of combining our tiles into larger sprite sheets tonight hopefully, but if there's anything you can do here to speed things up it'd put the webclient in a much better place. |
So our tests with grouping tiles together isn't seeming to make much of a difference, but it's hard to quantify. In the starting city, for example, only 4 .dmi's are in play for this map as far as turfs and objects go, but I'm wondering if tiles elsewhere will have an impact on performance? Areas where the player isn't present?
Do we want to group together every single 32x32 tile, for example? This'd be pretty tedious on our end, so I want to make sure whether or not it would be a worthwhile difference. Most areas don't have any more than 4 .dmi's of turfs/objects, but I'm wondering if texture changes include every single .dmi rather than the ones onscreen. |