Have you guys upped the version of GCC you're compiling with lately? The debug build (-g -O0) I've been given seems to be behaving fine. I'll continue watching the servers, and we'll see.
|
Nope, that's built on the same setup as the release build. However, the debug setting will use a few different parameters for fault checking. I will get you a release build with the -g flag so it will be identical to your previous setup.
|
Its definitly something related to the os or libraries.
In my tests, an old 32bit server running ubuntu 10.04 had no freeze on the same byond code. But one on ubuntu 12.10 64bit with the ia32 libs for compatibility had the issue. Also could be related (could also not be). but if you enable export MALLOC_CHECK_=2 before running the server, it will freeze a lot more during a malloc check, in a seemingly infinite loop as well. |
Jey123456 wrote:
Its definitly something related to the os or libraries. I ran a 64bit version of 12.04 with BYOND 498, and 496, and it would run a server for over a month without issue. I only noticed the issue when 499 clients started joining my 498 server. |
yea, i never said it was also present in older versions (never tried in older versions), but its definitly here in 499 and seem highly related to the libc libraries.
Am currently running a server on a 32bit os using the libs provided on http://www.byond.com/download/build/gcc/ . Ill see in a few days if it does happen here (we sometime get 20hours before it freeze, so 48hours should be a good benchmark) |
Updating this thread. I'm debugging with Stephen now and we've solved the Insert() issue, which will be a bug fix in the next build (499.1200). The ProtoStrCompSigned() hang is a separate issue but we're working on that now.
|
Lummox JR wrote:
In particular you should avoid calling skin procs in client/New(); the skin is not initialized at that point. Are you saying this is not recommended? client/New() Just trying to see if those are considered "skin procs" or not... |
What else could a skin proc be other than win___()?
(also, client/New() has a return value you're not setting) |
Kaiochao wrote:
What else could a skin proc be other than win___()? I was afraid that's what he was referring to, because without doing this (my actual project handles it much neater, this is just a quick example from a test environment) I get a lot of "inaccessible verbs" when crossing a server boundary. The only place in the entire code that works for setting the keys and removing this inaccessible verb problem is doing it before ..() in client/New() |
You should be doing the winsets in Login(). At that point the skin is definitely initialized. However there's really no reason for those macros not to be in the skin already.
|
I guess the reason is wanting to either load the client's custom keyboard configuration or simply wanting to avoid the bug. If you put those into mob/Login(), you get "inaccessible verb" errors when transitioning through a link() to the same game. They really gotta be there before client/New()'s default code is called in order to eliminate the inaccessible verb bug.
I don't really want to use BYOND's custom skin to allow the user to create his own key commands, I'd rather handle it with code and server-save files. In fact, I want to use CONTROL_FREAK Edit: Nevermind, this is not consistent. I've tested this in more complicated client/New() code blocks, and still get inaccessible verb problems. I've had this problem every since we were able to add macros using winset |
well, little update.
We are past the 24hour mark since the last freeze after the change to LD_PRELOAD the libs provided in http://www.byond.com/download/build/gcc/ on ubuntu 10.04 32bit. Considering we very rarely ever reached 20hours. thats a good sign. We will see if it maintain over the weekend. |
Another update. Running on the libc provided by byond definitly had an impact. we had quite a while before the bug showed up, and the bug slightly changed in nature
it is now freezing in #0 0xb72f7eab in DelProtoStr(unsigned long) () from /usr/local/lib/libbyond.so #1 0xb7311809 in SetSoftObjectVar(Value, unsigned long, Value) () from /usr/local/lib/libbyond.so #2 0xb7345c64 in ?? () from /usr/local/lib/libbyond.so #3 0xb7346908 in ?? () from /usr/local/lib/libbyond.so #4 0xb73504e5 in ObjectWriteVar(Value, unsigned long, Value) () from /usr/local/lib/libbyond.so #5 0xb7370506 in ?? () from /usr/local/lib/libbyond.so #6 0xb7364391 in ?? () from /usr/local/lib/libbyond.so #7 0xb73521ef in ?? () from /usr/local/lib/libbyond.so #8 0xb73521ef in ?? () from /usr/local/lib/libbyond.so #9 0xb736d059 in ?? () from /usr/local/lib/libbyond.so #10 0xb736d505 in TickProc(long) () from /usr/local/lib/libbyond.so #11 0xb733450a in ?? () from /usr/local/lib/libbyond.so #12 0xb7403b07 in TimeLib::SystemAlarm() () from /usr/local/lib/libbyond.so #13 0xb73d4cea in SocketLib::WaitForSocketIO(long, unsigned char) () from /usr/local/lib/libbyond.so #14 0x0804a3ee in ?? () #15 0xb6c14bd6 in __libc_start_main (main=0x8049f80, argc=6, ubp_av=0xbfb616c4, init=0x804bbe0, fini=0x804bbd0, rtld_fini=0xb772b080 <_dl_fini>, stack_end=0xbfb616bc) at libc-start.c:226 #16 0x08049ed1 in ?? () never able to actually return from DelProtoStr(unsigned long) we also had a BUG: Bad ref (6:79622) in DecRefCount(DM mob.dm:701) very shortly before </_dl_fini> |
Murrawhip wrote:
Did a ProtoStrCompSigned() fix manage to get in 499.1200? No. The routine isn't actually broken. Nor is the calling routine, ProtoStrSearch(). What's happening is that at some point, the string tree is getting borked. When I last tested with Stephen, the project he was hosting froze up on attempting to add a new string when it found one in the tree that had itself as its own left child. The code says this is impossible. Therefore what has to be happening is that heap corruption is throwing a string out of whack after it's been added to the tree, which could cause all sorts of nasty issues. The only way I know of to catch heap corruption would be to run something like valgrind or a Windows equivalent. Since this is in a Linux environment, valgrind should be an option. |
Jey123456 wrote:
we also had a BUG: Bad ref (6:79622) in DecRefCount(DM mob.dm:701) I think this is highly relevant. Can you show me the code around the area where this bug occurred (if you have access to it)? Whatever is mangling the string tree may have also messed up the refcount on another string, causing it to self-destruct too early. Again though I strongly recommend attempting to run any such project in valgrind, because heap corruption seems to be the key to the problem. |
problem with valgrind. is that the bug is very rare. Require players on the server in the 20-25+ and even then, it only seem to happen once per day at most.
With valgrind, the server is so laggy that its impossible to actually play on. |
as for the code around mob.dm Its a pretty simple thing.
http://sebsauvage.net/ e/?3dfc0a6675aeeb6e#Wsudc5P34n467tDxtSZlKJHx73Wtrl66QxNYDrz/ LfE= It happened in another string before the freeze. |
Just had a freeze hosting simple chat..
Mon Jul 22 05:01:08 2013 BYOND 4.0 Public (Version 499.1197) on Linux (should update but meh) Ubuntu 12.10. Game has been running fine for the past month or so until this. Log - http://sc.byondpanel.com/log - this is a symlink so it'll be updated as per the current log file, ignore Wed Jun 19 05:19:20 2013 I forgot the images folder. |
More this time right after an upgrade to the latest 1201
http://eternia.byondpanel.com/bp/Eternia.log This is getting frustrating not only for me but my clients, in this case I just updated Chances (Eternia Roleplay) to the latest BYOND Linux and this has happened.. |
Unfortunately the freeze you're seeing in Insert() makes absolutely no sense unless the StringEditor object has been hosed by heap corruption, in which case it's already too late to trace the problem. Running valgrind could potentially produce a clue, however, since that could catch heap corruption. The only caveat is, I'm afraid there's a decent chance you'd get offsets for that that are just as broken as the ones you've had all along. Something on your system is frelling with these files in a big way, because I've never before seen Linux traces be so completely off; I know I'm using the right Linux build since you're not using FreeBSD.