I've spent the last week or so creating a database PHP script on a webserver, and it's gone peachy until I realized I don't know what function(s) would accomplish the same effect as world.Export in DM. I'm looking to send a topic string to a BYOND game server in much the same way I invoke PHP script with world.Export; how would I go about accomplishing this?
[I'm not too wild about sticking the data I need in a text file and using world.Export twice in succession, so I'm saving that for my last option.]
ID:276414
Sep 4 2005, 11:03 am (Edited on Sep 4 2005, 11:15 am)
|
|
In response to Crispy
|
|
This is going to require a bit of explaining as I'm not very knowledgable of these sorts of things, but I'm determined to complete this script (and learn something in the process).
1.) When I'm sending this data over the socket, do I send the literal hex value in the format you've used, do I use decimal (ala hexdec), or is the data required to be formatted in another fashion? 2.)As for the length of the data, does that mean the character count or something else? 3.)What -is- an unsigned big-endian integer and what format does it use (E.G., must be comprised of six digits, must have a decimal)? 4.)When structuring this query, do I just string all the values together in a big lump or must I use a delimiter of some sort? |
In response to Mobius Evalon
|
|
Mobius Evalon wrote:
> This is going to require a bit of explaining as I'm not very knowledgable of these sorts of things, but I'm determined to complete this script (and learn something in the process). Fair enough. I'm willing to teach if you're willing to learn. =) Some of this stuff I've only properly learnt about in the last few years myself, so I can sympathize if you're a bit lost! > 1.) When I'm sending this data over the socket, do I send the literal hex value in the format you've used, do I use decimal (ala hexdec), or is the data required to be formatted in another fashion? Time for a crash course in digital theory! Forgive me if this is too detailed... I got a bit carried away. But you did say you wanted to learn. =) (The last two paragraphs may be all you really need to read, though understanding the rest is important for background knowledge.) A bit is a single binary digit - 0 or 1. You probably knew this already. A byte is eight bits. It can store an eight-digit binary (or "base 2") number; for example, 11101101. In our numbering system, which is known as decimal (or "base 10"), this is 237. It's the same number, it's just represented in different ways. It's often convenient to work in a numbering system that's close to binary - but binary is really annoying to work with, so hexadecimal ("base 16") is often used instead. (Why 16? Because it's a power of two, as in 24 = 16. Powers of two crop up ALL THE TIME in computer science - it's basically because we're working with binary numbers.) As we're using base 16, we can't just use 0 to 9 as our digits - so we use the letters A to F as well. "A" is 10, "B" is 11, and so on. Compare the number 237 expressed in binary and hexadecimal: Binary: 1110 1101 Hex: E D (I've split up the numbers with spaces so you can see it better.) You can see that each hexadecimal digit represents four digits - "E" is 1110, and "D" is 1101. Note that each base 16 digit equals 4 binary digits - and 16=24 (binary=2). See why we use hexadecimal? It matches up to bits and bytes very nicely. Okay, so we know how numbers are stored in computers. But what about text? Well, computers can't store text, only numbers. Your CPU has no concept of "text". But obviously computers can store text SOMEHOW, because otherwise how are you reading this...? =) And indeed they do - but they do so as sequences (or "strings") of numbers. Every letter and every symbol has its own number. Which number means which letter? That depends on what's called an encoding. An encoding defines how the numbers in the computer's memory get translated to letters and words on your screen. (There are other kinds of encodings, too; the word is also used in the contexts of video and sound compression, for example. But we're talking about text, or string encodings.) Encodings can be messy, annoying things to deal with. Thankfully, all we need to use at this point is ASCII encoding, otherwise known as "plain text". This is a standard encoding that absolutely everyone uses. It's mostly only useful for English and other Germanic languages. (Now aren't you glad we don't speak Chinese? I'd have to teach you about Unicode. =P) Aaaanyway... the thing I'm getting at here is the difference between 123 as a number and "123" as a string. If I were to transfer the number 123 over a network socket, I would send the following byte: 01111011 (as decimal: 123). But if I were to transfer the string "123", I would send the following three bytes: 00110001, 00110010, 00110011 (as decimal: 49, 50, 51, which are the ASCII codes for the characters "1", "2", and "3" respectively). When I refer to 0x## (for example, 0x2a), that's a hexadecimal number. (Why the 0x? Because that's how the C programming language does it. See http://en.wikipedia.org/wiki/ Hexadecimal#Representing_hexadecimal.) I don't mean that you should send the string "0x2a" - I mean that you should send the hexadecimal number 2A, which is 42 in decimal and 00101010 in binary. This number will take up a single byte (as opposed to the string, which would take up at least four bytes, which is bad - we want to minimize the amount of information that we have to send over the internet, so that it's faster). And now, because I can't be bothered to go and look it up after typing all that, here's your homework: Find out how to put hexadecimal numbers into a string in PHP, so that the number 0x2a takes up one byte. (That is, you can get the length of the string containing 0x2a, and it will be 1 - not 4.) > 2.)As for the length of the data, does that mean the character count or something else? In this case, yes, the character count. (Technically, it's the number of bytes used to store the data in memory - but plain ASCII text, which is what we're dealing with here, uses one byte per character, so it makes no difference. For future reference, though, do be aware that this may not hold true for all encodings - for example, "wide" strings use two bytes per character, UTF-8 uses however many bytes it feels like using, and there are many other encoding schemes that all have different properties.) > 3.)What -is- an unsigned big-endian integer and what format does it use (E.G., must be comprised of six digits, must have a decimal)? Warning: If you haven't read the crash course above, you may not have any idea what I'm talking about here. =) Firstly, the word "integer". It means whole number. The number 3 is an integer; but the number 0.5 is not. Simple enough, yeah? Secondly, "unsigned". I established above that there are lots of ways that we can represent text as numbers. Unfortunately, it gets even more complicated - you can represent numbers in different ways too. (Will the madness ever stop?!) One of those ways is distinguishing between signed and unsigned numbers. This has nothing to do with signatures - it simply means whether or not the number can be negative. A "signed" number is a number that can be negative (i.e. it has a "sign" - as in a positive or negative sign). If we wanted to write a negative number in binary, we'd probably do something like this: -0001 1101. But computers can only store 0s and 1s, so we have to use one of those to represent a negative number, and the other to represent a positive number. Maybe like this: 00001 1101, or 10001 1101 if we want to store a positive number. (I've chosen 0 for negative because it's smaller.) But wait: That's 9 bits, not 8. That won't fit in a byte! What to do? Steal the first bit of the byte, that's what. So +42 is 10101010, and -42 is 00101010. (Edit: As pointed out by nick.cash below, this is not strictly true; all computers since a fair few decades ago use "two's complement" to store binary numbers. See http://en.wikipedia.org/wiki/Two%27s_complement if you're interested.) An "unsigned" number is what I was talking about above. Sometimes you don't need to be able to represent negative numbers - so rather than waste a whole bit, we use the full 8 bits to represent the number. So 42 is 00101010, and -42 doesn't exist. =P Note that 10101010 means 42 if we interpret it as a signed number, but 170 if we interpret it as an unsigned number. Thus it's important to know which one we're using. (Edit: My numbers are off here, but the concept is solid; the same pattern of bits can mean a different number depending on whether you treat it as signed or unsigned.) And finally, "big-endian". Representing numbers as single bytes is all well and good, but what if we want to store a number greater than 255? (Which is the largest value that can be stored in a one-byte unsigned integer.) I know my bank balance is bigger than that; I wouldn't want the bank to just forget about the rest of my money. =) Answer: We use two bytes. Or four, or even eight. But, as usual, it's more complicated than that. "Endianness" refers to which order bytes are stored and transmitted in. Big-endian means the order you would expect; the hexadecimal number 4A32 (which takes up two bytes) is stored in the order 4A, then 32. But there's also little-endian, which uses 32, then 4A. Confusing? Yes. Stupid? Yes. But that's how it goes. Obviously the number 4A32 is different from 324A - so it's important to specify whether we're using big- or little-endian format. > 4.)When structuring this query, do I just string all the values together in a big lump or must I use a delimiter of some sort? Just lump 'em together, in the order specified. If you do it right there shouldn't be any problems. ----- Whew! That's a lot of info to take in. Read it slowly, and make sure you understand most of it. I'll explain things further if you're confused. Yes, I could have just written the PHP code for you. But how would you learn anything then? ;-) If you get completely and totally stuck and want to give up, I might get around to programming world.Topic() stuff in PHP myself one of these days. If you're lucky. |
In response to Crispy
|
|
So +42 is 10101010, and -42 is 00101010. What, no explanation of 2's complement?! No one uses sign bits for integers :-) |
In response to nick.cash
|
|
Quiet, you. No need to complicate things further. =P
|
In response to Crispy
|
|
Okay, after four days I've discovered how to insert hexadecimal values into a string resulting in a one-byte space, and that's to lead with a (back?)slash(I always know which one is appropriate, but I don't know which is the forward and which is the backward);
echo(strlen("\x00\x83"));
That results in a length of 2. Two more questions; Am I required to send the entire query in hex, or is it supposed to be "(hex)(integer)(hex)(ascii)(hex)"? As for the byte count/data length, I'm not entirely positive how to make it occupy two bytes. Do I just prepend an "\x00" if it consists of one byte or must I accomplish that another way? [Edit: thought I'd include this to show progress; function export($addr,$port,$str) |
In response to Mobius Evalon
|
|
Mobius Evalon wrote:
I always know which one is appropriate, but I don't know which is the forward and which is the backward \ = Backslash and / = Forwardslash. You can remember it by the direction of the stroke when writting it. Either that or backslash is above the enter key, while forward slash shares the ? key. |
In response to Mobius Evalon
|
|
Mobius Evalon wrote:
Okay, after four days I've discovered how to insert hexadecimal values into a string resulting in a one-byte space, and that's to lead with a (back?)slash(I always know which one is appropriate, but I don't know which is the forward and which is the backward); Well done. =) It is the backslash. You can tell by looking at the way the slash is leaning. If the top is leant forward, that's a forward slash: / But if the top is leant backwards, that's a backslash: \ Am I required to send the entire query in hex, or is it supposed to be "(hex)(integer)(hex)(ascii)(hex)"? Integers must be converted to hexadecimal specially. Remember what I said about the difference between 123 and "123"? Well, when you concatenate a number and a string, PHP turns the number into a string first. So we'd be sending "\x31\x32\x33" (those are the ASCII codes for "1", 2", and "3" in hex) instead of what we wanted to send, which was "\x7b" (that's 123 is hex). Think of it as the difference between spelling out a number digit-by-digit (one, two, three) and just saying it (one hundred and twenty three). So we need to tell PHP that we want the actual value 123 in the string, not just the digits "1", "2", and "3" next to each other. ASCII strings can just be sent as plain ASCII, since they're encoded as hex in memory anyway. To encode the integer in a string as hexadecimal, you need to use an appropriate PHP function; pack will do. You'll want the "n" format (for 2-byte unsigned big-endian integers): $query = '\x00\x83' . pack("n",strlen($str)) . '\x00\x00\x00\x00\x00' . $str . '\x00'; As for the byte count/data length, I'm not entirely positive how to make it occupy two bytes. Do I just prepend an "\x00" if it consists of one byte or must I accomplish that another way? That's correct. Note that if it was supposed to be "little-endian", you would put the null byte after the value. However, it's "big-endian", so as you've noted we must prepend it. |
In response to Crispy
|
|
Okay, I've got the packet structured (I think). I looked into sockets in the PHP reference, but there's all sorts of socket-based connections; socket_create/socket_connect/socket_send, fsockopen, stream_socket_client, stream_socket_server... which is the one I want to use to send this packet?
|
In response to Mobius Evalon
|
|
You need to create the socket, connect it to the BYOND server, write your packet, read the response if you want it (I can tell you more about that once you've got writing working), and then close the socket. There's an example at http://www.php.net/sockets - scroll down to Example 2. The data you're sending is different (the example uses the HTTP protocol, which consists only of plain ASCII), but the principle is the same.
|
In response to Crispy
|
|
You mean if I ever get it working...
I'm pretty sure at this point it's just a wrong socket type or protocol, but this is what I've come up with so far; function export($addr,$port,$str) I'm connecting to the game server with world.address and world.port being what I'm using above for $addr and $port. I'm not getting any script errors and also no result on the server end, so what is it that's amiss? [Edit; Well that can't be right, I'm getting a length of 50 characters on the query with a 15-character $str.] [Edit*2; Interesting, apparently only a double-quoted string will correctly insert backslashed hex values. I changed the query string to be contained in double quotes and tried it, and got "BYOND(344.887) BUG: network message underflow (131,16)." in the BYOND server. It's getting there, just not in the correct order or something.] |
In response to Mobius Evalon
|
|
Yeah, PHP is like that about single-quoted versus double-quoted strings. Good catch.
It took me a while to diagnose this because my PHP installation was a bit dodgy - it wouldn't load the sockets extension until I reinstalled. Sorry about the delay. Anyway, I worked out the problem, and it's working perfectly on my machine. The problem is that the packet length isn't just the length of the string. Recheck Bob's reverse engineering notes that I posted above. =) 1 byte - 0x00 1 byte - 0x83 2 bytes - The length of the following data, as an unsigned big-endian 2-byte integer 5 bytes - all 0x00 X bytes - The message you want to send, in standard ASCII; this is what would follow the question mark, including the question mark. e.g. "?hello" will call world.Topic with href="hello". 1 byte - 0x00 In other words, the length shouldn't be X. It should be X+6. One last problem; socket_write is not technically guaranteed to send the number of bytes that you ask it to send. (It usually will, for short messages at least, but it's better to be safe.) You need to put it in a loop, and call it repeatedly until it sends the entire message, something like this: $bytestosend = strlen($query); (Edit: Tested and updated the above code with fixes.) Other than that, your code is perfect. Well done! |
In response to Crispy
|
|
After I updated to include that socket_write loop it wouldn't work for awhile... then I noticed I left out a backslash! The debug echo I put in the script said it was writing three more bytes than expected, and after staring at it awhile I noticed that missing backslash... now it works perfectly!
I can't thank you enough for the amount of help you've willingly given me over the last two weeks, but I can honestly say I've learned quite a bit about sockets and bitwise operations (not to mention some digital data history, how to insert hex digits into a string, the existance of and difference between -endians, among other things). Hee, I'm so excited that I'm actually able to write data directly into a BYOND server from a remote web server with PHP... |
In response to Mobius Evalon
|
|
No problem! You've done well. =)
|
This is the format of one world.Topic() packet, as reverse-engineered by BobOfDoom:
1 byte - 0x00
1 byte - 0x83
2 bytes - The length of the following data, as an unsigned big-endian 2-byte integer
5 bytes - all 0x00
X bytes - The message you want to send, in standard ASCII; this is what would follow the question mark, including the question mark. e.g. "?hello" will call world.Topic with href="hello".
1 byte - 0x00
Connect to the server at the appropriate hostname and port, then send the above data over the connection. If you want the BYOND server to send back some data (even just a "yes, I got it" response), wait for a response. If it comes, the response will be in this form:
1 byte - 0x00
1 byte - 0x83
2 bytes - The length of the following data, as an unsigned big-endian 2-byte integer
1 byte - The data type of the following data. Useful values are 0 (null, or unsupported data type), 0x2a (a floating-point number; 4 bytes, big-endian), and 0x06 (a null-terminated standard ASCII string).
X bytes - Depends on the data type, specified above.
You can send multiple world.Topic() calls without having to reconnect; i.e. you can use the same socket over and over. Remember to close it when you're done, though. =)