Reverse Engineering Zombs Royale
Both sides of my cheek are extremely numb having just came back from the dentist. But I need to post something to test my new site setup. So note that this post is mostly actually for me to get used to writing posts and testing out my site. Thank you.
So anyways,
Zombs Royale!
Yes, this game.
I was preparing material for an “Attacking Network Protocols” workshop (Yes, I stole the name from the No Starch Press book) on the night before an exam, and thought to myself “Hm, why don’t I try to reverse engineer some “io” games on the internet?”.
If you’re wondering what an “io game” is, it’s just an online game on a website with a domain TLD (top level domain, the most popular being “.com”) of “io”, the most popular being “agar.io”, which is what started the “io game craze” in 2015, with games like “slither.io” and “diep.io” following suit. This is despite the fact that the “io” TLD was originally meant for the British Indian Ocean Territory. Because in the Computer Science circle, “IO” is a well-known acryonym for “Input/Output”, it has also gained traction there. Perhaps my “.my” TLD of my site will start a similar craze ;)
I was searching for games to try, and had recalled watching Youtube celebrity Markiplier play this game (among others) several years ago in this video. I even remember playing it a few times a while ago, so it was a prime candidate since I was already somewhat familar with the game, and it didn’t seem too popular enough to warrant heavy protections against attacks.
Reconaissance
This game was built using Unity (I found references to it in the source code) by Yang C Liu, via his company, “End Game Interactive”. The web version is mostly ran in WebAssembly.
Besides the WebAssembly binary, which is basically bytecode for a stack-machine architecture which can be disassembled/decompiled by something like wasm-decompile (See: WebAssembly Binary Toolkit). There is also an application executable for Windows (and integrated into Discord?), . This is also a target to attack and reverse engineer, along with the IOS and Android apps which can be reverse engineered, but I went a different way, since my workshop was focused on attacking the network protocol itself.
Websockets
Many of these IO games use Websockets for their networking. The reason is simple: That’s the most efficient way for webpages to communicate with a server when having a realtime connection.
The HTTP protocol is what we call a stateless protocol, a client conects to the server, the server responds, and the connection (usually) ends. If you click another page after that, (or you have some javascript fetching other pages), a new connection is made for that.
Sure, you can still simulate a realtime connection by polling the server every X seconds. For example, if you want to have a chat application, you can have some javascript on the page making some AJAX requests every second to see if there is a new chat message.
For a web game, this polling system would be far too slow when you need to very quickly broadcast player movements and such. Hence, the need for Websockets, which allow for a continuous connection where the client and server can communicate freely as long you want, just as long as the page is open.
If you’re wondering why there are Websockets instead of having the web browser exposing a way to run raw sockets, that’s because then you can connect to any arbitrary server and do whatever you want, which is kind of dangerous. For example, you can make a connection with another website with a different domain via HTTP, since you can implement HTTP communication in the sockets directly, but with Websockets, it sits below the application level and you can’t do that.
MITMProxy
Enter: MITMProxy. I used it to analyse the network traffic on the game. Firefox and Chrome have tools to inspect websocket traffic already, by just going to the Developer Console and heading to the Network tab, and filtering out the “WS” option to find the websocket connection. Unlike HTTP requests, the traffic is shown on the right, with the red downwards arrow meaning “received” data, and the green upwards arrow being “sent” data. I suspect it was designed this way because websockets would typically have much heavier traffic, and they didn’t want to “spam” the main panel with all the traffic.
Using mitmproxy has some other advantages though, the main one I’ll be using is that I can use its python API to script out and experiment, by being able to modify data that is sent/received by the browser.
MiTM stands for Man-in-The-Middle, and what mitmproxy does is act as an intermediary between the browser and the server. This is done by mitmproxy acting as a proxy server. Then, you can configure your device or browser to use this proxy server and it would be able to see the network traffic, and also modify if it wanted.
Decrypting TLS/SSL traffic is slightly (only very slightly) more involved. There is a pre-installed list of “Certificate Authorities” in the browser, but since we can install a “fake” Certificate Authority generated by mitmproxy on our browser, it isn’t difficult at all, just follow the instructions. If this were a video game on a game console, this would have been much more difficult since video game consoles, though they might allow for setting proxies in the network settings, generally don’t allow installing certificate authorities. Even on Android and IOS they don’t officially allow it anymore.
Analysing the traffic
I initially wrote a simple script to print out all the websocket traffic by the zombsroyale site. I took the anatomy.py
example on the mitmproxy add-on documentation, and modified it to fit my needs.
from mitmproxy import ctx
class ZombsRoyaleAddon:
def is_zombs(self, flow):
# There are various different domain like mason-ipv4.zombsroyale.io, pinger-asia.eggs.gg, pinger-useast.eggs.gg
# and pinger-uswest.eggs.gg
# I ended up not filtering for those since I only had one websocket instance runnig.
# If you want, you can experiment by printing out
# the host and adding the check here.
# print(flow.request.host)
return True
def websocket_message(self, flow):
if self.is_zombs(flow):
latest_message = flow.websocket.messages[-1]
pretext = 'Sent' if latest_message.from_client else 'Received'
print(pretext + ": ", end='')
print(latest_message)
addons = [ZombsRoyaleAddon()]
After running this script, connecting my browser to the proxy, and installing mitmproxy’s “rogue” certificate, I got the following.
If you’re wondering, the
-q
option is to enable “quiet” mode, so that it doesn’t log anything other than what I asked to print out in the python script
Already we can glean some information. Besides the bytes of the number “1” being sent continuously back and forth (keep-alive perhaps?), we can see some ASCII text, such as the 0{"sid": "..."}
packet received, and the 42[<option>, <value>]
commands. If I wanted to experiment, I could change the contents of flow.websocket.messages\[-1]
and mitmproxy would change the message before it is sent. I could try experiment with changing setPlatform’s value to “mobile”, or changing the version in “setVersion”, or using “setName” to give myself a longer name than it usually allows. (The browser caps it at 10 characters, but changing it manually here lets you go up to 20 characters. I believe this is because the game purposefully allows for special characters for users with a non-“english alphabet” name, and to encode that in UTF-8, you would need 2 bytes for each letter.) .
As an aside, I found that occasionally zombsroyale’s loading progress bar will get stuck when I have a proxy on. I’m not sure if this is some sort of intentional anti-cheat system in place or just a coincidence and something is failing to work properly. I found that I can get around this by disabling the proxy, waiting for the progress bar to load halfway, then turning the proxy back on. I got the idea from the PS1’s “disc swap trick”, where users would make a “backup” of their game, and usually the PS1 would do a check at the start to make sure it is a legitimate PS1 game, then load it. The “disc swap trick” was for the user to first put in a legitimate disc on (any game, doesn’t have to be the same), the PS1 would check it and verify it is indeed a licensed copy, and then the user would quickly swap out the discs so that the PS1 would load up the “backup” of their game.
Unfortunately, it appears that when starting a game, it gets a lot less readable. There is a few more readable messages, but starting from the “SampleUsername” message and afterwards, it all is a bunch of unreadable data.
Here is an example of of the data sent after that message.
Note how there is a pattern in that packet that seem to start off similar. For example here you can see the messages starting with b'\x00\xbe\x1f\x00\x00\x00\x00b\xaeTB
. With further experimentation, I found out that certain “prefixes” correspond to different commands, for example, a certain type of prefix only appears when I move my mouse, and another type of message only happens when I type in the game chat (usually just a single byte in the “prefix” changed to indicate this). Doing this, I filtered out printing certain very common “prefixes” such as the prefix for me moving my mouse, so that I can focus and make small changes to analyse. This prefix is likely a an application-level header for packets that gives some information, and there are a lot of things in common in the various packets sent, so they remain the same. The “changed byte” I observed doing certain actions is likely corresponding to a byte that is used to indicate the “type of the message”, such as a “character movement” packet, or a “mouse movement” packet, or a “chat message” packet, etc.
mitmproxy very helpfully hot-reloads the python file when you make changes to experiment with it, so you don’t need to terminate the running process and restart it every time you make a change to the python code.
For example, here you can see many similar messages just pouring out and spamming my screen, and I can filter it out with the following code.
# ...
pretext = 'Sent' if latest_message.from_client else 'Received'
if b'\x19\x00\x00\x00b\xaeTB' in latest_message.content:
return
# ...
I can then do similar things to see what packets are being spammed out during my character movement, or when I move my mouse, etc.
This is a great starting point, but unfortunately, I quickly found out that these “prefixes”, although consistent during each round, every time the round ends or you leave and join another round, the “prefix” changes completely. This, along with my chat messages not appearing in the network traffic, led me to believe that the traffic was either compressed or encrypted. I quickly managed to rule out compression by noting that much of the traffic were the same size, and when I type messages in chat, the size of the packet increased correspondingly with the size of my chat message. So I deduced it was some form of custom home-grown encryption on top of the SSL/TLS encryption.
Fun fact: When I was analysing it, I tried playing around 3am when nobody was playing, because people kept killing me when I was busy analysing the network capture, and their mere presence was polluting my traffic with tons of data since all the players and their movements would be captured, so I waited until there was nearly no one in the lobby so the game woudn’t start, and I can experiment alone.
At one point, I sent out the same “chat message” in the game twice, and noticed that the network traffic stayed the same. That was pretty sus, because if the contents were to be encrypted, unless you were re-using keys, it shouldn’t show multiple packets with the same ciphertext.
This clearly indicated to me that they were re-using the same key, and were also resetting the stream cipher each time, to encrypt every packet. I’m not sure why they didn’t just use a single stream instead of resetting it each time. I still didn’t know the key though, but I now had this idea of trying out a known-plaintext-attack.
The way stream ciphers work, is that they take in a key value and and output a continuous stream of random bytes. A stream cipher is a type of symmetric encryption, and the reason I thought they used this instead of assymmetric encryption (aka public key encryption) is because symmetric encryption would be far too slow. There are other types of symmetric encryption besides stream ciphers, such s block ciphers, but the data wasn’t being padded and even if were a block cipher, the trick I’m going to do would still work.
I said earlier the stream cipher would taken in a key value and output random bytes. This is similar to a Pseudo-Random Number generator, but instead of a key, you would use a seed value to start off with. Every time you use the same key/seed, you would get the same output. For the stream cipher, the output is then “combined” with the plaintext somehow, usually through an Exclusive Or (XOR). XOR also has the nice property so that if you XOR the resulting ciphertext with the same stream, you will get back the original plaintext.
$$ StreamCipher(key) \oplus Plaintext = CipherText $$ $$ StreamCipher(key) \oplus CipherText = Plaintext $$Fortunately for us, the properties of XOR allow us to recover the key stream.
$$ StreamCipher(key) = CipherText \oplus Plaintext $$The idea is that I would send a large chat message, for example one full of “a"s, and then I would know that part of the “message” packet when decrypted would have the letter “a"s. A lot of ciphers use XOR for encryption after generating a stream of bytes from a stream cipher. With the known plaintext, I can XOR the encrypted message with the known-plaintext and recover part of the encryption stream.
I unfortunately still didn’t have the key (only part of the encryption stream), but at least now I can decrypt other messages (at the point where they intersect with where the “a"s were in the “chat message” packet).
If you recall, earlier there was a packet sent which was something like “SampleUsername” followed by 8 random bytes. In my testing, these random bytes change every time you join a new round, so I suspect perhaps this is the “key”? If it is, I have not found the algorithm used though, after trying a couple.
So I sent a large chat message of just “a"s. (Note: if it’s too long you’ll get kicked out of the game.)
Then I copy pasted the network traffic. (There are two because the server sends your chat message back to you)
And now I modify the code to XOR it.
import sys
# Thanks https://stackoverflow.com/a/29409299
def xor(a, b, byteorder=sys.byteorder):
a, b = a[:len(b)], b[:len(a)]
int_a = int.from_bytes(a, byteorder)
int_b = int.from_bytes(b, byteorder)
int_enc = int_a ^ int_b
return int_enc.to_bytes(len(a), byteorder)
# ...
known_ciphertext = b'\t0Y\xb0\xd7k\xcbX\xbfQ5\xb0\xd7n`6\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe6V\xbdQ8\xd1\xb6\x0f\xe67\xdc0Y\xb0\xd7n'
known_plaintext = b'a'*len(known_ciphertext)
stream = xor(known_ciphertext, known_plaintext)
decrypted_message = latest_message.content
if len(stream) > 0:
decrypted_message = xor(decrypted_message, stream)
print(pretext + ": ", end='')
print(decrypted_message)
# ...
Indeed, after this, sending a chat message “Hello Decrypted World!” gives the same output in the traffic capture.
Not entirely sure why the first character in the “Sent” packet for the chat message content is wrong.
Anyway, from here I experimented with the “emote” feature. By default, as a non-logged-in player, I had access to 5 emotes.
Upon experimenting, I reacted with two separate emotes and tried to pinpoint their differences. (I knew which packets were “emote” packets by seeing what packets only appear right after I trigger an emote). Since the only difference I changed when “emoting” is the “type” of emote, I deduced that this was the byte indicating which emote it is (circled in blue).
Note that the tab character, \t
is actually just the ASCII value for number 9. You can check this yourself by looking at an ASCII table online or running ord('\t')
in python. It just so happened that when printing, python realised this number was a printable ASCII character, and “helpfully” converted it to a tab escape code \t
.
Anyway, just for fun, I tried replacing this with byte with hand-picked random bytes until I found some other valid emotes. Using an invalid number would result in a “blank” emote being shown (UPDATE: This has been fixed sometime after the start of June 2022).
Anyway, here is my code excerpt that I tag on to the end of websocket_message
.
# ...
# Replace Happy Emote with something else
if decrypted_message == b'aJaaad-\x0e\x02\x08\raaa\xf6`\x00':
# You can change this and try all 1 byte values, so 0 to base 16 FF (255).
emote_id = 15
# Remember to remove the b'\x08' (original emote id)
replaced_message = b'aJaaad-\x0e\x02' + (emote_id).to_bytes(length= 1, byteorder='big') + b'\raaa\xf6`\x00'
# Remember to re-encrypt
replaced_message = xor(replaced_message, stream)
# be sure to change the original and not decrypted_message!
flow.websocket.messages[-1].content = replaced_message
# ...
And I tested and confirmed that it indeed showed a custom emote that I probably wasn’t supposed to be able to access.
Unfortunately, the vast majority of emotes seem to be unavailable, I suspect there are other checks going on or some other part of the packet I need to change. But I did find a number of emotes.
# Known emote reacts
# 7 - Glasses guy
# 8 - Happy
# 9 - Cry
# 10 - GG
# 11 - Thumbs Up
# 12 - Thumbs Down
# 14 - Med pack
# 15 - Ammo
# 16 - Gun 1
# 27 - RIP
# 28 - Poop
# 30 - Cry?
# 31 - Devil
# 32 - Bandanna guy
# 33 - Single teardrop guy
# 34 - Moustache guy
# 35 - Wink?
UPDATE: Apparently since I last did this around the start of June, the developers have fixed this. No more “blank” reacts, and most of the “custom” reacts seem to not work anymore. Either that, or i have a bug in my code now that my old code didn’t have. I wonder if the devs saw my attacks in the logs…
This is only a start, I experimented plenty, for example during a game, I realised a chat message could contain the string “LOCAL” if you set your chat to “team-only” or “GLOBAL” if you sent your chat to show to all player. I realise that moving my weapon in the game sent specific packets, and if I invested enough time, I can probably reverse engineer it and figure out how to make an aimbot. Heck, I even realised that when my network traffic was still and not much was going on, I was probably alone, but once I was getting a lot fo traffic, I realised that someone was probably nearby just outside my line of sight, and getting ready to snipe me.
But that’s enough for one article. If you wish to try this out yourself, good luck and have fun!
Further attacks
It’s unlikely that the “known plaintext” we put in (the bunch of “a"s) were exactly that of the correct plaintext. It was probably prefixed by some sort of header to indicate it is a chat message, and some other info. Plus, it’s annoying having to replace the known ciphertext every round since it always changes. It would be nice if we could defeat the encryption entirely.
I think fully defeating the custom encryption just by looking at it could be possible by someone smarter than me. Perhaps they’re using some well-known implementation such as a Unity plugin or a guide someone wrote on encryption in Unity, and taking a look at that code could shed more light. Maybe for some reason they’re not using a Cryptographically Secure Pseudo Random Generator (CSPRNG), and are instead using a standard RNG (such as the Mersenne Twister, as far as I can tell Unity uses Xorshift128), and since we have some of the plaintext and ciphertext, we might be able to recover the internal state of the stream and get a key. Besides that, the encryption key might even be transferred somewhere in the network, since we know it changes every time you join, it must have some way of telling the client how to decrypt it, and I simply haven’t found it or the encryption algorithm yet.
Whatever the case may be, my opinion is that the next stage of attack should be a static analysis of the code. Decompiling the Windows executable of the game, or the Android APK for example, and then looking for the relevant encryption code. That doesn’t mean the network analysis was a total waste of time, one could compare and use as reference the knowledge gleaned from the network analysis to speed up the process, for example, now we know what values correspond to which emote, and we have some semblance of what the “structure” of some network packets are like, such as the chat packet, we know at roughly what index the “message” portion starts, since it’s the point where we get the “correct” cipher stream and getting meaningful output in other packets. Perhaps then we’ll be able to find out how to decrypt the network traffic.
Other io games
Besides Zombs Royale, I have also, with more success actually, tried reverse engineering gartic.io, a spin on the classic party game of “guessing” what a person is drawing. Despite being more popular, as I had seen several “Twitch influencers” and University events using this site, I found their security more lax. Even before I started studying the game, there were some bots in the site, who would gang up and spam in chat or mass-report players, as well as drawing obscene drawings during their turn.
Their communication was completely in readable ASCII text (in a similar format to Zombs Royale’s initial login actually, perhaps this is a standard protocol that I’m unaware of? Something like "<number>[<options>, ...]"
). What also helped is that, instead of using WebAssembly, they had it in more or less readable javascript, which I took a quick look at. I also decompiled the Android APK and took a peek. Interestingly, the code (and the packets sent) weren’t in English, but were, according to Google Translate, in Spanish.
Indeed, I managed to reverse quite a bit of it, and even coded myself my own “bot” that would respond to my commands in chat.
I won’t go reveal too much about gartic.io because it is already overrun with bots and I don’t want to ruin the place, but if you want to have some fun, go ahead and try reverse engineer it. It shouldn’t be too hard. Or- let me know if you want me to post an article on it, and I might do so :)
If you want to find other targets, I’d suggest finding a medium-sized target. Nothing big enough that the developers are already experienced enough that it is very difficult for you to attack or it being no fun because someone else already reverse engineered it (e.g: Among Us), but not something too small that would just be no fun (and also they can probably quickly pinpoint that it is you who are attacking them, since you’ll be one of the only players).
I focused a lot on Websockets, but mitmproxy should work with raw sockets as well, though I haven’t tested it. This means you can even try it out on your favourite MMO and try to reverse engineer that, though you should be wary and realise you definitely have a chance at getting your account/IP banned.
If you do try it, good luck, Have fun!