Hacking into Unity games

hackf5.io
14 min readJun 27, 2019

With the rising popularity of game streaming platforms like Twitch that allow developers to build extensions to boost fan engagement, knowing how to hack into games to get data to feed those extensions is a skill that many developers want to learn.

In this article I’m going to explain how to hack into the memory space of a Unity game process running under Windows. This memory often contains a lot of interesting and hidden information about that game that could be used, for example, to build a Twitch extension.

You can find the code that goes with this article here: https://github.com/hackf5/unityspy.

The key ideas have been borrowed from HearthMirror, which is part of the excellent Hearthstone Deck Tracker.

The author has no affiliation with HearthSim, HearthMirror or Hearthstone Deck Tracker, but is a fan of their work.

Legal Stuff

HearthMirror is copyright and released open source under a proprietary license. Copyright protects the expression of an idea, not the idea itself. Ideas are protected by patent.

This post has a good description of the protections afforded by copyright. In particular copyright probably makes it illegal to reference the HearthMirror library in your own project, or to copy code from the HearthMirror project. But, as the source is legally available, it is legal to understand the ideas in the source and to replicate them.

What’s Unity?

Unity is the world’s most popular game engine. If you’ve ever tried building a game, the chances are that you tried to build it in Unity. In fact it’s so popular that something like half of all games that are built are built on Unity.

There are many reasons for this popularity, but a significant one is Unity’s outstanding cross platform support. It achieves this on a number of platforms, including Windows, using a library called Mono.

If you want to find out how to get started building games with Unity I can thoroughly recommend Brackeys YouTube channel, which has lots of great Unity tutorials.

What’s Mono?

For an in-depth discussion of what Mono is, you’re probably best off reading the Wikipedia page, but the tldr; is that it’s the open source version of the .NET Framework. .NET libraries run under Mono and Mono libraries run under .NET.

The Mono project has been around for almost as long as .NET itself, but for many years it struggled for popularity because, although .NET is a managed language, it uses a lot of native libraries. In the case of a Windows Forms or a WPF app, for example, the windows and controls are drawn using Windows libraries that are part of the operating system. What this means is that these apps won’t work under Mono on any non-Windows computer because those libraries simply aren’t there.

A few years ago Mono was acquired by Xamarin, which in turn was recently acquired by Microsoft, so now Microsoft owns yet another .NET runtime. There’s nothing quite like diversification!

Mono in Unity

Most of the Unity game engine is written in C++, but game developers write their code in C#. On platforms that are well supported by Mono, such as Windows and Mac, this C# code runs under Mono, on other platforms, such as iOS, Unity uses a transpilation process called IL2CPP (Intermediate Language To C Plus Plus) that converts C# into C++. IL2CPP is more restrictive than Mono, but it has better cross platform support. And where Unity runs under Mono, it in fact uses its own fork.

This post provides an interesting discussion of the internal workings of the Unity game engine and its possible future directions.

With the preamble out of the way it’s time to get into the technical details.

The Goal

The goal is to gain access to and understand the contents of the memory running under Mono in a Unity game on Windows.

Since all of the game logic runs under Mono, then this memory contains everything useful about the game, so this isn’t any real restriction.

You can find the code that goes with this article here: https://github.com/hackf5/unityspy.

The solution can be found at /src/HackF5.UnitySpy.sln. It contains two projects:

  • /src/HackF5.UnitySpy/HackF5.UnitySpy.csproj contains the core logic for reading data in a process. I’ll refer to this as the core library.
  • /src/HackF5.UnitySpy.Gui/HackF5.UnitySpy.Gui.csproj contains a simple application that demonstrates how to use the HackF5.UnitySpy library for browsing the state of a Unity game. I’ll refer to this as the GUI app.

The core library must be run from a 32-bit process.

The code is written in C# and can be compiled and run using Visual Studio 2019 Community Edition. The library itself is .NET standard 2.0, so will compile on Linux and Mac as well as Windows, however it only runs under Windows. In theory it could be modified to run on Mac and Linux, but this would require more knowledge of macOS and Linux than I possess.

The Plan

To save a few bytes in the cloud I’m going to use the word process to interchangeably mean the Unity game process and the memory that the operating system has allocated to that process.

  1. Get hold of a reference to the process.
  2. Get a reference to the process’ root AppDomain.
  3. Using the AppDomain, find the data in the process that represents the main game logic Assembly. By default Unity calls this Assembly Assembly-CSharp.
  4. Using the Assembly data, find the data in the process that represents all of the Types in the Assembly.
  5. Use the Type data to interpret the process’ game state.

This approach works because the Types directly reference their static fields, and game state is often held in a statically rooted data model. So by finding the static objects, looking at their fields, then looking at the fields of those fields and so on, it is possible to build up a picture of the current game state.

Reference the Unity game process

In .NET processes are represented by a Process object. Process has a number of useful static methods that allow you to get a reference to a running process.

  • Process.GetProcesses()returns a collection of all process resources running on the local computer.
  • Process.GetProcessById(int processId) returns the process running on the local computer with a specific process ID.

The core library takes a process ID. See the ProcessFacade class.

The GUI app gets all running processes, to allow the user to choose the process that they want to inspect. It then passes the ID of this process into the core library. See the MainViewModel class.

Find the root AppDomain

If you’ve spent while working working with .NET applications, then you’ll probably know that .NET applications are composed of one or more AppDomains. You can think of an AppDomain as an isolated .NET process inside a native process.

For our purposes, all you really need to know about AppDomains is that most .NET applications only have one of them, the root AppDomain, and an AppDomain contains a list of its loaded Assemblies.

The root AppDomain location

Heading over to the Mono documentation you can see that Mono declares a function called mono_get_root_domain. Its return value is a pointer to the application’s root AppDomain. That is the thing I want to get hold of.

What I want to do is call mono_get_root_domain to find the location of the root AppDomain in the process, but there is no way to call a function inside another process, so normally this wouldn’t be of much use. However, I can try to disassemble the mono.dll library to see what the function looks like.

If you have the Unity IDE installed then you can find this library at C:\Program Files\Unity\Editor\Data\Mono\bin\mono.dll.

Unlike .NET, which can be beautifully decompiled using the incredible free JetBrains decompiler dotPeek, C/C++ does not decompile or even dissassemble well, but for simple functions it dissassembles to something almost comprehensible. Thankfully an excellent, free GPL licensed C/C++ dissasembler exists in the form of Snowman.

If you open up Snowman, disassemble Unity’s mono.dll library and look for the mono_get_root_domain function you will see it is defined as

Yuck! But it isn’t actually quite as bad as it first looks. In fact the assembly code is more helpful. What the assembly tells us is that the function call starts at address 10027c32 and at this address is an instruction to move the constant 4 byte value 0x101f62cc onto the EAX register. See the assembly mov instruction for more details.

So the mono_get_root_domain function is located at 10027c32, then there is a single byte mov eaxinstruction, and then next 4 bytes contain the address of the root AppDomain that should be moved onto eax.

This means that addressof(AppDomain) = addressof(mono_get_root_domain) + 1.

For a specific mono.dll this value will be constant, but the constant is calculated by the compiler at compile time, so if I hard-coded the value 0x101f62cc from a specific version of the mono.dll, then the core library would only be able to inspect Unity games that referenced that specific version of the mono.dll library. That means I need to find the definition of the mono_get_root_domain function at runtime and read the AppDomain address out of it.

Read the mono.dll from the Unity game process

The first thing to do is to get the mono.dll module from the Unity game process. The .NET framework makes this straightforward as it can be found by enumerating the process’ Modules collection.

Having got hold of the mono.dll module, I need to use some native code to dump it into a byte array.

All of the hard work is done by the ReadProcessMemory function that is defined in the Windows operating system library kernel32. The mono.dll Module tells me it starts at monoModule.BaseAddress in the process’ memory and it consists of monoModule.ModuleMemorySize bytes. So all I need to do is use ReadProcessMemory to read that chunk of the Unity game process’ memory into a byte array of the same size as the module.

Now the variable moduleDump contains the contents of the mono.dll module.

See the ProcessFacade.ReadModule(...) method.

Find the mono_get_root_domain function

All Windows DLLs start with a standard header that describes, amongst other things, the location of each function in the DLL. The header conforms to the PE Format specification.

If you’ve worked with raw byte arrays before, then you’ll know what’s coming. If not you might be in for a shock, as it involves navigating to specific locations in the array and interpreting the bytes that you find there as some known type, usually something like an integer or a string.

In this case the PE Format specification describes how to find the location of the mono_get_root_domain function.

There is a fairly complete .NET PE file reader library called PeNet, but unfortunately when I tried it, it didn’t work out of the box with the mono.dll byte array dump, although with a bit of tweeking I did almost get it working.

Since I only needed a little bit of information from the PE header and the PeNet library needed modifying, I decided that the best way to go was to copy the offsets from the HearthMirror project and to read the data from the module dump directly as they’ve done.

In summary, the code above enumerates the list of functions declared in the module’s PE header and when it finds one called mono_get_root_domain sets rootDomainFunctionAddress to be equal to the address of this function.

So I’ve found the address of the root AppDomain, it’s located at address rootDomainFunctionAddress + 1 in the Unity game process’ memory.

See the AssemblyImageFactory.GetRootDomainFunctionAddress(…) method.

Read the AppDomain

The Mono library defines various structs that map to onto managed .NET types. The type that maps to the managed AppDomain type is _MonoDomain.

Looking at the _MonoDomain struct you can see that it contains a pointer to a GSList called domain_assemblies. A GSList is a singly linked list of void pointers. You can probably assume these are pointers to structs of type _MonoAssembly, the Mono type that maps to a .NET Assembly.

The problem here is working out what the offset of domain_assemblies is from the start of the _MonoDomain as this involves counting the cumulative size of all of the fields that are declared before domain_assemblies.

Since the Unity game process is 32-bit, the size of a pointer is the size of a 32-bit integer. There are 8 bits in a byte, so in 32 bits there are 4 bytes, so the size of a pointer is 4.

In _MonoDomain I count 18 pointers and three 32-bit integers declared using the typedef gint32 before reaching domain_assemblies, which makes 84 bytes in total. But then at the start of the struct is a rather inconvenient field called lock of type MonoCoopMutex. The problem with this type is that it isn’t possible to work out its size because it is declared externally.

My best guess for the type of MonoCoopMutex is that it is _RTL_CRITICAL_SECTION, which is defined in the standard Windows header file winnt.h. The size of that type looks like 36 bytes, which would give an offset of 120 bytes in total. However, that is not the offset that is used in the HearthMirror project. HearthMirror uses an offset of 112 bytes, so the size of MonoCoopMutex must be 28 bytes.

I’m cheating here though, because I’m just copying someone else’s hard work. So how could the HearthMirror devs have worked out that the offset is 112?

Well, I tried counting from the start, but that didn’t work because there was no way to know the size of MonoCoopMutex, so another option is to count from some known location.

The best candidate for this is in my opinion is the friendly_name field. This field is located two pointers, or 8 bytes, away from our target field. So if I can work out the offset of friendly_name then I can work out the offset of the domain_assemblies field that I’m interested in.

The friendly_name field is a good candidate because it’s a C-style string that probably contains some recognisable value. To find this field I can start at an offset of 92 bytes (two pointers on from the minimum offset of the domain_assemblies field) and try reading the C-style string I find at each position.

There are a number of ways of doing this. I could write a bit of C# code to scan through each candidate in a running Unity game’s memory, or I could use a hex editor capable of reading directly from a process’ memory. I decided to use the excellent and free HxD Hex Editor and do it by hand.

Using this I was able to find that friendly_name has value Unity Root Domain and with it the offset of the friendly_name field from the start of the _MonoDomain object. What I was looking for was a string that wasn’t junk.

Value of friendly_name variable

With this I’ve legitimately found the offset of the domain_assemblies field and can go on to look for the main Unity game logic Assembly called Assembly-CSharp.

See AssemblyImageFactory.GetAssemblyImage(…).

Find the Assembly-CSharp main game logic Assembly

To find this Assembly I need to enumerate over the collection of _MonoAssembly objects contained in the domain_assemblies singly linked list until I find an Assembly called Assembly-CSharp.

This is easier than finding the domain_assemblies offset as the singly linked list can be enumerated by following the pointer at a 4 byte offset from the start of each item in the list. The first 4 bytes of each item is a pointer to the _MonoAssembly referenced by the element.

The Assembly’s name is found by reading the aname field of the _MonoAssembly at offset 8 (two pointers from the start of the object) of type _MonoAssemblyName.

The C-Style string that holds the name is located at the start of the _MonoAssemblyName.

See AssemblyImageFactory.GetAssemblyImage(…).

Find the Types referenced by the main game logic Assembly

The Type information is held in the field image of type _MonoImage. This field has an offset of 64 bytes from the start of a _MonoAssembly.

The information I’m interested in is stored in a field called class_cache of type _MonoInternalHashTable. The hash table tells you how many elements it contains and then provides a pointer to the first item in the table. You can make the assumption that this item is a pointer to an object of type _MonoClass.

I’m going to be completely honest here, I’m not even going to guess how the HearthMirror devs found the offset of the class_cache field.

The best way of solving the problem would be with an industry grade C/C++ dissasembler like IDA from Hex Rays. Unfortunately for the amateur hacker this tool that starts at around $1000 USD. It’s obviously worth the money if you do this sort of thing regularly, but if you don’t it’s hard to justify the cost.

Another possible alternative is to compile the Mono library from source and then to write a small C program that uses the offsetof macro to calculate the offsets of the fields that you’re interested in. With a little big of messing around this isn’t too difficult to do.

I compiled and ran the following code in Visual Studio.

It would be nice if this worked, but the offset of domain_assemblies field doesn’t match the value I found earlier, so it clearly doesn’t work. There are two problems here:

  1. I don’t know what compiler Unity uses to build its Mono DLL.
  2. I don’t know what version of the source code they compile against.

Different compilers can pack types differently, so even if I had the right source code my compiler could return a different offset. And of course if my source code is different from the DLL that Unity ships I could be comparing oranges with apples. So unless you have the exact compiler and source, this strategy is unlikely to work. The correct class_cache offset is actually 672, which is way off the value that I found using the compiler.

Without a dissasembler you are going to need to do reverse engineer the address from the process’ memory directly.

If I was going to do this then I would start by scanning the whole process memory for class names that I knew were declared in the Assembly-CSharp Assembly. Once I’d found these, then I would assume that they were part of a _MonoClass object and with enough of them I would hope to work my way back to the class_cache by looking for the addresses of those _MonoClass objects. It’s probably doable, but it isn’t straightforward.

Having found the class_cache then extracting the type information is more of the same. Work out which fields you need from the objects you’re interested in, work out their offsets and read them.

See AssemblyImage to see how the Type information is found.

See TypeDefinition to see how the Type information is read.

Again all of the hard work has been done by the HearthMirror devs.

Read the process’ game state

The entry point into managed memory is through a Type’s static fields. These are found in _MonoClass->runtime_info->domain_vtables->vtable. This works because the game engine runs in the root AppDomain. If it was running in another AppDomain then it would be necessary to shift domain_vtables by the ID of the AppDomain. As before you probably need a dissassembler to work out the appropriate offsets.

At this stage things get easier because the _MonoClass objects describe their fields along with their offsets, so you can read the offsets of the managed object fields directly, without needing to work them out.

Once you have the Type information it’s simply a matter of finding the static fields that hold useful information and then using the field offsets to read this information.

See TypeDefinition.GetStaticValue(…) to see how object data is read.

Trying it out

You can see this in action for yourself by:

  1. Getting a copy of Hearthstone.
  2. Getting and building the GUI app in the Unity Spy solution.
  3. Running Hearthstone and then running the GUI app.

If you do this you’ll be able to browse the Hearthstone game data. For example you’ll be able to get a list of all of the cards that you own.

If you enjoyed this article and want to see more like it please give it a clap and leave a comment.

--

--