Warning: This is a poorly-written brain dump written solely in the interest of getting some information out.
The Glitch
We've got an interesting new glitch in The Legend of Zelda: Twilight Princess!
A couple days ago, another Twilight Princess glitch hunter, S0ft, posted a screenshot of some logs from Dolphin resembling the following:
Core\PowerPC\MMU.cpp:412 E[MASTER]: Warning: Unable to resolve write address 104100a3 PC 830
This immediately caught my attention. I see new crashes posted in Discord somewhat frequently, and most of them are simple, well-handled
memory errors caused by attempting to read from somewhere in the 0x0000
–0x1000
range; that is, null pointer reads. There are dozens
of ways to make Twilight Princess do that, and they're generally not that interesting, since the game's crash handler kicks in and
stops any further execution.
This one is different. Put simply, PC
should not 830
. This means we haven't done an out-of-bounds read, we've done an out-of-bounds
jump, sending program execution off the rails entirely.
So I started investigating. Thankfully, this one is pretty easy to reproduce:
- perform The Amazing Fly Glitch
- as soon as the
FISH ON!
text appears, soft-reset the console - load a save
- view an area banner, pick up an item, talk to an NPC, or a variety of other things to trigger the glitch
Sure enough, a breakpoint I'd set at 0x0000800
tripped. But how did we get here? Well, the answer was a lot weirder than I expected.
An enormous shout-out here to Taka and the rest of the Twilight Princess Decompilation team, without whom this analysis would not have been possible.
What's actually happened here is that we caused an out-of-bounds read, the same thing that usually happens with these Twilight Princess crashes. Specifically, in this function:
void OSSleepThread(OSThreadQueue* queue) {
BOOL enabled;
OSThread* currentThread;
enabled = OSDisableInterrupts();
currentThread = OSGetCurrentThread();
currentThread->state = OS_THREAD_STATE_WAITING; // memory error is on this line
currentThread->queue = queue;
AddPrio(queue, currentThread, link);
RunQueueHint = TRUE;
__OSReschedule();
OSRestoreInterrupts(enabled);
}
currentThread
is an invalid (not NULL
!) pointer, so the attempt to read currentThread->state
crashes. But what happened to currentThread
?
Well, OSGetCurrentThread
returns the value of OS_CURRENT_THREAD
, which is always stored at address 0x800000E4
. Sure enough, the value stored
at OS_CURRENT_THREAD
wasn't a valid pointer.
Obviously, my first instinct was to set a memory breakpoint on 0x800000E4
in Dolphin and work from there. I set that up, performed the glitch again,
and Dolphin never observed the invalid pointer being written to that address, even though it still reported an invalid access exception.
Huh?
Well, either way, I noticed something else strange while looking at memory around 0x800000E4
. It looks for all the world like a Yaz0
-compressed archive
(which the game uses for various resource files) has been placed at address 0x80000000
instead of the data that's supposed to be there.
About an hour later, I found the answer. This isn't a CPU-level copy or some kind of memory remapping, this is the result of a DMA copy from some part
of ARAM to main memory starting at 0x80000000
. The mechanism that causes this is refreshingly simple:
static int JKRDecompressFromAramToMainRam(u32 src, void* dst, u32 srcLength, u32 dstLength,
u32 offset, u32* resourceSize) {
BOOL interrupts = OSDisableInterrupts();
if (s_is_decompress_mutex_initialized == false) {
OSInitMutex(&decompMutex);
s_is_decompress_mutex_initialized = true;
}
OSRestoreInterrupts(interrupts);
OSLockMutex(&decompMutex);
u32 szsBufferSize = JKRAram::getSZSBufferSize();
szpBuf = (u8*)JKRAllocFromSysHeap(szsBufferSize, 32);
/* ... */
decompSZS_subroutine(firstSrcData(), (u8*)dst);
/* ... */
}
This code is a little unclear because of the use of some global variables, but szpBuf
will eventually be used by firstSrcData()
as the target
of a DMA copy operation. The problem is that JKRDecompressFromAramToMainRam
doesn't check whether the JKRAllocFromSysHeap
allocation succeeds; if
the allocation fails, szpBuf
will become 0
, and the DMA operation will target the start of main memory. Since the DMA engine (apparently) can't
segfault, this just works and the copy result is aliased to 0x80000000
. This ends up copying up to 0x2000
bytes of the archive that's intended to
be decompressed, usually some kind of font, sound or animation file.
When the next thread yields, the OS tries to read the thread structure pointer from 0x800000E4
, which the DMA copy overwrote with an invalid pointer.
This traps, and execution is transferred to the out-of-bounds read handler at 0x80000300
. The trick is that since we copied up to 0x2000
bytes into RAM
here, also overwrote the exception handler! That means that we're now executing the contents of that compressed archive file as code!
The Problems
So, to recap: Performing a simple sequence of actions in Twilight Princess copies a chunk of data from ARAM to main memory starting at address 0x80000000
,
overwriting important system data and exception handlers, and usually causing execution to move into that data copied from ARAM.
In my opinion, this is the closest Twilight Princess has ever come to arbitrary code execution. However, there are still several problems:
- The underlying mechanism that causes the
JKRAllocFromSysHeap
call to fail is not particularly well understood. - Most pieces of data we can copy from ARAM don't do anything particularly interesting when executed, in part because:
- The FPU is disabled during this context-switch state, so any attempt to execute a floating-point instruction jumps to
0x80000800
. - Even if we did get a jump to a player-controlled memory location, the amount of work that needs to be done to restore normal
game operation from this state is non-trivial due to the critical global variables stored between
0x80000000
and0x80000100
that are completely obliterated by this glitch.
At this point, we're looking for help from folks experienced in doing low-level work like this on the GameCube platform. The Twilight Princess community has barely scratched the surface of what might be possible with this exploit, but it's clearly going to be a lot of work from here to find out just what we can do.
If you've got any information or experience that might be relevant, please come find me in the Twilight Princess Speedrunning Discord server.