I don't know anything about reverse engineering, but I have wanted to reverse engineer/decompile the Disney Animation Studio [1] for DOS for years.
I found the software at a thrift store in 2009, when I was eighteen, and I was immediately impressed. This was actually very intuitive, easy-to-use animation software that was very powerful, years before FutureSplash/Flash was released.
There's not a ton of info available on the internet now, but I have been trying to remedy that a bit [2] by uploading the manual. I reached out to Disney to ask if I could potentially buy and release the source code off of them, and they politely told me "no". I reached out to the creators in the credits on LinkedIn to see if there there was any way I could look at the code or if they could at least answer some questions, and they never got back to me.
I think the only way we're going to get the source code to The Animation Studio will be if I learn how to use Ghidra (or something similar) and decompile it myself.
It isn't that hard. I'm currently reverse engineering a old flight simulator game called A-10 Cuba. I had to teach myself X86 Assembly, and understand basic calling convention. Then C++ vtables, struct alignment and struct layout. How-ever you do need this basic level understanding of the core fundamental to help you along when the tools you use IDA, Ghidra that turn the assembly code back into C pseudo code.
So there is a big hurdle to get over in the initial stages but you soon find out that a lot higher code structure/scaffold isn't wiped out by the compiler. For example, the generated assembly code very closely mirrors the C/C++ function boundaries. This enables you to infer the over-all original code structure/layout basically from the call chain, and then you can manually step through and figure out what the original programmer was trying to achieve - abet the order of execution does get messed up by the compiler but it isn't that bad.
In my project with A-10 Cuba, I was successful in reverse engineering its file format, the over-all module layout, engine and rendering engine during my three weeks break. I still have some time to work out the AI logic, and mission design but one builds on another. What do I mean one builds on another? Well when you first start you have no types, not structs. So the first days you think you're making absolutly no progress because you're trying to calculate pointer offsets and structs layouts in IDA. I highly recommend Google Gemini or Claude code to do this heavy lifting because you can get away with a lot by asking it (for this IDA Pseudocode, infer what the struct layout is and tell me what it is doing?).
The first stage of getting those first struct layout is painstaking, then you soon can branch off one strut, or struct pointer to another. This feeds back like a feed-back loop - because programmers are lazy. And you soon have a large part of the struct/code-flow layout figured out.
You then take the structs/code-flow, and pesudo code and then do a re-write in a modern C/C++ compiler until you have a working version.
The plan is to put everything into a repo on github, this includes documentation on the file format, and also the rewrite of the original code in modern C++ and DirectX or Vulkan. I don't see much point in reverse engineering the old rendering engine - I can do it but I've got everything I need right now that I can just rewrite the game inside the browser.
Crimsonland (2003) is a top-down shooter that shipped as a stripped DirectX 8 binary with zero symbols. I decompiled it with Ghidra, validated behavior with WinDbg and Frida, and rewrote it from scratch in Python/Raylib — 46,800 lines matching the original behavior faithfully. The write-up covers static and runtime analysis, reverse engineering custom asset formats, and the full rewrite process. Code is on GitHub and it's playable now via uvx crimsonland@latest
I've been thinking about this topic and am glad to see it come up: AI is going to be a huge boon for digital preservation & restoration projects like this. I realized this while building this project (a map explorer for Tribes 2): https://exogen.github.io/t2-mapper/
Old games like this have a small (and shrinking) audience of people who care about them. With Tribes 2, for example, there are only ~50 people who actively play on a regular basis. A subset of those people are programmers, and a subset of those have the time & energy to put into a project like t2-mapper, assuming they're even interested. I got a basic version working, but then Claude Code helped decode and convert obsolete Dynamix/Torque3D file formats (improving existing Blender addons that were incomplete), got TorqueScript running in the browser, wrote shaders, and generally helped figure out what the original C++ code was doing.
In the past, you'd need the stars to perfectly align for stuff like this to happen: a passionate super-fan with the time, resources, knowledge, and persistence to see it through. Now, you mostly just need the persistence (and maybe a couple hundred bucks for tokens). I foresee people with niche interests (but not necessarily a programmer's skillset) being able to extend the lifetime (and maybe audience) of their obscure or obsolete software.
10tons tends to make smaller scale games and you feel it sometimes but I've had a great time with quite a few of their other shooters too. You used to be able to get this bundle for cheap from fanatical sometimes, not sure if that is still the case. They are best known in the modern era for Tesla vs Lovecraft which doesn't show up in this bundle.
https://store.steampowered.com/bundle/428/10tons_Shooters/
There have been a few attempts to make open source versions of Crimsonland and I had a good time with Violetland
https://github.com/ooxi/violetland
Bravo, that’s a seriously impressive undertaking, and a great demonstration of the augmentation potential in agentic coding. There’s so much focus on replacing entry-level work it gets missed what these power tools can do in the hands of people who know what they’re doing.
I'd be curious to see if there were any discoveries of any cut features left on the cutting room floor still present in the code [0] (aside from the demo teaser code mentioned still being present). I always find software archaelogy fascinating, as we get scraps of unused content or code, and can only guess as to the decisions made that led to it being scrapped, or why certain custom file formats were used, or code was structured a certain way.
[0] I am aware that such a wiki exists exactly for this purpose.
As an active reverse engineer, I'm really curious how you used agetic AI for this! Did you just have them going through the code and labeling stuff? Or were they also responsible for writing the reimplementation? This overview is super interesting, I would love to see details about the pipeline itself.
There are many ghidra plugin, like GhidrAssist, you can use to connect to a LLM. They will automatically put a name on each function and variable. It is far from perfect but it is way faster than doing it by hand in my experience.
Curious what parts you think can only be learned by hand? Having read the article I think the auto approach covers all the same ground, just at a much faster pace with no down time.
I really need to start familiarizing with these new tools, I'm only using LLMs in interactive, “question and answer”, mode and it feels like using a typewriter when everyone is switching to computer word processors.
Thanks for sharing, it's a really interesting writeup and project!
That's one way. I'm not certain that's the way you'd project did it, hard to say without looking at the pipeline. But there are N64 "matching decompilation" projects that do it exactly the way you propose.
Any recommended learning materials/resources for basic binary reverse engineering? I'm imaging a resource that teaches the common tools/concepts and provides binaries in increasing complexity.
Very impressive, makes one wonder what do some companies have in private compared to public tools that we stitch together. E.g. you can combine LLMs with statical analysis/proving to get much better results.
I found the software at a thrift store in 2009, when I was eighteen, and I was immediately impressed. This was actually very intuitive, easy-to-use animation software that was very powerful, years before FutureSplash/Flash was released.
There's not a ton of info available on the internet now, but I have been trying to remedy that a bit [2] by uploading the manual. I reached out to Disney to ask if I could potentially buy and release the source code off of them, and they politely told me "no". I reached out to the creators in the credits on LinkedIn to see if there there was any way I could look at the code or if they could at least answer some questions, and they never got back to me.
I think the only way we're going to get the source code to The Animation Studio will be if I learn how to use Ghidra (or something similar) and decompile it myself.
[1] https://en.wikipedia.org/wiki/The_Animation_Studio
[2] https://archive.org/details/disney_beginner_guide_2/disney_b...
So there is a big hurdle to get over in the initial stages but you soon find out that a lot higher code structure/scaffold isn't wiped out by the compiler. For example, the generated assembly code very closely mirrors the C/C++ function boundaries. This enables you to infer the over-all original code structure/layout basically from the call chain, and then you can manually step through and figure out what the original programmer was trying to achieve - abet the order of execution does get messed up by the compiler but it isn't that bad.
In my project with A-10 Cuba, I was successful in reverse engineering its file format, the over-all module layout, engine and rendering engine during my three weeks break. I still have some time to work out the AI logic, and mission design but one builds on another. What do I mean one builds on another? Well when you first start you have no types, not structs. So the first days you think you're making absolutly no progress because you're trying to calculate pointer offsets and structs layouts in IDA. I highly recommend Google Gemini or Claude code to do this heavy lifting because you can get away with a lot by asking it (for this IDA Pseudocode, infer what the struct layout is and tell me what it is doing?).
The first stage of getting those first struct layout is painstaking, then you soon can branch off one strut, or struct pointer to another. This feeds back like a feed-back loop - because programmers are lazy. And you soon have a large part of the struct/code-flow layout figured out.
You then take the structs/code-flow, and pesudo code and then do a re-write in a modern C/C++ compiler until you have a working version.
Old games like this have a small (and shrinking) audience of people who care about them. With Tribes 2, for example, there are only ~50 people who actively play on a regular basis. A subset of those people are programmers, and a subset of those have the time & energy to put into a project like t2-mapper, assuming they're even interested. I got a basic version working, but then Claude Code helped decode and convert obsolete Dynamix/Torque3D file formats (improving existing Blender addons that were incomplete), got TorqueScript running in the browser, wrote shaders, and generally helped figure out what the original C++ code was doing.
In the past, you'd need the stars to perfectly align for stuff like this to happen: a passionate super-fan with the time, resources, knowledge, and persistence to see it through. Now, you mostly just need the persistence (and maybe a couple hundred bucks for tokens). I foresee people with niche interests (but not necessarily a programmer's skillset) being able to extend the lifetime (and maybe audience) of their obscure or obsolete software.
There have been a few attempts to make open source versions of Crimsonland and I had a good time with Violetland https://github.com/ooxi/violetland
[0] I am aware that such a wiki exists exactly for this purpose.
I'm also impressed by the game's jaz image format. Very cool.
There used to be a Linux version but apparently it hasn't been updated to be added to or even compiled on modern Linux kernels and distros.
Someone I know tried to resurrect it a few years back but now I'm wondering if couldn't use OpenCode etc to get it up and running again.
(I did find a recent-ish clone [1] so may start with that)
0 - https://en.wikipedia.org/wiki/Bolo_(1987_video_game)
1 - https://github.com/stephank/orona
Reversing this by hand seems like it would have taken orders of magnitude longer…
https://store.playstation.com/en-us/product/UP4403-PPSA02752...
Thanks for sharing, it's a really interesting writeup and project!