Navigate Site: | Home | Blog | Forum | Samples | Downloads | About Us | Links | Documentation

Blog

2016-11-24: How To Output Info On Non-6502 Assembly Files

The decompiler can process a 6502 assembly file for outputting the debug info, but what for other .asm files?

Well, only 3 things need doing.

See how easy it is to convert it?

With the above conversion of data, now you can run any x86 assembly file into the 6502 module!

2016-11-12: Decompiler Now Outputting Useful Info For Reverse Engineers

I made my decompiler output useful information for reverse-engineers. The information below ends up in a path like: jobs/zol/logs/zol.usub_6967.txt.debug.txt

File: zol.usub_ED6B.txt


Maybe people will use it now...

2016-03-23: Initial Work on ARM Decompiler

We've got the ARM compiler toolchain and we're writing simple, general code then disassembling it (using objdump) and then decompiling that assembly code. There is an ARM module which is 458 lines of code, but the decompiler itself works fine as usual.

We've done:

We haven't done:

Next up will be a sample to show just how well it works.

2016-03-04: On the Philosophy of Apathy

On reverse engineering forums, you inevitably get someone finding a particular problem which a decompiler will fail at.

Well, let's say the decompiler can do everything else, and such problems are in the minority.

My point is this: Is it better to say it's not going to be perfect, so don't attempt it OR just try and get success 99% of the time?

2016-02-28: New Feature: Removing Erroneous BLocal Initialisation

Look at the following code:

int BLocal1;
BLocal1 = 0;
if (((CreatureUnderCursor_HealthEtcAttributes)[2] < 144)) {
       BLocal1 = 1;
} else {
       if (((CreatureUnderCursor_HealthEtcAttributes)[4] == 255)) {
               BLocal1 = 2;
       } else {
               BLocal1 = 3;
       }//EndIF; 67E2
}//EndIF; 67E2
myregs.pc = 0x4341;//DrawBorderTile
myregs.a = BLocal1;
_sys(&myregs); 

Basically, the 'BLocal1 = 0' is code that is irrelevant.

That's because if the '< 144' condition is true, BLocal1 becomes 1. Else it goes on to the '== 255' condition. If true, it's 2. If the '== 255' condition is false, BLocal1 becomes 3.

Whatever happens, the 'BLocal1 = 0' is simply not needed, so the decompiled code omits that 'BLocal1 = 0'.

2016-01-21: Breakthrough: Recompilation

As you probably know, there's 2 kinds of decompiled code. One that humans need to edit and fix, the other that recompiles with no human input, and runs identical to the original code.

You might also know that 6502 code has various gotchas which should make decompilation impossible.

Well, now we prove the doubters wrong. Thanks to help from the CC65 mailing list, we have a bunch of decompiler-generated code which runs identically to the code it was decompiled from!

What we did was, we compiled the code below with CC65 to a binary file of around 600 bytes. Then we loaded up Temple of Terror (the text adventure the code below is from) and loaded in the binary file at $0801. Then we edited the DropAll subroutine so the first thing it did was JMP to $080D which is from the binary file. Then we ran the game and typed 'DROP ALL'. If anything was wrong, it would have bombed out, but instead, it worked and dropped all objects the player has. Success!

This, of course, means that the decompiler will soon be able to output code from other CPU's without editing.

#include <stdio.h>
#include <stdlib.h>
#include <6502.h>
typedef unsigned char *PUC;
#define PlayerLocation (*(PUC)1024)
#define ObjectIn (*(PUC)1026)
#define NumberOfObjectsHeld (*(PUC)1029)
#define NumberOfObjectsInGame (*(PUC)1030)
#define ObjectInRoomTable ((PUC)1086)
void DropAll() {
int LLocal1;
if ((NumberOfObjectsHeld != 0)) {
        LLocal1 = NumberOfObjectsInGame;
        do {
                if ((ObjectInRoomTable[LLocal1] == ObjectIn)) {
                        ObjectInRoomTable[LLocal1] = PlayerLocation;
                }//EndIF; 313B
                LLocal1 = (LLocal1 - 1);
        } while (LLocal1 != 0);//LoopEndWh 313F
        NumberOfObjectsHeld = 0;
        __asm__("jsr $30B8");
        return;// 3147
} else {
        __asm__("jsr $2E84");//Inventory
        return;// 314A
}//EndIF; 314A
return;// 314A
};

2016-01-13: First recompiled decompiled code and Selective Recompilability

It's a landmark, the first time Detech has put decompiled code into a C compiler, and have it run exactly the same as the original assembly code.

The compiler used is CC65, a 6502 C compiler, and outputs to a 6502 binary.

PUC is a pointer to an address.

#define Redraw_PointerToColourMap *((PUC*)6)
#define BorderDrawParamColour *((PUC*)133)
#define GUI_Draw_Pointer *((PUC*)4)
#define Global18 *((PUC*)18)
#define Global135 *((PUC*)135)

void DrawBorderTile(int Arg_acc) {
int LLocal1;
Global135 = Arg_acc;
usub_4313();
LLocal1 = 7;
do {
       ((PUC *)GUI_Draw_Pointer)[LLocal1] = ((PUC *)Global18)[LLocal1];
       LLocal1 = (LLocal1 - 1);
} while (LLocal1 >= 0);//LoopEndWh 434D
((PUC *)Redraw_PointerToColourMap)[0] = BorderDrawParamColour;
return();// 4355
};

This code (just 12 lines of assembly code) compiles to a 672-byte C64 .PRG file.

It hasn't yet been tested but this is to follow in a couple of days.

For now, just bask in the glory!

This is part of Decompiler Tech's new strategy on decompilation: Selective recompilability.

The goal is to make it possible to recompile the code output by RevEngE. But the twist on a normal strategy is that people can recompile just the subroutines they want to deal with, and patch them so the new code is called instead of the old code. Rather than decompiling the entire program.

And this strategy also means that for a fully-defined decompiler module such as 6502, if we get it working with 6502, it will follow quite easily for any CPU we can fully define.

As a side note, 'define' means every CPU opcode is accounted for and defined. 6502 has all opcodes in the module, but of course, x86 doesn't (and Java soon will).

2015-12-28: Initial Work on Java Decompiler

We've started working on a Java module for RevEngE (decompiling Java, not running it). After 2 days, we've got a basic sketch that fits entirely within the RevEngE API. The module is about 150 lines of code. It can handle loops, loop variables, and the stack-ish stuff that Java does a lot.

The stack is emulated the same as normal (eg, x86 stack), by changing just 15 lines of main engine code.

In a few days, it should be good enough for more ambitious input code.

Decompiled Java code:
public static void main(java.lang.String[]) {
	int BLocal1;
	java.lang.System.out.println("Hello, World");
	float LLocal1 = 0.0;
	do {
		if (50.0 <= LLocal1) {
			break;
		};
		java.lang.System.out.println(LLocal1);
		LLocal1 = (1.0 + LLocal1);
	}//LoopEnd 1C
	String local2 = "hi there";
	java.lang.System.out.println(local2.charAt(3));
	int local4 = 2;
	BLocal1 = 1;
	if ((2 == local4)) {
			BLocal1 = 1;
	} else {
			BLocal1 = 2;
	}//EndIF; 3F
	java.lang.System.out.println(BLocal1);
	return();// 47
};

Original Java code:
public static void main(String[] args) {
	// Prints "Hello, World" to the terminal window.
	System.out.println("Hello, World");
	for (float i = 0; i < 50; i++)
	{
		System.out.println(i);
	}

	int f = 2.0;
	
	String tempstr = "hi there";
	System.out.println(tempstr.charAt(3));

	int blocal = 1;
	if (f == 2)
	{
		blocal = 1;
	}
	else
	{
		blocal = 2;
	}
	System.out.println(blocal);
}

2015-12-22: Auto-Detecting Arguments and Return Values

One of the things the RevEngE decompiler does is find which registers a function has as its input, and also which registers the function outputs.

if ((Global163 + Global138) & 1) {
        usub_B6A1(207,1,0);
}

B6A1 has as its function header: void tirnanog.usub_b6a1.txt(int Arg_acc,int Arg_y,int Arg_x)

2015-12-19: Goto's Considered Okay

There's a problem in decompilation. The first problem is when people use goto's in their code, and the second problem is the way conditional blocks made from || and && create crazy goto's.

So the only solution is to recognise when to use goto's, and that must be as rare as possible.

We recognise goto's that are things like:

Whenever they occur, we put a goto in the decompiled code, so that the program is readable most of the time, and goto's are only used in extreme cases.

A goto used in a conditional block, can actually, in some cases, be turned back into the original conditions, so goto's can be optimised out this way.

Just another example of how we get readable, bug-free code for 99.5% of the time, and leave the nasty stuff for the last 0.5%.

2015-12-16: Global X-Referencing

There are now listings of Global variables, saying which function reads or writes to them.

See the writes here and the reads here.

2015-12-08: Commodore Free Issue 90

The new issue of C= Free is out, and Decompiler Tech is mentioned! Which is great!

Also in this issue is the Crack and Train Like a Pro which has a great tip for finding the start address of a C64 program: Hunt for these bytes: H 0000 0800 A9 37 85 01

In Frodo/SuperSAM, that's basically q 0000 ffff a9 37 85 01.

I just found out the start address for Super G-Man! Great tip.

2015-11-22: Notes on static analysis

Because the decompiler simulates the code, it should be possible to find function calls that were previously hidden, and also trace variables across the code base. This should make it useful for analysing binaries for malicious code.

2015-11-12: Notes on bytecode

If a program (or game) has its own internal bytecode (like assembly code), people say you can't decompile it. But you can!

First you decompile the program and figure out the bytecodes it uses.

Then you disassemble the bytecode and put THAT into RevEngE. So it'll decompile the bytecode!

2015-10-01: RevEngE Explained

The new version of the decompiler is called RevEngE. This stands for: Reverse Engineering Emulator (not Engine!).

2015-10-01: We're using Lisp!

The first version of our decompiler, called VBRB (VB Right Back) was written in C++. But it was rewritten after the author learned Scheme at University. The rewrite was from the ground up in Common Lisp.

Advantages include complex data structures made easy, and it's as fast, if not faster, than C++.

Another advantage is memory usage. I found out C++ was hogging tons of memory in various strings, maps and vectors, and Lisp is much better (around 12MB with a typical decompiled binary).

2015-09-11: VB5/VB6 Decompiling Service is Back

More great news for VB5/6 native code customers.

We have fully resurrected our VB5/6-specific code. It's now working with the new engine (see below). The great thing is the new engine means the decompiled VB code is now quite a bit more accurate than before.

2015-09-07: VB5/VB6 Native Decompiler Backported

The good news is that all the code dealing with VB5/VB6 native executables, has been merged back into the most recent decompiler engine.

Therefore, even better news than usual: We can now decompile any VB5/VB6 native executables! And even better, all the improvements to structure of your programs are available for VB5/VB6!

Feel free to send us a message regarding VB5/VB6. Don't worry if you don't know if it's native or P-code, we'll tell you that for free!

2015-08-27: 68000 CPU Ported to RevEngE Decompiler

It took just 3 days, but we ported RevEngE to deal with 68000 code. We took a small sample C program (DeHex from Fred's Fish Disks) and implemented all the opcodes, without any custom opcodes needed (except movem but that's not part of the actual code). About 18 opcodes were used in this program.

It's the same as 6502. There are NO registers in the code, and just a few special variable types, which aren't changed from 6502 (or even x86).

This has just been done, and could take a couple of weeks to implement the rest of the opcodes and operand data types. But this is the future!

Sample here.

2015-08-02: The 6502 Decompiler, RevEngE6502, is now available for download!

  • Some features in the paid version are: For now, the demo version: Also available is a version of the C64 Frodo emulator, with extra code in the SAM machine code monitor. This makes it easier to find areas of code to disassemble.

    See here for downloadable binaries.