Compilers are quite similar to assemblers. These programs take high-level source code and converts that into low-level code directly into an object file. The end result from this is an executable file, such as .elf or .exe.
Some compilers include clang
or gcc
. Some compiled languages do not translate their code into machine code, such as Java. C and C++ would compile into machine code directly, whereas Java would be compiled into an intermediate form known as JVM byte code. This byte code is then compiled into machine code for execution.
An assembler is a program that translate assembly code into machine code, and there are severeal types of assemblers built for different operating systems and processors.
Microsoft Macro Assembler (MASM) uses the Intel syntax for Microsoft Windows
GNU Assembler (GAS) is used by the GNU project
Netwide Assembler (NASM) relies on x86 architecture that is used to write 16-bit, 32-bit and 64-bit programs. This one is quite popular for Linux
Flat Assembler (FLAT) is a x86 assembler that supports Intel-style assembly on the IA-32.
For now, we will be focusing on NASM, which is the most common one used.
When a source code file is assembled, the resulting file is called an object file. It's a binary representation of the program. The assembler's job is to do some further operations to refine our assembly code, such as assigning memory location to variables and instructions. It also resolves symbolic names.
The process is outlined below:
Once the assembler creates the .obj file, a linker is needed to create the executable file. Linkers take one or more object files and combine them together. An example of an object file is a Dynamic Link Library file (.dll) used to create a .exe file together.
For debuggers, one that I recommend is Immunity Debugger for Windows. Make sure that you do not run Immunity Debugger on bare metal because it requires Windows Defender to be off (to execute and reverse engineer malicious files), so run it on a VM.
Debuggers are really useful, because they allow for us to view the memory stored within the registers, break the execution flow where needed, and so on.
I use Immunity Debugger a lot for Vanilla BOFs (for OSCP) and it's really handy to use in conjunction with mona.py.
Here's how Immunity Debugger would look when a binary is loaded.
The windows are as follows:
Disassembler Panel
This is hte most important window, where all assembly code is produced or viewed when debugging stuff
The instructions can be viewed and reviewed to find exploits
Register Panel
This would hold information about the memory within the registers
As the program is executed, this window changes values a lot
Memory Dump Panel
Shows the memory location and contents in different formats.
For the image above, it is showing a hex dump of the binary
Thread Stack
This is the window where we can view the contents of the stack
Immunity Debugger attempts to include explanations of the content so we can follow it better as it's not exactly readable for humans.
Decompilers are programs that attempt to translate a binary back into high-level code. I say attempt because it cannot recreate the exact binary majority of the time. It does however provide an easier to read format to reverse engineer code. Otherwise, you would be staring at hex or assembly.
Some popular decompilers I use are ghidra
and dnspy
, for Linux and Windows respectively. You can install ghidra
using apt
on Linux machines, and is commonly used for .elf files. This tool also lists the functions that are present within the binary for easy following.
Here's an example of ghidra
output:
Not exactly translated back into C, but very close.
dnspy
is used for decompiling .NET code compiled using C#. In other words, most .exe files.
Here's another example of dnSpy decompiling code:
This tool tends to do a better job at decompiling it very close to the actual code written.