[C++] Simple functions in x64 assembly
To learn x64 assembly (asm) Iāll document the disassembly of some simple C++ functions.
The examples were compiled on Godbolt with MSVCās latest version (v19.4) using O0 and O2.
Identity Function
auto identity(int x) {
return x;
}
A simple identity function.
x$ = 8
identity(int) PROC ; identity
mov DWORD PTR [rsp+8], ecx
mov eax, DWORD PTR x$[rsp]
ret 0
identity(int) ENDP ; identity
Letās go through it line by line.
The code uses the MASM syntax which takes the form of instruction destination, source.
x$ = 8
x$ is a simple constant.
identity(int) PROC ; identity
This block denotes the start of the function.
mov DWORD PTR [rsp+8], ecx
The mov instruction simply moves a value from one place to another.
The source and destination can either be a register or memory.
Square brackets denote accessing memory.
Here we move the value in register ecx into memory at the address 8 bytes above the stack pointer.
In C++ this would look like.
*(rsp + 8) = ecx;
By convention, the first four parameters of a Windows function are placed in registers rcx, rdx, r8, and r9.
These are then moved to stack memory when the function begins1.
mov eax, DWORD PTR x$[rsp]
x is copied from memory to register eax for the return value.
Integer return values are stored in rax (eax is simply the lower 32 bits of the full 64 bit register).
ret 0
The ret instruction returns from the function to the calling address.
Now letās look at the optimised version of the function.
x$ = 8
identity(int) PROC ; identity, COMDAT
mov eax, ecx
ret 0
identity(int) ENDP ; identity
x is just moved from ecx into eax. Thatās it.
+1 function
auto add1(int x) {
return x + 1;
}
A simple increment function.
x$ = 8
add1(int) PROC ; add1
mov DWORD PTR [rsp+8], ecx
mov eax, DWORD PTR x$[rsp]
inc eax
ret 0
add1(int) ENDP ; add1
The inc instruction adds 1 to its only operand.
x$ = 8
add1(int) PROC ; add1, COMDAT
lea eax, DWORD PTR [rcx+1]
ret 0
add1(int) ENDP
With optimisations we encounter the lea instruction.
It stands for āload effective addressā and it stores the result of the rhs expression in the destination (it doesnāt actually access memory).
Itās used for calculating memory offsets but itās often used for efficient mathematics2.
In this case weāre storing rax+1 in the eax register so we can return immediately.
Integer multiplication
auto multiply(int x, int y) {
int z{x * y};
return z;
}
Integer multiplication with a stack variable.
z$ = 0
x$ = 32
y$ = 40
multiply(int,int) PROC ; multiply
$LN3:
mov DWORD PTR [rsp+16], edx
mov DWORD PTR [rsp+8], ecx
sub rsp, 24
mov eax, DWORD PTR x$[rsp]
imul eax, DWORD PTR y$[rsp]
mov DWORD PTR z$[rsp], eax
mov eax, DWORD PTR z$[rsp]
add rsp, 24
ret 0
multiply(int,int) ENDP ; multiply
This example has more to go through but itās still simple.
mov DWORD PTR [rsp+16], edx
mov DWORD PTR [rsp+8], ecx
sub rsp, 24
The function prolog.
Parameters x and y are stored in two of the four reserved registers and then moved to memory.
The stack pointer address is reduced by 24 to account for the three variables in the function (8*3=24).
All memory for the function is reserved up front.
Variables can then be accessed with offsets from rsp instead of having to move it about.
mov eax, DWORD PTR x$[rsp]
imul eax, DWORD PTR y$[rsp]
mov DWORD PTR z$[rsp], eax
mov eax, DWORD PTR z$[rsp]
x is moved from memory into the return register eax and then multiplied by y.
This result is moved into zās address in memory before being moved back to eax as the return value.
Somewhat wasteful.
add rsp, 24
ret 0
We reset the stack pointer to its original address and return.
x$ = 8
y$ = 16
multiply(int,int) PROC ; multiply, COMDAT
imul ecx, edx
mov eax, ecx
ret 0
multiply(int,int) ENDP ; multiply
For the optimised version the two parameters are directly multiplied in the registers and moved to eax.