Skip to content

Assembly (Intel)

Compiling & Linking

Bash
1
2
3
4
5
# compiling
nasm -g -f elf64 src.asm  # -g adds debug info, -f specifies the 64bit ELF format

# linking
ld -o src src.o  # -o specifies the output name

Basics

Exports & Sections

Symbols can be exported to be linked against with the linker

Every program in ELF has several sections:

  • data: global variables
  • rodata: global constants (read-only data)
  • bss: space reserved at program startup
  • text: CPU instructions
GAS
1
2
3
4
5
6
; export the '_start' symbol for linking
global _start

; specify a section
section .data
section .text

Labels & Declarations

Anything that's on the beginning of a line and is followed by a colon (:) is a label. Labels generally store addresses.

Declaration instructions:

  • db: declare bytes
  • dw: declare word (2 bytes)
  • dd: declare double word (4 bytes)
  • dq: declare quad word (8 bytes)
  • equ: set a name to the value of an expression

Note: See the NASM manual, section 3.2.1 for the full list. Note: all byte declarations are Little Endian

GAS
arr: db 0x12,0x34,0x56,0x78,0x90

Registers

There are several registers available on x86_64.
Some serve a specific purposes (e.g. registers for storing floating point ; numbers), while others are called "general purpose" registers.

There are 16 of them:

  • rax: accumulator
  • rbx: base
  • rcx: counter
  • rdx: destination
  • rsp and rbp: stack pointer and base pointer
  • rsi and rdi: source and destination index
  • r8 through r15: lack of creativity

The prefix r means that instructions will use all 64 bits in the registers.

For all those registers, except r8 through r15, it's possible to access:

  • the lowest 32 bits with e prefix (e.g. eax, ebp)
  • the lowest 16 bits without any prefix (e.g. ax, si)

For registers rax through rdx, it's possible to access:

  • the lowest byte with the l suffix, replacing the trailing x (e.g. al)
  • the highest byt in the 16 bits with the h suffix, in the same way (e.g. ah)

Instructions

Instructions are operations that the CPU knowns how to execute directly.
They are separated from their operands by whitespace, and the operands are separated from other with commas.

GAS
1
2
3
4
<instr> <operand1>, <operand2>, ..., <operand_n>

; Intel syntax dictates the first operand is the destination, and the second is the source
<instr> DEST, SOURCE
GAS
1
2
3
mov eax, 0x12345678  ; copies 4 bytes to eax
inc rdi  ; INC: increment
dec rsi  ; DEC: decrement

add

Adds the two operands and stores the result in the destination.

GAS
add rdi, rbx  ; Equivalent to rdi += rbx

sub

Subtract the two operands and stores the result in the destination.

GAS
sub rsi, rbx  ; Equivalent to rsi -= rbx

mul, div, imul, idiv

mul and div interpret their operands as unsigned integers.
imul and idiv interpret their operands as signed integers in two's complement.

mul and div instructions take a single operand because they use fixed registers for the other number.

For mul, the result is rax * <operand>, and it's a 128-bit value stored in rdx:rax, meaning the 64 lower bits are stored in rdx, while the 64 upper bits are stored in rax.

For div, the operand is the divisor and the dividend is rdx:rax, meaning it's a 128-bit value whose 64 upper bits are in rdx and whose 64 lower bits are in rax.
The quotient is a 64-bit value stored in rax, and the remainder is also a 64-bit value, stored in rdx.

and, or, xor

GAS
1
2
3
and rdi, rsi  ; bitwise AND
or rdi, rsi  ; bitwise OR
xor rdi, rsi  ; bitwise XOR

shr, shl

GAS
shr rsi, 2  ; right (logical) bitshift: equivalent to rsi >> 2
shl rsi, 3  ; left (logical) bitshift: equivalent to rsi << 3

Note: there's sar for arithmetic right shift and sal for arithmetic shift left.