Confused about 64-bit registers - ASM

Question

I'm currently learning assembly, I'm using Intel syntax on a 64bit ubuntu, using nasm.

So I found two websites that reference the syscalls numbers:

This one for 32 bit registers (eax, ebx, ...): https://syscalls.kernelgrok.com

This one for 64 bits registers (rax, rbx, ...): https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64

The thing is that my code doesn't work when I'm using the 64 bits syscall numbers, but it works when I replace the 'e' from the 32 bit registers by a 'r', so for instance in sys_write I use rbx to store the fd instead of rdi as and it works.

I'm quite lost right now. This code doesn't work:

message db 'Hello, World', 10

section .text
global _start
_start: mov rax,4
        mov rdi, 1
        mov rsi, message
        mov rdx, 13
        syscall
        mov rax, 1
        mov rdi, 0
        syscall

If you write 64 bit code, you should use 64 bit system calls. Note that the arguments go into different registers than with 32 bit system calls. Please post your code if you want debug help. — fuz, Mar 26 '20 at 18:21
Are you also changing your code from using `int 0x80` to `syscall`? — David Wohlferd, Mar 26 '20 at 18:25
```section .data message db 'Hello, World', 10 section .text global _start _start: mov rax,4 mov rbx, 1 mov rcx, message mov rdx, 13 int 80h mov rax, 1 mov rdi, 0 int 80h``` this code works — Ajvar, Mar 26 '20 at 18:27
Put the code you claim doesn't work into your question with an [edit]. `int 0x80` system calls only ever look at the low 32 bits of registers which is why you shouldn't use them in 64-bit code. But that means your code would work fine with `mov ecx, message` and so on, *if* it works with `mov rcx, message`. See [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730). Were you putting the pointer in `esi` or `rsi` before `int 0x80`? Of course that doesn't work, the calling convention and call numbers differ, too. — Peter Cordes, Mar 26 '20 at 18:31

score 4 · Accepted Answer · answered Mar 26 '20 at 18:35

4

Run strace ./my_program - you make a bogus stat system call, then write which succeeds, then fall off the end and segfault.

$ strace ./foo 
execve("./foo", ["./foo"], 0x7ffe6b91aa00 /* 51 vars */) = 0
stat(0x1, 0x401000)                     = -1 EFAULT (Bad address)
write(0, "Hello, World\n", 13Hello, World
)          = 13
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xd} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)

It's not register names that are your problem, it's call numbers. You're using 32-bit call numbers but calling the 64-bit syscall ABI.

Call numbers and calling convention both differ.

int 0x80 system calls only ever look at the low 32 bits of registers which is why you shouldn't use them in 64-bit code.

The code you posted in a comment with mov rcx, message would work fine with mov ecx, message and so on, if it works with mov rcx, message. See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?.

Note that writing a 32-bit register zero-extends into the full 64-bit register so you should always use mov edi, 1 instead of mov rdi, 1. (Although NASM will do this optimization for you to save code-size; they're so equivalent that some assemblers will silently do it for you.)

answered Mar 26 '20 at 18:35

Peter Cordes

328,167
45
605
847

Thank's mate, I got really confused with all the 32/64bit stuff, you really helped me here ! Also I think it's because i'm working on mac AND ubuntu and Mac OS 64 syscall numbers seem to be the same as linux 32 – Ajvar Mar 26 '20 at 18:39
2

@Ajvar The system call numbers are different on each system. For historical reasons, some system call numbers are the same on many systems but you shouldn't rely on that. The best way to do system calls is to use the libc wrappers as those are portable, allowing your program to be run on multiple systems. – fuz Mar 26 '20 at 18:46
And for the zero extend thing, shouldn't I use 8 bit registers then ? – Ajvar Mar 26 '20 at 18:47
1

@Ajvar No as writes to 8 bit registers are not automatically zero-extended to the whole 64 bit. This only happens when you write to a 32 bit register. – fuz Mar 26 '20 at 19:28
1

@Ajvar: [Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?](https://stackoverflow.com/q/11177137) Unfortunately no x86 extension has ever added an opcode for `mov r/m32, sign_extended_imm8` which would save 2 bytes on every mov-immediate to register, or more for negative values to 64-bit regs. And 3 bytes per instruction with a memory destination. AMD64 could have, since it freed up several opcodes, but AMD64 was very conservative perhaps because AMD weren't sure it would even catch on :( – Peter Cordes Mar 26 '20 at 19:58
@Ajvar: the base call numbers on MacOS are similar to Linux, but you need to set some hit bits as flags for 64-bit MacOS syscalls. – Peter Cordes Mar 26 '20 at 20:02

Confused about 64-bit registers - ASM

1 Answers1