3

I'm beginning to learn x86_64 assembly programming with the NASM assembler on ubuntu linux. One of the things I'm having trouble with is figuring out which registers are magically used by the operations.

The book I'm reading has code samples like this:

mov    rdi, fmt1
mov    rsi, strng
mov    rax, 0
call   printf

; How am I supposed to know which registers are used by the call to printf? 
; The libc printf function supports an arbitrary number of parameters. 
; Clearly there aren't an unlimited number of registers in x86_64 so how does this work
; as the parameter list grows?

And another part of the code sample is this:

xor    rax, rax
mov    rbx, strng
mov    rcx, strLen
mov    r12, 0
pushLoop:
    mov    al, byte[rbx + r12]
    push   rax
    inc    r12
    loop   pushLoop
; It took me a few seconds to find out where the exit condition is. I realized that
; rcx is being compared to r12 in some way, but I'm not sure how. Is it explained anywhere?

I'm not sure where I should be looking for the answer to my first question. My hunch is that the answer to my second question is in the NASM documentation somewhere but I'm not sure where to find it. I'm trying to relate these constructs to what I know in high level languages, but I'm struggling.

Thank you!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    See the [ABI](https://uclibc.org/docs/psABI-x86_64.pdf). Library functions generally use registers in the same way as any other. Para 3.2 describes calling conventions. – Gene Jan 22 '20 at 04:48

2 Answers2

5

First part: all library functions follow the standard calling convention. On all x86-64 platforms other than Windows, that's the x86-64 System V ABI.

You can make up your own conventions when writing your own asm functions, like returning multiple different values in multiple registers instead of limiting yourself to only what you could get a C compiler to do.

(e.g. you could write a memcmp that returns the position of the first difference in RDI and the actual < = or > in FLAGS, e.g. from doing a cmp on the mismatching bytes.)

But compiler-generated functions you can call from asm (including C standard library functions) will always follow the ABI.


Second part: implicit usage of registers by some instructions: check the ISA manual for relevant instructions. If you don't know it, don't just assume from the name.

You can single-step in a debugger that highlights register-value changes to help you notice any case where a register changes that you weren't expecting at all.

Look instructions up on in Intel's vol.2 manual (or AMD's equivalent). e.g. HTML extract of the Intel's PDF at https://www.felixcloutier.com/x86/, specifically the entry for loop. Also How exactly does the x86 LOOP instruction work? explains that it's like a dec rcx / jnz except without setting FLAGS.

There aren't that many instructions with implicit operands. The most commonly used ones are stack instructions like push/pop implicitly using RSP in the obvious way.

The other notable ones include E/RAX and E/RDX being used by one-operand [i]mul and [i]div. (And cdq to sign-extend EAX into EDX:EAX to set up for idiv, or cdqe into RAX)

CL for variable shift counts is implicit in the machine code, but explicit in asm source (like shr rdx, cl).

rep-"string" instructions implicitly use RCX, plus RSI and/or RDI.

Most of these implicit uses come from old 8086 history. See Why is there not a register that contains the higher bytes of EAX?. Instructions like loop and jrcxz aren't used by compilers because they're slow, and the 2-operand form of imul like imul ecx, edx are faster when you don't need the high half result in EDX/RDX.

Further reading:

This is not an exhaustive list. cmpxchg / cmpxchg16b, xlat, cpuid, rdtsc, rdpmc, and many others have implicit operands, but only a few of the instructions that get used regularly by compilers do.

Note that FLAGS is an implicit input to many instructions, like adc and cmov.


NASM has an appendix that lists all instructions, but generally assemblers leave that up to CPU vendors. All x86-64 assemblers produce machine code for the same instructions. This bugfixed fork of an older version of that doc keeps English descriptions of instructions. (Mainline NASM removed that for space after adding SSE instructions; there are just too many to do more than list in one flat page these days, with AVX2 and especially AVX512.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Note that insref.htm (your answer's last link) doesn't list amd64 instructions, operations, or registers. So it doesn't fit entirely here. – ecm Jan 25 '20 at 20:37
  • 1
    @ecm: oh right. Well that just reinforces my point that assembler manuals are not where you should look for instruction-set listings, at least not for x86. It's still useful if you know how x86-64 extended things, and which opcodes it removed... – Peter Cordes Jan 25 '20 at 20:43
3
  1. You're asking about the calling conventions used on Linux x86-64. These follow the System V ABI. That document explains all these details. Calling conventions are in Section 3.2 of the v1.0 document. The short and oversimplified answer to your specific question is that the first 6 arguments are passed in registers; if there are more than that, they are pushed onto the stack. (Life gets more complicated if some of the arguments have types other than integer or pointer.)

    This is also where you find details about which registers may or may not be modified by a called function. For example, the call to printf might modify the rdx register, but not rbx (or if it does, it will save the previous value and restore it before returning).

  2. The details of what instructions do is usually considered part of the documentation of the processor, not of the assembler. So the official source would be the software developer's manual from the processor vendor. Here is Intel and here is AMD (see the "AMD64 Architecture" documents). There are also many third-party manuals explaining the instruction set. felixcloutier.com is a popular one. Here is the loop instruction; you can see that it decrements rcx on each iteration and exits when it reaches zero.

Community
  • 1
  • 1
Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82