10

I'm trying to get the values of the assembly registers rdi, rsi, rdx, rcx, r8, but I'm getting the wrong value, so I don't know if what I'm doing is taking those values or telling the compiler to write on these registers, and if that's the case how could I achieve what I'm trying to do (Put the value of assembly registers in C variables)?

When this code compiles (with gcc -S test.c)

#include <stdio.h>

void    beautiful_function(int a, int b, int c, int d, int e) {
    register long   rdi asm("rdi");
    register long   rsi asm("rsi");
    register long   rdx asm("rdx");
    register long   rcx asm("rcx");
    register long   r8 asm("r8");

    const long      save_rdi = rdi;
    const long      save_rsi = rsi;
    const long      save_rdx = rdx;
    const long      save_rcx = rcx;
    const long      save_r8 = r8;
    printf("%ld\n%ld\n%ld\n%ld\n%ld\n", save_rdi, save_rsi, save_rdx, save_rcx, save_r8);
}

int main(void) {
    beautiful_function(1, 2, 3, 4, 5);
}

it outputs the following assembly code (before the function call):

    movl    $1, %edi
    movl    $2, %esi
    movl    $3, %edx
    movl    $4, %ecx
    movl    $5, %r8d
    callq   _beautiful_function

When I compile and execute it outputs this:

0
0
4294967296
140732705630496
140732705630520
(some undefined values)

What did I do wrong ? and how could I do this?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Fayeure
  • 1,181
  • 9
  • 21
  • What does the assembly look like for your function? – Shawn Jun 06 '21 at 22:22
  • 2
    You can't *reliably* get the registers as they existed before the call. `asm("rdi")` etc are only hints to the compiler and it may not use that register. The `asm("rdi")` will be used if the variable appears as an input or output operand of inline assembly. If you created a naked function with all basic inline assembly you'd be able to do it I guess but you can't mix _C_ code or make references to parameters. – Michael Petch Jun 06 '21 at 22:25
  • Write some inline assembler to store the parameter registers into local variables. So, just declare the "save_rrr" variables followed by an asm block with "mov save_edi,edi" (in the correct form for compiler!). As Michael says, the "register" keyword is just a hint to the compiler. – Skizz Jun 06 '21 at 22:30
  • 1
    Is it possible that you compiled the runnable program with other options (like optimizations on with -O# that you didn't pass yo the compiler when you specified -S? – Michael Petch Jun 06 '21 at 22:40
  • cant reproduce https://godbolt.org/z/38vGK516e – 0___________ Jun 06 '21 at 23:18
  • @MichaelPetch you can in this case because it is defined by ABI. – 0___________ Jun 06 '21 at 23:23
  • According to the [docs](https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html) for using local register variables: *The only supported use for this feature is to specify registers for input and output operands when calling Extended asm.* Given that you have no extended asm in this function, I'm not sure what results you were expecting. If you use a feature in an unsupported fashion, undefined behavior is the inevitable result. – David Wohlferd Jun 07 '21 at 00:04
  • 4
    What's actually the goal here? Even if you're able to read the registers within `beautiful_function`, that won't guarantee you get the same values that were in those registers when the function was called; the compiler is free to insert other code in `beautiful_function` that overwrites the registers. So whatever your goal, this is probably not a good way to achieve it. – Nate Eldredge Jun 07 '21 at 00:20
  • 2
    If you want to see the register contents for debugging, use your debugger to insert breakpoints and dump registers. If you want to see how parameters are passed, read the ABI and the generated assembly. If you really want to use the register values as of the function call, write the entire function in raw assembly like Michael Petch suggests. Or, for the registers that are used for parameter passing, simply define parameters for your function and use them like normal C. – Nate Eldredge Jun 07 '21 at 00:21
  • @0___________ : The ABI says what registers get used for a standard ABI call, but with optimizations that could all change. The ABI also doesn't guarantee that the prologue code after the function starts and before you access the variables hasn't put the value in different registers or clobbered some of them. It isn't likely at all, BUT it is not behaviour you can *reliably* rely on. If you turn optimizations on this would almost certainly fail. – Michael Petch Jun 07 '21 at 00:32
  • 1
    @0___________: your Godbolt link uses separate compiler options for the asm view vs. the "execution" pane. You were looking at gcc11 -O0 execution output (which does happen to work; apparently https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html `register ..asm()` local vars in debug mode do something even when not using them with Extended Asm, the only *guaranteed* behaviour). https://godbolt.org/z/qqEdKadEe shows that after inlining, reading uninitialized register vars is just garbage. (I used the "run the executable" option for a single compiler pane) – Peter Cordes Jun 07 '21 at 00:36
  • @0___________ https://godbolt.org/z/TETWfrdsh shows that even `clang -O0` (with no inter-procedural optimizations) still prints garbage. Only GCC without inlining / inter-procedural optimization "works", and even then it's a happens-to-work undocumented behaviour. (Support for it was intentionally removed from documentation a few years ago.) – Peter Cordes Jun 07 '21 at 00:39
  • 1
    @MichaelPetch: re: *Is it possible that you compiled the runnable program with other options (like optimizations on* - This looks like MacOS, so `gcc` is actually clang, which breaks this code even at `-O0`. (`callq _beautiful_function` leading `_`, plus x86-64 SysV calling convention.) Clang treats the uninitialized `register long rdi asm("rdi");` as just an ordinary variable, and reads its value from its uninitialized stack memory, not from RDI like GCC happens to. – Peter Cordes Jun 07 '21 at 00:47

2 Answers2

8

Your code didn't work because Specifying Registers for Local Variables explicitly tells you not to do what you did:

The only supported use for this feature is to specify registers for input and output operands when calling Extended asm (see Extended Asm).

Other than when invoking the Extended asm, the contents of the specified register are not guaranteed. For this reason, the following uses are explicitly not supported. If they appear to work, it is only happenstance, and may stop working as intended due to (seemingly) unrelated changes in surrounding code, or even minor changes in the optimization of a future version of gcc:

  • Passing parameters to or from Basic asm
  • Passing parameters to or from Extended asm without using input or output operands.
  • Passing parameters to or from routines written in assembler (or other languages) using non-standard calling conventions.

To put the value of registers in variables, you can use Extended asm, like this:

long rdi, rsi, rdx, rcx;
register long r8 asm("r8");
asm("" : "=D"(rdi), "=S"(rsi), "=d"(rdx), "=c"(rcx), "=r"(r8));

But note that even this might not do what you want: the compiler is within its rights to copy the function's parameters elsewhere and reuse the registers for something different before your Extended asm runs, or even to not pass the parameters at all if you never read them through the normal C variables. (And indeed, even what I posted doesn't work when optimizations are enabled.) You should strongly consider just writing your whole function in assembly instead of inline assembly inside of a C function if you want to do what you're doing.

  • 1
    Indeed, that does work to read registers, but clang will optimize away the code in `main` that sets those register values (assuming you enable optimization), because the actual `int a` C variable args are unread. Even with `__attribute__((noinline))`. I was working on an answer with an example when you posted this; I just got back to it and finished. – Peter Cordes Jun 07 '21 at 02:14
7

Even if you had a valid way of doing this (which this isn't), it probably only makes sense at the top of a function which isn't inlined. So you'd probably need __attribute__((noinline, noclone)). (noclone is a GCC attribute that clang will warn about not recognizing; it means not to make an alternate version of the function with fewer actual args, to be called in the case where some of them are known constants that can get propagated into the clone.)

register ... asm local vars aren't guaranteed to do anything except when used as operands to Extended Asm statements. GCC does sometimes still read the named register if you leave it uninitialized, but clang doesn't. (And it looks like you're on a Mac, where the gcc command is actually clang, because so many build scripts use gcc instead of cc.)

So even without optimization, the stand-alone non-inlined version of your beautiful_function is just reading uninitialized stack space when it reads your rdi C variable in const long save_rdi = rdi;. (GCC does happen to do what you wanted here, even at -Os - optimizes but chooses not to inline your function. See clang and GCC (targeting Linux) on Godbolt, with asm + program output.).


Using an asm statement to make register asm do something

(This does what you say you want (reading registers), but because of other optimizations, still doesn't produce 1 2 3 4 5 with clang when the caller can see the definition. Only with actual GCC. There might be a clang option to disable some relevant IPA / IPO optimization, but I didn't find one.)

You can use an asm volatile() statement with an empty template string to tell the compiler that the values in those registers are now the values of those C variables. (The register ... asm declarations force it to pick the right register for the right variable)

#include <stdlib.h> 
#include <stdio.h>

__attribute__((noinline,noclone))
void    beautiful_function(int a, int b, int c, int d, int e) {
    register long   rdi asm("rdi");
    register long   rsi asm("rsi");
    register long   rdx asm("rdx");
    register long   rcx asm("rcx");
    register long   r8 asm("r8");

    // "activate" the register-asm locals:
    // associate register values with C vars here, at this point
   asm volatile("nop  # asm statement here"        // can be empty, nop is just because Godbolt filters asm comments
       : "=r"(rdi), "=r"(rsi), "=r"(rdx), "=r"(rcx), "=r"(r8) );

    const long      save_rdi = rdi;
    const long      save_rsi = rsi;
    const long      save_rdx = rdx;
    const long      save_rcx = rcx;
    const long      save_r8 = r8;
    printf("%ld\n%ld\n%ld\n%ld\n%ld\n", save_rdi, save_rsi, save_rdx, save_rcx, save_r8);
}

int main(void) {
    beautiful_function(1, 2, 3, 4, 5);
}

This makes asm in your beautiful_function that does capture the incoming values of your registers. (It doesn't inline, and the compiler happens not to have used any instructions before the asm statement that steps on any of those registers. The latter is not guaranteed in general.)

On Godbolt with clang -O3 and gcc -O3

gcc -O3 does actually work, printing what you expect. clang still prints garbage, because the caller sees that the args are unused, and decides not to set those registers. (If you'd hidden the definition from the caller, e.g. in another file without LTO, that wouldn't happen.)

(With GCC, noninline,noclone attributes are enough to disable this inter-procedural optimization, but not with clang. Not even compiling with -fPIC makes that possible. I guess the idea is that symbol-interposition to provide an alternate definition of beautiful_function that does use its args would violate the one definition rule in C. So if clang can see a definition for a function, it assumes that's how the function works, even if it isn't allowed to actually inline it.)

With clang:

main:
        pushq   %rax          # align the stack
     # arg-passing optimized away
        callq   beautiful_function@PLT
    # indirect through the PLT because I compiled for Linux with -fPIC, 
    # and the function isn't "static"
        xorl    %eax, %eax
        popq    %rcx
        retq

But the actual definition for beautiful_function does exactly what you want:

# clang -O3
beautiful_function:
        pushq   %r14
        pushq   %rbx
        nop     # asm statement here
        movq    %rdi, %r9             # copying all 5 register outputs to different regs
        movq    %rsi, %r10
        movq    %rdx, %r11
        movq    %rcx, %rbx
        movq    %r8, %r14
        leaq    .L.str(%rip), %rdi
        xorl    %eax, %eax
        movq    %r9, %rsi                # then copying them to printf args
        movq    %r10, %rdx
        movq    %r11, %rcx
        movq    %rbx, %r8
        movq    %r14, %r9
        popq    %rbx
        popq    %r14
        jmp     printf@PLT              # TAILCALL

GCC wastes fewer instructions, just for example starting with movq %r8, %r9 to move your r8 C var as the 6th arg to printf. Then movq %rcx, %r8 to set up the 5th arg, overwriting one of the output registers before it's read all of them. Something clang was over-cautious about. However, clang does still push/pop %r12 around the asm statement; I don't understand why. It ends by tailcalling printf, so it wasn't for alignment.


Related:

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847