0

I am trying to learn how to use ptrace library for tracing all system calls and their arguments. I am stuck in getting the arguments passed to system call. I went through many online resources and SO questions and figured out that on 64 bit machine the arguments are stored in registers rax(sys call number), rdi, rsi, rdx, r10, r8, r9 in the same order. Check this website .

Just to confirm this I wrote a simple C program as follows

#include<stdio.h>
#include<fcntl.h>
int main() {
  printf("some print data");
  open("/tmp/sprintf.c", O_RDWR);
}

and generated assembly code for this using gcc -S t.c but assembly code generated is as below

    .file   "t.c"
    .section    .rodata
.LC0:
    .string "some print data"
.LC1:
    .string "/tmp/sprintf.c"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    movl    $2, %esi
    movl    $.LC1, %edi
    movl    $0, %eax
    call    open
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4"
    .section    .note.GNU-stack,"",@progbits

As you can see this code is storing parameters on esi and edi instead. Why is happening?

Also please guide me on what is the best way to access these passed arguments from these registers/memory location from a C code? How can I figure out if the contents of register is the argument itself or is it a memory location where actual argument is stored?

Thanks!

harrythomas
  • 1,407
  • 1
  • 13
  • 17
  • 3
    On an x86 64-bit processors if you change the contents of a 32-bit register the **zero extended** result is placed into the entire 64-bit register. Using the 32-bit registers can shorten the encoding of the instructions. _EDI_ is the lower 32-bits of the register _RDI_ and _ESI_ is the lower 32-bits of _RSI_ – Michael Petch Aug 19 '16 at 13:53
  • @MichaelPetch: I am trying to access these registers from ptrace call. So from your comment even if I try to access rdi it should give me correct data, am I correct? – harrythomas Aug 19 '16 at 14:01
  • If you place a vale in a 32-bit register it should be the same value as the 64-bit register but be 64-bits wide. So looking at the 64-bit register is correct. So although _EDI_ was written to, the contents are viewable through the 64-bit register _RDI_ . – Michael Petch Aug 19 '16 at 14:05
  • Thanks! In case of string how are the values stored? are they directly stored in a register or does registe point to a memory location where actual value is? – harrythomas Aug 19 '16 at 14:10
  • 2
    Assuming you are using AT&T syntax which you are in this case - If you see a label with a `$` sign in front of it like `$.LC0` that tells the assembler to use the value of the label (which is its address) not what is stored at that address. You generally pass the pointer to a string. `movl $.LC0, %edi` moves the **address** of the string "some print data" into _EDI_. – Michael Petch Aug 19 '16 at 14:10
  • So if first argument to open call is stored in rdi, then the contents of rdi is actually the memory location where string '/tmp/sprintf.c' is stored? – harrythomas Aug 19 '16 at 14:13
  • Also one more thing I can't see any int 80 call in the assembly code? Why is that? – harrythomas Aug 19 '16 at 14:46
  • Your code is using the _C_ library (glibc on Linux). `printf` is the _C_ function that displays your text. Under the hood the `printf` function is doing the low level call on your behalf. Same thing for `open`. When you use `-S` to get the assembly output it will not show you the assembler code for the _C_ library functions. On a side note: On 64-but Linux it is preferred you use `syscall` rather than `int 0x80`. – Michael Petch Aug 19 '16 at 14:51
  • [Ryan Chapman's blog](http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64) is a good source of the Linux 64-bit system calls and their parameters. – Michael Petch Aug 19 '16 at 14:59
  • `int 80` will segfault, IIRC. `int 0x80` calls the 32-bit ABI. `syscall` calls the 64-bit ABI. See [the x86 tag wiki](http://stackoverflow.com/tags/x86/info); there seems to be a lot of stuff you don't know, but that's explained by links in the tag wiki. – Peter Cordes Aug 19 '16 at 16:48

1 Answers1

0

this code is storing parameters on esi and edi

32-bit instructions are smaller, thus preferred when possible. See also Why do most x64 instructions zero the upper part of a 32 bit register.


How can I figure out if the contents of register is the argument itself or is it a memory location where actual argument is stored?

The AMD64 SystemV calling convention never implicitly replaces a function arg with a hidden pointer. Integer / pointer args in the C prototype always go in the arg-passing registers directly.

structs / unions passed by value go in one or more registers, or on the stack.

The full details are documented in the ABI. See more links in the tag wiki. http://www.x86-64.org/documentation.html is down right now, so I linked the current revision on github.

CL.
  • 173,858
  • 17
  • 217
  • 259
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • @CL.: thanks for the doc-links cleanups in my answers, especially taking care to replace the links with a similar Wikipedia link when appropriate :) – Peter Cordes Sep 21 '17 at 09:30