1

I have a question on a homework assignment for an Assembly class. I'm not looking for an answer by any means, just some guidance with how it works. Based on my understanding, I can't determine what exactly is happening.

Consider a function P, which generates local values a-c by simple local computation 
and d-f by calling Q(), R(), and S(). 

long P(long x,long y,long z) {
     long a = ...;
     long b = ...;
     long c = ...;
     long d = ...;
     long e = ...;
     long f = ...;
     return d + e + f;
  }

0000000000000022 <P>:
22: 55                      push   %rbp
23: 53                      push   %rbx
24: 48 83 ec 20             sub    $0x20,%rsp
28: 48 83 c7 01             add    $0x1,%rdi
2c: 48 89 7c 24 18          mov    %rdi,0x18(%rsp)
31: 48 83 c6 02             add    $0x2,%rsi
35: 48 89 74 24 10          mov    %rsi,0x10(%rsp)
3a: 48 83 c2 03             add    $0x3,%rdx
3e: 48 89 54 24 08          mov    %rdx,0x8(%rsp)
43: 48 8d 74 24 10          lea    0x10(%rsp),%rsi
48: 48 8d 7c 24 18          lea    0x18(%rsp),%rdi
4d: b8 00 00 00 00          mov    $0x0,%eax
52: e8 00 00 00 00          callq  57 
57: 48 89 c3                mov    %rax,%rbx
5a: 48 8d 74 24 08          lea    0x8(%rsp),%rsi
5f: 48 8d 7c 24 10          lea    0x10(%rsp),%rdi
64: b8 00 00 00 00          mov    $0x0,%eax
69: e8 00 00 00 00          callq  6e 
6e: 48 89 c5                mov    %rax,%rbp
71: 48 8d 74 24 18          lea    0x18(%rsp),%rsi
76: 48 8d 7c 24 08          lea    0x8(%rsp),%rdi
7b: b8 00 00 00 00          mov    $0x0,%eax
80: e8 00 00 00 00          callq  85 
85: 48 01 eb                add    %rbp,%rbx
88: 48 01 d8                add    %rbx,%rax
8b: 48 83 c4 20             add    $0x20,%rsp
8f: 5b                      pop    %rbx
90: 5d                      pop    %rbp
91: c3                      retq 
  • Find the size of the stack in bytes.

  • Identify the assembly statement(s) that allocate and free the local stack.

  • Identify which local values get stored in callee-saved registers.

  • Identify which local values get stored on the stack.

  • Explain why the program could not store all of the local values in callee-saved registers.

I understand the concept of pushing rbx and rbp on the stack to make room for other local variables later on. I understand how space on the stack is allocated on line 24. Then the arguments passed into P are altered and stored on the stack. My issue starts at line 43.

line 43 and and 48 create pointers to positions on the stack correct? Then line 4d sets eax(or rax) to 0. Then on line 57 we set rbx to rax(0), and the next 3 lines I'm completely confused about. We create more pointers of the stack and store the address into rsi and rdi. Wouldn't this override what we did in lines 43 and 48. And then it sets eax(rax) to 0 again on line 64, but eax was already 0 and nothing changed with it.

This repeats with the next call on line 69 as mentioned above. By the time you get to line 85 and 88 to me it seems like it would just be 0 + 0 + 0.

On a side note, shouldn't each 'callq' end with a 'ret'? For example, shoudn't there be a 'ret' after line 64 and 7b?

I've reached the point where it feels like something is missing from the code, but I wanted to check first because it seems more likely that I'm not understanding some core fundamental principle.

Thank you in advance for any friendly nudge in the right direction to figure this out!

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Sean Weiss
  • 13
  • 2
  • Have you tried using a debugger on the code when assembled into an executable? – mustachioed Mar 08 '18 at 22:16
  • If you're not looking for an answer you're on the wrong site and this is off-topic. Stack Overflow is specifically for technical problems with technical answers. – tadman Mar 08 '18 at 22:24
  • I simply meant that I'm not here to get a quick solution to the problem and turn in the assignment. I don't understand technically what is happening with the function calls written in assembly. I'm looking for an answer to exactly what's happening there to verify if I understand the assembly code or if there might be something missing in the assignment itself. – Sean Weiss Mar 08 '18 at 22:55
  • @MustacheMoses I have not. Is there a way to type in the assembly code to do that without actually having the correct 'C' code? I can try looking into that if there is. – Sean Weiss Mar 08 '18 at 22:57
  • @SeanWeiss yes, it's called an assembler. You may need to do some work to get it to compile as is though with entry points and everything. – mustachioed Mar 08 '18 at 23:03
  • That asm code doesn't look correct to me, AFAIK as I can tell it does push two values at beginning, then -0x20 from `rsp` (that's -0x30 in total to original return address), then it does push three return addresses (-0x18 = -0x48), then it adds 0x20 (-0x28), pops two values (-0x18) and then it will use that as return address for `retq`, which is initial `rdi+1` I think (didn't debug it, just run it quickly in head, and not even every instruction, so maybe some stack manipulation eludes me). Looks like disassembly of object file, without adjusting of relocation info, so calls are wrongly `+0`. – Ped7g Mar 09 '18 at 00:15
  • I.e. the `call 85` should target some other address probably. Anyway, the pairing of `call` with `ret` is not mandatory, those are separate individual instructions. They are usually used in that "pair" way, but it's not mandatory, if you will prepare some return address in stack by other means than `call`, the `ret` will still use it (principle used in late "retpoline" fixes of spectre and meltdown vulnerabilities, instead of jumping to kernel code, the address is put into stack, and `ret` is used for jump, clearing some internal caches as it is not well paired with previous `call`), etc. – Ped7g Mar 09 '18 at 00:21
  • And the `call` is sometimes used in exploits to load address of data in position-independent code (as the `call` will push absolute address of next byte after `call` into stack, so `call execShell` `db "/bin/sh", 0` `execShell: pop esi` will load absolute address of the string into `esi`, while at compilation time this code requires only relative offsets (`call` target is encoded in relative way), and there will be no `ret`, as that "return address" would make the CPU execute string as code. The instruction does exactly what it does (see docs), each is separate independent CPU-state-changer. – Ped7g Mar 09 '18 at 00:26
  • Thank you to everyone that responded, it really helped piece together what I wasn't understanding and helped me obtain a stronger grasp of how it all works! – Sean Weiss Mar 09 '18 at 16:46

2 Answers2

2

The code you're looking at is unlinked code. The three call instructions do not simply jump to the next instruction,* they will be filled in by the linker with the offset to an actual function. So you cannot simply ignore their behavior as you have been.

The behavior of a function call is dependent on the ABI, as Anders mentioned. In particular, RSI and RDI should be assumed to be overwritten, and RAX contains the return value of the function.

* Call instructions in x86 are relative to the next instruction. So an offset of 0 in a call instruction causes the disassembler to display the next instruction as the target. This is typical for unlinked code.

prl
  • 11,716
  • 2
  • 13
  • 31
  • I think this is the correct answer right here. I'm thinking that the function calls go to code that's not provided and returns to the next line, which explains why we immediately take the return value from rax and store it somewhere else. The confusion was thinking the call jumped to the address next it it (call 57 jumps straight to line 57), but I guess that's not necessarily true if it's _unlinked code_. Thank you for providing this answer, I've been scratching my head for hours trying to make sense of this. – Sean Weiss Mar 09 '18 at 16:44
0

You should be studying the ABI that describes the interface between higher level and ASM. This is just a example link below. You need to find the ABI for the architecture and compiler you are using. https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html

Anders Cedronius
  • 2,036
  • 1
  • 23
  • 29
  • See [Where is the x86-64 System V ABI documented?](https://stackoverflow.com/questions/18133812/where-is-the-x86-64-system-v-abi-documented) – Peter Cordes Mar 08 '18 at 23:22