0

So I'm reading through "Modern compiler Implementation in C" and in one of the problems it presents the following C function

int f(int a) {int b; b = a+1; g(); h(b); return b+2;}

And it poses the question: "If local variable b is live across more than one procedure call is it kept in a callee-save register? Explain how doing this would speed up the program

After writing functions for f, g, and h compiling / disassembling it with gcc -c -S simple-functions.c I get the output:

f:
.LFB3:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $24, %rsp
    movl    %edi, -20(%rbp)  # Receive input 'a' in the edi register. Save it to the stack as a local variable 'b = a'
    movl    -20(%rbp), %eax  # Move 'b' into  a register and increment it
    addl    $1, %eax         
    movl    %eax, -4(%rbp)   # Save the value of 'b' to the stack
    movl    $0, %eax
    call    g
    movl    -4(%rbp), %eax    # Reclaim the saved value of b
    movl    %eax, %edi        # Pass b to the function 'h' through the edi register
    call    h
    movl    -4(%rbp), %eax # Reclaim b again for return value
    addl    $2, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

(Where I have included annotations based on my interpretation of what the code is doing)

Looking at the above code it seems to me the register being used to store b, 'eax', is a a "caller-save" register, as the caller 'f' is saving its value before calling g.

But now I'm confused about the question " Explain how doing this would speed up the program". It doesn't seem to me that putting b in a callee-save register would speed things up. In fact it seems like it would be slower. In the above code 'b' is saved to the stack once but reclaimed twice. If 'b' were in a callee-save register both the 'g' and 'h' function would need to go through the process of saving and then restoring the value 'eax' twice so it would actually be doing more.

EDIT: My understanding is that it is currently doing 1 write and 2 reads. But using a callee-save would cause it to do 2 writes and 2 reads as both f and g would save and restore the register.

Is what I'm saying making sense. Can someone explain how using a callee-save register would be advantageous in this situation?

Onye
  • 195
  • 1
  • 7
  • 4
    `g` and `h` need to preserve that register anyway so that's a fixed cost. Using a callee-saved register would mean `f` has to save the incoming value for its caller and restore it at the end, so that's 1 write and 1 read. Currently, it is using 1 write and 2 reads. Note that you did not enable optimizations so the compiler is not producing efficient code. – Jester Jan 21 '23 at 02:30
  • 2
    [Turn on optimizations](https://godbolt.org/z/nabW74Kda) to see a callee-saved register being used (`ebx` in that example). Interestingly the compiler chooses to save `a` instead of `b`, probably because `b` got optimized out. The code becomes the equivalent of `int f(int a) { g(); h(a+1); return a+3;}` – Nate Eldredge Jan 21 '23 at 03:06
  • 4
    Note there's a fair chance that `g` and/or `h` *don't* need to use that particular callee-saved register, in which case they will not save and restore it, so you will not pay that cost. – Nate Eldredge Jan 21 '23 at 03:08
  • Just count the reads and writes and instructions in general in the two versions. – Erik Eidt Jan 21 '23 at 05:14
  • 1
    Compilers use a call-preserved register even when there's only one function call; with `-Os` (optimize for size) it's a missed optimization to save/restore a call-preserved reg and mov to/from it in that case instead of just pushing and popping the incoming register arg. See [Why do compilers insist on using a callee-saved register here?](https://stackoverflow.com/q/61375336) (which of course compiles with optimization to make asm that's relevant.) – Peter Cordes Jan 21 '23 at 06:07
  • 2
    What are `g()` and `h()`? Note that if the compiler is any good at optimizing these functions will probably be inlined into `f()` so that all the calling convention restrictions become irrelevant and don't hamper optimization. – Brendan Jan 21 '23 at 08:02
  • 2
    See also [why callees don't use caller saved registers first?](https://stackoverflow.com/q/63164981) - very similar code to yours, looking at code with function calls, but mixed by the clunky and confusing terminology of "callee saved" instead of "call preserved". – Peter Cordes Jan 21 '23 at 11:39

0 Answers0