Arm Assembly proper way to PUSH POP link register and pc in a subroutine and calling a subroutine within a subroutine

Question

For ARM assembly

I have been doing the following in my subroutines:

SubRoutine:
  PUSH {r1,r2,lr}
  //code that changes r1 and r2
  POP {r1,r2,lr}
  bx lr

Is this the correct way to return from a subroutine and continue with code in your main function? I have seen around that people are doing the following:

SubRoutine:
  PUSH {r1,r2,lr}
  //code that changes r1 and r2
  POP {r1,r2,pc}
  bx lr

But I don't know why you would POP the PC when you PUSHed the LR. Which is the correct way and why?

Also, If you call a subroutine within a subroutine, do you do the following:

SubRoutine:
   PUSH {r1,r2,lr}

  //code that changes r1 and r2
  PUSH {lr}
  bl AnotherRoutine (where bx lr will be used to return from it)
  POP {lr}

  POP {r1,r2,pc}
  bx lr

OR do you do it like this instead:

SubRoutine:
   PUSH {r1,r2,lr}

  //code that changes r1 and r2
  PUSH {lr}
  bl AnotherRoutine(where bx lr will be used to return from it)
  POP {pc}

  POP {r1,r2,pc}
  bx lr

If you had bothered to read the manual, you'd know that `pop` with `pc` in the list is an interworking branch since ARMv5T. The `bxlr` after it is never reached. — EOF, Nov 05 '16 at 07:15
"don't know why you would POP the PC when you PUSHed the LR. " ---- sounds like you're missing something fundamental here. You push a value __from the LR__ and then pop that value __into the PC__ (and thus perform the branch to that address ). You do need to read some basics on asm programming, there's plenty of stuff on the Net... — tum_, Nov 05 '16 at 09:45
You don't need to store `lr` before `bl AnotherRoutine`, as you already did store it at the start of SubRoutine. If you don't store it at the start, then you would have to save it by `push/pop lr` around `bl`. If you don't modify `lr` at all (no `bl`), you don't need to store it at the beginning, just `bx lr` at end of SubRoutine. Ie. just make sure the value of return address (initially in `lr`, unless you copy it elsewhere, like with `push`) gets into `pc`, whenever you want to return to the caller (can be from multiple places inside SubRoutine, if you need early exit). — Ped7g, Nov 05 '16 at 11:01
What is there to clarify? CPU is state machine. It reads next instruction from `pc` and updates `pc` to point to next instruction. If meanwhile executed instruction does modify `pc`, CPU will continue there, it has only the value in `pc`, no other idea where it is (and was!). — Ped7g, Nov 05 '16 at 11:04
related: [What registers to save in the ARM C calling convention?](https://stackoverflow.com/q/261419) — Peter Cordes, Sep 29 '20 at 00:19

score 7 · Answer 1 · edited May 23 '17 at 11:53

There are three cases that you should be aware of.

Leaf: void foo(void) {};
Tail call: int foo(void) { return bar(); };
Intermediate: int foo(void) { int i; i = bar() + 4; return i; };

There are many ways to implement these calls. Below are some samples and are not the only way to implement epilogue and prologue in ARM assembler.

LEAF funtions

Many functions are the leaf type and do not require saving of the lr. You simply use the bx lr to return. For example,

SubRoutine:
  PUSH {r1,r2}
  //code that changes r1 and r2
  POP {r1,r2}
  bx lr

Also, it is typical that r1 and r2 are used to pass parameters and a SubRoutine is free to use/destroy them.^{ARM calling conventions} This will be the case if you call 'C' function from assembler. So typically, no one would save r1 and r2 but as it is assembler you can do what ever you like (even if it is a bad idea). So actually the example is only bx lr if you follow the standard.

Tail Call

If your function is a leaf except for a final call to another function you can use the following short cut,

Sub_w_tail:
// Save callee-saved regs (for whatever calling convention you need)
// Leave LR as is.
// ... do stuff
B  tail_call

The LR is saved by the caller to Sub_w_tail and you just jump directly to tail_call which returns to the original caller.

Intermediate function

This is the most complex. Here is a possible sequence,

SubRoutine:
   PUSH {r1,r2,lr}

  //code that changes r1 and r2
  bl AnotherRoutine (where bx lr will be used to return from it)

  // more code
  POP {r1,r2,pc}   // returns to caller of 'SubRoutine'

Some details of an older calling convention are in the ARM Link and frame registers question. You can use this convention. There are many different ways to perform the epilogue and prologue in ARM assembler.

The last is quite complex; or at least tedious to code. It is a lot better to let a compiler determine what registers to use and what to place on the stack. However, usually you only need to know how to code the first (LEAF function) when writing assembler. It is most productive only to code an optimized sub-routine called from a higher level language in assembler. It is useful to know how all of them work to understand compiled code. You should also consider inline assembler so you don't have to deal with these nuances.

The tail call method can be quite useful if you are making an assembler *veneer* or *shim* that does some sort of book keeping before calling a *real* routine. For instance, some `malloc` debug code, etc. However, if you compile `void* malloc(size) { return real_malloc(size); };` you can get the same thing. It might be more useful when shared libraries or different memory spaces are active. — artless noise, Nov 09 '16 at 14:47

Arm Assembly proper way to PUSH POP link register and pc in a subroutine and calling a subroutine within a subroutine

1 Answers1

Linked

Related