Porting from 32 to 64-bit by just changing all the register names from eXX to rXX makes factorial return 0?

Question

How fortunate it is for all of use learning the art of computer programming to have access to a community such as Stack Overflow! I have made the decision to take up the task of learning how to program computers and I am doing so by the knowledge of an e-book called 'Programming From the Ground Up', which teaches the reader how to create programs in the assembly language within the GNU/Linux environment.

My progress in the book has come to the point of creating a program which computes the factorial of the integer 4 with a function, which I have made and done without any error caused by the assembler of GCC or caused by running the program. However, the function in my program does not return the right answer! The factorial of 4 is 24, but the program returns a value of 0! Rightly speaking, I do not know why this is!

Here is the code for your consideration:

.section .data

.section .text

.globl _start

.globl factorial

_start:

push $4                    #this is the function argument
call factorial             #the function is called
add $4, %rsp               #the stack is restored to its original 
                           #state before the function was called
mov %rax, %rbx             #this instruction will move the result 
                           #computed by the function into the rbx 
                           #register and will serve as the return 
                           #value 
mov $1, %rax               #1 must be placed inside this register for 
                           #the exit system call
int $0x80                  #exit interrupt

.type factorial, @function #defines the code below as being a function

factorial:                 #function label
push %rbp                  #saves the base-pointer
mov %rsp, %rbp             #moves the stack-pointer into the base-
                           #pointer register so that data in the stack 
                           #can be referenced as indexes of the base-
                           #pointer
mov $1, %rax               #the rax register will contain the product 
                           #of the factorial
mov 8(%rbp), %rcx          #moves the function argument into %rcx
start_loop:                #the process loop begins
cmp $1, %rcx               #this is the exit condition for the loop
je loop_exit               #if the value in %rcx reaches 1, exit loop
imul %rcx, %rax            #multiply the current integer of the 
                           #factorial by the value stored in %rax
dec %rcx                   #reduce the factorial integer by 1
jmp start_loop             #unconditional jump to the start of loop
loop_exit:                 #the loop exit begins
mov %rbp, %rsp             #restore the stack-pointer
pop %rbp                   #remove the saved base-pointer from stack
ret                        #return

How fortunate it is for all of you learning the art of computer programming to have access to a **DEBUGGER**. So go use it and find where your code goes wrong. Hint: you already failed at accessing the argument. PS: x86-64 conventions do not use the stack for passing arguments, but you still can do that, if you do it correctly. PPS: don't use `int 0x80` in 64 bit code. It happens to work here though. — Jester, Oct 16 '17 at 20:44
Can you add some comments in the code? What are the arguments the program expect? -- The factorial example in [the book](http://mirror.cedia.org.ec/nongnu/pgubook/ProgrammingGroundUp-1-0-booksize.pdf) uses part of registers instead of all the register (e.g. they uses %esp instead of %rsp). -- You may check, as another example, an iterative and a recursive [factorial implementation](https://github.com/bbyars/programming-from-the-ground-up/tree/master/ch4-functions) at Github. — Jaime, Oct 16 '17 at 20:58
@Jaime the example in book is for x86 32b assembly, looks like OP tried to convert it to x86-64 in naive way just by replacing name of registers with the 64b variants. So obviously that code has no chance to work correctly, as it's being run in wrong mode. OP: follow your book properly. And search SO for how to compile+debug+run 32b binary in 64b linux. — Ped7g, Oct 17 '17 at 00:00

Peter Cordes · Accepted Answer · 2017-10-17T09:34:29.450

TL:DR: the factorial of the return address overflowed %rax, leaving 0, because you ported wrong.

Porting 32-bit code to 64-bit is not as simple as changing all the register names. That might get it to assemble, but as you found even this simple program behaves differently. In x86-64, push %reg and call both push 64-bit values, and modify rsp by 8. You would see this if you single-stepped your code with a debugger. (See the bottom of the x86 tag wiki for info using gdb for asm.)

You're following a book that uses 32-bit examples, so you should probably just build them as 32-bit executables instead of trying to port them to 64-bit before you know how.

Your sys_exit() using the 32-bit int 0x80 ABI still works (What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?), but you will run into trouble with system calls if you try to pass 64-bit pointers. Use the 64-bit ABI.

You will also run into problems if you want to call any library functions, because the standard function-calling convention is different, too. See Why parameters stored in registers and not on the stack in x86-64 Assembly?, and the 64-bit ABI link, and other calling-convention docs in the x86 tag wiki.

But you're not doing any of that, so the problem with your program simply comes down to not accounting for the doubled "stack width" in x86-64. Your factorial function reads the return address as its argument.

Here's your code, commented to explain what it actually does

push $4                    # rsp-=8.  (rsp) = qword 4
                           # non-standard calling convention with args on the stack.
call factorial             # rsp-=8.  (rsp) = return address.  RIP=factorial
add $4, %rsp               # misalign the stack, so it's pointing to the top half of the 4 you pushed earlier.
# if this was in a function that wanted to return, you'd be screwed.

mov %rax, %rbx             # copy return value to first arg of system call
mov $1, %rax               #eax = __NR_EXIT from asm/unistd_32.h, wasting 2 bytes vs. mov $1, %eax
int $0x80                  # 32-bit ABI system call, eax=call number, ebx=first arg.  sys_exit(factorial(4))

So the caller is sort of fine (for the non-standard 64-bit calling convention you've invented that passes all args on the stack). You might as well omit the add to %rsp entirely, since you're about to exit without touching the stack any further.

.type factorial, @function #defines the code below as being a function

factorial:                 #function label
push %rbp                  #rsp-=8, (rsp) = rbp
mov %rsp, %rbp             # make a traditional stack frame

mov $1, %rax               #retval = 1.  (Wasting 2 bytes vs. the exactly equivalent mov $1, %eax)

mov 8(%rbp), %rcx          #load the return address into %rcx

... and calculate the factorial

For static executables (and dynamically linked executables that aren't ASLR enabled with PIE), _start is normally at 0x4000c0. Your program will still run nearly instantaneously on a modern CPU, because 0x4000c0 * 3c latency of imul is still only 12.5 million core clock cycles. On a 4GHz CPU, that's 3 milliseconds of CPU time.

If you'd made a position-independent executable by linking with gcc foo.o on a recent distro, _start would have an address like 0x5555555545a0, and your function would have taken ~70368 seconds to run on a 4GHz CPU with 3-cycle imul latency.

4194496! includes many even numbers, so its binary representation has many trailing zeros. The whole %rax will be zero by the time you're done multiplying by every number from 0x4000c0 down to 1.

The exit status of a Linux process is only the low 8 bits of the integer you pass to sys_exit() (because the wstatus is only a 32-bit int and includes other stuff, like what signal ended the process. See wait4(2)). So even with small args, it doesn't take much.

Porting from 32 to 64-bit by just changing all the register names from eXX to rXX makes factorial return 0?

1 Answers1

Related