Are a 64-bit register's higher 4 bytes set to zero if we only use mov operation on its lower 4 bytes

Question

I am learning x86-64 assembly from Computer Systems: A Programmer's Perspective and I came across an exercise which asks to translate a line of C code into (two) equivalent assembly instructions. The code is about copying a variable of one type into another using pointers.

The pointer variables are declared as follows:

src_t *sp; //src_t and dest_t are typedefs
dest_t *dp;

and the C code to be translated is:

*dp = (dest_t)*sp;

It is given that the pointers sp and dp are stored in registers %rdi and %rsi respectively, and that we should set the 'appropriate portion' of %rax (eg. %eax, %ax or %al) to do intermediate data copying (as x86-64 doesn't allow both source and destination to be memory references).

Now when src_t is unsigned char and dest_t is long, I did the following assembly code for it:

movzbq (%rdi), %rax //move a byte into %rax with zero extension
movq %rax, (%rsi) //move 8 bytes of 'long' data

But the book as well as Godbolt (using gcc with -O3) says it should be

movzbl  (%rdi), %eax
movq    %rax, (%rsi)

In this case, the byte is only(?) zero-extended to 4 bytes (%eax is 4 bytes long), but I read that if we do like

movl %edx, %rax

then the upper 4 bytes of %rax will also be set to 0.

I have two questions:

Is movl %edx, %rax equivalent to movl %edx, %eax, that is, are the upper 4 bytes also set to 0 in the latter case?
Is movzbq (%rdi), %rax equivalent to movzbl (%rdi), %eax, i.e. does movzbl also set the higher 4 bytes to zero (like movl), even though we don't mention the full register (%rax) but only a part of it (%eax)?

(1) the former instruction doesn't exist (2) correct. All operations targetting dword sized registers clear the upper 32 bit of the destination. — fuz, Apr 26 '21 at 17:24
@fuz for (1) this means that both source and destination registers have to match the instruction suffix..thanks for clarification — tf3, Apr 26 '21 at 17:28
@tf3: Yup, only a very few instructions can use different-sized operands. e.g. `shl %cl, %rax`, and movzx / movsx (AT&T movsXY / movzXY where XY are 2 different size codes). See also [MOVZX missing 32 bit register to 64 bit register](https://stackoverflow.com/q/51387571) and [Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?](https://stackoverflow.com/q/11177137), and yes, `movzbq` is encodeable but a waste of a REX prefix to explicitly widen instead of implicitly. — Peter Cordes, Apr 27 '21 at 02:17

Chris Dodd · Accepted Answer · 2021-04-26T18:56:15.303

In general, on x86_64, any instruction with a 32-bit general purpose register as its destination (any %eXX or %rNd register) will also set the upper 32 bits of the corresponding 64-bit register to 0. So every instruction with a 32-bit destination 0-extends that to 64 bits.

From the Intel IA32 Software Developer's manual (section 3.4.1.1):

When in 64-bit mode, operand size determines the number of valid bits in the destination general-purpose register:

64-bit operands generate a 64-bit result in the destination general-purpose register.

32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.

8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.

Thanks for the crisp manual entry – tf3 Apr 26 '21 at 18:53 — tf3, Apr 26 '21 at 18:53

Are a 64-bit register's higher 4 bytes set to zero if we only use mov operation on its lower 4 bytes

1 Answers1