1

I am learning asm x64, more precisely I currently learn to use data structures. My book tell me that I must use a subregister (aka partial register) to store data from the structure to rdi, but It doesn't explain why.

Could someone please enlighten me?


I tried this piece of code without success (Which was expected): MOV rdi, [my_struc+my_field]

This code worked as expected:

XOR rbx, rbx
MOV bx, [my_struc+my_field]
MOV rdi, rbx

After reading comments, thank you so much @Jester, I understand better now.

Here is my data structure:

section .bss
    STRUC sockaddr_in
        sin_family:     RESW 1 ; 1*2 bytes
        sin_port:       RESW 1 ; 1*2 bytes
        sin_address:    RESD 1 ; 1*4 bytes
        sin_zero:       RESD 2 ; 2*4 bytes
    ENDSTRUC


section .data
    sa: ISTRUC sockaddr_in
        AT sin_family,  DW  0x02        ; AF_INET, see "man socket"
        AT sin_port,    DW  0x5C11      ; 4444 Little Endian
        AT sin_address, DD  0x0100007f  ; 127.0.0.1 Little Endian
        AT sin_zero,    DD  0x0
    IEND

I thought that MOV would zero extend automatically, but It seems that actually not.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Ben1782
  • 11
  • 1
  • 4
    You forgot to show your data structure. Anyway I assume it's a 16 bit field so you need a 16 bit register or zero/sign extend as appropriate. I would just use `movzx edi, word [my_struc+my_field]`. Also if you want it in `rdi` you could just as well use `di` instead of `bx`. Since 32 bit ops zero extend you can do `xor edi, edi; mov di, [my_struc+my_field]` – Jester Aug 17 '23 at 14:25
  • That `MOV rdi, [my_struc+my_field]` specifies a 64-bit memory access, which if done on an array holding 16-bit elements, would load 4 neighboring elements packed into a single register — that's not just overkill but unwanted extra data, since usually a program is only interested in seeing one element at a time. – Erik Eidt Aug 17 '23 at 15:02
  • 2
    As others are saying there are several ways to indicate a 16-bit memory access, and they go to whether the data should be considered signed or unsigned if/when expanding to 32 bits or larger. On intel, you might treat the data as 16-bit and forgo expansion, but if you want expansion to 32-bits or larger, clearing a register first, then loading into one of its subregisters has the effect of zero extending the 16-bit item (i.e. considering the data as unsigned) — this is a traditional approach for older processors, but of course today that can be done in one instruction (`movzx`). – Erik Eidt Aug 17 '23 at 15:07
  • Ok so using a 16-bit register would have been more relevant if I understand correctly. – Ben1782 Aug 17 '23 at 15:13
  • Yes and `bx` or `di` are such. On intel architectures you can stick with 16-bit the subregisters even as you do addition, comparison, etc.. whereas on any RISC machine, arithmetic can only be done in full size, so you have to expand short data to full register width. However, when sticking with the subregisters on intel, be more aware of overflow since they are working with fewer bits. In some sense, intel offers too many ways to handle roughly the same thing. – Erik Eidt Aug 17 '23 at 15:50
  • 1
    Are you asking why writing BX doesn't already zero-extend into RBX? See [Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?](https://stackoverflow.com/q/11177137) which discusses the fact that only 32-bit registers do that, keeping 8086 / 386 legacy partial-register merging-with-old-value behaviour for the smaller sizes even in 64-bit mode. – Peter Cordes Aug 17 '23 at 15:58
  • 1
    Also [How do AX, AH, AL map onto EAX?](https://stackoverflow.com/q/15191178) . You only need to use partial register names in your asm source when *storing* narrow values. Narrow loads should use `movsx` or `movzx` to a 32-bit register. (Or maybe to a 64-bit register for `movsx` if you need to sign-extend to 64-bit.) [How to load a single byte from address in assembly](https://stackoverflow.com/q/20727379) – Peter Cordes Aug 17 '23 at 17:14

0 Answers0