How to BSWAP the lower 32-bit of 64-bit register?

Question

I've been looking for the answer for how to use BSWAP for lower 32-bit sub-register of 64-bit register. For example, 0x0123456789abcdef is inside RAX register, and I want to change it to 0x01234567efcdab89 with a single instruction (because of performance).

So I tried following inline function:

#define BSWAP(T) {  \
    __asm__ __volatile__ (  \
            "bswap %k0" \
            : "=q" (T)  \
            : "q" (T)); \
}

And the result was 0x00000000efcdab89. I don't understand why the compiler acts like this. Does anybody know the efficient solution?

Replaced `64-bit` tag with `64bit`, because there are more questions tagged `64bit`. — Brad Gilbert, Oct 16 '08 at 23:41

score 5 · Accepted Answer · answered Oct 07 '08 at 13:40

5

Ah, yes, I understand the problem now:

the x86-64 processors implicitly zero-extend the 32-bit registers to 64-bit when doing 32-bit operations (on %eax, %ebx, etc). This is to maintain compatibility with legacy code that expects 32-bit semantics for these registers, as I understand it.

So I'm afraid that there is no way to do ror on just the lower 32 bits of a 64-bit register. You'll have to do use a series of several instructions...

answered Oct 07 '08 at 13:40

Dan Lenski

76,929
13
76
124

3

[Why do most x64 instructions zero the upper part of a 32 bit register](http://stackoverflow.com/q/11177137/995714) – phuclv Apr 26 '15 at 08:19

score -1 · Answer 2 · answered Oct 07 '08 at 02:10

-1

Check the assembly output generated by gcc! Use the gcc -s flag to compile the code and generate asm output.

IIRC, x86-64 uses 32-bit integers by default when not explicitly directed to do otherwise, so this may be (part of) the problem.

answered Oct 07 '08 at 02:10

Dan Lenski

76,929
13
76
124

How to BSWAP the lower 32-bit of 64-bit register?

2 Answers2

Linked