I'm new to assembly code and SSE/AVX instructions. Now, I want to assign a specific value to all locations in 256-bit YMM registers, but I don't know if the final result is correct.
- To assign 0 or 1 to
ymm0:
__asm__ __volatile__(
"vpxor %%ymm0, %%ymm0, %%ymm0\n\t" // all are 0
or
"VPCMPEQB %%ymm0, %%ymm0, %%ymm0\n\t" // all are 1
: : :);
GDB result shows that:
// all are 0
ymm0
{v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
v4_double = {0x0, 0x0, 0x0, 0x0},
v32_int8 = {0x0 <repeats 32 times>},
v16_int16 = {0x0 <repeats 16 times>},
v8_int32 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
v4_int64 = {0x0, 0x0, 0x0, 0x0},
v2_int128 = {0x0, 0x0}}
// all are 1
ymm0
{v8_float = {0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff},
v4_double = {0x7fffffffffffffff, 0x7fffffffffffffff, 0x7fffffffffffffff, 0x7fffffffffffffff},
v32_int8 = {0xff <repeats 32 times>},
v16_int16 = {0xffff <repeats 16 times>},
v8_int32 = {0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff},
v4_int64 = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff},
v2_int128 = {0xffffffffffffffffffffffffffffffff, 0xffffffffffffffffffffffffffffffff}}
- To set 0xA to all locations (both high and low 128-bits) in
ymm0:
__asm__ __volatile__(
"movq $0xaaaaaaaaaaaaaaaa, %%rcx\n"
"vmovq %%rcx, %%xmm0\n"
"vpbroadcastq %%xmm0, %%ymm0\n": : :);
GDB result shows that:
ymm0
{v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
v4_double = {0x0, 0x0, 0x0, 0x0},
v32_int8 = {0xaa <repeats 32 times>},
v16_int16 = {0xaaaa <repeats 16 times>},
v8_int32 = {0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa},
v4_int64 = {0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaa},
v2_int128 = {0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}}
Questions:
- What does the GDB result (structure) mean? E.g., v8_float, v4_double, v32_int8, etc.
- In the second case (0xA), why are the v8_float and v4_double always 0?
- How can I assign the value (e.g., 'a') to all locations in YMM (including both high and low 128-bits)?