I have a value in uint16x8_t (a Q-register). If it was asm, I'd add two subparts of the register, e.g. for Q0 it would be vadd_u16(d0, d1) the result that I need. The problem is that I don't see how I can get that using neon intrinsics since there is no conversion from uint16x8_t to uint16x4x2_t to be able to pass low and high parts to vadd_u16.
There are lots of vreinterpret_x_y macros but not a single one converts from uint16x8_t to uint16x4x2_t. Am I missing something, how such operation should be done in arm-neon?