I am trying to mimic __int128_t B functionality using std::array<uint64_t, 2> A (where B=A[0] + 2^64 A[1]).
How to efficiently negate A? (with two's complement)
========= Attempts =========
Negating __int128_t produces the following assembly
negq %r12
adcq $0, %r13
negq %r13
I do not know how to simulate negq with the carry effect in C++.
Trying this as inline assembly
asm (
"negq %[low];"
"adcq $0, %[high];"
"negq %[high];"
: [high] "+r"(A[1]), [low] "+r"(A[0])
);
produces many redundant mov's arround my code (loading & storing the registers in A[0], A[1]). Also, trying
uchar_t carry = 0;
carry = _subborrow_u64(carry, 0, A[0], &(A[0]));
carry = _subborrow_u64(carry, 0, A[1], &(A[1]));
I get
movq 16(%rsp), %rcx
movq %rax, %rsi
subq %rcx, %rsi
movq %rax, %rcx
movq %rsi, (%r8)
sbbq 24(%rsp), %rcx
being considerably slower.
What is the correct way to implement negation? (amd64)