The simplest solution is to avoid the absolute address in ptr(). The reason is that x86/x86_64 requires a 32-bit displacement, which is not always possible for arbitrary user addresses - the displacement is calculated by using the current instruction pointer and the target address - if the difference is outside a signed 32-bit integer the instruction is not encodable (this is an architecture constraint).
Example code:
using namespace asmjit;
void setXmmVarViaAddressLocation(x86::Compiler& cc, x86::Xmm& v, const float* f)
{
x86::Gp tmpPtr = cc.newIntPtr("tmpPtr");
cc.mov(tmpPtr, reinterpret_cast<std::uintptr_t>(f);
cc.movq(v, x86::ptr(tmpPtr));
}
If you want to optimize this code for 32-bit mode, which doesn't have the problem, you would have to check the target architecture first, something like:
using namespace asmjit;
void setXmmVarViaAddressLocation(x86::Compiler& cc, x86::Xmm& v, const float* f)
{
// Ideally, abstract this out so the code doesn't repeat.
x86::Mem m;
if (cc.is32Bit() || reinterpret_cast<std::uintptr_t>(f) <= 0xFFFFFFFFu) {
m = x86::ptr(reinterpret_cast<std::uintptr_t>(f));
}
else {
x86::Gp tmpPtr = cc.newIntPtr("tmpPtr");
cc.mov(tmpPtr, reinterpret_cast<std::uintptr_t>(f);
m = x86::ptr(tmpPtr);
}
// Do the move, now the content of `m` depends on target arch.
cc.movq(v, x86::ptr(tmpPtr));
}
This way you would save one register in 32-bit mode, which is always precious.