For some more optimized memory functions I am looking to utilize the XMM8 to XMM15 registers as that would mean I could move more data per loop. However, Intel insinuates different behaviors with these registers. I failed to find any weird bugs in my code to that regard but I am unsure if there might be any dangers laying around the corner. Intel states the following in their documentation:
XMM8 through XMM15 are available using REX.R in 64-bit mode.
To which I only have one questions, what is REX.R?