So I've been tryna learn about SEE optimization on my own and I'm not quite getting it, I thought a simple function that just zeroes the memory would be easy to implement, so I went on and tried to implement it myself.
Here is the zero memory function that loops from the buffer start to buffer end and uses _mm_store_si128 to zero it out.
bool zeromem( byte * _dest, uint _sz )
{
if ( _dest == nullptr )
return false;
__m128i zero = _mm_setzero_si128( );
for ( auto i = rcast<__m128i*>( _dest ),
end = rcast<__m128i*>( _dest + _sz );
i < end; ++i )
{
_mm_store_si128( i, zero );
}
return true;
}
Exception thrown: Access Violation (0x00000) even though the pointer is not 0x00000.
The test I did was just allocating 1024 bytes of memory and then calling zeromem.
The exception is thrown on the first iteration.