2

We know that on NEON, the SIMD registers q0~q7 are shared with float registers s0~s31. So the code below has a bug:

float_t fRatio = (float_t)srcWidth/dstWidth;

// NEON asm modified q0~q7
MyNeonFunctionPtr1(pData, Stride, (int32_t)(fHorRatio*m_iHorScale));

//  following sentence use wrong "fHorRatio", 
//  which is modified by "MyNeonFunctionPtr1"; 

int32_t vertStepLuma = (int32_t)(fHorRatio*m_iVertScale);

In x86, emms can solve it. But how do I do it on NEON? My temporary solution is to use volatile on vertStepLuma. Is there a better way? Thanks!

Mysticial
  • 464,885
  • 45
  • 335
  • 332
lcljesse
  • 85
  • 4

2 Answers2

2

Are you using gcc inline assembly? Then use clobber list. You inform GCC that you will be use specific registers and gcc won't store in them values after inline asm block. Read here: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3

Otherwise, if it is external function implemented elsewhere then ABI dictates that you are allowed to corrupt only q4, q5, q6 and q7 registers: ARM to C calling convention, NEON registers to save Fix the function to preserve registers (q0-q3), or make an inline assembly around it where you save these registers yourself.

Community
  • 1
  • 1
Mārtiņš Možeiko
  • 12,733
  • 2
  • 45
  • 45
0

Well, the callee (function) shall preserve only Q4-Q7 prior to overwriting and restore them which means that the caller shall be aware of the fact that there is no guarantee the other registers remain untouched. Therefore, the caller has to -if necessary- preserve Q0-Q3, Q8-Q15 prior to the function call and restore them upon return. (Compilers do this automatically)

Jake 'Alquimista' LEE
  • 6,197
  • 2
  • 17
  • 25