2

I have a for loop in my (c++ .Net Win32 Console) code which has to run as fast as possible. So I need to make the compiler use a register instead of storing it in RAM.

MSDN says:

The register keyword specifies that the variable is to be stored in a machine register, if possible.

This is what I tried:

for(register int i = 0; i < Size; i++)

When I look at Disassembly code which the compiler generates, I see:

012D4484  mov         esi,dword ptr [std::_Facetptr<std::codecvt<char,char,int> >::_Psave+24h (12DC5E4h)]  
012D448A  xor         ecx,ecx  
012D448C  push        edi  
012D448D  mov         edi,dword ptr [std::_Facetptr<std::codecvt<char,char,int> >::_Psave+10h (12DC5D0h)]  
012D4493  mov         dword ptr [Size],ebx  
012D4496  test        ebx,ebx  
012D4498  jle         FindBestAdd+48h (12D44B8h)  //FindBestAdd is the function the loop is in
012D449A  lea         ebx,[ebx]  

I am expecting the assembly code not to generate a dword ptr where I used register keyword.

So, How would I know if it's possible for compiler to use a register and What should I do to force the compiler to read/write directly from/to registers.

Bizhan
  • 16,157
  • 9
  • 63
  • 101
  • 3
    Specify your compiler options and the declaration of `Size` and anything else that may be unusual. My `gcc` uses registers just fine, even without optimizations turned on. There's no reason VC should do worse. –  Dec 07 '12 at 13:35
  • 1
    Possible duplicate: http://stackoverflow.com/questions/578202/register-keyword-in-c – Lei Mou Dec 07 '12 at 13:37
  • 4
    I don't see how the disassembly matches source code: perhaps you would need to post more source code and more disassembly. The theoretical answer to your questions are, a) to know whether it's possible you need to know the compiler implementation; and b) C++ doesn't let you force the compiler to enregister ... at best, the `register` keyword is only a hint. – ChrisW Dec 07 '12 at 13:37
  • 2
    Are you sure that the disassembly you posted actually is the for-loop? Either way, the `register` keyword is just a hint to the compiler, but since compilers these days are much better than most humans at doing register allocation for the code they generate they can choose to ignore that hint. I'd suggest using a profiler so that you can find the real bottlenecks in your program where optimization actually would matter. – Michael Dec 07 '12 at 13:40
  • You're right. When I try to change the loop sometimes i is register and sometimes it's a dword Ptr. It's seems impossible to force the compiler. – Bizhan Dec 07 '12 at 13:55
  • 1
    The easiest way to get a register is to switch to x64. x86 usually doesn't have enough registers. Just look at your code: it's already using EBX, ECX, ESI and EDI. That leaves EAX and EDX, but they're probably in use too. The `push edi` looks like a register being spilled to stack, which means all registers were in use. – MSalters Dec 07 '12 at 14:05

2 Answers2

14

The register keyword is only a hint and is ignored by most modern compilers. In essence, this is because the compiler is better at optimizing and figuring out what should be placed in a register than the programmer.

Thus, you cannot force the compiler to use registers, and you shouldn't even if you could. If you want optimum speed, turn on maximum optimization level in your compiler settings.

Agentlien
  • 4,996
  • 1
  • 16
  • 27
  • 1
    Definitely let the compiler do this for you. – D.Shawley Dec 07 '12 at 13:36
  • You're right. When I try to change the loop sometimes i is register and sometimes it's a dword Ptr. It's seems impossible to force the compiler. – Bizhan Dec 07 '12 at 13:56
  • @Bizz, x86 has a very limited number of registers in 32-bit mode. If you have function calls inside the loop (and it looks like you do) it might not be possible to keep the value in the register only. Compilers usually apply special cost models to each operation in order to chose the proper storage. Blocking one register would severely constrain the optimiser and might result in other expensive loads/stores. So I would stick to compiler hints, especially for portable codes. – Hristo Iliev Dec 07 '12 at 16:43
  • 1
    wrong. You've never ran into issues where C code did not work as intended on an embedded device and a peripheral cannot work without using precise register access sequence – yan bellavance Oct 21 '18 at 12:00
4

In your case the compiler will most likely use a register anyway if you supply the right optimization options.

In general, the only way to force a variable into a register is to use inline assembly.

Nemanja Trifunovic
  • 24,346
  • 3
  • 50
  • 88