Why aren't compilers using registers for their intended purpose?

Question

It seem to be well known that x86 registers names have a purpose and indicate on how the register should be use (see this website for example, or this SO post). Registers purpose should be :

* EAX - Accumulator Register
* EBX - Base Register
* ECX - Counter Register
* EDX - Data Register
* ESI - Source Index
* EDI - Destination Index
* EBP - Base Pointer
* ESP - Stack Pointer

(Please note that I did not found official information about this on INTEL's documentation yet)

According to this, ecx should be the register holding my i variable on the code bellow :

int main()
{
    register int i = 0;
    for(i = 0 ; i <= 10 ; i++){}
    return 0;
}

Which I compile with gcc loop.c -o loop And I disassemble it with objdump -D -M intel ./loop:


0000000000001138 <main>:
    1138:       f3 0f 1e fa             endbr64
    113c:       55                      push   rbp
    113d:       48 89 e5                mov    rbp,rsp
    1140:       53                      push   rbx
    1141:       bb 00 00 00 00          mov    ebx,0x0
    1146:       f3 0f 1e fa             endbr64
    114a:       bb 00 00 00 00          mov    ebx,0x0
    114f:       eb 03                   jmp    1154 <main+0x1c>
    1151:       83 c3 01                add    ebx,0x1
    1154:       83 fb 0a                cmp    ebx,0xa
    1157:       7e f8                   jle    1151 <main+0x19>
    1159:       b8 00 00 00 00          mov    eax,0x0
    115e:       5b                      pop    rbx
    115f:       5d                      pop    rbp
    1160:       c3                      ret

We clearly see that ebx is holding i, not ecx. Is there an historical reason to this? Did compiler used theoretical purpose or registers back then or was it just for humans?

@Progman ecx is the counter register responsible for holding counters. `loopX` instructions uses `ecx` as `i` counter for their operations. — Nark, Apr 18 '22 at 08:00
You could be using local variables that would be loaded into `ecx` for a `loopX` check. You would not be using registers for nested loops anyway due to limited registers you have. — Nark, Apr 18 '22 at 08:04
If a compiler wanted to make slower but smaller code, then sure it could use `loop`. [Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?](https://stackoverflow.com/q/35742570). See also comments and duplicates on [What are the data registers in Assembly for?](https://stackoverflow.com/q/71543381) — Peter Cordes, Apr 18 '22 at 21:01
*You would not be using registers for nested loops anyway due to limited registers you have.* - That's too pessimistic. x86-64 has 15 general-purpose registers other than the stack pointer; that's enough to handle loop counters for nested loops that aren't too crazy. (Even moreso in vectorized or FP code where your data will be in separate registers, xmm0..15). Especially if you compile with optimization enabled, instead of just using `register int` vars in a debug build like this! — Peter Cordes, Apr 19 '22 at 00:41
For example (https://godbolt.org/z/sxGxMnfas) this simplistic loop with a repeat-loop slapped around it doesn't even need to save/restore any call-preserved integer regs. (With `-O3`, it defeats the simplistic benchmark with vectorization of the tiny inner loop). Coincidentally, `gcc -O2` happens to pick ECX for the outer loop counter in that. — Peter Cordes, Apr 19 '22 at 00:42
The "intended purpose" of registers is pretty much irrelevant. The idea that "ax is accumulator, cx is count" and so on made a certain amount of sense on the original 8086, where a number of instructions or features were hardcoded to operate on certain registers, in a fashion that was sort of consistent with the naming. But the 80386 lifted many of these restrictions, especially with regard to addressing modes, and so these days the registers are mostly interchangeable. The naming is now effectively arbitrary. — Nate Eldredge, Apr 19 '22 at 15:16
Thanks @PeterCordes & Nate , your answers are very instructive. Didn't knew that instructions were hardcoded to be used with specific registers back then. — Nark, Apr 19 '22 at 18:43
All those instructions with implicit operands still exist (except for the BCD instructions which are gone from 64-bit mode, only in 32 or 16-bit mode). You even mentioned `loop` yourself. It's just that there are more flexible forms of many instructions, like `imul reg, r/m` in 386 that obsoletes `mul r/m` with implicit accumulator for most purposes, and `movsx` that means `cbw` / `cwde` are just a code-size saving, not saving a pair of xchg to get the data in AL/AX. CL for shift counts is only partially obsoleted by BMI2 extensions (Haswell in 2013), still necessary for shld/shrd. — Peter Cordes, Apr 19 '22 at 20:30
[Why are rbp and rsp called general purpose registers?](https://stackoverflow.com/a/51347294) lists some of the implicit uses for each register — Peter Cordes, Apr 19 '22 at 20:33

Why aren't compilers using registers for their intended purpose?

0 Answers0