1

CPU register banks have for a long time separated integer and floating-point registers, for several reasons including letting instructions specify registers with fewer bits, saving resources on read/write ports, and being able to physically locate the register banks closer to the respective execution units.

This ended up holding true for a vector register set: MMX initially reused the 8087 floating-point registers for short integer vectors, but a later x86 CPU added a second set of registers for the short integers, even though it had to go to some trouble to pretend they were the same registers in case there was code that accidentally depended on that.

What about modern GPUs? Do they have unified or separate vector register banks for integers and floating-point numbers? And if unified, why do they make a different choice compared to CPUs?

rwallace
  • 31,405
  • 40
  • 123
  • 242
  • 1
    *even though it had to go to some trouble to pretend they were the same registers in case there was code that accidentally depended on that.* - What are you talking about? It sounds like you're talking about the XMM registers added by SSE (originally only for FP, but SSE2 expanded that to integer SIMD). But those are separate architectural state; nothing ever pretends they're the same registers as the MMX `mm0..mm7` state. – Peter Cordes Nov 17 '20 at 06:21
  • 1
    GPUs don't have vector registers at all; each GPU "core" is like one element of a CPU-style short-vector SIMD vector. But yes I think they're unified. – Peter Cordes Nov 17 '20 at 06:22
  • @PeterCordes Not XMM, I mean the MMX integer vector registers, which were initially physically the same as the 8087 floating-point registers, but later became physically separate, but for some compatibility purpose, I forget exactly what, the CPU had to pretend they were still the same. – rwallace Nov 17 '20 at 06:23
  • @PeterCordes Ah! Do you have any idea why the reasons CPUs separate them, don't apply to GPUs? – rwallace Nov 17 '20 at 06:24
  • 1
    GPUs basically do nothing but number crunching, in throughput-optimized cores. CPUs care much more about latency because some of their workloads have long serial dependency chains without massive amounts of data / thread-level parallelism. – Peter Cordes Nov 17 '20 at 06:40
  • 3
    As far as NVIDIA GPUs of the past dozen years are concerned, a register is a register is a register. It comprises 32 bits. So it can hold a 32-bit `int` or a 32-bit `float`. A pair of them (aligned to an even register number, e.g. R4,R5) can hold a 64-bit `double`. Among CPUs, the Motorola 88100 used the same register file for floating-point and integer operations. – njuffa Nov 17 '20 at 07:17
  • 2
    Come to think of it, the DEC VAX used the same registers for integer and floating-point operations. – njuffa Nov 17 '20 at 07:27
  • 1
    @njuffa: [Is there any architecture that uses the same register space for scalar integer and floating point operations?](https://stackoverflow.com/q/51471978) already covers CPUs with unified GP-integer / FP registers. But yes, good point that it's not entirely unique to GPUs. – Peter Cordes Nov 17 '20 at 14:59
  • @PeterCordes I was specifically addressing "... if unified, why do they make a different choice compared to CPUs ..." as an incorrect generalization. Admittedly I did not spend any time looking for related site content for what I consider an off-topic question about hardware architecture. – njuffa Nov 17 '20 at 16:20

0 Answers0