Why temporary registers and saved registers in risc-v are not numbered sequentially?

Question

Why not use x5~x11 as temporary registers? Is there any reason?

enter image description here

The registers seemed to have been renamed with the introduction of variable length compressed instructions https://stackoverflow.com/questions/30636566/abi-register-names-for-risc-v-calling-convention or is the manual linked in the question just a register naming convention in the scope of the Linux kernel? — Sebastian, Oct 19 '22 at 07:56

Paul Sherman · Answer 1 · 2023-04-16T12:38:49.573

In a nutshell, for simplicity.

The RVC instruction formats CL, CS, CB, and CJ provide only three bits for their register number fields. This simplifies hardware design of the instruction decoder, because only three lines, not five, are needed. An arbitrary constant "8" is added to these three-bit numbers. This is indicated by the prime-tick shown in instruction descriptions (RISC-V greencard).

In the case of RVC the three temporary registers within 16-bit number range are not included as shown in Table 1.2 of the RISC-V Compressed Instruction Set Specification and, thus, should not be used. They are unreachable as arguments of format types CL, CS, CB, and CJ, and furthermore will cause other registers to be clobbered, as shown below. For example, using register t0 inside of a loop controlled by modification of, and branching on, register a3 would be an extremely difficult run-time bug to find!

reg#  reg#' (used in CL-, CS-, CB-, CJ-Types)
x[5]   - (5 + 8 = x[13] = a3 (!)
x[6]   - (6 + 8 = x[14] = a4 (!)
x[7]   - (7 + 8 = x[15] = a5 (!)
x[8]   0    + 8 = x[8] = s0
x[9]   1    + 8 = x[9] = s1
x[10]  2    + 8 = x[10] = a0
x[11]  3    + 8 = x[11] = a1
x[12]  4    + 8 = x[12] = a2
x[13]  5    + 8 = x[13] = a3
x[14]  6    + 8 = x[14] = a4
x[15]  7    + 8 = x[15] = a5
------
x[16]  - (0 + 8 = x[8] = s0 (!)
x[17]  - (1 + 8 = x[9] = s1 (!)
...
x[31]  - (7 + 8 = x[15] = a5 (!)

Instructions of the RVC extension which have, and don't have, this three-bit limitation are shown below.

r(d,s1,s2)	r = 8 + r'
add	-
addi	-
addi16sp	addi4spn
addiw	addw
-	and
-	andi
-	beqz
-	bnez
fldsp	fld
flwsp	flw
fsdsp	fsd
fswsp	fsw
jalr	-
jr	-
ldsp	ld
li	-
lui	-
lwsp	lw
mv	-
-	or
sdsp	sp
slli	-
-	srai
-	srli
-	sub
-	subw
swsp	sw
-	xor
call	-
csrr, csrc, csrci, csrrc	-

Here is a very simple example that demonstrates a run-time bug when RVC forms are being used, or you don't know that they will be.

# run-time bug demonstration with RV(32/64)C

.equ OFFSET, 0
BASE:
.word 0xDEADBEEF

#--------------------------------
    c.li a3, 100   # i = 1
    c.li t2, BASE  # clobbers a5
loop:
    beqz a3, loop_done    # while (i > 0)
    c.addi a3, a3, -1     # decrement i

    lw t0, OFFSET(t2)     # clobbers a3, because x[5+8]=x[13]=a3
    andi t0, t0, 0x03     # clobbers a3, again
    bnez t0, loop         # tests a3, not t0 as expected
loop_done:
    sltiu a3, a3, zero
    # a3=0 failure because timeout
    # a3>0 success
#--------------------------------

There are three ways to fix this bug.

One way is to use only registers s0 or s1 (stacking them appropriately) or a0 thru a5, instead of using t0, t1, or t2.

Another way is to guard the above code not allowing assembler choices of any RVC forms.

.option push
.option norvc   # tell assembler do not emit compressed instruction forms

# code with t0, t1, or t2 goes here
# dont forget to remove the c. prefixes

.option pop

A third way is to specify "-norvc" at the gnu as command line and selectively and carefully enable compressed instructions as needed.

.option push
.option rvc   # tell assembler it is okay to emit compressed instruction forms

# code without t0, t1, or t2 goes here
# may want to specify c. prefixes

.option pop

Sounds like an assembler bug if it encodes `a3` instead of erroring on asm source that uses `t0` for an instruction that forces a compressed encoding, like `c.andi`. Truncating the register number and encoding the wrong register is not helpful, so presumably will be fixed (to error instead of mis-assemble) if it happens now, @Sebastian. I tried with `clang -target riscv32` but it doesn't support `c.` instructions at all. I don't have RISC-V GNU Binutils installed so I didn't try GAS. — Peter Cordes, Apr 16 '23 at 06:07
Oops. May have mis-spoken. My Binutils does indeed show "Error: illegal operands `...`" when, for example "c.slli t3, t3, 1" is specified. The output listing, however, obtained from -objdump -D, doesn't put the "c." prefix when the assembler choses to use a RVC form--not critical, but misleading to the eye, if you only look at the pneumonics and not the opcodes. — Paul Sherman, Apr 16 '23 at 08:39
I use real ASIC h/w and see strange code hang-up; might be some instruction boundary being violated as RVC forms are automatically chosen. Judicious use of .balign's and ".option norvc"s might save the day. At first I thought the problem was a race condition with the 2-byte instructions when when Rd and Rs1/Rs2 point to the same register, My tool version is: GNU assembler (SiFive Binutils-Metal 2.35.0-2020.12.8) 2.35 (edited typo's in my original post) — Paul Sherman, Apr 16 '23 at 08:53

Erik Eidt · Answer 2 · 2022-10-18T18:19:05.997

1

Can't speak to why the designers choose that somewhat mixed up ordering.

But I can say that it doesn't matter, in the sense that it makes no difference to the hardware or software. This is because there are no instructions that refer to multiple, sequential registers via one register number.

An example, that neither MIPS nor RISC V have, would be a store multiple registers instruction, in which one register number, e.g. a low number, is specified explicitly and then sequential register numbers are implied for some count of registers.

In architectures that do that, it is important to have, for example, at least the call-preserved (aka callee saves) registers consecutively numbered, for the optimal use of that instruction. Careful placement of the return address register within the register numbering helps, too.

In RISC V, register numbers are always explicit, with the only exception in the compressed instruction extension, where a register number is impicit, namely, sp, for local memory variable manipulation, though never for a range of registers.

In this situation, there is neither advantage nor disadvantage to any other alternative register usage ordering.

edited Oct 18 '22 at 18:19

answered Oct 17 '22 at 15:53

Erik Eidt

23,049
2
29
53

1

Are RV32C compressed instructions limited in which register numbers they can access with a 16-bit instruction? Having a good mix of things in the low half or quarter might make sense then, otherwise yeah seems as weird as MIPS where the ABI names are also non-contiguous. (https://en.wikibooks.org/wiki/MIPS_Assembly/Register_File - makes me wonder if t8 and t9 were a later design change, if initially they were also reserved for async clobber by kernel interrupt handlers.) – Peter Cordes Oct 19 '22 at 07:46
1

@PeterCordes, yes, that's a good point, thx. RVC has 3 bit register fields, IIRC, so that adds to the equation. – Erik Eidt Oct 19 '22 at 16:41
2

@PeterCordes, in three bit register fields, they allow for access to x8-x15; these are s0, s1, a0, a1, a2, a3, a4, a5. So, two call-preserved and six parameter or scratch registers (a0,a1 are also return values). Perhaps this does partially explain the register ordering, at least that break between s0-1 and s2-11 -- but doesn't really explain the other discontinuities. – Erik Eidt Oct 23 '22 at 18:50

Why temporary registers and saved registers in risc-v are not numbered sequentially?

2 Answers2