Proper way to manipulate registers (PUT32 vs GPIO->ODR)

Question

I'm learning how to use microcontrollers without a bunch of abstractions. I've read somewhere that it's better to use PUT32() and GET32() instead of volatile pointers and stuff. Why is that?

With a basic pin wiggle "benchmark," the performance of GPIO->ODR=0xFFFFFFFF seems to be about four times faster than PUT32(GPIO_ODR, 0xFFFFFFFF), as shown by the scope:

(The one with lower frequency is PUT32)

This is my code using PUT32

PUT32(0x40021034, 0x00000002); // RCC IOPENR B 
PUT32(0x50000400, 0x00555555); // PB MODER
while (1) {
    PUT32(0x50000414, 0x0000FFFF); // PB ODR
    PUT32(0x50000414, 0x00000000); 
}

This is my code using the arrow thing

* (volatile uint32_t *) 0x40021034 = 0x00000002; // RCC IOPENR B 
GPIOB->MODER = 0x00555555; // PB MODER
while (1) {
    GPIOB->ODR = 0x00000000; // PB ODR
    GPIOB->ODR = 0x0000FFFF; 
}

I shamelessly adapted the assembly for PUT32 from somewhere

PUT32     PROC
    EXPORT  PUT32     
    STR R1,[R0]
    BX LR
    ENDP

My questions are:

Why is one method slower when it looks like they're doing the same thing?
What's the proper or best way to interact with GPIO? (Or rather what are the pros and cons of different methods?)

Additional information:

Chip is STM32G031G8Ux, using Keil uVision IDE.
I didn't configure the clock to go as fast as it can, but it should be consistent for the two tests.
Here's my hardware setup: (Scope probe connected to the LEDs. The extra wires should have no effect here)

Thank you for your time, sorry for any misunderstandings

Countless reasons to use an abstraction, this is typical for drivers. The abstraction is going to be slower of course. You can use the volatile pointer trick to inline the abstraction, but you cant go the other way later, once you use the pointers then you have to add work to create an abstraction for the various reasons you would want an abstraction. — old_timer, Mar 21 '21 at 02:40
The volatile pointer trick is not necessarily defined to be reliable within the language, volatile is arguable meant for hardware register access, but other use cases are not reliable (sharing variables between foreground and interrupt service routines). The current fad of misusing unions and structs, make the volatile thing significantly less reliable. Abstractions like readl() writel() or similar avoid these problems. You can also insure the exact instruction can be used rather than hoping the correct instruction is used which is a problem with a compiled approach. — old_timer, Mar 21 '21 at 02:42

score 2 · Answer 1 · answered Mar 20 '21 at 09:39

PUT32 is a totally non-standard method that the poster in that other question made up. They have done this to avoid the complication and possible mistakes in defining the register access methods.

When you use the standard CMSIS header files and assign to the registers in the standard way, then all the complication has already been taken care of for you by someone who has specific knowledge of the target that you are using. They have designed it in a way that makes it hard for you to make the mistakes that the PUT32 is trying to avoid, and in a way that makes the final syntax look cleaner.

The reason that writing to the registers directly is quicker is because writing to a register can take as little as a single cycle of the processor clock, whereas calling a function and then writing to the register and then returning takes four times longer in the context of your experiment.

By using this generic access method you also risk introducing bugs that are not possible if you used the manufacturer provided header files: for example using a 32 bit access when the register is 16 or 8 bits.

Proper way to manipulate registers (PUT32 vs GPIO->ODR)

1 Answers1