Even if signed overflow is undefined behavior, binaries do have a definitive behavior. So we can check why using register makes the code faster.
Using an undefined behavior with no optimizations means that GCC can produce a completely different assembly even for small, apparently insignificant, modifications.
register is such insignificant modification since it's an historical micro-optimization that GCC doesn't honor anymore (apparently).
Quoting the GNU C Manual Reference:
20.10 auto and register
For historical reasons, you can write auto or register before a local variable
declaration. auto merely emphasizes that the variable isn’t static; it changes
nothing.
register suggests to the compiler storing this variable in a register. How-
ever, GNU C ignores this suggestion, since it can choose the best variables
to store in registers without any hints.
However, the two binaries differ exactly in the fact that count is held in a register (ebx specifically) vs a local variable (and in the fact that a frame pointer is created).

So register does indeed makes your code faster. You can see, on the left, that without it (at -O0 optimization level) GCC generated add [rbp+count], 1 (this is a 32-bit increment, IDA doesn't show that) while with the register modifier add ebx, 1 was generated.
The 5x slow-down seems to match the store-load forwarding latency.
Note however that GCC may move count to a register or in memory at its will (unless maybe with volatile), this can happen if you change int to unsigned int or to unsigned long long or if too many other local variables are in use or with any other compiler switch (like specific optimizations).
register had the desired effect in this simple code because there were no other constraints in place from the compiler analysis.
It's however interesting to see that GCC doesn't completely ignore it like Stallman claims in his manual.