Obviously you should only consider this if you're already using all the other GP registers, including lr, and can't shift some of your work to NEON registers, e.g. using packed-integer even if you only care about the low 32 bits.
(Using SIMD regs for more scalar integer is usually only useful if there's an isolated set of values that don't interact with the other values in your algorithm, and you don't need to branch on them or use them as pointers. Transfer between int and SIMD is slow on some ARM CPUs.)
This is very non-standard, and only even possibly safe in user-space, not kernel
If you have any signal handlers installed, your stack pointer must be valid when one of those signals arrives. (And that's asynchronous.)
There's no other async usage of the user-space stack pointer in Linux beyond signal handlers. (Except if you're debugging with GDB and use print foo(123) where foo is a function in the target process.)
As mentioned in comments on Can I use rsp as a general purpose register (the x86-64 equivalent of this question), there's a workaround even for signals:
Use sigaltstack to set up an alternative stack, and specify SA_ONSTACK in the flags for sigaction when installing a handler.
As @Timothy points out, if your scratch value of SP could be an integer that happens to "point" into the alt stack, the signal dispatch mechanism will assume this is a nested signal and won't modify SP (because in an actual nested-signal case that would overwrite the first signal handler's still in use stack). So you could be one push away from SP going into an unmapped page, unless you allocate twice as much as you need, and only pass the top half to sigaltstack. (Maybe just 2k or 4k for simple signal handlers that return after not doing much).
This should be safe even with nested signals: only the outer-most signal handler can start near the bottom of the alt stack, and use some of the allocated space beyond the actual altstack. Another signal will use space below that, if SP is still within the altstack. Or it will use the top of the altstack if SP has gotten outside the altstack.
Or you can avoid the need for this over-allocation by using SP to hold a pointer to something else that's definitely not the alt stack, if any of your GP registers need to be a pointer. Having it be a valid pointer opens you up to corruption instead of faults if a debugger uses the current SP for something, or if you get the altstack mechanism wrong. But that's just a difference in failure mode: either is catastrophic.
Hardware interrupts save state on the kernel stack, not the user-space stack. If they used the user stack:
- user-space could crash the OS by having an invalid SP.
- user-space could gain kernel privileges by having another user-space thread modify the kernel's stack data (including return addresses.)
(All user-space threads of a process share the same page table, and can read/write each other's stack mappings.)
Linux/Android is very different from a lightweight RTOS without virtual memory or strict enforcement of privilege separation.