Differences:
sub reg,reg is documented to set AF=0 (the BCD half-carry flag, from bit 3 to bit 4). XOR leaves AF undefined. The architectural effect is otherwise exactly identical, leaving only possible performance differences. AF almost never matters, usually only if the next instruction is aaa or something.
sub-zeroing is slower than xor-zeroing on a few CPUs (e.g. Silvermont, as pointed out in my answer you linked), but the same performance on most. And of course both have the same 2-byte size.
I'd guess it's just different authors of hand-written asm, some of them preferring sub probably without realizing that some CPUs only special-case xor. Except in cases where they want to guarantee clearing the AF flag, where sub might be intentional. Like perhaps initializing things and wanting a fully known state for EFLAGS before something that might use pushf.
XOR leaving AF undefined still means it will be either 0 or 1, you just don't know which. (Not like C undefined behaviour). The actual result could depend on the CPU model, the input values, or possibly even some stray bits somewhere.
In modern CPUs that recognize sub as a zeroing idiom, it will be zero so the CPU can handle xor-zeroing and sub-zeroing exactly identically, including the FLAGS result.