In fact, traditional x86 opcodes allow both operand size selection (sometimes as specific instruction encoding, sometimes via prefix bytes) and register number selection bits. For register selection, there's always three bits in the instruction encoding. This allows for a total of eight registers.
Originally, there were four, AX/BX/BP/SP for 16bit and AL/AH/BL/BH for 8bit.
Adding two more gave CX/DX plus CL/CH/DL/DH. No more 8bit regs left, but still two unused values in the register selection for 16bit.
Which were provided in another rev of Intel's architecture by the index regs DI/SI.
That done, they had exhausted the 3 register selection bits (and made it impossible to provide 8bit regs for SI/DI/BP/SP).
The way AMD64 64bit mode managed to double the register set is therefore by using prefix bytes ("use the new regs"-prefix), similar to how traditional x86 code chose between 16 and 32bit operations. Same method was used to provide 8bit registers where there have been none "traditionally", i.e. for SP/BP/SI/DI.
To illustrate, see, for example, the following instruction encodings:
0: 00 c0 add %al,%al
2: 00 c1 add %al,%cl
4: 00 c2 add %al,%dl
6: 00 c3 add %al,%bl
8: 00 c4 add %al,%ah
a: 00 c5 add %al,%ch
c: 00 c6 add %al,%dh
e: 00 c7 add %al,%bh
10: 40 00 c4 add %al,%spl
13: 40 00 c5 add %al,%bpl
16: 40 00 c6 add %al,%sil
19: 40 00 c7 add %al,%dil
And, for [ 16bit / 64bit ] / 32bit, side-by side since it's so illustrative:
0 : [66/48] 01 c0 add %?ax,%?ax
2/3 : [66/48] 01 c1 add %?ax,%?cx
4/6 : [66/48] 01 c2 add %?ax,%?dx
6/9 : [66/48] 01 c3 add %?ax,%?bx
8/c : [66/48] 01 c4 add %?ax,%?sp
a/f : [66/48] 01 c5 add %?ax,%?bp
c/12: [66/48] 01 c6 add %?ax,%?si
e/15: [66/48] 01 c7 add %?ax,%?di
The prefix 0x66 marks a 16bit operation, and 0x48 is one of the prefix bytes for a 64bit op (it'd be a different one if your target and/or source were one of the "new" high-numbered registers).
To get back to your original question, how to access the high bits; well, newer CPUs have SSE instructions for the purpose; every 8/16/32/64bit field of the vector register is separately accessible via e.g. shuffle instructions, and in fact a lot of string manipulation code provided by Intel / AMD in their optimized libraries these days doesn't use the normal CPU registers anymore but the vector registers instead. If you need symmetry between upper / lower halves (or other fractions) of some larger value, use the vector registers.