0

I read that x86 CPUs have a variable instruction length of 1 to 15 bytes. On the other hand, it is also written that the x86 word size is 32 bits, that means all registers, including the instruction register which holds the actual instruction, are 32 bits wide (4 Bytes).

That means the instructions can be wider than the instruction register. How does this fit? Further more, I learned that after executing an instruction, without jumping, the instruction counter is incremented by 4. That means it operates with the assumption that every instruction is 4 bytes long. How is this right?

I hope that someone could clarify this for me.

Tung Nguyen
  • 149
  • 1
  • 8
  • 3
    You're mixing MIPS (or some other nice RISC) concepts with x86 I think, there are probably some related answers – harold Jan 30 '18 at 17:19
  • it would be nice if you could link me such answers^^. I tried to search on google, but didnt know which term to search. – Tung Nguyen Jan 30 '18 at 17:21
  • [Instruction decoding when instructions are length-variable](https://stackoverflow.com/q/8204086/555045) – harold Jan 30 '18 at 17:26
  • 4
    There is no "instruction register" in x86 that holds an instruction. – prl Jan 30 '18 at 18:29
  • Counterexamples for "after executing an instruction, without jumping, the instruction counter is incremented by 4": `push eax`, `inc ebx`, `nop`, `cwd`, `add dword ptr ds:[eax+8*ebx+12345678h], 9abcdefh` ... actually, trying to find an exact 4 byte instruction is harder. – Jongware Jan 30 '18 at 18:44
  • `and ax, 0x1234` in 16-bit-mode, `lock rep not eax` (cheating: unnecessary prefixes may be ignored), `bt eax, 1` (80386+) – sivizius Jan 30 '18 at 18:50
  • @sivizius: unnecessary `lock` prefixes usually fault. 4-byte instructions: `movdqa xmm0, xmm1` and *many* other SSE2 instructions, or SSE1 with `imm8`. In 64-bit mode, 4-byte instructions are common with a REX prefix + imm8: `add rsp, 8`. But in 32-bit mode: `mov eax, [esp+8]` (base=ESP requires a SIB byte). Or `lea eax, [ecx + edx*4 - 11]` (opcode + modrm + SIB + disp8). Or many instructions with a 2-byte opcode + disp8 or SIB, like `movzx eax, byte [ebx + esi]`. But yeah, 2, 3 and 5 byte instructions are more common. – Peter Cordes Jan 31 '18 at 01:10
  • 2
    _"the x86 word size is 32 bits"_ The x86 word size is 16 bit for historical reasons. See section _4.1 FUNDAMENTAL DATA TYPES_ in Intel's manual. – Michael Jan 31 '18 at 10:00
  • near-duplicate: [x86 registers: MBR/MDR and instruction registers](https://stackoverflow.com/q/51522368) explains that x86 doesn't have an "instruction register", and it wouldn't make sense for a variable-length ISA that needs complex decoding. – Peter Cordes Dec 03 '18 at 12:17

1 Answers1

1

The x86 has a quite complex opcode-parser with multiple states. First it looks for the legacy prefixes like REP, LOCK, address- and operand-override-prefixes and probably just sets internal flags. Then it looks for mandatory and rex prefixes and probably sets other internal flags. After this, the parser expects the actual instruction...or a 0x0f-prefix for more instructions. Even this instruction-byte may contain other data, e.g.registers could be encoded there, so depending on the highest three bits (0bxxx.....) of the instruction, the parser has to decide, whether the instruction encodes an register (e.g. 0b000xxx110: push xxx, where xxx is es, cs, ss or ds) or not. depending on the instruction, the parser then looks for an ModR/M-field and evaluates this. When this ModR/M-field indicates, there is an SIB-field to, then, guess what?, it evaluates the SIB-field. There could be, depending on the instruction, ModR/M-field or SIB-field, an immediate offset and/or an immediate value at the end.

I do not know, how the processor actually stores this stuff. Maybe there is an register for the instruction, a flag-register, a register, where the destination-register-number is stored, a register for the immediate value and some kind of representation of used addresses.

Anyway, there is not this instruction-register you may heard for RISC-processors and even if, just because the length of general registers is 64bit, other registers does not have to be this size. E.g. the Streaming SIMD Extensions provide xmm-registers, that are 128bit in size. They could contain a full 15-byte valid x86-instruction.

You can find the structure of this parser on page 5 here.

Community
  • 1
  • 1
sivizius
  • 450
  • 2
  • 14
  • 1
    Thank you for the detailed answer, though some of the terminologies are a bit over my head. So in short there are 2 assumptions in my questions that are wrong here: 1. x86 has an instruction register (which is 32 bit). 2. x86 has a different parsing method other than the simple increment by 4 in the instruction counter. I'm reading Essential of Computer Architecture of Douglas Corner. My fault was that I automatically deduced everything in the book applies to x86 CPU. – Tung Nguyen Jan 30 '18 at 19:02