0

I'm using i686 gcc on windows. When I built the code with separate asm statements, it worked. However, when I try to combine it into one statement, it doesn't build and gives me a error: unsupported size for integer register.

Here's my code

u8 lstatus;
u8 lsectors_read;
u8 data_buffer;

void operate(u8 opcode, u8 sector_size, u8 track, u8 sector, u8 head, u8 drive, u8* buffer, u8* status, u8* sectors_read)
{
    asm volatile("mov %3, %%ah;\n"
                "mov %4, %%al;\n"
                "mov %5, %%ch;\n"
                "mov %6, %%cl;\n"
                "mov %7, %%dh;\n"
                "mov %8, %%dl;\n"
                "int $0x13;\n"
                "mov %%ah, %0;\n"
                "mov %%al, %1;\n"
                "mov %%es:(%%bx), %2;\n"
                : "=r"(lstatus), "=r"(lsectors_read), "=r"(buffer)
                : "r"(opcode), "r"(sector_size), "r"(track), "r"(sector), "r"(head), "r"(drive)
                :);
    status = &lstatus;
    sectors_read = &lsectors_read;
    buffer = &data_buffer;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
mike
  • 100
  • 7
  • This is 16-bit code. What options are you using to compile it? Maybe pointer width is a problem? If you remove some operands, which one causes the compile error? Also, you forgot a `"memory"` clobber, which is normally necessary for safety when passing pointers in registers into an inline asm statement that dereferences them. Also, why global variables as temporaries? And why are you assigning to by-value function args right before the end of a function? The caller won't see any effect of `status=lstatus`, that's not how C works. Did you maybe mean `*status = lstatus`? – Peter Cordes Jun 23 '20 at 19:04
  • Does gcc have an x86-16 code generator? I thought there isn’t one. – prl Jun 23 '20 at 19:05
  • @prl: mainstream i686 gcc `-m16` uses `.code16gcc` but doesn't change the ABI. So it uses 32-bit pointers and 32-bit int, so most instructions assemble with an operand-size prefix. That can maybe work, but I'm worried about what register size it's going to pick for a pointer. – Peter Cordes Jun 23 '20 at 19:06
  • Also this code still looks super broken. It doesn't set BX or ES ahead of the `int $0x13`, and it loads 2 or 4 bytes from `es:bx` after the int (instead of returning `bx` *as* a pointer; GCC doesn't support far pointers but you could ensure that ES=DS.) Fuz's answer on your previous question ([Cannot read sectors from disk in C](https://stackoverflow.com/q/62503703)) is still what you should be doing instead of this. Any time you have a mov as the first or last instruction in an asm template, you're probably doing it wrong and should use more specific constraints. – Peter Cordes Jun 23 '20 at 19:10
  • Im confused. es:bx is set when i call the 0x13 inturrupt? I don't think gcc has an x86-16 generator. – mike Jun 23 '20 at 19:16
  • Also, last time, gcc couldn't find the ES segment registor even though it exists. – mike Jun 23 '20 at 19:18
  • No, ES:BX is the pointer *input* to `int 0x13` that tells it where to read into. Also, you forgot to declare clobbers so you could have the same problem as last time, with GCC picking registers you overwrite with your own asm instructions. That's another reason to use constraints to tell GCC where to put stuff in the first place: there aren't enough registers for GCC to have 8 variables in registers you don't need to overwrite inside the asm (if you do it this way). https://godbolt.org/z/oujNP7 shows what happens with the uses of `%7` and `%8` removed, avoiding this out-of-registers error. – Peter Cordes Jun 23 '20 at 19:22
  • Wait doesn't in 0x13 write into ES:BX. When I put in clobbers, it gives me an impossible constraints error. – mike Jun 23 '20 at 19:27
  • 1
    If you are intent on trying to use GCC with 16-bit code I might recommend that you consider getting a hold of the forked ia16-gcc elf cross compiler. It is a work in progress but it handles 16-bit code much better than the regular GCC compiler. There is even a lib86 library that allows you to make BIOS interrupt calls through function (although it seems to lack the ability to use 32-bit registers which is required for a rare number of BIOS services) – Michael Petch Jun 23 '20 at 19:36
  • Also, how come it works with seperate asm statements? – mike Jun 23 '20 at 20:03
  • @mike: it seems the problem is actually having too many register operands for a single statement. x86's 8-bit registers overlap with full registers, so probably you're getting this error because the compiler can't find a free register that's wide enough to satisfy your input constraint. – Peter Cordes Jun 23 '20 at 20:06
  • 1
    @mike I showed you what proper code for this task looks like in your previous question. Why did you went back to this horribly broken implementation? – fuz Jun 23 '20 at 22:38
  • @mike It did not work with separate asm statements and if it appeared to work, it did so only out of coincidence. I believe you have not understood how gcc-style inline assembly actually works. – fuz Jun 23 '20 at 22:39
  • 1
    @mike Note that the BIOS call simply leaves `es:bx` untouched. You need to provide an address for the buffer in `es:bx` and the BIOS will load the data to that address. The address is still in there when the call returns which is why some resources document the call returning the address of the buffer. However, this applies to all registers except for `ax` in general. – fuz Jun 23 '20 at 22:41
  • 1
    @PeterCordes Gcc requires that `ss`, `es` and `ds` point to the same memory so no need to set `es` in the inline assembly. – Timothy Baldwin Jun 23 '20 at 22:48

1 Answers1

3

The error message is a little misleading. It seems to be happening because GCC ran out of 8-bit registers.

Interestingly, it compiles without error messages if you just edit the template to remove references to the last 2 operands (https://godbolt.org/z/oujNP7), even without dropping them from the list of input constraints! (Trimming down your asm statement is a useful debugging technique to figure out which part of it GCC doesn't like, without caring for now if the asm will do anything useful.)

Removing 2 earlier operands and changing numbers shows that "r"(head), "r"(drive) weren't specifically a problem, just the combination of everything.

It looks like GCC is avoiding high-8 registers like AH as inputs, and x86-16 only has 4 low-8 registers but you have 6 u8 inputs. So I think GCC means it ran out of byte registers that it was willing to use.

(The 3 outputs aren't declared early-clobber so they're allowed to overlap the inputs.)


You could maybe work around this by using "rm" to give GCC the option of picking a memory input. (The x86-specific constraints like "Q" that are allowed to pick a high-8 register wouldn't help unless you require it to pick the correct one to get the compiler to emit a mov for you.) That would probably let your code compile, but the result would be totally broken.

You re-introduced basically the same bugs as before: not telling the compiler which registers you write, so for example your mov %4, %%al will overwrite one of the registers GCC picked as an input, before you actually read that operand.

Declaring clobbers on all the registers you use would leave not enough registers to hold all the input variables. (Unless you allow memory source operands.) That could work but is very inefficient: if your asm template string starts or ends with mov, you're almost always doing it wrong.

Also, there are other serious bugs, apart from how you're using inline asm. You don't supply an input pointer to your buffer. int $0x13 doesn't allocate a new buffer for you, it needs a pointer in ES:BX (which it dereferences but leaves unmodified). GCC requires that ES=DS=SS so you already have to have properly set up segmentation before calling into your C code, and isn't something you have to do every call.

Plus even in C terms outside the inline asm, your function doesn't make sense. status = &lstatus; modifies the value of a function arg, not dereferencing it to modify a pointed-to output variable. The variable written by those assignments die at the end of the function. But the global temporaries do have to be updated because they're global and some other function could see their value. Perhaps you meant something like *status = lstatus; with different types for your vars?

If that C problem isn't obvious (at least once it's pointed out), you need some more practice with C before you're ready to try mixing C and asm which require you to understand both very well, in order to correctly describe your asm to the compiler with accurate constraints.


A good and correct way to implement this is shown in @fuz's answer to your previous question. If you want to understand how the constraints can replace your mov instructions, compile it and look at the compiler-generated instructions. See https://stackoverflow.com/tags/inline-assembly/info for links to guides and docs. e.g. @fuz's version without the ES setup (because GCC needs you to have done that already before calling any C):

typedef unsigned char u8;
typedef unsigned short u16;

// Note the different signature, and using the output args correctly.
void read(u8 sector_size, u8 track, u8 sector, u8 head, u8 drive,
    u8 *buffer, u8 *status, u8 *sectors_read)
{
    u16 result;

    asm volatile("int $0x13"
        : "=a"(result)
        : "a"(0x200|sector_size), "b"(buffer),
          "c"(track<<8|sector), "d"(head<<8|drive)
        : "memory" );  // memory clobber was missing from @fuz's version

    *status = result >> 8;
    *sectors_read = result >> 0;
}

Compiles as follows, with GCC10.1 -O2 -m16 on Godbolt:

read:
        pushl   %ebx
        movzbl  12(%esp), %ecx
        movzbl  16(%esp), %edx
        movzbl  24(%esp), %ebx      # load some stack args
        sall    $8, %ecx
        movzbl  8(%esp), %eax
        orl     %edx, %ecx          # shift and merge into CL,CH instead of writing partial regs
        movzbl  20(%esp), %edx
        orb     $2, %ah
        sall    $8, %edx
        orl     %ebx, %edx
        movl    28(%esp), %ebx     # the pointer arg
        int $0x13                  # from the inline asm statement
        movl    32(%esp), %edx     # load output pointer arg
        movl    %eax, %ecx
        shrw    $8, %cx
        movb    %cl, (%edx)
        movl    36(%esp), %edx
        movb    %al, (%edx)
        popl    %ebx
        ret

It might be possible to use register u8 track asm("ch") or something to get the compiler to just write partial regs instead of shift/OR.


If you don't want to understand how constraints work, don't use GNU C inline asm. You could instead write stand-alone functions that you call from C, which accept args according to the calling convention the compiler uses (e.g. gcc -mregparm=3, or just everything on the stack with the traditional inefficient calling convention.)

You could do a better job than GCC's above code-gen, but note that the inline asm could optimize into surrounding code and avoid some of the actual copying to memory for passing args via the stack.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thanks! I tried out fiz's answer. GCC doesn't recognize the register es in the cobbler though. – mike Jun 24 '20 at 00:21
  • 1
    @mike: If that's the case for the GCC you're using, then it definitely requires ES=DS=SS anyway, and you should use the version was already adding to this answer when you commented. But seriously, consider playing with something simpler and better supported than 16-bit x86 asm when you're still making plain-C beginner mistakes like `status = &lstatus;`. It's normal to make mistakes when learning a language, but also a good idea to limit complexity while you're learning. GNU C inline asm is very very hard to get right, you really have to understand asm *and* C, and is bad for learning either. – Peter Cordes Jun 24 '20 at 00:27
  • 1
    I have a more advanced version of that function here in this answer diskbios.h in the answer. It handles retries, returns the carry flag, status handles some buggy BIOSes and rather than pass a large number of parameters it uses a data structure instead. https://stackoverflow.com/a/52047408/3857942 – Michael Petch Jun 24 '20 at 03:17
  • 1
    @mike If the code I post doesn't work, why haven't you told me? I wrote it assuming ia16-gcc because you didn't say which gcc version you were using. Receiving help is a two way street. By refusing to communicate, you only make this more difficult for yourself and waste a bunch of people's time. Don't do that. – fuz Jun 24 '20 at 16:54