Ok so I've been experimenting with 32bit protected mode, and am trying to get back to 16bit real mode. Of course when you clear CR0 register's protected mode bit you are back in a state somewhere between 32bit protected mode and 16bit real mode. I think this is called 32bit real mode. This state remains until you run a far instruction that sets the code's CS segment (which puts it back into 16bit real mode). In my case, since I got into protected mode with a far CALL instruction, I should be returning with a RETF instruction. However, while the initial call from 16bit real mode put 4 bytes on the stack (16bit segment and 16bit offset), the return to 16bit real mode (due to the RETF instruction being run in 32bit real mode), should be expecting there to be 6 bytes on the stack (16bit segment and 32bit offset). And I've already compensated for this by fixing the size of the values on the stack for the RETF instruction.
Now I've tested my code in DosBox, but something strange is going on. Instead of using only 6 bytes for the RETF instruction, it seems to be using 8 bytes (32bit segment and 32bit offset) instead. The result is that it kept getting my stack out of sync , even though it my code pointer at the right location after the RETF. It's like it popped 2 extra bytes off the stack that I'm not sure what they were even used for. This doesn't make any sense. The segment part of an address is never a 32bit number. I managed to fix it by pushing the segment number on the stack as 32bit number (and of course keeping the offset as 32bits as well). But from my understanding, this extra fix (using 8 bytes on the stack for a 32bit RETF) shouldn't be needed. Is this just a DosBox bug, or is this behavior correct for real hardware too? Does the fact that the RETF instruction is being run from the strange 32bit real mode, and that the destination segment is a different bitness (16bit real mode), something that actually should cause the behavior I'm seeing?