Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86 Call to Interrupt Procedure (INT imm8) has improper definition for 16-bit real mode #7342

Open
swine-flu opened this issue Jan 2, 2025 · 4 comments
Assignees
Labels
Feature: Processor/x86 Status: Triage Information is being gathered

Comments

@swine-flu
Copy link

Describe the bug
Stack is not being properly cleaned up after issuing a call to this procedure in x86-16 real mode application. Improper definition of INT imm8 seems to be what causing this issue:

:INT imm8 is vexMode=0 & byte=0xcd; imm8 { tmp:1 = imm8; intloc:$(SIZE) = swi(tmp); call [intloc]; }

To Reproduce
Having an interrupt call somewhere at the beginning of the function containing local variables makes this bug apparent. Accordingly to the decompiler output, arbitrary modifications are being made to local variables lying on top of the stack while there is nothing in the assembly code which could be related to these changes.

Expected behavior
I'm not expecting INT imm8 definition to perfectly replicate its actual semantics for a real mode, i.e. storing the flags on the stack before the call and also utilizing IDT to resolve the address, performing something semantically close to a far call should suffice I presume. For the time being I ended up with the following band-aid fix:

:INT imm8       is vexMode=0 & byte=0xcd; imm8                      { tmp:1 = imm8; intloc:$(SIZE) = swi(tmp); tmp2:2 = SP; call [intloc]; SP = tmp2; }

Screenshots
Here is the comparison of two decompilations - with and without a fix:
INT_imm8

Environment (please complete the following information):

  • OS: Windows 10
  • Java Version: 21.0.5
  • Ghidra Version: 11.3
  • Ghidra Origin: locally built
@GhidorahRex
Copy link
Collaborator

The only change here is restoring the stack pointer. I don't know enough about how the stack pointer works with interrupts in x86, should this be done automatically? It might be worth taking a closer look at interrupt stack behavior.

We would almost certainly want to save and restore the entire stack pointer (e.g. RSP/ESP), and should definitely incorporate adding the return address to the push.

@swine-flu
Copy link
Author

I'm not proficient enough to give a qualified answer regarding that matter. I only read a brief description of this instruction and haven't looked into a proper manual. My only concern is related to the fact that the current p-code definition of this instruction is spoiling the stack accordingly to the decompiler output.

Nonetheless, a proper implementation of this instruction for 16-bit real mode should follow the next logic: call to interrupt procedure (INT imm8 in particular) is similar to a far call, i.e. it's supposed to store 32-bit address consisting of the current code segment and the next instruction pointer (CS:IP pair) on the stack, however it's also supposed to push FLAGS register onto the stack beforehand. It is IRET instruction responsibility to restore the flags and clean up the stack before performing far return and resuming code execution. I've also been able to observe and confirm this exact behavior via DOSBox debugger.

Considering the fact that this instruction behaves quite differently for various cpu modes, it's probably worth to add a distinct definition for a real mode and keep the old definition intact for the time being:

:INT imm8       is protectedMode=0 & vexMode=0 & byte=0xcd; imm8                      { ... }
:INT imm8       is protectedMode=1 & vexMode=0 & byte=0xcd; imm8                      { tmp:1 = imm8; intloc:$(SIZE) = swi(tmp); call [intloc]; }

@ABratovic
Copy link

ABratovic commented Jan 8, 2025

@swine-flu, your band-aid to the INT pcode I believe doesn't deliver the result you're actually looking for as your decompilation screenshots still shows two calls to INT 0x10 but not the actual INT function and subfunctions in the AH and AL registers.
You could try to further extend/modify the Interrupt pcode to do so but I suggest first trying a SyscallAnalyzer like this ghidra extension that I found at https://github.com/Gravelbones/GhidraDosToolbox which does the following and see how much better your assembler and decompiler output looks like with and without your proposed band-aid.

  • This Analyzer finds x86 INT instructions (opcode CD) and then maps these instruction to function prototypes.
  • This was the way DOS programs and to some extend Win3 did system calls and interface with BIOS function like
  • memory, disk, and screen functions.
  • Most commonly the AH register will contain the main function within the interrupt and in some cases AL will contain
  • a subfunction. In some cases other registers will contain the subfunction.
  • The analyzer will read a file with mapping of interrupt, function and subfunction to a function name.

For example
image

@swine-flu
Copy link
Author

@ABratovic, I've got all the necessary info regarding interrupts being displayed as comments in my project, as a matter of fact the vast majority of them is hidden behind the Turbo Pascal API and its RTL, so not having them substituted with pseudofunctions doesn't bother me that much.

Better support for BIOS and DOS interrupts had been discussed already, see #2266. Implementing such a feature is a tall order and I'm not expecting to see anything like that any time soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Processor/x86 Status: Triage Information is being gathered
Projects
None yet
Development

No branches or pull requests

4 participants