Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SZ_GetSpace: Fix reversing error & buffer overflow #1014

Closed
wants to merge 1 commit into from
Closed

SZ_GetSpace: Fix reversing error & buffer overflow #1014

wants to merge 1 commit into from

Conversation

anzz1
Copy link
Contributor

@anzz1 anzz1 commented Jan 19, 2024

edit: tl;dr; as it becomes clear in the posts further below, this PR doesn't actually fix the problem.


SZ_GetSpace: Fix reversing error & buffer overflow

This hopefully fixes the crash following soon after SZ_GetSpace: overflow on netchan->message and other SZ_GetSpace related errors.

Also I have a question about previous implementation,
where did these come from?

https://github.com/dreamstalker/rehlds/blob/90cb97ddae6cd7c66b08987d4d8049cb9fbcc76d/rehlds/engine/common.cpp#L1189-L1192

and

https://github.com/dreamstalker/rehlds/blob/90cb97ddae6cd7c66b08987d4d8049cb9fbcc76d/rehlds/engine/common.cpp#L1243

These are not present in the original engine, nor are they wrapped in #ifdef REHLDS_FIXES. It seems they appear out of thin air.

@anzz1
Copy link
Contributor Author

anzz1 commented Jan 19, 2024

It's pretty clear when you look at the previous (ReHLDS) func implementation that it's wrong.

Not only it does not match the original (HLDS) SZ_GetSpace function, but it doesn't function correctly.

The return value is same whether the sizebuf overflowed or not.
Ultimately the only thing that changes between the normal and overflowed codepaths are the prints.

The changes that are different end up doing nothing:

SZ_Clear unsets the SIZEBUF_OVERFLOWED flag, which is directly set again afterwards.
https://github.com/dreamstalker/rehlds/blob/90cb97ddae6cd7c66b08987d4d8049cb9fbcc76d/rehlds/engine/common.cpp#L1243-L1244

SZ_Clear also sets buf->cursize to 0, but it doesn't matter as its overwritten on the next line.
https://github.com/dreamstalker/rehlds/blob/90cb97ddae6cd7c66b08987d4d8049cb9fbcc76d/rehlds/engine/common.cpp#L1248

@s1lentq
Copy link
Collaborator

s1lentq commented Jan 19, 2024

There is no reverse mistake here, and the code behavior is exactly the same, except for the lack of REHLDS_FIXES to check negative length

@anzz1
You may be misled by looking at disassembly code as it, but when combined with DWARF and take into account the omitted code by optimizing the compiler, the code looks exactly as it should be

@anzz1
Copy link
Contributor Author

anzz1 commented Jan 19, 2024

I am sorry, but you are wrong.
Please check again the original engine_i486.so and you can confirm my findings are true.
I am using the DWARF debug info, and I am aware how compilers and optimizations work.
I know how to reverse, this is not copy-paste from a disassembler, I have corrected the function flow to what it should be so the resulting functionality is identical to original.

You can use either the build 6153 rehlds was originally based on, or the newer 8684 pre-anniversary one, since the SZ_GetSpace function is identical on both.

First of all, the negative check does not exist. In fact, the string "%s: %i negative length on %s" does not exist anywhere in the binary, you can confirm by searching for the word "negative" 6E 65 67 61 74 69 76 65. It is simply not there.

Secondly, the SZ_Clear function is never called anywhere in the SZ_GetSpace function, in any case.

Thirdly, the logic for print "full buffer size on %s" or "full buffer size on %s, ignoring" is wrong, one or the other or none must be printed but never both of them.

But the real problem here is that the check for if (cursize + length <= buf->maxsize) is completely missing, which cause the function to return out-of-bounds pointer.

To simplify, this is how function should work:

If overflowed:
Set buf->cursize to length.
Add flag SIZEBUF_OVERFLOWED to buf->flags
Return pointer to buf->data

If not overflowed:
Set buf->cursize to buf->cursize + length
Return pointer to buf->data + cursize (note: the previous cursize, not cursize+length !)

But currently it does not have the "If overflowed" part.
It always return the buf->data + cursize, even if overflowed. This is NOT correct, and returns OOB pointer.
This is also why the resulting crash is so random, and lead people to try wrong fixes for the cause like #907. Since the returned sizebuf_t data now points to OOB memory, it can corrupt whatever happens to be there and the problems it cause might be crash or something else, and hard to understand, like is usual with buffer overflows.

@s1lentq
Copy link
Collaborator

s1lentq commented Jan 19, 2024

First of all, the negative check does not exist. In fact, the string "%s: %i negative length on %s" does not exist anywhere in the binary, you can confirm by searching for the word "negative" 6E 65 67 61 74 69 76 65. It is simply not there.

I didn't contradict that, I know it doesn't exist and I mentioned it above, that it really isn't in the vanilla code and should have been wrapped in REHLDS_FIXES macro.

Secondly, the SZ_Clear function is never called anywhere in the SZ_GetSpace function, in any case.

You can check with dwarf again and use the right tools to fully analyze them, and you will notice that SZ_Clear was used in the vanilla code, and the fact that you don't see this function in the disassembly code means about compiler optimization, because GCC using more optimizations of compiler (like -O3, -ipo etc) by compared windows compiler VC++ (/O2 /Os /GL etc)

/* <11cdc> ../engine/common.cpp:1777 */
void *SZ_GetSpace(sizebuf_t *buf, int length)
{
//	void *data;
//	SZ_Clear(sizebuf_t *buf);
}

If you still have doubts, u can take a look at windows binaries any build, which uses less optimization and cleaner disassembler (pseudocode) code

SZ_GetSpace
void *__cdecl SZ_GetSpace(sizebuf_t *buf, int length)
{
	bool v3; // zf
	const char *v4; // eax
	const char *v5; // eax
	const char *v6; // eax
	int v7; // ecx
	void *result; // eax

	if ( length + buf->cursize > buf->maxsize )
	{
		if ( !(buf->flags & 1) )
		{
			v4 = buf->buffername;
			if ( buf->maxsize == 0 )
			{
				if ( !v4 )
					v4 = "???";

				Sys_Error("SZ_GetSpace:	Tried to write to an uninitialized sizebuf_t: %s", v4);
			}

			if ( !v4 )
				v4 = "???";

			Sys_Error("SZ_GetSpace: overflow without FSB_ALLOWOVERFLOW set on %s", v4);
		}

		if ( length > buf->maxsize )
		{
			v5 = buf->buffername;
			if ( !(buf->flags & 1) )
			{
				if ( !v5 )
					v5 = "???";

				Sys_Error("SZ_GetSpace: %i is > full buffer size on %s", length, v5);
			}

			if ( !v5 )
				v5 = "???";

			Con_DPrintf("SZ_GetSpace: %i is > full buffer size on %s, ignoring", length, v5);
		}

		v6 = buf->buffername;
		if ( !buf->buffername )
			v6 = "???";

		Con_Printf("SZ_GetSpace: overflow on %s\n", v6);
		SZ_Clear(buf);
		buf->flags |= 2u;
	}
	v7 = buf->cursize;
	result = &buf->data[v7];
	buf->cursize = length + v7;
	return result;
}

Thirdly, the logic for print "full buffer size on %s" or "full buffer size on %s, ignoring" is wrong, one or the other or none must be printed but never both of them.

This does not change anything, Sys_Error has the attribute noreturn, which means that we will never return to the function SZ_GetSpace after call (unreachable code), therefore logic same and there will never be a print of both at the same time.

Anyway, let's leave the point about what a function looks like in vanilla code and turn to the main question, the logic of the code.

But currently it does not have the "If overflowed" part.
It always return the buf->data + cursize, even if overflowed.

No, it is not

If overflowed:
Set buf->cursize to length.
Add flag SIZEBUF_OVERFLOWED to buf->flags
Return pointer to buf->data

If not overflowed:
Set buf->cursize to buf->cursize + length
Return pointer to buf->data + cursize (note: the previous cursize, not cursize+length !)

Following your point of view about the logic of the code, there are no contradictions in the current code, it still works as you expected, and for some reason, you ignore the fact that cursize is always 0 in overflow case after call SZ_Clear, therefore it is equivalent to returning data = &buf->data[0];

@anzz1
Copy link
Contributor Author

anzz1 commented Jan 19, 2024

Oh right, I don't have any DWARF comments only the function names, types and so.

Yes, I only checked the Linux version, indeed you are correct that SZ_Clear is called on the Windows swds.dll version.

So the only part that is wrong is the "%s: %i is > full buffer size on %s" / "%s: %i is > full buffer size on %s, ignoring" part as only one of them must be called, but this shouldn't cause any problems. I don't know which decompiler you are using, but at least IDA and Hex-Rays C decompiler does produce incorrect result here when looking at the Windows binary, while the Linux one produces correct result. The produced output from swds.dll looks what you posted above, but is incorrect. If you look at the graph view and then text view, you see it lose the jmp instruction after Sys_Error call since it thinks its a noreturn function. This produces wrong graph and wrong pseudocode.

Meanwhile the graph & pseudocode on the linux one is correct, and it's either/neither, but not both.
image

So this bug unfortunately still eludes us. What is clear though is that after a SZ_GetSpace overflow happens, crash or other undefined behaviour will happen later. The cause of it is just somewhere else, not here.

@anzz1 anzz1 closed this Jan 19, 2024
@anzz1
Copy link
Contributor Author

anzz1 commented Jan 19, 2024

I guess I could still reopen this since this way its more correct, with the Printf paths bifurcated.

But like you said, it ultimately only changes the print part and removes the unnecessary SZ_Clear doing &~ SIZEBUF_OVERFLOWED , | SIZEBUF_OVERFLOWED dance. Setting the length to 0 in SZ_Clear makes length+cursize == length+0, which I missed. So when it comes to the overflow bug, this PR has no real effect.

Was this bug ever fixed in the original HLDS? As the two possibilities are that either a similar reversing error due to incorrect output or whatever happened in ReHLDS, or that this overflow bug still exist in original HLDS too.

I try to save the stack information next time I encounter it if there's anything helpful there, but as the underlying error could have happened way earlier than the crash itself it might be of no help. We'll see, the bug leading to a crash is unfortunately pretty rare and thus hard to catch as the server could run for weeks without encountering it.

@anzz1 anzz1 reopened this Jan 19, 2024
@s1lentq
Copy link
Collaborator

s1lentq commented Jan 19, 2024

If you look at the graph view and then text view, you see it lose the jmp instruction after Sys_Error call since it thinks its a noreturn function

Because, if the function Sys_Error with this attribute noreturn actually returns then it is UB (behavior is undefined)
Therefore the compilers, such as VC++ and GCC, may handle such scenarios differently and own discretion, and that's why leading to variations in code instructions.

I'm still convinced that it doesn't matter which instructions follow the Sys_Error call if the function, marked as noreturn, guarantees the exit point of the app.

@anzz1
Copy link
Contributor Author

anzz1 commented Jan 19, 2024

Oh yeah, if I followed further, I would've seen that Sys_Error indeed is a noreturn function, instead of being errorneously marked as such. Well, there is a path from which it could return in theory, if (!bReentry_14510 && !svs.dll_initialized && !g_bIsDedicatedServer), but in reality this probably could never be the case.

God damn it, I thought that this issue would be finally solved.
I should've known that the fix couldn't have been this easy...

Exactly as you said, I was confused after all. 😅

Well, you live to learn, I guess.

@anzz1 anzz1 closed this Jan 19, 2024
@s1lentq s1lentq removed the Status: 🚧 in progress Task currently being worked on. label Jan 19, 2024
@lspublic
Copy link

lspublic commented Feb 4, 2024

Anzz1, this problem what you fixed for sz getspace where we can download beacuse i have problem in my server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engine: 📀 reHLDS Case refers to reHLDS engine.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants