-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible UB when setting skill level #1677
Comments
Adding
should do the trick (not tested). Do you want to send a PR? |
I think the above code is wrong. A gameskill of 0 corresponds to sk_baby, which is already covered in the first if statement. Also, a strict equality is insufficient, because the player can pass any char below '1'. So something like this should be enough: else if (gameskill < sk_baby) // avoid undefined behavior (left shift by negative value) in the else branch bit = 0; Also note that the negative gameskill condition is not thoroughly tested in the code, so there may be some more UB lurking around. An even deeper problem is that original DOOM code also has this UB problem, so we must be sure that adding this check won't deviate too much from Chocolate DOOM philosophy of preserving the original behaviour. |
The original compiler probably just passed the value directly to the machine instruction which would have treated it as unsigned and only looked at the low bits. You'd have to check the original binary to make sure. If that is indeed the case then the 100% compatible thing would be to do the same, cast to unsigned, subtract 1 and use only the lowest bits to rotate. |
I have the original installer, which I use to test some use cases with DOSBox. Any tools I can use to check if the compiled code treat it as unsigned? |
Not really. You'd have to inspect the assembly which means either reverse engineering the original binary or finding matching version of the old Watcom compiler and writing some test programs. |
My guess is also that the |
Okay, here is what I did:
#include main() { int bit; int i; for (i = 1; i <= 10; i++) { bit = 1 << -i; printf("bit = %d\n", bit); } }
#include #include print_bin(int num) { int bits_size = 8 * sizeof(int); char *p = calloc(bits_size, 1); int i; for (i = 0; i < bits_size; i++) { p[i] = num & 0x1; num = num >> 1; } printf("0b"); for (i = bits_size - 1; i > -1; i--) { printf("%d", p[i]); } printf("\n"); free(p); } main() { int bit; int i; for (i = 1; i <= 10; i++) { bit = 1 << -i; print_bin(bit); } }
ConclusionConsidering the skill hack was meant to take place when the user run the program with else if (gameskill < sk_baby) // avoid undefined behavior (left shift by negative value) in the else branch bit = 0; |
What happens when you let |
Can you dump the generated assembly? I'm not sure if wcl386 has a switch for that but there should be a debugger or disassembler which should be able to get it from the object file. |
I wrote the program to loop from DOS/4GW Protected Mode Run-time Version 1.6 Copyright (c) Rational Systems, Inc. 1990-1992 0b10000000000000000000000000000000 0b01000000000000000000000000000000 0b00100000000000000000000000000000 0b00010000000000000000000000000000 0b00001000000000000000000000000000 0b00000100000000000000000000000000 0b00000010000000000000000000000000 0b00000001000000000000000000000000 0b00000000100000000000000000000000 0b00000000010000000000000000000000 0b00000000001000000000000000000000 0b00000000000100000000000000000000 0b00000000000010000000000000000000 0b00000000000001000000000000000000 0b00000000000000100000000000000000 0b00000000000000010000000000000000 0b00000000000000001000000000000000 0b00000000000000000100000000000000 0b00000000000000000010000000000000 0b00000000000000000001000000000000 0b00000000000000000000100000000000 0b00000000000000000000010000000000 0b00000000000000000000001000000000 0b00000000000000000000000100000000 0b00000000000000000000000010000000 0b00000000000000000000000001000000 0b00000000000000000000000000100000 0b00000000000000000000000000010000 0b00000000000000000000000000001000 0b00000000000000000000000000000100 0b00000000000000000000000000000010 0b00000000000000000000000000000001 0b10000000000000000000000000000000 0b01000000000000000000000000000000 0b00100000000000000000000000000000 0b00010000000000000000000000000000 0b00001000000000000000000000000000 0b00000100000000000000000000000000 0b00000010000000000000000000000000 0b00000001000000000000000000000000 0b00000000100000000000000000000000 0b00000000010000000000000000000000 0b00000000001000000000000000000000 0b00000000000100000000000000000000 0b00000000000010000000000000000000 0b00000000000001000000000000000000 0b00000000000000100000000000000000 0b00000000000000010000000000000000 0b00000000000000001000000000000000 0b00000000000000000100000000000000 0b00000000000000000010000000000000 0b00000000000000000001000000000000 0b00000000000000000000100000000000 0b00000000000000000000010000000000 0b00000000000000000000001000000000 0b00000000000000000000000100000000 0b00000000000000000000000010000000 0b00000000000000000000000001000000 0b00000000000000000000000000100000 0b00000000000000000000000000010000 0b00000000000000000000000000001000 0b00000000000000000000000000000100 0b00000000000000000000000000000010 0b00000000000000000000000000000001 As you can see, the left shift operator works |
I will check if there is a option to generate the associated assembly code. |
Okay, I managed to generate de assembly code running the following command - /l[=list_file] generate listing file - /s[=source_file] include source lines - /e generate list of externs - /p generate list of publics - /r registers in upper case Here is the assembly code: Module: ub.c Group: 'DGROUP' CONST,CONST2,_DATA,_BSS Segment: '_TEXT' BYTE USE32 00000096 bytes 0000 53 print_bin_ push EBX 0001 51 push ECX 0002 52 push EDX 0003 56 push ESI 0004 57 push EDI 0005 89 c3 mov EBX,EAX 0007 bf 20 00 00 00 mov EDI,00000020H 000c 89 f8 mov EAX,EDI 000e ba 01 00 00 00 mov EDX,00000001H 0013 e8 00 00 00 00 call calloc_ 0018 89 c1 mov ECX,EAX 001a 31 d2 xor EDX,EDX 001c eb 0c jmp L2 001e 88 d8 L1 mov AL,BL 0020 24 01 and AL,01H 0022 8d 34 11 lea ESI,[ECX+EDX] 0025 88 06 mov [ESI],AL 0027 d1 fb sar EBX,1 0029 42 inc EDX 002a 39 fa L2 cmp EDX,EDI 002c 7c f0 jl L1 002e 68 00 00 00 00 push offset L6 0033 e8 00 00 00 00 call printf_ 0038 83 c4 04 add ESP,00000004H 003b 8d 57 ff lea EDX,-1H[EDI] 003e eb 15 jmp L4 0040 8d 1c 11 L3 lea EBX,[ECX+EDX] 0043 0f b6 1b movzx EBX,byte ptr [EBX] 0046 53 push EBX 0047 68 03 00 00 00 push offset L7 004c e8 00 00 00 00 call printf_ 0051 83 c4 08 add ESP,00000008H 0054 4a dec EDX 0055 83 fa ff L4 cmp EDX,0ffffffffH 0058 7f e6 jg L3 005a 68 06 00 00 00 push offset L8 005f e8 00 00 00 00 call printf_ 0064 83 c4 04 add ESP,00000004H 0067 89 c8 mov EAX,ECX 0069 e8 00 00 00 00 call free_ 006e 5f pop EDI 006f 5e pop ESI 0070 5a pop EDX 0071 59 pop ECX 0072 5b pop EBX 0073 c3 ret 0074 51 main_ push ECX 0075 52 push EDX 0076 ba 01 00 00 00 mov EDX,00000001H 007b 89 d0 L5 mov EAX,EDX 007d f7 d8 neg EAX 007f 88 c1 mov CL,AL 0081 b8 01 00 00 00 mov EAX,00000001H 0086 d3 e0 shl EAX,CL 0088 e8 00 00 00 00 call print_bin_ 008d 42 inc EDX 008e 83 fa 40 cmp EDX,00000040H 0091 7e e8 jle L5 0093 5a pop EDX 0094 59 pop ECX 0095 c3 ret No disassembly errors List of external symbols Symbol ---------------- calloc_ 00000014 free_ 0000006a print_bin_ 00000089 printf_ 00000060 0000004d 00000034 ------------------------------------------------------------ Segment: 'CONST' DWORD USE32 00000008 bytes 0000 30 62 00 L6 - 0b. 0003 25 64 00 L7 - %d. 0006 0a 00 L8 - .. No disassembly errors ------------------------------------------------------------ List of public symbols SYMBOL GROUP SEGMENT ADDRESS --------------------------------------------------------- main_ _TEXT 00000074 print_bin_ _TEXT 00000000 ------------------------------------------------------------ Assembly is not my strong point, but the magic seems to take place from address 007b to 0091: it negates EAX (address 007d) and then copy the lower 8 bits of EAX into CL (address 007f). Then it moves the value 1 to EAX (address 0081) and performs the left bit shift (address 0086). |
Yes, that's the opreation I expected it to produce. I think this will accurately emulate it
|
I just discovered that if rhs is greater or equal the number of bits in the promoted lhs, then it is also considered UB. else if (gameskill < sk_baby) { // accurately emulate what doom.exe did while avoiding undefined behavior // (left shift by negative value and rhs too big) int bits_int = 8 * sizeof(int); int shift = ((gameskill - 1) & 0xFF) % bits_int; bit = 1 << shift; } The assembly code uses the 8-bit register called AL to select the lower 8 bits of EAX, so we should use 0xFF. Also, there is no need to cast to unsigned int, because Just to be sure, I tested what happens when I left shift an int with by 33 or more bits. The result is the following using the Watcom compiler: 0b00000000000000000000000000000010 0b00000000000000000000000000000100 0b00000000000000000000000000001000 0b00000000000000000000000000010000 0b00000000000000000000000000100000 0b00000000000000000000000001000000 0b00000000000000000000000010000000 0b00000000000000000000000100000000 0b00000000000000000000001000000000 0b00000000000000000000010000000000 0b00000000000000000000100000000000 0b00000000000000000001000000000000 0b00000000000000000010000000000000 0b00000000000000000100000000000000 0b00000000000000001000000000000000 0b00000000000000010000000000000000 0b00000000000000100000000000000000 0b00000000000001000000000000000000 0b00000000000010000000000000000000 0b00000000000100000000000000000000 0b00000000001000000000000000000000 0b00000000010000000000000000000000 0b00000000100000000000000000000000 0b00000001000000000000000000000000 0b00000010000000000000000000000000 0b00000100000000000000000000000000 0b00001000000000000000000000000000 0b00010000000000000000000000000000 0b00100000000000000000000000000000 0b01000000000000000000000000000000 0b10000000000000000000000000000000 0b00000000000000000000000000000001 0b00000000000000000000000000000010 0b00000000000000000000000000000100 0b00000000000000000000000000001000 0b00000000000000000000000000010000 0b00000000000000000000000000100000 0b00000000000000000000000001000000 0b00000000000000000000000010000000 0b00000000000000000000000100000000 0b00000000000000000000001000000000 0b00000000000000000000010000000000 0b00000000000000000000100000000000 0b00000000000000000001000000000000 0b00000000000000000010000000000000 0b00000000000000000100000000000000 0b00000000000000001000000000000000 0b00000000000000010000000000000000 0b00000000000000100000000000000000 0b00000000000001000000000000000000 0b00000000000010000000000000000000 0b00000000000100000000000000000000 0b00000000001000000000000000000000 0b00000000010000000000000000000000 0b00000000100000000000000000000000 0b00000001000000000000000000000000 0b00000010000000000000000000000000 0b00000100000000000000000000000000 0b00001000000000000000000000000000 0b00010000000000000000000000000000 0b00100000000000000000000000000000 0b01000000000000000000000000000000 0b10000000000000000000000000000000 As in the negative case, it works |
The After that the modulo is redundant since the value is already in range 0-31. The |
With each passing day, UB seems to become less of a bug and more of a language feature.😅 So I tested the following code with Watcom compiler: #include #include print_bin(int num) { int bits_size = 8 * sizeof(int); char *p = calloc(bits_size, 1); int i; for (i = 0; i < bits_size; i++) { p[i] = num & 0x1; num = num >> 1; } printf("0b"); for (i = bits_size - 1; i > -1; i--) { printf("%d", p[i]); } printf("\n"); free(p); } main() { int bit; int i; int shift; for (i = 1; i <= 64; i++) { bit = 1 << -i; print_bin(bit); shift = (-i & 0xFF) % (8 * sizeof(int)); bit = 1U << shift; print_bin(bit); printf("\n"); } } It emulates perfectly the behavior of negative shift, while avoiding (luckily) all UB: DOS/4GW Protected Mode Run-time Version 1.6 Copyright (c) Rational Systems, Inc. 1990-1992 0b10000000000000000000000000000000 0b10000000000000000000000000000000 0b01000000000000000000000000000000 0b01000000000000000000000000000000 0b00100000000000000000000000000000 0b00100000000000000000000000000000 0b00010000000000000000000000000000 0b00010000000000000000000000000000 0b00001000000000000000000000000000 0b00001000000000000000000000000000 0b00000100000000000000000000000000 0b00000100000000000000000000000000 0b00000010000000000000000000000000 0b00000010000000000000000000000000 0b00000001000000000000000000000000 0b00000001000000000000000000000000 0b00000000100000000000000000000000 0b00000000100000000000000000000000 0b00000000010000000000000000000000 0b00000000010000000000000000000000 0b00000000001000000000000000000000 0b00000000001000000000000000000000 0b00000000000100000000000000000000 0b00000000000100000000000000000000 0b00000000000010000000000000000000 0b00000000000010000000000000000000 0b00000000000001000000000000000000 0b00000000000001000000000000000000 0b00000000000000100000000000000000 0b00000000000000100000000000000000 0b00000000000000010000000000000000 0b00000000000000010000000000000000 0b00000000000000001000000000000000 0b00000000000000001000000000000000 0b00000000000000000100000000000000 0b00000000000000000100000000000000 0b00000000000000000010000000000000 0b00000000000000000010000000000000 0b00000000000000000001000000000000 0b00000000000000000001000000000000 0b00000000000000000000100000000000 0b00000000000000000000100000000000 0b00000000000000000000010000000000 0b00000000000000000000010000000000 0b00000000000000000000001000000000 0b00000000000000000000001000000000 0b00000000000000000000000100000000 0b00000000000000000000000100000000 0b00000000000000000000000010000000 0b00000000000000000000000010000000 0b00000000000000000000000001000000 0b00000000000000000000000001000000 0b00000000000000000000000000100000 0b00000000000000000000000000100000 0b00000000000000000000000000010000 0b00000000000000000000000000010000 0b00000000000000000000000000001000 0b00000000000000000000000000001000 0b00000000000000000000000000000100 0b00000000000000000000000000000100 0b00000000000000000000000000000010 0b00000000000000000000000000000010 0b00000000000000000000000000000001 0b00000000000000000000000000000001 0b10000000000000000000000000000000 0b10000000000000000000000000000000 0b01000000000000000000000000000000 0b01000000000000000000000000000000 0b00100000000000000000000000000000 0b00100000000000000000000000000000 0b00010000000000000000000000000000 0b00010000000000000000000000000000 0b00001000000000000000000000000000 0b00001000000000000000000000000000 0b00000100000000000000000000000000 0b00000100000000000000000000000000 0b00000010000000000000000000000000 0b00000010000000000000000000000000 0b00000001000000000000000000000000 0b00000001000000000000000000000000 0b00000000100000000000000000000000 0b00000000100000000000000000000000 0b00000000010000000000000000000000 0b00000000010000000000000000000000 0b00000000001000000000000000000000 0b00000000001000000000000000000000 0b00000000000100000000000000000000 0b00000000000100000000000000000000 0b00000000000010000000000000000000 0b00000000000010000000000000000000 0b00000000000001000000000000000000 0b00000000000001000000000000000000 0b00000000000000100000000000000000 0b00000000000000100000000000000000 0b00000000000000010000000000000000 0b00000000000000010000000000000000 0b00000000000000001000000000000000 0b00000000000000001000000000000000 0b00000000000000000100000000000000 0b00000000000000000100000000000000 0b00000000000000000010000000000000 0b00000000000000000010000000000000 0b00000000000000000001000000000000 0b00000000000000000001000000000000 0b00000000000000000000100000000000 0b00000000000000000000100000000000 0b00000000000000000000010000000000 0b00000000000000000000010000000000 0b00000000000000000000001000000000 0b00000000000000000000001000000000 0b00000000000000000000000100000000 0b00000000000000000000000100000000 0b00000000000000000000000010000000 0b00000000000000000000000010000000 0b00000000000000000000000001000000 0b00000000000000000000000001000000 0b00000000000000000000000000100000 0b00000000000000000000000000100000 0b00000000000000000000000000010000 0b00000000000000000000000000010000 0b00000000000000000000000000001000 0b00000000000000000000000000001000 0b00000000000000000000000000000100 0b00000000000000000000000000000100 0b00000000000000000000000000000010 0b00000000000000000000000000000010 0b00000000000000000000000000000001 0b00000000000000000000000000000001 So the final code should be: else if (gameskill < sk_baby) { // accurately emulate what doom.exe did while avoiding undefined behavior // (left shift by negative value and rhs too big) int bits_int = 8 * sizeof(int); int shift = ((gameskill - 1) & 0xFF) % bits_int; bit = 1U << shift; } We could also use your idea of using 0x1F instead of taking the remainder, although it is slightly less portable, because we are assuming 32-bits integers. |
I think even SDL2 assumes 32-bit integers. |
Not exactly, SDL2 also uses fixed width integer types. |
It does not matter what size an integer is since the values we're playing with get truncated to 5 bits (the |
You are right, I forgot C specification guarantees integers to be at least 16 bits. else if (gameskill < sk_baby) { // avoid undefined behavior (left shift by negative value and rhs too big) by // accurately emulating what doom.exe did: select lower 8 bits and reduce mod 32. // For more details check: https://github.com/chocolate-doom/chocolate-doom/issues/1677 bit = 1U << ((gameskill - 1) & 0x1F); } |
Actually, I gave a second thought and the above code is also UB if |
I don't think Doom works on anything where |
Here is the PR: #1678 |
Background
Version of Chocolate Doom: 3.0.1
Operating System and version: Any
Game: (Doom/Heretic/Hexen/Strife/other): Any
Any loaded WADs and mods (please include full command line): No
Bug description
Observed behavior:
It is a well known hack that selecting a skill level of 0 makes all items absent. The problem lies in function P_SpawnMapThing: if gameskill is negative (which happens when skill level selected in cmd is 0), then line 807 executes a left bit shift with negative rhs, which is considered UB.
Expected behavior:
The solution is not so straightforward, because original DOOM also has this problem (although less explicitly because skill_t has no constant for "no items" hack). Maybe a if statement checking directly for the hack may works, but it could make the code behaviour different than original DOOM.
The text was updated successfully, but these errors were encountered: