Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pltStubSymbols's treatment of the .plt.got section miscalculates the number of PLT stubs #375

Open
RyanGlScott opened this issue Apr 4, 2024 · 2 comments
Labels

Comments

@RyanGlScott
Copy link
Contributor

If you compile this simple C program using clang:

int main(void) {
    return 0;
}

You'll see that it has exactly one PLT stub in its .plt.got section:

$ clang test.c -o test
$ objdump -d -j .plt.got test

test:     file format elf64-x86-64


Disassembly of section .plt.got:

0000000000001030 <__cxa_finalize@plt>:
    1030:	ff 25 c2 2f 00 00    	jmp    *0x2fc2(%rip)        # 3ff8 <__cxa_finalize@GLIBC_2.2.5>
    1036:	66 90                	xchg   %ax,%ax

However, pltStubSymbols claims that it has more PLT stubs than this! Here is what you see if you debug-print the output of pltStubSymbols on this program:

fromList [(0x1040,""),(0x1048,""),(0x1050,""),(0x1058,"__libc_start_main"),(0x1060,"_ITM_deregisterTMCloneTable"),(0x1068,"__gmon_start__"),(0x1070,"_ITM_registerTMCloneTable"),(0x1078,"__cxa_finalize")]

What is going on here?

Ultimately, pltStubsSymbol consults the .rela.dyn section to figure out what the contents of the .plt.got are. In this case, .rela.dyn contains five entries:

$ objdump -dzR -j .got test

test:     file format elf64-x86-64


Disassembly of section .got:

0000000000003fd8 <.got>:
    3fd8:	00 00                	add    %al,(%rax)
			3fd8: R_X86_64_GLOB_DAT	__libc_start_main@GLIBC_2.34
    3fda:	00 00                	add    %al,(%rax)
    3fdc:	00 00                	add    %al,(%rax)
    3fde:	00 00                	add    %al,(%rax)
    3fe0:	00 00                	add    %al,(%rax)
			3fe0: R_X86_64_GLOB_DAT	_ITM_deregisterTMCloneTable@Base
    3fe2:	00 00                	add    %al,(%rax)
    3fe4:	00 00                	add    %al,(%rax)
    3fe6:	00 00                	add    %al,(%rax)
    3fe8:	00 00                	add    %al,(%rax)
			3fe8: R_X86_64_GLOB_DAT	__gmon_start__@Base
    3fea:	00 00                	add    %al,(%rax)
    3fec:	00 00                	add    %al,(%rax)
    3fee:	00 00                	add    %al,(%rax)
    3ff0:	00 00                	add    %al,(%rax)
			3ff0: R_X86_64_GLOB_DAT	_ITM_registerTMCloneTable@Base
    3ff2:	00 00                	add    %al,(%rax)
    3ff4:	00 00                	add    %al,(%rax)
    3ff6:	00 00                	add    %al,(%rax)
    3ff8:	00 00                	add    %al,(%rax)
			3ff8: *unknown*	__cxa_finalize@GLIBC_2.2.5
    3ffa:	00 00                	add    %al,(%rax)
    3ffc:	00 00                	add    %al,(%rax)
    3ffe:	00 00                	add    %al,(%rax)

But only one of them (__cxa_finalize) is actually a PLT stub. However, their presence throws off the heuristics that pltStubSymbols uses.

I'm not quite sure what to do about this. It would be convenient if there were a convenient mechanism to distinguish __cxa_finalize apart from the other entries in .got, but I'm not sure what that would be. My first inclination was to filter out any symbols that aren't function symbols, but even that isn't enough, as __libc_start_main is also a function symbol:

$ readelf -W --dyn-syms test

Symbol table '.dynsym' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.34 (2)
     2: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTable
     3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
     4: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
     5: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __cxa_finalize@GLIBC_2.2.5 (3)

It's also tempting to think that the combination of FUNC and WEAK would uniquely identify PLT stubs, but that is also not true. If you call a function via a function pointer, e.g.,

void* (*m)(size_t) = &malloc;

Then malloc will also be called via a PLT stub, but its function symbol will be FUNC and GLOBAL.

@RyanGlScott RyanGlScott added the bug label Apr 4, 2024
@langston-barrett
Copy link
Contributor

langston-barrett commented Apr 16, 2024

Ultimately, pltStubsSymbol consults the .rela.dyn section to figure out what the contents of the .plt.got are.

Perhaps a silly question: Why doesn't pltStubsSymbol consult the .plt.got section to figure out what the contents of .plt.got are?

[EDIT]: Perhaps this is just a hard problem, as indicated by this comment in the angr source code

@RyanGlScott
Copy link
Contributor Author

Your EDIT hits the nail on the head: the .plt.got section (as well as its cousins .plt and .plt.sec) are really just an unorganized collection of instructions, with no discernible function symbols to clearly demarcate the start of each PLT stub. In general, you have to undergo reverse engineering to know where each PLT stub begins and ends.

The only reason our heuristics for detecting PLT stubs in the .plt section work as well as they do is that the relocations contained in the .rela.plt section are only related to the .plt section. This isn't always the case for the .rela.dyn section, however. In addition to containing relocations for the PLT stubs in the .plt.got section, it can also contain relocations for things like global variables defined in shared libraries (e.g., _ITM_deregisterTMCloneTable). As such, our heuristics aren't terribly reliable for the .plt.got section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants