Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a global fixed-point analysis #298

Open
travitch opened this issue Jun 28, 2022 · 0 comments
Open

Implement a global fixed-point analysis #298

travitch opened this issue Jun 28, 2022 · 0 comments
Labels
discovery Issues related to the code discovery logic enhancement

Comments

@travitch
Copy link
Contributor

travitch commented Jun 28, 2022

Currently, macaw is designed to do code discovery in a single pass: it explores the function frontier (in roughly breadth-first order) without ever revisiting functions when more information is learned. This also applies locally: once a block is discovered, it is never re-evaluated (even when new information can improve results). tl;dr - we should add an option for computing well-motivated fixed-points to improve results.

In the global case, there are a number of scenarios where learning information later can improve previously-visited functions.

  • If a tail call is mis-classified as a jump, that could be fixed if the function is re-analyzed after the target of the jump is identified as a function via another call site
  • If multiple functions include the same suffix of code, it can be turned into a tail call once the overlap is recognized
  • Right now, we make very liberal assumptions about what registers are preserved across calls. This is problematic for functions that are not ABI-compliant. If we did a global analysis, we could identify call targets, analyze them to determine the set of registers they actually preserve, then re-analyze callers to provide much more precise results.
  • macaw-refinement could be enhanced with this information to determine what local state (e.g., on the stack) is not clobbered by callees, which would enable it to preserve more constraints across calls (and work in more situations)

In the function local case, we have a scenario that we don't handle very well.

  1. Assume macaw discovers a block from 0x10-0x50
  2. Later in the function, macaw discovers a backedge to 0x20

Right now, macaw creates an overlapping block that runs from 0x20-0x50. If we re-analyzed overlapping blocks, we could systematically split blocks to eliminate almost all overlap. Note that overlapping instructions can still cause blocks to overlap in ways we cannot split, but re-analysis would significantly improve the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discovery Issues related to the code discovery logic enhancement
Projects
None yet
Development

No branches or pull requests

1 participant