Implement a global fixed-point analysis #298

travitch · 2022-06-28T03:57:10Z

Currently, macaw is designed to do code discovery in a single pass: it explores the function frontier (in roughly breadth-first order) without ever revisiting functions when more information is learned. This also applies locally: once a block is discovered, it is never re-evaluated (even when new information can improve results). tl;dr - we should add an option for computing well-motivated fixed-points to improve results.

In the global case, there are a number of scenarios where learning information later can improve previously-visited functions.

If a tail call is mis-classified as a jump, that could be fixed if the function is re-analyzed after the target of the jump is identified as a function via another call site
If multiple functions include the same suffix of code, it can be turned into a tail call once the overlap is recognized
Right now, we make very liberal assumptions about what registers are preserved across calls. This is problematic for functions that are not ABI-compliant. If we did a global analysis, we could identify call targets, analyze them to determine the set of registers they actually preserve, then re-analyze callers to provide much more precise results.
macaw-refinement could be enhanced with this information to determine what local state (e.g., on the stack) is not clobbered by callees, which would enable it to preserve more constraints across calls (and work in more situations)

In the function local case, we have a scenario that we don't handle very well.

Assume macaw discovers a block from 0x10-0x50
Later in the function, macaw discovers a backedge to 0x20

Right now, macaw creates an overlapping block that runs from 0x20-0x50. If we re-analyzed overlapping blocks, we could systematically split blocks to eliminate almost all overlap. Note that overlapping instructions can still cause blocks to overlap in ways we cannot split, but re-analysis would significantly improve the results.

The text was updated successfully, but these errors were encountered:

travitch added the enhancement label Jun 28, 2022

RyanGlScott mentioned this issue Jun 28, 2022

Improve the tail call identification logic to prefer local jumps unless the target is a known function entry #296

Merged

travitch mentioned this issue Aug 1, 2022

Implement parallel code discovery #307

Open

travitch added the discovery Issues related to the code discovery logic label Aug 1, 2022

This was referenced Aug 1, 2022

Implement an analysis to identify no-return functions #308

Open

Improve function pointer identification #309

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a global fixed-point analysis #298

Implement a global fixed-point analysis #298

travitch commented Jun 28, 2022 •

edited

Implement a global fixed-point analysis #298

Implement a global fixed-point analysis #298

Comments

travitch commented Jun 28, 2022 • edited

travitch commented Jun 28, 2022 •

edited