Skip to content

blucia0a/Legerdemain

Repository files navigation

Legerdemain - Brandon Lucia - 2011 - University of Washington

Legerdemain (LDM) is a program instrumentation framework.  The purpose of LDM
is to provide the ability to instrument C and C++ programs with flexibility,
simplicity, and low overhead.  

Here is a list of things that can be done easily using LDM:

  *replace/wrap functions
  *monitor program events
  *collecting information
  *support multithreaded programs

The main design goals of LDM are simplicity, extensibility, and performance.
LDM is a framework for building instrumentation tools.  A user can write a
plugin for LDM that replaces functions, monitors certain events, or collects
information.  Which functions, events, and information are collected is up to
the discretion of the user.  

The main use case of LDM is to create plugins that implement program analysis
tools.  LDM is a good platform for writing program analysis tools because it
can collect information without requiring kernel-level access or changes to the program being instrumented.  

The Core Functionality section talks about how LDM works, describing Plugins
and Thread Constructor Plugins.   


Core Functionality:
-------------------
LDM is centered around the idea of Transparent Analysis Plugins or TAPs. TAPs
are shared libraries that are built to be loaded at runtime by LDM.  A TAP can
implement custom instrumentation and analysis.  

LDM can be run by running "LDM_TAPS=<ldm taps list> ldm.sh <your program>".
This command runs your program under a default configuration.  It will load
TAPs listed in LDM_TAPS.  Optionally LDM will provide a debug shell on crashes,
if run with LDM_DEBUG=1.


Basic Plugins
.............
TAPs extend LDM's core functionality.  The simplest kind of plugin is one that
replaces all calls to one function with a function in the plugin.  To do so,
you can use the LDM_REG, LDM_ORIG, and LDM_ORIG_DECL macros.  These are just
wrappers to make this use of LD_PRELOAD/dlopen/dlsym easier and more concise.
To wrap a function (malloc, say):


Pseudo-Code Example 1: Instrumenting calls to malloc.  MallocWatcher.c
=========================================
#include <malloc.h>
#include <stdio.h>

#include "legerdemain.h"

static int inited;
static int malloccount;
LDM_ORIG_DECL(void *,malloc,size_t);
void *malloc(size_t sz){
  malloccount++;
  if( !inited ){
    LDM_REG(malloc);
  }
  return LDM_ORIG(malloc)(sz);
}

static void deinit(){
  ldmmsg(stderr,"[MALLOCWATCHER] Unloading fancy malloc plugin\n");
  ldmmsg(stderr,"[MALLOCWATCHER] Malloc was called %d times\n",malloccount);
}

void LDM_PLUGIN_INIT(){
  ldmmsg(stderr,"[MALLOCWATCHER] Loading fancy malloc plugin\n");
  LDM_REG(malloc);
  inited = 1;
  LDM_PLUGIN_DEINIT(deinit)
}

=========================================

The plugin can be built as a .so, by running

gcc -fPIC -shared -o MallocWatcher.so MallocWatcher.c -ldl

That will produce MallocWatcher.so.    
The plugin is initialized by the special function LDM_PLUGIN_INIT.  This
function will be called when LDM is registering the plugin.  In
LDM_PLUGIN_INIT, LDM_PLUGIN_DEINIT(deinit) is called to register a function
that should be called when the plugin is unloaded.

This plugin uses LDM_ORIG_DECL to declare that the function malloc is being
replaced.  Then, in the plugin's version of malloc(), the original version of
malloc() is called using LDM_ORIG(malloc).  Finally, LDM_REG is called in
LDM_PLUGIN_INIT() to register the replacement with LDM.  

In some cases -- like this one -- the function you are replacing may be called
during your plugin's initialization.  You need to be sure it isn't being called
before LDM_REG() is called.  The "if( !inited )" guard in malloc() in the above
example illustrates this.  ldmmsg(f,s) prints the string s to the FILE *f with
[LDM] before it, so you know where your messages are originating, and they
don't get mixed up with program output.  

To cause LDM to load MallocWatcher.so, run

LDM_TAPS=MallocWatcher.so ldm.sh <any program>

Generally, the LDM_TAPS variable loads LDM plugins at runtime, within
ldm.sh.  Note that plugins are *not* thread safe, unless you make them thread
safe.  For example, if this plugin is run on a multithreaded program,
malloccount would need to be protected with a lock. 


Thread-Aware Plugins
....................
A TAP may also define a thread constructor and destructor.  

A thread constructor is called by every thread just before entering its
start_routine.  Thread constructors are declared like this:

void LDM_PLUGIN_THD_INIT(void *arg,void *(*start_routine)(void *));

The "arg" argument is the void * argument passed to pthread_create when the
program created the thread.  The "start_routine" argument is the void *(*)(void
*) argument passed to pthread_create.  The start_routine is exposed to the TAP
so a program may selectively apply instrumentation to only threads executing a
certain function.  Many thread constructors can be applied to a running program
simultaneously, but only one may be specified per TAP.

A TAP may also specify thread destructor routines.  There are several things
that are required to ensure thread destructors are correctly handled.  First,
the thread destructor must be declared, and associated with a special
destructor identifier using LDM_THD_DTR_DECL(id,func).  Second, in the plugin
initializer (the function declared with LDM_PLUGIN_INIT), you must call
LDM_REGISTER_THD_DTR(id,func).  Finally, in the init_thread function, each
thread must call LDM_INSTALL_THD_DTR(id), to enable the destructor.

An example of a thread-aware TAP is in clientlibs/NullThreadWatcher.  It
illustrates a thread constructor and destructor function.


Sampled TAPs (sTAPs)
....................
LDM provides support to temporally sample events in your program.  There is a
simple tutorial example illustrating how to use this feature in
clientlibs/NullSampledWatcher.  There are several steps:

1)First write a TAP (see previous section).  

2)Call sampled_thread_monitor_init() to initialize the sampling subsystem when
  you intialize your plugin (in your LDM_PLUGIN_INIT function).  

3)Register threads with the sampling system as they are created.  Do this using
  sampled_thread_monitor_thread_init(), which should be called from your 
  plugin's thread initialization function.  

4)Create a destructor in your TAP (see previous section).  In your plugin's
  thread destructor, you need to call sampled_thread_monitor_thread_deinit().
  This is important, as if you do not, your program will crash.

5)In order to do anything in your sTAP, you need to register a
  function that gets called when the sampling subsystem 'poke's your threads.  
  To do so, you need to write a function that gets passed as the second 
  argument to sampled_thread_monitor_init(). The returns void, and is passed two
  arguments by the sampling subsystem when it is called: a SamplingState
  argument, that says whether sampling is being turned on (SAMPLE_ON) or off
  (SAMPLE_OFF); and a ucontext_t * argument, which is the architectural context
  of the thread at the point just before it was poked. 

Refer to clientlibs/NullSampledWatcher/NullSampledWatcher.c for an illustrative 
example of a sTAP.


DEBUGGING SUPPORT:
------------------
LDM will catch fatal signals and drop to a shell if you specify LDM_DEBUG=1 in
the environment.  The shell has a few useful commands, to look at memory
addresses, query for live variables, etc. 'help' in the shell will overview the
commands.  To force the debug shell at startup, you can specify LDM_DEBUG=1 in
your environment before running ldm.sh.


ELF/DWARF Support:
------------------
LDM has rudimentary support for looking at dwarf/elf information.  Running ldm,
you can type 'help'.  This will present you with a list of possible commands,
most of which related to debug info.

libdwarfclient is used by libldm to access dwarf information in elf program
binaries.  dwarfinfo is a standalone program that can be used to access
dwarf debugging information from the the command line.  It is primarily 
useful in 3 ways right now.
1)using options -g -e<executable> -f<source file> -l<source line> it will show the 
  instruction address of the specified file and line
2)adding the -c option to the command line in 1), information about the compilation
  unit in which the code point resides will be printed
3)using the -p option dumps all the dwarf debug information in the binary


APPENDIX A: Design Notes
------------------------
Driven by performance, LDM does not use the dynamic compilation strategy used
by Dynamo/DynamoRIO/Pin. LDM is different in that there is no fixed overhead to
running a program with LDM, like there is with Pin (due to dynamic compilation,
code caching, etc). Instead, LDM runs programs with zero overhead, unless
instructed otherwise by a program analysis plugin.  The cost of this design
decision is that many fine-grained events that are accessible to Pin are
fundamentally uninstrumentable by LDM.


APPENDIX B: TODO
----------------
Add integrated support for user defined watchpoints, via drprobe (github.com/blucia0a/drprobe).


About

A program instrumentation library for replacing and wrapping functions, and monitoring program behavior.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published