Core API¶
New in version 1.0.
This section documents the FIDL core API, and it’s intended for developers of IDA plugins
API Overview¶
-
class
decompiler_utils.BBGraph(f_ea)¶ Representation of the assembly CFG for a function
-
find_connected_paths(bb_start, bb_end, co=10)¶ Leverages NetworkX to find all connected paths
Parameters: - bb_start (Basic block) – Initial basic block
- bb_end (Basic block) – Final basic block
- co (int, optional) – Cutoff parameter
NOTE: the cutoff parameter in
nx.all_simple_pathsserves two purposes:- reduce the chances of CPU melting (algo is O(n!))
- nobody will inspect (manually) monstruous paths
Returns: generator of lists or None
-
get_node(addr)¶ Given a function’s address, returns the basic block (address) that contains it (or None)
Parameters: addr (int) – address within a function Returns: Address of the node containing the input address Return type: int
-
-
decompiler_utils.NonLibFunctions(start_ea=None, min_size=0)¶ Generator yielding only non-lib functions
Parameters: - start_ea (int, optional) – Address to start looking for non-library functions.
- min_size (int, optional) – Minimum function size. Useful to filter small, uninteresting functions.
-
decompiler_utils.all_paths_between(c, start_node=None, end_node=None, co=40)¶ Calculates all paths between
start_nodeandend_nodeCalculating paths is one of these things that is better done with the paralell index graph (
c.i_cfg) It haywires when done with complex elements.FIXME: the co (cutoff) param is necessary to avoid complexity explosion. However, there is a problem if it’s reached…
Parameters: - c (
controlFlowinator) – acontrolFlowinatorobject - start_node (
cexpr_t) – acontrolFlowinatornode - start_node – a
controlFlowinatornode - co (int, optional) – the cutoff value controls the maximum path length.
Returns: it yields a list of nodes for each path
Return type: list
- c (
-
decompiler_utils.assigns_to_var(cex)¶ Does this :class:
cexpr_tassign a value to any variable?TODO: this is limited for now to expressions of the type:
v1 = something somethingParameters: cex ( cexpr_t) – acexpr_tobjectReturns: the assigned var index (to cf.lvarsarray) or -1 if thecexpr_tdoes not assign to any variableReturn type: int
-
decompiler_utils.blowup_expression(cex, final_operands=None)¶ Extracts all elements of an expression
Ex:
x + 1 < y->{x, 1, y}Parameters: cex ( cexpr_t) – acexpr_tobjectReturns: a set of elements (the final_operands) Return type: set
-
class
decompiler_utils.cImporter¶ Collect import information
This is mainly to work around the fact that :func:
get_func_namedoes not resolve imports…-
get_imports_info()¶
-
-
class
decompiler_utils.callObj(c=None, name='', node=None, expr=None)¶ Auxiliary object for code clarity.
It represents the occurrence of a
callexpression.Parameters: - name (string, optional) – name of the function called
- node (
controlFlowinator) – acontrolFlowinatornode containing the call expression - expr (
cexpr_t) – thecallexpression element
-
decompiler_utils.citem2higher(citem)¶ This gets the higher representation of a given :class:
citem, that is, a :class:cinsn_tor :class:cexpr_tParameters: citem (:class: citem) – a :class:citemobject
-
class
decompiler_utils.controlFlowinator(ea=None, fast=True)¶ This is the main object of FIDL’s API.
It finds all decompiled code “blocks” and recreates a CFG based on this information.
This gives us the best of both worlds: the possibility to analyze a graph (like in disassembly mode) and the power of :class:
citembased analysis.Some analysis is performed after the CFG has been constructed. These are rather cost intensive, so they are turned off by default. Use
fast=Falseto apply these and get a better CFG.Parameters: - ea (int) – address of the function to analyze
- fast (bool) – Set to
Falsefor an object with richer information
-
dump_cfg(out_dir)¶ Dump the CFG for debugging purposes
This dumps a representation of the CFG in DOT format. To generate an image:
dot.exe -Tpng decompiled.dot -o decompiled.png
-
dump_i_cfg()¶ Dump interim CFG for debugging purposes
-
decompiler_utils.create_comment(c=None, ea=0, comment='')¶ Displays a comment at the line corresponding to
eaTODO: avoid creating orphan comment in case the mapping from
eato decompiled code failsParameters: - c (
controlFlowinator) – acontrolFlowinatorobject - ea (int) – address for the comment
- comment (string) – the comment to add
- c (
-
decompiler_utils.debug_blownup_expressions(c=None)¶ Debugging helper.
Show all blown up expressions for this function.
Parameters: c ( controlFlowinator) – acontrolFlowinatorobject
-
decompiler_utils.debug_get_break_statements(c)¶
-
decompiler_utils.debug_stahp()¶ Toggles
DEBUGvalue, useful for testing
-
decompiler_utils.decast(ins)¶ Remove the
cast, returning the casted element
-
decompiler_utils.display_all_calls_to(func_name)¶ Wrapping
display_line_at()since this is the most common use of this APIParameters: func_name (string) – name of the function to search references
-
decompiler_utils.display_line_at(ea, silent=False)¶ Displays the line of pseudocode corresponding to
eaThis is useful to quickly answer questions like:
- “Is this function always called with its first parameter being a constant?”
- “I want to see all the error messages displayed by this function”
- etc.
Parameters: - ea (int) – address of an element contained within the line to display
- silent (bool) – flag controlling verbose output
-
decompiler_utils.display_node(c=None, node=None, color=None)¶ Displays a given node in the
pseudoviewerParameters: - c (
controlFlowinator) – acontrolFlowinatorobject - node (
cexpr_t) – acontrolFlowinatornode - color (int, optional) – color to mark the line of code corresponding to node
- c (
-
decompiler_utils.display_path(cf=None, path=None, color=None)¶ Shows a path’s code and colors its lines.
Parameters: - cf (an
cfunc_tobject, optional) – a decompilation object - path (list) – a list of :
controlFlowinatornodes - color (int, optional) – color to mark the lines of code corresponding to path
Returns: a list of function lines (path nodes)
Return type: list
- cf (an
-
decompiler_utils.do_for_all_funcs(func, fast=True, start_ea=None, blacklist=None, min_size=100, **kwargs)¶ This is a generic wrapper for all kinds of logic that we want to apply to all the functions in the binary.
Parameters: - func (function) – function “pointer” performing the analysis. Its only mandatory argument is a
controlFlowinatorobject. - fast (boolean, optional) – parameter fast for the
controlFlowinatorobject. - start_ea (int, optional) – Address to start looking for non-library functions.
- blacklist (function, optional) – a function determining whether to process a function. Implemented via dependency injection.
Returns: A list of JSON-like messages (individual function results)
Return type: list
- func (function) – function “pointer” performing the analysis. Its only mandatory argument is a
-
decompiler_utils.does_constrain(node)¶ This tries to answer the question: “Does this
nodeconstrains variables in any way?”Essentially it is looking for the occurrence of variables within known constrainer constructs, eg. inside an
ifcondition.TODO: many more heuristics can be included here
Parameters: node ( cinsn_torcexpr_t) – typically acontrolFlowinatornodeReturns: a set of variable indexes (to cf.lvarsarray)Return type: set
-
decompiler_utils.dprint(s='')¶ This will print a debug message only if debugging is active
Parameters: s (str, optional) – The debug message
-
decompiler_utils.dump_lvars(ea=0)¶ Debugging helper.
-
decompiler_utils.dump_pseudocode(ea=0)¶ Debugging helper.
-
decompiler_utils.find_all_calls_to(f_name)¶ Finds all calls to a function with the given name
Note that the string comparison is relaxed to find variants of it, that is, searching for
mallocwill match as well_malloc,malloc_0, etc.Parameters: f_name (string) – the function name to search for Returns: a list of callObjReturn type: list
-
decompiler_utils.find_all_calls_to_within(f_name, ea)¶ Finds all calls to a function with the given name within the function containing the
eaaddress.Note that the string comparison is relaxed to find variants of it, that is, searching for
mallocwill match as well_malloc,malloc_0, etc.Parameters: - f_name (string) – the function name to search for
- ea (int) – any address within the function that may contain the calls
Returns: a list of
callObjReturn type: list
-
decompiler_utils.find_elements_of_type(cex, element_type, elements=None)¶ Recursively extracts expression elements until a
cexpr_tfrom a specific group is foundParameters: - cex (
cexpr_t) – acexpr_tobject - element_type (a
cot_xxxvalue (eg.cot_add)) – the type of element we are looking for (as acot_xxxvalue, seecompiler_consts.py)
Returns: a set of
cexpr_tof the specified typeReturn type: set
- cex (
-
decompiler_utils.get_all_vars_in_node(cex)¶ Extracts all variables involved in an expression.
Parameters: cex ( cexpr_t) – typically acontrolFlowinatornodeReturns: list of var_tindexes (tocf.lvars)Return type: list
-
decompiler_utils.get_cfg_for_ea(ea, dot_exe, out_dir)¶ Debugging helper.
Uses
DOTto create a.PNGgraphic of theControlFlowinatorCFG and displays it.Parameters: - ea (int) – address of the function to analyze
- dot_exe (string) – path to the
DOTbinary - out_dir (string) – directory to write the
.DOTfile
-
decompiler_utils.get_cond_from_statement(ins)¶ Given a
cinsn_trepresenting a control flow structure (do, while, for, etc.), it returns the correspondingcexpr_trepresenting the condition/argument for that code construct.This is useful since we usually want to peek into conditional statements…
Parameters: ins ( cinsn_t) – thecinsn_tassociated with a control flow structureReturns: the condition or argument within that control flow structure Return type: cexpr_t
-
decompiler_utils.get_function_vars(c=None, ea=0, only_args=False, only_locals=False)¶ Populates a dict of
my_var_tfor the function containing the specifiedeaParameters: - c (
controlFlowinator) – acontrolFlowinatorobject, optional - ea (int) – the function address
- only_args (bool, optional) – extract only function arguments
- only_locals (bool, optional) – extract only local variables
Returns: A dictionary of
my_var_t, indexed by their index- c (
-
decompiler_utils.get_interesting_calls(c, user_defined=[])¶ Not all functions are created equal. We are interested in functions with certain names or substrings in it.
Parameters: - c (
controlFlowinator) – acontrolFlowinatorobject - user_defined (list, optional) – a list of names (or substrings), if not supplied a hard-coded default list will be used.
Returns: a list of
callObjReturn type: list
- c (
-
decompiler_utils.get_return_type(cf=None)¶ Hack to get the return value of a function.
Parameters: cf ( ida_hexrays.cfuncptr_t) – the result ofdecompile()Returns: Type information for the return value Return type: tinfo_t
-
decompiler_utils.is_arithmetic_expression(cex, only_these=[])¶ Checks whether this is an arithmetic expression.
Parameters: - cex (
cexpr_t) – expression, usually this is a node. - only_these (a list of
cot_*constants, eg.cot_add.) – a list of arithmetic expressions to look for. These are defined inida_hexrays
Returns: True or False
Return type: bool
- cex (
-
decompiler_utils.is_array_indexing(ins)¶
-
decompiler_utils.is_asg(ins)¶
-
decompiler_utils.is_binary_truncation(cex)¶ Looking for expressions truncating a number
These expressions are of the form
v1 & 0xFFFFor alikeParameters: cex (:class:cexpr_t) – an expression Returns: True or False Return type: bool
-
decompiler_utils.is_call(ins)¶
-
decompiler_utils.is_cast(ins)¶
-
decompiler_utils.is_final_expr(cex)¶ Helper for internal functions.
A final expression will be defined as one that can not be further decomposed, eg. number, var, string, etc.
Normally, you should not need to use this.
Parameters: cex ( cexpr_t) – acexpr_tobjectReturns: True or False Return type: bool
-
decompiler_utils.is_global_var(ins)¶ Tells whether
insis a global variableTODO: enhance this heuristic
Parameters: ins – cexpr_torinsn_tReturns: True or False Return type: bool
-
decompiler_utils.is_if(ins)¶
-
decompiler_utils.is_number(ins)¶ Convenience wrapper
-
decompiler_utils.is_ptr(ins)¶
-
decompiler_utils.is_read(ins)¶ Try to find read primitives.
Looking for things like:
v3 = *(_DWORD *)(v5 + 784)
NOTE: this will find expressions that are read && write, since they are not mutually exclusive
TODO: Rather rough, it is a first version…
Parameters: node ( cinsn_torcexpr_t) – acontrolFlowinatornodeReturns: True or False Return type: bool
-
decompiler_utils.is_ref(ins)¶
-
decompiler_utils.is_string(ins)¶ Convenience wrapper
-
decompiler_utils.is_var(ins)¶ Whether this
inscorresponds to a variableRemember that if this evaluates to True, we are dealing with an object of type
var_ref_twhich are pretty much useless. We may want to convert this to alvar_tand even better to amy_var_tafterwards.ref2var()is a simple wrapper to perform the conversion between reference and variable
-
decompiler_utils.is_write(node)¶ Try to find write primitives.
Looking for things like:
*(_DWORD *)(something) = v38 arr[i] = v21
TODO: Rather rough, it is a first version…
Parameters: node ( cinsn_torcexpr_t) – acontrolFlowinatornodeReturns: True or False Return type: bool
-
decompiler_utils.lex_citem_indexes(line)¶ Part of Lighthouse plugin
Lex all ctree item indexes from a given line of text. The HexRays decompiler output contains invisible text tokens that can be used to attribute spans of text to the ctree items that produced them.
-
decompiler_utils.lines_and_code(cf=None, ea=0)¶ Mapping of line numbers and code
Parameters: - cf (an
cfunc_tobject, optional) – a decompilation object - ea (int, optional) – Address within the function to decompile, if no cf is provided
Returns: a dictionary of lines of code, indexed by line number
Return type: dict
- cf (an
-
decompiler_utils.main()¶
-
decompiler_utils.map_citem2line(line2citem)¶ Part of Lighthouse plugin
Creates a mapping of citem indexes to lines of code
-
decompiler_utils.map_line2citem(decompilation_text)¶ Part of Lighthouse plugin
Map decompilation line numbers to citems. This function allows us to build a relationship between citems in the ctree and specific lines in the hexrays decompilation text.
-
decompiler_utils.map_line2node(cfunc, line2citem)¶ Part of Lighthouse plugin
Map decompilation line numbers to node (basic blocks) addresses. This function allows us to build a relationship between graph nodes (basic blocks) and specific lines in the hexrays decompilation text.
-
decompiler_utils.map_node2lines(line2node)¶ Part of Lighthouse plugin
Creates a mapping of nodes to lines of code
-
decompiler_utils.my_decompile(ea=None)¶ This is a workaround for the cache lifecycle problem.
It calls the
pseudoViewerAPI if the function is not in the cache, in order to plug it in.Parameters: ea (int) – Address within the function to decompile Returns: decompilation object Return type: a cfunc_t
-
decompiler_utils.my_get_func_name(ea)¶ Wrapper for
get_func_namehandling some corner cases.Parameters: ea (int) – Address of the function to resolve its name
-
class
decompiler_utils.my_var_t(var)¶ This wraps the
lvar_tnicely into a more usable data structure.It aggregates several interesting pieces of information in one place. eg.
is_arg,is_constrained,is_initialized, etc.The most commonly used attributes for this class are:
- name
- type_name
- size
- is_arg
- is_pointer
- is_array
- is_signed
Parameters: var ( lvar_t) – an object representing a local variable or function argument
-
decompiler_utils.num_value(ins)¶ Returns the numerical value of
insParameters: ins – cexpr_torinsn_t
-
decompiler_utils.points_to(ins)¶
-
class
decompiler_utils.pseudoViewer¶ This wraps the
pseudoViewerAPI neatly.We need it because some things don’t work unless you previously visited (or are currently visiting) the function whose decompiled form you want to analyze. Thus, we are forced to “Hack like in the movies”
NOTE: the performance penalty is negligible
-
OPEN_NEW= 1¶
-
REUSE_IF_PSEUDOCODE= -1¶
-
USE_EXISTING= 0¶
-
close()¶ Closes the pseudoviewer widget
-
show(ea=0, reuse=-1)¶ Displays the pseudoviewer widget
Parameters: - ea (int, optional) – adress of the function to display
- reuse (int, optional) – how to reuse an existing pseudocode display, if any
-
-
decompiler_utils.ref2var(ref, c=None, cf=None)¶ Convenient wrapper to streamline the conversions between
var_ref_tandlvar_tParameters: - c (
controlFlowinator) – acontrolFlowinatorobject, optional - cf (a
cfunc_tobject) – a decompilation object (usually the result ofdecompile), optional - ref (
var_ref_t) – a reference to a variable in the pseudocode
Returns: a
lvar_tobjectReturn type: lvar_t- c (
-
decompiler_utils.ref_to(ins)¶
-
decompiler_utils.string_value(ins)¶ Gets the string corresponding to
insWorks with C-str and Unicode
Parameters: ins – cexpr_torinsn_tReturns: string for this insReturn type: string
-
decompiler_utils.value_of_global(ins)¶ Returns the value of a global variable