Core API

New in version 1.0.

This section documents the FIDL core API, and it’s intended for developers of IDA plugins

API Overview

class decompiler_utils.BBGraph(f_ea)

Representation of the assembly CFG for a function

find_connected_paths(bb_start, bb_end, co=10)

Leverages NetworkX to find all connected paths

Parameters:
  • bb_start (Basic block) – Initial basic block
  • bb_end (Basic block) – Final basic block
  • co (int, optional) – Cutoff parameter

NOTE: the cutoff parameter in nx.all_simple_paths serves two purposes:

  • reduce the chances of CPU melting (algo is O(n!))
  • nobody will inspect (manually) monstruous paths
Returns:generator of lists or None
get_node(addr)

Given a function’s address, returns the basic block (address) that contains it (or None)

Parameters:addr (int) – address within a function
Returns:Address of the node containing the input address
Return type:int
decompiler_utils.NonLibFunctions(start_ea=None, min_size=0)

Generator yielding only non-lib functions

Parameters:
  • start_ea (int, optional) – Address to start looking for non-library functions.
  • min_size (int, optional) – Minimum function size. Useful to filter small, uninteresting functions.
decompiler_utils.all_paths_between(c, start_node=None, end_node=None, co=40)

Calculates all paths between start_node and end_node

Calculating paths is one of these things that is better done with the paralell index graph (c.i_cfg) It haywires when done with complex elements.

FIXME: the co (cutoff) param is necessary to avoid complexity explosion. However, there is a problem if it’s reached…

Parameters:
Returns:

it yields a list of nodes for each path

Return type:

list

decompiler_utils.assigns_to_var(cex)

Does this :class:cexpr_t assign a value to any variable?

TODO: this is limited for now to expressions of the type:

v1 = something something
Parameters:cex (cexpr_t) – a cexpr_t object
Returns:the assigned var index (to cf.lvars array) or -1 if the cexpr_t does not assign to any variable
Return type:int
decompiler_utils.blowup_expression(cex, final_operands=None)

Extracts all elements of an expression

Ex: x + 1 < y -> {x, 1, y}

Parameters:cex (cexpr_t) – a cexpr_t object
Returns:a set of elements (the final_operands)
Return type:set
class decompiler_utils.cImporter

Collect import information

This is mainly to work around the fact that :func:get_func_name does not resolve imports…

get_imports_info()
class decompiler_utils.callObj(c=None, name='', node=None, expr=None)

Auxiliary object for code clarity.

It represents the occurrence of a call expression.

Parameters:
  • name (string, optional) – name of the function called
  • node (controlFlowinator) – a controlFlowinator node containing the call expression
  • expr (cexpr_t) – the call expression element
decompiler_utils.citem2higher(citem)

This gets the higher representation of a given :class:citem, that is, a :class:cinsn_t or :class:cexpr_t

Parameters:citem (:class:citem) – a :class:citem object
class decompiler_utils.controlFlowinator(ea=None, fast=True)

This is the main object of FIDL’s API.

It finds all decompiled code “blocks” and recreates a CFG based on this information.

This gives us the best of both worlds: the possibility to analyze a graph (like in disassembly mode) and the power of :class:citem based analysis.

Some analysis is performed after the CFG has been constructed. These are rather cost intensive, so they are turned off by default. Use fast=False to apply these and get a better CFG.

Parameters:
  • ea (int) – address of the function to analyze
  • fast (bool) – Set to False for an object with richer information
dump_cfg(out_dir)

Dump the CFG for debugging purposes

This dumps a representation of the CFG in DOT format. To generate an image:

dot.exe -Tpng decompiled.dot -o decompiled.png

dump_i_cfg()

Dump interim CFG for debugging purposes

decompiler_utils.create_comment(c=None, ea=0, comment='')

Displays a comment at the line corresponding to ea

Parameters:
Returns:

returns True if comment successfully created

Return type:

bool

decompiler_utils.debug_blownup_expressions(c=None, node=None)

Debugging helper.

Show all blown up expressions for this function.

Parameters:c (controlFlowinator) – a controlFlowinator object
decompiler_utils.debug_get_break_statements(c)
decompiler_utils.debug_stahp()

Toggles DEBUG value, useful for testing

decompiler_utils.decast(ins)

Remove the cast, returning the casted element

decompiler_utils.display_all_calls_to(func_name)

Wrapping display_line_at() since this is the most common use of this API

Parameters:func_name (string) – name of the function to search references
decompiler_utils.display_line_at(ea, silent=False)

Displays the line of pseudocode corresponding to ea

This is useful to quickly answer questions like:

  • “Is this function always called with its first parameter being a constant?”
  • “I want to see all the error messages displayed by this function”
  • etc.
Parameters:
  • ea (int) – address of an element contained within the line to display
  • silent (bool) – flag controlling verbose output
decompiler_utils.display_node(c=None, node=None, color=None)

Displays a given node in the pseudoviewer

Parameters:
decompiler_utils.display_path(cf=None, path=None, color=None)

Shows a path’s code and colors its lines.

Parameters:
  • cf (an cfunc_t object, optional) – a decompilation object
  • path (list) – a list of :controlFlowinator nodes
  • color (int, optional) – color to mark the lines of code corresponding to path
Returns:

a list of function lines (path nodes)

Return type:

list

decompiler_utils.do_for_all_funcs(func, fast=True, start_ea=None, blacklist=None, min_size=100, **kwargs)

This is a generic wrapper for all kinds of logic that we want to apply to all the functions in the binary.

Parameters:
  • func (function) – function “pointer” performing the analysis. Its only mandatory argument is a controlFlowinator object.
  • fast (boolean, optional) – parameter fast for the controlFlowinator object.
  • start_ea (int, optional) – Address to start looking for non-library functions.
  • blacklist (function, optional) – a function determining whether to process a function. Implemented via dependency injection.
Returns:

A list of JSON-like messages (individual function results)

Return type:

list

decompiler_utils.does_constrain(node)

This tries to answer the question: “Does this node constrains variables in any way?”

Essentially it is looking for the occurrence of variables within known constrainer constructs, eg. inside an if condition.

TODO: many more heuristics can be included here

Parameters:node (cinsn_t or cexpr_t) – typically a controlFlowinator node
Returns:a set of variable indexes (to cf.lvars array)
Return type:set
decompiler_utils.dprint(s='')

This will print a debug message only if debugging is active

Parameters:s (str, optional) – The debug message
decompiler_utils.dump_lvars(ea=0)

Debugging helper.

decompiler_utils.dump_pseudocode(ea=0)

Debugging helper.

decompiler_utils.find_all_calls_to(f_name, bruteforce=True)

Finds all calls to a function with the given name

Note that the string comparison is relaxed to find variants of it, that is, searching for malloc will match as well _malloc, malloc_0, etc.

Parameters:
  • f_name (string) – the function name to search for
  • bruteforce (bool, optional) – fallback to bruteforce (search all functions)
Returns:

a list of callObj

Return type:

list

decompiler_utils.find_all_calls_to_within(f_name, ea=0, c=None)

Finds all calls to a function with the given name within the function containing the ea address.

Note that the string comparison is relaxed to find variants of it, that is, searching for malloc will match as well _malloc, malloc_0, etc.

Parameters:
  • f_name (string) – the function name to search for
  • ea (int) – any address within the function that may contain the calls
  • c (controlFlowinator, optional) – if specified, work on this controlFlowinator object
Returns:

a list of callObj

Return type:

list

decompiler_utils.find_elements_of_type(cex, element_type, elements=None)

Recursively extracts expression elements until a cexpr_t from a specific group is found

Parameters:
  • cex (cexpr_t) – a cexpr_t object
  • element_type (a cot_xxx value (eg. cot_add)) – the type of element we are looking for (as a cot_xxx value, see compiler_consts.py)
Returns:

a set of cexpr_t of the specified type

Return type:

set

decompiler_utils.get_all_vars_in_node(cex)

Extracts all variables involved in an expression.

Parameters:cex (cexpr_t) – typically a controlFlowinator node
Returns:list of var_t indexes (to cf.lvars)
Return type:list
decompiler_utils.get_cfg_for_ea(ea, dot_exe, out_dir)

Debugging helper.

Uses DOT to create a .PNG graphic of the ControlFlowinator CFG and displays it.

Parameters:
  • ea (int) – address of the function to analyze
  • dot_exe (string) – path to the DOT binary
  • out_dir (string) – directory to write the .DOT file
decompiler_utils.get_cond_from_statement(ins)

Given a cinsn_t representing a control flow structure (do, while, for, etc.), it returns the corresponding cexpr_t representing the condition/argument for that code construct.

This is useful since we usually want to peek into conditional statements…

Parameters:ins (cinsn_t) – the cinsn_t associated with a control flow structure
Returns:the condition or argument within that control flow structure
Return type:cexpr_t
decompiler_utils.get_expr(n)

Returns the corresponding cexpr_t in case n is of type cinsn_t. Idempotent otherwise.

decompiler_utils.get_function_vars(c=None, ea=0, only_args=False, only_locals=False)

Populates a dict of my_var_t for the function containing the specified ea

Parameters:
  • c (controlFlowinator) – a controlFlowinator object, optional
  • ea (int) – the function address
  • only_args (bool, optional) – extract only function arguments
  • only_locals (bool, optional) – extract only local variables
Returns:

A dictionary of my_var_t, indexed by their index

decompiler_utils.get_interesting_calls(c, user_defined=[])

Not all functions are created equal. We are interested in functions with certain names or substrings in it.

Parameters:
  • c (controlFlowinator) – a controlFlowinator object
  • user_defined (list, optional) – a list of names (or substrings), if not supplied a hard-coded default list will be used.
Returns:

a list of callObj

Return type:

list

decompiler_utils.get_return_type(cf=None)

Hack to get the return value of a function.

Parameters:cf (ida_hexrays.cfuncptr_t) – the result of decompile()
Returns:Type information for the return value
Return type:tinfo_t
decompiler_utils.is_arithmetic_expression(cex, only_these=[])

Checks whether this is an arithmetic expression.

Parameters:
  • cex (cexpr_t) – expression, usually this is a node.
  • only_these (a list of cot_* constants, eg. cot_add.) – a list of arithmetic expressions to look for. These are defined in ida_hexrays
Returns:

True or False

Return type:

bool

decompiler_utils.is_array_indexing(ins)
decompiler_utils.is_asg(ins)
decompiler_utils.is_binary_truncation(cex)

Looking for expressions truncating a number

These expressions are of the form v1 & 0xFFFF or alike

Parameters:cex (:class:cexpr_t) – an expression
Returns:True or False
Return type:bool
decompiler_utils.is_call(ins)
decompiler_utils.is_cast(ins)
decompiler_utils.is_final_expr(cex)

Helper for internal functions.

A final expression will be defined as one that can not be further decomposed, eg. number, var, string, etc.

Normally, you should not need to use this.

Parameters:cex (cexpr_t) – a cexpr_t object
Returns:True or False
Return type:bool
decompiler_utils.is_global_var(ins)

Tells whether ins is a global variable

TODO: enhance this heuristic

Parameters:inscexpr_t or insn_t
Returns:True or False
Return type:bool
decompiler_utils.is_helper(ins)

Helpers are IDA macros, e.g. __ROR__ or LOBYTE

decompiler_utils.is_if(ins)
decompiler_utils.is_member_pointer(ins)

Convenience wrapper

decompiler_utils.is_number(ins)

Convenience wrapper

decompiler_utils.is_ptr(ins)
decompiler_utils.is_read(ins)

Try to find read primitives.

Looking for things like:

v3 = *(_DWORD *)(v5 + 784)

NOTE: this will find expressions that are read && write, since they are not mutually exclusive

TODO: Rather rough, it is a first version…

Parameters:node (cinsn_t or cexpr_t) – a controlFlowinator node
Returns:True or False
Return type:bool
decompiler_utils.is_ref(ins)
decompiler_utils.is_return(ins)
decompiler_utils.is_string(ins)

Convenience wrapper

decompiler_utils.is_struct_member(ins)

Convenience wrapper

decompiler_utils.is_var(ins)

Whether this ins corresponds to a variable

Remember that if this evaluates to True, we are dealing with an object of type var_ref_t which are pretty much useless. We may want to convert this to a lvar_t and even better to a my_var_t afterwards.

ref2var() is a simple wrapper to perform the conversion between reference and variable

decompiler_utils.is_write(node)

Try to find write primitives.

Looking for things like:

*(_DWORD *)(something) = v38
arr[i] = v21

TODO: Rather rough, it is a first version…

Parameters:node (cinsn_t or cexpr_t) – a controlFlowinator node
Returns:True or False
Return type:bool
decompiler_utils.lex_citem_indexes(line)

Part of Lighthouse plugin

Lex all ctree item indexes from a given line of text. The HexRays decompiler output contains invisible text tokens that can be used to attribute spans of text to the ctree items that produced them.

decompiler_utils.lines_and_code(cf=None, ea=0)

Mapping of line numbers and code

Parameters:
  • cf (an cfunc_t object, optional) – a decompilation object
  • ea (int, optional) – Address within the function to decompile, if no cf is provided
Returns:

a dictionary of lines of code, indexed by line number

Return type:

dict

decompiler_utils.main()
decompiler_utils.map_citem2line(line2citem)

Part of Lighthouse plugin

Creates a mapping of citem indexes to lines of code

decompiler_utils.map_line2citem(decompilation_text)

Part of Lighthouse plugin

Map decompilation line numbers to citems. This function allows us to build a relationship between citems in the ctree and specific lines in the hexrays decompilation text.

decompiler_utils.map_line2node(cfunc, line2citem)

Part of Lighthouse plugin

Map decompilation line numbers to node (basic blocks) addresses. This function allows us to build a relationship between graph nodes (basic blocks) and specific lines in the hexrays decompilation text.

decompiler_utils.map_node2lines(line2node)

Part of Lighthouse plugin

Creates a mapping of nodes to lines of code

decompiler_utils.member_info(ins)
Returns info about a structure member
or a pointer to it
Parameters:inscexpr_t or insn_t
decompiler_utils.my_decompile(ea=None)

This sets flags necessary to use this programmatically.

Parameters:ea (int) – Address within the function to decompile
Returns:decompilation object
Return type:a cfunc_t
decompiler_utils.my_get_func_name(ea)

Wrapper for get_func_name handling some corner cases.

Parameters:ea (int) – Address of the function to resolve its name
class decompiler_utils.my_var_t(var)

This wraps the lvar_t nicely into a more usable data structure.

It aggregates several interesting pieces of information in one place. eg. is_arg, is_constrained, is_initialized, etc.

The most commonly used attributes for this class are:

  • name
  • type_name
  • size
  • is_arg
  • is_pointer
  • is_array
  • is_signed
Parameters:var (lvar_t) – an object representing a local variable or function argument
decompiler_utils.num_value(ins)

Returns the numerical value of ins

Parameters:inscexpr_t or insn_t
decompiler_utils.points_to(ins)
class decompiler_utils.pseudoViewer

This wraps the pseudoViewer API neatly.

We need it because some things don’t work unless you previously visited (or are currently visiting) the function whose decompiled form you want to analyze. Thus, we are forced to “Hack like in the movies”

TODO: probably deprecate this after IDA 7.5 changes NOTE: the performance penalty is negligible

close()

Closes the pseudoviewer widget

show(ea=0, flags=8)

Displays the pseudoviewer widget

Parameters:
  • ea (int, optional) – adress of the function to display
  • flags (int, optional) – how to flags an existing pseudocode display, if any
silent_flags = 8
decompiler_utils.ref2var(ref, c=None, cf=None)

Convenient wrapper to streamline the conversions between var_ref_t and lvar_t

Parameters:
  • c (controlFlowinator) – a controlFlowinator object, optional
  • cf (a cfunc_t object) – a decompilation object (usually the result of decompile), optional
  • ref (var_ref_t) – a reference to a variable in the pseudocode
Returns:

a lvar_t object

Return type:

lvar_t

decompiler_utils.ref_to(ins)
decompiler_utils.string_value(ins)

Gets the string corresponding to ins

Works with C-str and Unicode

Parameters:inscexpr_t or insn_t
Returns:string for this ins
Return type:string
decompiler_utils.value_of_global(ins)

Returns the value of a global variable