Related: ASM introduction, ASM Dictionary
Quick Start[]
Let's assume we are at ROM:FF123456.
Python>ea = ScreenEA() # electronic arts? Python>print ea 4279383126 # nope... Python>print "%x" % ea ff123456 # bingo!
So, that EA is just a 32-bit address (unsigned integer). This is the address at the cursor position in IDA.
Decoding ASM instructions[]
Okay, let's decode the current instruction.
Python>print idc.GetDisasm(ea) LDRCC R3, [R0] Python>print idc.GetMnem(ea) LDR Python>print idc.GetOpnd(ea,0) R3 Python>print idc.GetOpnd(ea,1) [R0] Python>print idc.GetOpType(ea,0) 1 Python>print idc.GetOpType(ea,1) 3 Python>print idc.GetOperandValue(ea,0) 3 Python>print idc.GetOperandValue(ea,1) 0
Here we used the 'idc' module. The IDA console has it imported by default. Some useful functions for retrieving an ASM instruction:
- idc.GetDisasm(ea) => a string containing the human-readable ASM line
- idc.GetMnem(ea) => just the mnemonics of the instruction, without the conditional suffix and other stuff
- idc.GetOpnd(ea, i) => the i^th operand, as string
- idc.GetOpType(ea, i) => type of the i^th operand:
o_void = 0 # No Operand o_reg = 1 # General Register (al,ax,es,ds...) reg o_mem = 2 # Direct Memory Reference (DATA) addr o_phrase = 3 # Memory Ref [Base Reg + Index Reg] phrase o_displ = 4 # Memory Reg [Base Reg + Index Reg + Displacement] phrase+addr o_imm = 5 # Immediate Value value o_far = 6 # Immediate Far Address (CODE) addr o_near = 7 # Immediate Near Address (CODE) addr 8: I don't remember what... something related to MOVs
- idc.GetOperandValue(ea, i) => value of the i^th operand:
operand is an immediate value => immediate value operand has a displacement => displacement operand is a direct memory ref => memory address operand is a register => register number operand is a register phrase => phrase number otherwise => -1
- str = GetString(GetOperandValue(e,1), -1, ASCSTR_C) => this is for strings... like ADR R0, aBlahBlah ; "Blah Blah"
Converting operands[]
AJ said:
I use the 'S' key to change whereever there is a [SP,#48,arg76] -> [SP,#0x04]
To do this in IDC or IDAPython, use:
OpSeg(ea, i)
Related: OpAlt, OpBinary, OpChr, OpDecimal, OpEnumEx, OpFloat, OpHex, OpHigh, OpNot, OpNumber, OpOctal, OpOffEx, OpOff, OpSeg, OpSign, OpStkvar, OpStroffEx [1].
ASM Functions[]
At the beginning of the script, import idautils and idc:
from idautils import * from idc import *
Get the current function (from cursor):
ea = ScreenEA() func = idaapi.get_func(ea) funcname = GetFunctionName(func.startEA)
List all functions in current segment:
for funcea in Functions(SegStart(ea), SegEnd(ea)): name = GetFunctionName(funcea) print name
Find a function with a known name, let's say "call" or "TH_call":
for funcea in Functions(SegStart(ea), SegEnd(ea)): name = GetFunctionName(funcea) if name in ["call", "TH_call"]: ourfunc = funcea break print "found at %X" % funcea # if not found, an error is raised (ourfunc not defined)
Iterating through the instructions of a function:
E = list(FuncItems(ea)) for e in E: print "%X"%e, GetDisasm(e)
Find references to some function:
for ref in CodeRefsTo(ourfunc, 1): print " called from %s (0x%x)" % (GetFunctionName(ref), ref)
There is also CodeRefsFrom.
Subtle stuff: CodeRefsTo may return an address of an instruction which does not belong to any function. Here's how I handle this situation:
for ref in CodeRefsTo(ourfunc, 1): print " called from %s(0x%x)" % (GetFunctionName(ref), ref) E = list(FuncItems(ref)) if len(E) == 0: print "ORPHAN CALL (NOT IN A FUNCTION)!!!!" print " at %X " % ref continue ...
Set the signature of current function:
SetType("int foo(int a, int b, int c)")
Wiki tables[]
I might create some reusable APIs for wiki tables. Until then, look at this:
tab = open("table.txt", "w") tab.write("""{| border="1" ! scope="col"|Function name ! scope="col"|Called from: func (addr) """) for f in sorted(Funcs.keys()): tab.write("""|- |%s |%s """ % ( f, string.join(CallFuncs[f], ",")) ) tab.write("|}\n") tab.close()
Questions[]
(stuff I don't know, but I'd like to):
- How to get the conditional suffix (or whatever is called) of an instruction? [I used some regex'es for this]
- How to get the list of ALL functions from the dump (not just the current segment?) [IDA bug?]
Bugs and workarounds[]
A quirk in Python (triggered by IDAPython implementation) is some kind of module caching. So, if you import foo in your program, run it, then edit "foo.py", then run the program again, it WILL NOT LOAD YOUR CHANGES. You will have to either reload the firmware, restart IDA or, better, use the following:
Workaround: include your files with the ugly:
import inspect, os scriptdir = os.path.dirname(inspect.getfile(inspect.currentframe()))
execfile(os.path.join(scriptdir, "foo.py"))
and make sure there are no other side effects.
Since IDA 6.5, you can also use
idaapi.require("foo")
instead.
TODO: update all scripts with this method.
Next steps[]
Start writing some scripts and share them. You may also check IDAPython/Tracing calls tutorial.