API reference for ARM firmware analysis console
Examples are given for the disassembly of autoexec.bin (550D, 1.0.8) and for the 550D dump, which can be obtained by running Magic Lantern on your camera.
Loading some dumps and some names[]
In [1]: D = load_dumps("(autoexec|108)")
Did not find the matching bin for mynames.idc. Ignoring this file.
Did not find the matching bin for mychanges.idc. Ignoring this file.
Input files:
===============================================================================
Binary dump (*.bin) LoadAddr IDC database (*.idc)
===============================================================================
550d.108.0xff010000.bin FF010000 550d.108.20101116_indy_ROM0.idc
autoexec.0x8A000.bin 8A000 n/a
===============================================================================
Disassembling 550d.108.0xff010000.bin <ff010000>... ok
Disassembling autoexec.0x8A000.bin <8a000>... ok
Parsing 550d.108.20101116_indy_ROM0.idc... found 56768 MakeName's and 18053 MakeFunction's
Parsing disassembly of 550d.108.0xff010000.bin... found 1395198 lines
Indexing references...
Parsing disassembly of autoexec.0x8A000.bin... found 19564 lines
Indexing references...
In [2]: t2i,ml = D # D is sorted by bin file name
In [3]: ml.load_names("stubs-550d.108.S")
Found 80 stubs in stubs-550d.108.S.
In [4]: ml.load_names("autoexec.S")
Found 8 stubs in autoexec.S.
Magic commands[]
Shortcuts to misc stuff.
They operate on the currently selected dump, so first you have to select one:
In [5]: sel ml
Go to address or name:
In [6]: g 8B194
8b194: e59fb19c ldr r11, [pc, #412] ; 0x8b338: pointer to 0xc2b8c
8b198: e59fc19c ldr r12, [pc, #412] ; 0x8b33c: pointer to 0x8B7F4 (bmp_printf)
8b19c: e58b0000 str r0, [r11]
8b1a0: e1a01006 mov r1, r6
8b1a4: e3a02028 mov r2, #40 ; 0x28
8b1a8: e3a00802 mov r0, #131072 ; 0x20000
8b1ac: e59f318c ldr r3, [pc, #396] ; **'ML v. %s (%s)\nBuilt on %s by %s\n'
8b1b0: e88d0090 stm sp, {r4, r7}
8b1b4: e58dc010 str r12, [sp, #16]
8b1b8: e58d8008 str r8, [sp, #8]
In [7]: g fprintf
NSTUB(fprintf, 8eec0):
8eec0: e92d000e push {r1, r2, r3}
8eec4: e92d4070 push {r4, r5, r6, lr}
8eec8: e24ddf41 sub sp, sp, #260 ; 0x104
8eecc: e28dcf46 add r12, sp, #280 ; 0x118
8eed0: e1a05000 mov r5, r0
8eed4: e1a0300c mov r3, r12
8eed8: e59d2114 ldr r2, [sp, #276]
8eedc: e58dc100 str r12, [sp, #256]
8eee0: e1a0000d mov r0, sp
8eee4: e59fc03c ldr r12, [pc, #60] ; 0x8ef28: pointer to 0xFF1D6638 (vsnprintf)
Search for strings:
In [8]: s magic finding strings... 91b1c: 'A:/magic.cfg' 916b8: 'magic lantern init done' 91628: 'B:/magic.cfg' 91ff8: 'Magic Lantern install' 917d4: '# Magic Lantern %s (%s)\n# Build on %s by %s\n' 915fc: 'Magic Lantern %s (%s)' 915f0: '[MAGIC] '
Search for references:
In [9]: r additional_version ROMBASE+0x10f8: 8b0f8: e59f3220 ldr r3, [pc, #544] ; 0x8b320: pointer to 0x15094 (additional_version) 0x15094 (additional_version) In [10]: r 0x1ED0 ROMBASE+0x5b54: 8fb54: e59f40bc ldr r4, [pc, #188] ; 0x8fc18: pointer to 0x1ED0 (sounddev) 0x1ED0 (sounddev)
Classes/objects[]
Dump[]
Contains all info about a dump.
In [11]: ml.bin
Out[11]: autoexec.0x8A000.bin
In [12]: hex(ml.loadaddr), hex(ml.minaddr), hex(ml.maxaddr)
Out[12]: ('8A000', '8A000', 'B30BC')
In [13]: t2i.funcs(r"Flavor[C|S]") # when ratio=1, it uses a regex search
ff205b24: FlavorSharpness
ff205c14: FlavorContrast
ff205dac: FlavorSaturation
ff205e9c: FlavorColorTone
In [14]: t2i.funcs("DebugMsg", 0.5) # when ratio < 1, this is the min. allowed ratio for fuzzy search
ff0673ec: DebugMsg
ff2da3e0: _DebugSignal
ff08b350: EnableDebugMon
ff0cbac8: DpSetDebugMode
In [15]: t2i.refs("sounddev")
...
sounddev_task+28:
ff053488: e51f426c ldr r4, [pc, #-620] ; 0xff053224: pointer to 0x1ED0 (sounddev)
0x1ED0 (sounddev)
...
In [16]: t2i.refs("sounddev", context=2)
...
sounddev_task+28:
ff053480: e3a01000 mov r1, #0 ; 0x0
ff053484: eb006b5e bl @create_binary_semaphore
ff053488: e51f426c ldr r4, [pc, #-620] ; 0xff053224: pointer to 0x1ED0 (sounddev)
ff05348c: e3a02000 mov r2, #0 ; 0x0
ff053490: e5840058 str r0, [r4, #88]
0x1ED0 (sounddev)
...
In [17]: t2i.strings("AudioLevel")
finding strings...
ff544628: 'AudioLevelStateSignature'
ff1ab830: 'SoundDevice\\AudioLevel.c'
ff544670: 'AudioLevel'
In [18]: t2i.strrefs("^CreateTask$")
String references to ff06e2c8 'CreateTask':
createTask_maybe+68:
ff06e158: 028f0f5a addeq r0, pc, #360 ; *'CreateTask'
'CreateTask'
In [19]: t2i.disasm(0xff010000, 0xff01000f)
ff010000: e59ff0bc ldr pc, [pc, #188] ; 0xff0100c4: pointer to 0xff01000c
NSTUB($ fr603e.var_4C, ff010004):
ff010004: 6e6f6167 powvsez f6, f7, f7
ff010008: 796f7369 stmdbvc pc!, {r0, r3, r5, r6, r8, r9, r12, sp, lr}^
ff01000c: e3a00103 mov r0, #-1073741824 ; 0xc0000000
In [20]: ml.MakeName(0x8D070, "config_parse_file") In [21]: ml.MakeFunction(0x8D070) # will guess the end address Size: 108 In [22]: ml.MakeFunction(0x8D070, 0x8D0DC) # explicit end address In [23]: ml.MakeFunction(0x8D070, 0x8D0DC, "config_parse_file") # explicit end address and name Overwriting name config_parse_file In [24]: ml.MakeFunction(0x8D070, name="config_parse_file") # will guess the end address and set the given name Overwriting name config_parse_file
Fun[]
ASM function. Constructor:
- dump.Fun(name_or_addr)
In [25]: t2i.Fun(0xFF1C924C)
Out[25]: ASM function: dispcheck at 0xff1c924c in 550d.108.0xff010000.bin
In [26]: f = t2i.Fun("setFiltreOff")
Unknown function: setFiltreOff. Using closest match: SetFilterOff.
In [27]: f
Out[27]: ASM function: SetFilterOff at 0xff064e98 in 550d.108.0xff010000.bin
In [28]: f.name
Out[28]: SetFilterOff
In [29]: "%x"%f.addr
Out[29]: ff064e98
In [30]: "%x"%f.end
Out[30]: ff064eb8
In [31]: f.size
Out[31]: 32
In [32]: f.called_by()
ff064d30: UnpowerMicAmp+40
ff064120: sub_FF064114+12
In [33]: f.calls()
ff064ea8: eb00094f bl @DebugMsg
ff064eb4: eafffb4e b @audio_ic_write
// End of function: SetFilterOff
In [34]: f.disasm()
// Start of function: SetFilterOff
NSTUB(SetFilterOff, ff064e98):
ff064e98: e92d4010 push {r4, lr}
ff064e9c: e28f20f4 add r2, pc, #244 ; *'SetFilterOff'
ff064ea0: e3a01003 mov r1, #3 ; 0x3
ff064ea4: e3a00014 mov r0, #20 ; 0x14
ff064ea8: eb00094f bl @DebugMsg
ff064eac: e8bd4010 pop {r4, lr}
ff064eb0: e3a00c31 mov r0, #12544 ; 0x3100
ff064eb4: eafffb4e b @audio_ic_write
// End of function: SetFilterOff
In [35]: f.sig
Out[35]: push add mov mov bl pop mov b
In [36]: f.refs()
SetFilterOff+4:
ff064e9c: e28f20f4 add r2, pc, #244 ; *'SetFilterOff'
'SetFilterOff'
...
In [37]: f.strings()
SetFilterOff
Modules[]
match[]
Matches functions and addresses between different versions of firmware, or different camera models.
See GPL Tools/match.py.
guessfunc[]
This guesses function locations from calls (BL, BX), PUSH instructions, and loaded names which are not marked as functions. Function size is guessed with emusym.GuessFunction.
This is the main function. I prefer to run this before generating the HTML output.
In [38]: guessfunc.run(ml) Function 0xff82399c is outside ROM => skipping Function 0xff8922a4 is outside ROM => skipping ...
Those are little workers:
In [39]: analyze_names(dump) In [40]: analyze_push(dump) In [41]: analyze_bl(dump) In [42]: analyze_bx(dump)
html[]
Exports the disassembly to a (huge) set of HTML files, which can be browsed offline (like this example).
It can also do some automated firmware analysis.
In [43]: html.quick(ml) # quick disassembly of dump ml Disassembling autoexec.0x8A000.bin... ... Raw disassembly... [19% done, ETA 0:00:05]...
In [44]: html.full(ml) # quick(...) + symbolic function analysis Disassembling autoexec.0x8A000.bin... ... Running symbolic analysis for autoexec.0x8A000.bin... ... Function analysis... [91% done, ETA 0:00:19]...
In [45]: html.quick(D) # process all dumps from list D In [46]: html.full(D)
In [47]: html.update(ml) # only re-generate modified pages (experimental).
Some small workers:
In [48]: html.link2addr(0x8B194)
Out[48]: <a href="0008a000.htm#_8B194">8B194</a>
In [49]: html.link2func(ml.Fun("fprintf"))
Out[49]: <a href="sub_0008eec0.htm">fprintf</a>
In [50]: html.link2funcoff(0x8eee0)
disasm[]
Disassembly and code browsing.
In [51]: hex(BYTE(ml, 0x916b8)) Out[51]: 6D In [52]: hex(UINT32(ml, 0x916b8)) Out[52]: 6967616D In [53]: GuessString(ml, 0x916b8) Out[53]: magic lantern init done
In [54]: funcname(ml, 0x8eec0) Out[54]: fprintf In [55]: funcname(ml, 0x8b000) Out[55]: sub_8B000 In [56]: dataname(ml, 0x1ED0) Out[56]: 0x1ED0 (sounddev)
In [57]: guess_data(ml,0x8c228) Out[57]: 0x8c228: pointer to 0x2E5B0 (bmp_vram_info) In [58]: guess_data(ml,0x8e748) Out[58]: 0x8e748: pointer to 'They try to set code to %d' In [59]: guess_data(ml,0x8dca8) Out[59]: @menu_add
fileutil[]
Utils for working with files.
In [60]: change_ext("foo.py", ".jpg")
Out[60]: foo.jpg
s = capture(func, *args, **kwargs) # capture output and result of func
stats[]
Sorts functions by number of calls to them:
In [61]: stats.calls_to(t2i) ... 1753: dialog_label_item 2555: assert_0 16694: DebugMsg
Or by how many other functions they call:
In [62]: stats.calls_from(t2i) ... 103: sub_FF300BA0 sub_FF2FF400,JudgeBottomInfoDispT ... 133: sub_FF0340B4 release_mem,sub_FF031E08,LVCAF_Re ... 215: IDLEHandler release_mem,EndMovieRecSequence_N ...
The most called functions seem to be DebugMsg and assert.
idapy[]
IDAPython compatibility layer. Also contains functions from IDAPython/utils.py and other small stuff.
They are NOT fully compatible with IDAPython! I only wrote it to help porting my existing scripts.
The first thing you have to do before using any of those functions is:
In [63]: select_dump(dump)
where dump is a Dump object; for example:
In [64]: select_dump(t2i)
Or use the magic command sel, which does the same thing.
In [65]: GetFirstWord("abc def")
Out[65]: abc
In [66]: getRegs03("R1, R2 and maybe R5")
Out[66]: [1, 2]
In [67]: getRegsS("R1, R2 and maybe R5")
Out[67]: ['R1', 'R2', 'R5']
In [68]: filter_non_printable("\x07buzz!")
Out[68]: buzz!
In [69]: print GetDisasm(0xff064bbc)
subeq r1, pc, #2944 ; 0xb80
In [70]: GetMnem(0xff064bbc)
Out[70]: SUB
In [71]: GetMnef(0xff064bbc)
Out[71]: SUBEQ
In [72]: GetOpnd(0xff064bbc, 0)
Out[72]: R1
In [73]: GetOpnd(0xff064bbc, 1)
Out[73]: PC
In [74]: GetOpnd(0xff064bbc, 2)
Out[74]: #2944
In [75]: GetOpType(0xff064bbc, 0)
Out[75]: 1
In [76]: GetOpType(0xff064bbc, 1)
Out[76]: 1
In [77]: GetOpType(0xff064bbc, 2)
Out[77]: 5
In [78]: GetOperandValue(0xff064bbc, 0)
Out[78]: 1
In [79]: GetOperandValue(0xff064bbc, 1)
Out[79]: 15
In [80]: GetOperandValue(0xff064bbc, 2)
Out[80]: 2944
In [81]: print GetDisasm(0xff48e5a4)
stmiane r5!, {r6, r10, r12, lr}
In [82]: GetModeSuffix(0xff48e5a4)
Out[82]: IA
In [83]: GetCondSuffix(0xff48e5a4)
Out[83]: NE
In [84]: OppositeSuffix("EQ")
Out[84]: NE
In [85]: print GetDisasm(0xff064b90) ldrb r0, [r0, #16] In [86]: GetExtraSuffixes(0xff064b90) Out[86]: B In [87]: GetByteSuffix(0xff064b90) Out[87]: B In [88]: GetHalfwordSuffix(0xff064b90) In [89]: GetFlagSuffix(0xff064b90)
In [90]: print GetDisasm(0xff064bb4) cmpne r0, #5 ; 0x5 In [91]: ChangesFlags(0xff064bb4) Out[91]: True
In [92]: GetString(0xff011e1c) Out[92]: akashimorino
In [93]: isFuncStart(0xff064e98) Out[93]: True In [94]: GetFunctionName(0xff064e98) Out[94]: SetFilterOff In [95]: GetFuncOffset(0xff064e98) Out[95]: SetFilterOff In [96]: FuncItems(0xff064e98) Out[96]: [4278603416L, 4278603420L, 4278603424L, 4278603428L, 4278603432L, 4278603436L, 4278603440L, 4278603444L]
In [97]: SegStart(0xff064e98) Out[97]: 4278255616 In [98]: SegEnd(0xff064e98) Out[98]: 4283881244
In [99]: CodeRefsTo(0xff064e98) Out[99]: [4278603056L, 4278599968L] In [100]: DataRefsTo(1ED0) ... In [101]: CodeRefsFrom(0xff064e98) Out[101]: [4278612972L, 4278598644L] In [102]: DataRefsFrom(0xff064e98) Out[102]: [4278603672L, 1182033235, 3, 20, 3912040463L, 12544, 3912056945L]
bunch[]
Something like a struct.
In [103]: b = Bunch(foo=5, baz="hello") In [104]: b.foo Out[104]: 5 In [105]: b.baz Out[105]: hello
emusym[]
Symbolic emulation of ASM code, based on SymPy. Theory is here: IDAPython/Static_analysis
These functions operate on the currently selected dump, therefore you have to select one first.
In [106]: sel ml In [107]: CP = emusym.find_code_paths(0x8dac8) # extract code path by doing a recursive branch analysis In [108]: emusym.create_graph(CP, "myfunc.svg") # generate a graph of code flow, with graphviz
In [109]: cp = CP[0] # select first code path In [110]: emusym.resetArm() # reset the symbolic ARM registers and memory contents In [111]: emusym.emusym_code_path(cp) # run one code paths under symbolic emulation In [112]: emusym.emusym_code_path(cp, codetree=True) # return a code tree, useful for decompiling Out[112]: SEQ(IF(MEM(8 + MEM(unk_R3)), IFB(NE, SEQ(IF(MEM(4 + MEM(unk_R3)), IFB(TRUE, SEQ(MEMWRITE( ... In [113]: emusym.split_code_path_ea_cond(cp) # split the code path in two plain lists Out[113]: ([580296, 580300, 580308, 580312, 580316, 580320, 580328, 580332, 580336, 580344, 580344], [[], [], ['NE'], ['NE'], ['NE'], [], ['EQ'], ['EQ'], ['EQ'], ['EQ'], ['EQ']])
In [114]: emusym.GuessFunction(0x8ea48) # guesses the end of the function, when you know the start Size: 424 Out[114]: 584684
bkt[]
Backtracing: solves the contents of ARM registers / memory addresses by symbolic emulation of ASM code, going backwards until all requested unknowns are found. Theory is here: IDAPython/Backtracing.
In [115]: ea = 0x8CE98
In [116]: g ea-28
8ce7c: e1a0700a mov r7, r10
8ce80: e3a00032 mov r0, #50 ; 0x32
8ce84: e3a01003 mov r1, #3 ; 0x3
8ce88: e59f2090 ldr r2, [pc, #144] ; **'%s: ERROR Deleting config'
8ce8c: e59f3074 ldr r3, [pc, #116] ; **'config_parse'
8ce90: e59fa074 ldr r10, [pc, #116] ; 0x8cf0c: pointer to 0xFF0673EC (DebugMsg)
8ce94: e1a0e00f mov lr, pc
8ce98: e12fff1a bx r10
8ce9c: e3570000 cmp r7, #0 ; 0x0
8cea0: 159f507c ldrne r5, [pc, #124] ; 0x8cf24: pointer to 0xFF0182CC (free)
In [117]: bkt.back_solve(ea, ['ARM.R0','ARM.R1']) # a faster heuristic
Out[117]: [50, 3]
In [118]: bkt.back_solve_slow(ea, ['ARM.R0','ARM.R1']) # slow, dumb, but reliable
Out[118]: [50, 3]
In [119]: bkt.go_back([ea])
Out[119]: [577172, 577176]
In [120]: bkt.go_back([ea-4,ea])
Out[120]: [577168, 577172, 577176]
In [121]: bkt.find_func_call(ea, 4)
Out[121]: (4278612972L, 'DebugMsg', "(0x32, 3, '%s: ERROR Deleting config', 'config_parse')")
In [122]: bkt.trace_calls_to("DebugMsg", 4)
Function found at ff0673ec
Out[122]: {570628: (4278612972L, 'DebugMsg', "(0x12, 3, '%s created (and exiting)', 'null_task')"), ...
...
These functions find the function which is called at a given address, using backtracing if needed.
In [123]: bkt.subaddr_bl(0x8CD74) Out[123]: 575428 In [124]: bkt.subaddr_bx(0x8CE98) Out[124]: 4278612972 In [125]: bkt.subaddr_mov(addr) # for MOV PC, ...
deco[]
Experimental decompiler based on SymPy (see module emusym). Only works for functions without loops.
In [126]: print deco.decompile(0x8D070)
*(-4 + sp0) = lr0
*(-8 + sp0) = unk_R6
*(-12 + sp0) = unk_R5
*(-16 + sp0) = unk_R4
FIO_Open(arg0, 0x1000, arg2, ...) => ret_FIO_Open_8D084
strcpy(0xb2818, arg0, arg2, ...) => ret_strcpy_8D0A0
if ret_FIO_Open_8D084 == -1:
*(0xb27d4) = 0xb2854
return 0xb27d4
config_parse(ret_FIO_Open_8D084, arg0, arg2, strcpy) => ret_config_parse_8D0B8
FIO_CloseFile(handle=ret_FIO_Open_8D084) => ret_FIO_CloseFile_8D0CC
*(0xb27d4) = ret_config_parse_8D0B8
return 0xb27d4
!end
cache[]
Stores the result of lengthy computations.
In [127]: cache.access("my_computation", func_which_takes_ages_to_run)
In [128]: cache.disable()
In [129]: cache.enable()
In [130]: cache.clear()
There is also some experimental support for persistent cache, but it's disabled by default.
progress[]
Simple progress indicator; also computes ETA.
In [131]: progress("Doing some useless stuff...")
In [132]: time.sleep(1)
In [133]: progress(0.25)
Doing some useless stuff... [25% done, ETA 0:00:03]...
In [134]: time.sleep(2)
In [135]: progress(0.75)
Doing some useless stuff... [75% done, ETA 0:00:01]...
doc[]
Make Wiki docs from IPython commands included in main text. Something like auto-generating docs from docstrings.
You write:
This is an example:
2+2
and you get:
This is an example: In [1]: '''2+2''' Out[1]: 4
All the code to be executed in IPython is prefixed by:
- four spaces: normal code, i.e. show input command, run it and show output.
- three spaces and %: show input command, run it, but do not show output.
- three spaces and ~: show input command, but do not run it.
- three spaces and #: run the command, but do not show anything.
- three spaces and [list of indices]: display only those lines of output; use a string instead of index to display anything you like.
e.g.:
[1,2,"...",-1]ml.strings
displays the first line, second line, ellipsis and then the last line. Also, in this mode, max line length is limited to 100.
Spacing is critical, so pay attention! Also, 4 spaces and % is a magic command.
To make a doc:
doc.run("scripts/doc/api-ref.wikipy")
Result is in scripts/doc/api-ref.wiki, with Wiki markup codes.
Enjoy!
--Alexdu 16:56, December 1, 2010 (UTC)