API reference for ARM firmware analysis console

Examples are given for the disassembly of autoexec.bin (550D, 1.0.8) and for the 550D dump, which can be obtained by running Magic Lantern on your camera.

Loading some dumps and some names[edit | edit source]

In [1]: D = load_dumps("(autoexec|108)")
Did not find the matching bin for mynames.idc. Ignoring this file.
Did not find the matching bin for mychanges.idc. Ignoring this file.
Input files:
           Binary dump (*.bin)     LoadAddr     IDC database (*.idc)    
       550d.108.0xff010000.bin     FF010000     550d.108.20101116_indy_ROM0.idc
          autoexec.0x8A000.bin        8A000     n/a
Disassembling 550d.108.0xff010000.bin <ff010000>... ok
Disassembling autoexec.0x8A000.bin <8a000>... ok
Parsing 550d.108.20101116_indy_ROM0.idc... found 56768 MakeName's and 18053 MakeFunction's
Parsing disassembly of 550d.108.0xff010000.bin...  found 1395198 lines
Indexing references...
Parsing disassembly of autoexec.0x8A000.bin...  found 19564 lines
Indexing references...

In [2]: t2i,ml = D                                          # D is sorted by bin file name

In [3]: ml.load_names("stubs-550d.108.S")
Found 80 stubs in stubs-550d.108.S.

In [4]: ml.load_names("autoexec.S")
Found 8 stubs in autoexec.S.

Magic commands[edit | edit source]

Shortcuts to misc stuff.

They operate on the currently selected dump, so first you have to select one:

In [5]: sel ml

Go to address or name:

In [6]: g 8B194
   8b194:	e59fb19c 	ldr	r11, [pc, #412]	; 0x8b338: pointer to 0xc2b8c
   8b198:	e59fc19c 	ldr	r12, [pc, #412]	; 0x8b33c: pointer to 0x8B7F4 (bmp_printf)
   8b19c:	e58b0000 	str	r0, [r11]
   8b1a0:	e1a01006 	mov	r1, r6
   8b1a4:	e3a02028 	mov	r2, #40	; 0x28
   8b1a8:	e3a00802 	mov	r0, #131072	; 0x20000
   8b1ac:	e59f318c 	ldr	r3, [pc, #396]	; **'ML v. %s (%s)\nBuilt on %s by %s\n'
   8b1b0:	e88d0090 	stm	sp, {r4, r7}
   8b1b4:	e58dc010 	str	r12, [sp, #16]
   8b1b8:	e58d8008 	str	r8, [sp, #8]

In [7]: g fprintf
NSTUB(fprintf, 8eec0):
   8eec0:	e92d000e 	push	{r1, r2, r3}
   8eec4:	e92d4070 	push	{r4, r5, r6, lr}
   8eec8:	e24ddf41 	sub	sp, sp, #260	; 0x104
   8eecc:	e28dcf46 	add	r12, sp, #280	; 0x118
   8eed0:	e1a05000 	mov	r5, r0
   8eed4:	e1a0300c 	mov	r3, r12
   8eed8:	e59d2114 	ldr	r2, [sp, #276]
   8eedc:	e58dc100 	str	r12, [sp, #256]
   8eee0:	e1a0000d 	mov	r0, sp
   8eee4:	e59fc03c 	ldr	r12, [pc, #60]	; 0x8ef28: pointer to 0xFF1D6638 (vsnprintf)

Search for strings:

In [8]: s magic
finding strings...
91b1c: 'A:/magic.cfg'
916b8: 'magic lantern init done'
91628: 'B:/magic.cfg'
91ff8: 'Magic Lantern install'
917d4: '# Magic Lantern %s (%s)\n# Build on %s by %s\n'
915fc: 'Magic Lantern %s (%s)'
915f0: '[MAGIC] '

Search for references:

In [9]: r additional_version
   8b0f8:	e59f3220 	ldr	r3, [pc, #544]	; 0x8b320: pointer to 0x15094 (additional_version)
0x15094 (additional_version)

In [10]: r 0x1ED0
   8fb54:	e59f40bc 	ldr	r4, [pc, #188]	; 0x8fc18: pointer to 0x1ED0 (sounddev)
0x1ED0 (sounddev)

Classes/objects[edit | edit source]

Dump[edit | edit source]

Contains all info about a dump.

In [11]: ml.bin
Out[11]: autoexec.0x8A000.bin

In [12]: hex(ml.loadaddr), hex(ml.minaddr), hex(ml.maxaddr)
Out[12]: ('8A000', '8A000', 'B30BC')

In [13]: t2i.funcs(r"Flavor[C|S]")                           # when ratio=1, it uses a regex search
ff205b24: FlavorSharpness
ff205c14: FlavorContrast
ff205dac: FlavorSaturation
ff205e9c: FlavorColorTone

In [14]: t2i.funcs("DebugMsg", 0.5)                          # when ratio < 1, this is the min. allowed ratio for fuzzy search
ff0673ec: DebugMsg
ff2da3e0: _DebugSignal
ff08b350: EnableDebugMon
ff0cbac8: DpSetDebugMode

In [15]: t2i.refs("sounddev")
ff053488:	e51f426c 	ldr	r4, [pc, #-620]	; 0xff053224: pointer to 0x1ED0 (sounddev)
0x1ED0 (sounddev)

In [16]: t2i.refs("sounddev", context=2)
ff053480:	e3a01000 	mov	r1, #0	; 0x0
ff053484:	eb006b5e 	bl	@create_binary_semaphore	
ff053488:	e51f426c 	ldr	r4, [pc, #-620]	; 0xff053224: pointer to 0x1ED0 (sounddev)
ff05348c:	e3a02000 	mov	r2, #0	; 0x0
ff053490:	e5840058 	str	r0, [r4, #88]
0x1ED0 (sounddev)

In [17]: t2i.strings("AudioLevel")
finding strings...
ff544628: 'AudioLevelStateSignature'
ff1ab830: 'SoundDevice\\AudioLevel.c'
ff544670: 'AudioLevel'
In [18]: t2i.strrefs("^CreateTask$")
String references to ff06e2c8 'CreateTask':

ff06e158:	028f0f5a 	addeq	r0, pc, #360	; *'CreateTask'
In [19]: t2i.disasm(0xff010000, 0xff01000f)
ff010000:	e59ff0bc 	ldr	pc, [pc, #188]	; 0xff0100c4: pointer to 0xff01000c
NSTUB($ fr603e.var_4C, ff010004):
ff010004:	6e6f6167 	powvsez	f6, f7, f7
ff010008:	796f7369 	stmdbvc	pc!, {r0, r3, r5, r6, r8, r9, r12, sp, lr}^
ff01000c:	e3a00103 	mov	r0, #-1073741824	; 0xc0000000
In [20]: ml.MakeName(0x8D070, "config_parse_file")
In [21]: ml.MakeFunction(0x8D070)                            # will guess the end address
Size: 108

In [22]: ml.MakeFunction(0x8D070, 0x8D0DC)                   # explicit end address

In [23]: ml.MakeFunction(0x8D070, 0x8D0DC, "config_parse_file")  # explicit end address and name
Overwriting name config_parse_file

In [24]: ml.MakeFunction(0x8D070, name="config_parse_file")  # will guess the end address and set the given name
Overwriting name config_parse_file

Fun[edit | edit source]

ASM function. Constructor:

  • dump.Fun(name_or_addr)
In [25]: t2i.Fun(0xFF1C924C)
Out[25]: ASM function: dispcheck at 0xff1c924c in 550d.108.0xff010000.bin

In [26]: f = t2i.Fun("setFiltreOff")
Unknown function: setFiltreOff. Using closest match: SetFilterOff.

In [27]: f
Out[27]: ASM function: SetFilterOff at 0xff064e98 in 550d.108.0xff010000.bin

In [28]: f.name
Out[28]: SetFilterOff

In [29]: "%x"%f.addr
Out[29]: ff064e98

In [30]: "%x"%f.end
Out[30]: ff064eb8

In [31]: f.size
Out[31]: 32

In [32]: f.called_by()
ff064d30: UnpowerMicAmp+40
ff064120: sub_FF064114+12

In [33]: f.calls()
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eb4:	eafffb4e 	b	@audio_ic_write	
// End of function: SetFilterOff

In [34]: f.disasm()
// Start of function: SetFilterOff
NSTUB(SetFilterOff, ff064e98):
ff064e98:	e92d4010 	push	{r4, lr}
ff064e9c:	e28f20f4 	add	r2, pc, #244	; *'SetFilterOff'
ff064ea0:	e3a01003 	mov	r1, #3	; 0x3
ff064ea4:	e3a00014 	mov	r0, #20	; 0x14
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eac:	e8bd4010 	pop	{r4, lr}
ff064eb0:	e3a00c31 	mov	r0, #12544	; 0x3100
ff064eb4:	eafffb4e 	b	@audio_ic_write	
// End of function: SetFilterOff

In [35]: f.sig
Out[35]: push add mov mov bl pop mov b 

In [36]: f.refs()
ff064e9c:	e28f20f4 	add	r2, pc, #244	; *'SetFilterOff'


In [37]: f.strings()

Modules[edit | edit source]

match[edit | edit source]

Matches functions and addresses between different versions of firmware, or different camera models.

See GPL Tools/match.py.

guessfunc[edit | edit source]

This guesses function locations from calls (BL, BX), PUSH instructions, and loaded names which are not marked as functions. Function size is guessed with emusym.GuessFunction.

This is the main function. I prefer to run this before generating the HTML output.

In [38]: guessfunc.run(ml)
Function 0xff82399c is outside ROM => skipping
Function 0xff8922a4 is outside ROM => skipping

Those are little workers:

In [39]: analyze_names(dump)

In [40]: analyze_push(dump)

In [41]: analyze_bl(dump)

In [42]: analyze_bx(dump)

html[edit | edit source]

Exports the disassembly to a (huge) set of HTML files, which can be browsed offline (like this example).

It can also do some automated firmware analysis.

In [43]: html.quick(ml)                                      # quick disassembly of dump ml
Disassembling autoexec.0x8A000.bin...
Raw disassembly... [19% done, ETA 0:00:05]...
In [44]: html.full(ml)                                       # quick(...) + symbolic function analysis
Disassembling autoexec.0x8A000.bin...
Running symbolic analysis for autoexec.0x8A000.bin...
Function analysis... [91% done, ETA 0:00:19]...
In [45]: html.quick(D)                                       # process all dumps from list D

In [46]: html.full(D)
In [47]: html.update(ml)                                     # only re-generate modified pages (experimental).

Some small workers:

In [48]: html.link2addr(0x8B194)
Out[48]: <a href="0008a000.htm#_8B194">8B194</a>

In [49]: html.link2func(ml.Fun("fprintf"))
Out[49]: <a href="sub_0008eec0.htm">fprintf</a>

In [50]: html.link2funcoff(0x8eee0)

disasm[edit | edit source]

Disassembly and code browsing.

In [51]: hex(BYTE(ml, 0x916b8))
Out[51]: 6D

In [52]: hex(UINT32(ml, 0x916b8))
Out[52]: 6967616D

In [53]: GuessString(ml, 0x916b8)
Out[53]: magic lantern init done
In [54]: funcname(ml, 0x8eec0)
Out[54]: fprintf

In [55]: funcname(ml, 0x8b000)
Out[55]: sub_8B000

In [56]: dataname(ml, 0x1ED0)
Out[56]: 0x1ED0 (sounddev)
In [57]: guess_data(ml,0x8c228)
Out[57]: 0x8c228: pointer to 0x2E5B0 (bmp_vram_info)

In [58]: guess_data(ml,0x8e748)
Out[58]: 0x8e748: pointer to 'They try to set code to %d'

In [59]: guess_data(ml,0x8dca8)
Out[59]: @menu_add

fileutil[edit | edit source]

Utils for working with files.

In [60]: change_ext("foo.py", ".jpg")
Out[60]: foo.jpg
s = capture(func, *args, **kwargs) # capture output and result of func

stats[edit | edit source]

Sorts functions by number of calls to them:

In [61]: stats.calls_to(t2i)
 1753: dialog_label_item
 2555: assert_0
16694: DebugMsg

Or by how many other functions they call:

In [62]: stats.calls_from(t2i)
  103:                                       sub_FF300BA0         sub_FF2FF400,JudgeBottomInfoDispT ...
  133:                                       sub_FF0340B4         release_mem,sub_FF031E08,LVCAF_Re ...
  215:                                        IDLEHandler         release_mem,EndMovieRecSequence_N ...

The most called functions seem to be DebugMsg and assert.

idapy[edit | edit source]

IDAPython compatibility layer. Also contains functions from IDAPython/utils.py and other small stuff.

They are NOT fully compatible with IDAPython! I only wrote it to help porting my existing scripts.

The first thing you have to do before using any of those functions is:

In [63]: select_dump(dump)

where dump is a Dump object; for example:

In [64]: select_dump(t2i)

Or use the magic command sel, which does the same thing.

In [65]: GetFirstWord("abc def")
Out[65]: abc

In [66]: getRegs03("R1, R2 and maybe R5")
Out[66]: [1, 2]

In [67]: getRegsS("R1, R2 and maybe R5")
Out[67]: ['R1', 'R2', 'R5']

In [68]: filter_non_printable("\x07buzz!")
Out[68]: buzz!

In [69]: print GetDisasm(0xff064bbc)
subeq   r1, pc, #2944   ; 0xb80

In [70]: GetMnem(0xff064bbc)
Out[70]: SUB

In [71]: GetMnef(0xff064bbc)
Out[71]: SUBEQ

In [72]: GetOpnd(0xff064bbc, 0)
Out[72]: R1

In [73]: GetOpnd(0xff064bbc, 1)
Out[73]: PC

In [74]: GetOpnd(0xff064bbc, 2)
Out[74]: #2944

In [75]: GetOpType(0xff064bbc, 0)
Out[75]: 1

In [76]: GetOpType(0xff064bbc, 1)
Out[76]: 1

In [77]: GetOpType(0xff064bbc, 2)
Out[77]: 5

In [78]: GetOperandValue(0xff064bbc, 0)
Out[78]: 1

In [79]: GetOperandValue(0xff064bbc, 1)
Out[79]: 15

In [80]: GetOperandValue(0xff064bbc, 2)
Out[80]: 2944
In [81]: print GetDisasm(0xff48e5a4)
stmiane   r5!, {r6, r10, r12, lr}

In [82]: GetModeSuffix(0xff48e5a4)
Out[82]: IA

In [83]: GetCondSuffix(0xff48e5a4)
Out[83]: NE

In [84]: OppositeSuffix("EQ")
Out[84]: NE
In [85]: print GetDisasm(0xff064b90)
ldrb   r0, [r0, #16]

In [86]: GetExtraSuffixes(0xff064b90)
Out[86]: B

In [87]: GetByteSuffix(0xff064b90)
Out[87]: B

In [88]: GetHalfwordSuffix(0xff064b90)

In [89]: GetFlagSuffix(0xff064b90)
In [90]: print GetDisasm(0xff064bb4)
cmpne   r0, #5   ; 0x5

In [91]: ChangesFlags(0xff064bb4)
Out[91]: True
In [92]: GetString(0xff011e1c)
Out[92]: akashimorino
In [93]: isFuncStart(0xff064e98)
Out[93]: True

In [94]: GetFunctionName(0xff064e98)
Out[94]: SetFilterOff

In [95]: GetFuncOffset(0xff064e98)
Out[95]: SetFilterOff

In [96]: FuncItems(0xff064e98)
Out[96]: [4278603416L, 4278603420L, 4278603424L, 4278603428L, 4278603432L, 4278603436L, 4278603440L, 4278603444L]
In [97]: SegStart(0xff064e98)
Out[97]: 4278255616

In [98]: SegEnd(0xff064e98)
Out[98]: 4283881244
In [99]: CodeRefsTo(0xff064e98)
Out[99]: [4278603056L, 4278599968L]

In [100]: DataRefsTo(1ED0)

In [101]: CodeRefsFrom(0xff064e98)
Out[101]: [4278612972L, 4278598644L]

In [102]: DataRefsFrom(0xff064e98)
Out[102]: [4278603672L, 1182033235, 3, 20, 3912040463L, 12544, 3912056945L]

bunch[edit | edit source]

Something like a struct.

In [103]: b = Bunch(foo=5, baz="hello")

In [104]: b.foo
Out[104]: 5

In [105]: b.baz
Out[105]: hello

emusym[edit | edit source]

Symbolic emulation of ASM code, based on SymPy. Theory is here: IDAPython/Static_analysis

These functions operate on the currently selected dump, therefore you have to select one first.

In [106]: sel ml

In [107]: CP = emusym.find_code_paths(0x8dac8)                # extract code path by doing a recursive branch analysis

In [108]: emusym.create_graph(CP, "myfunc.svg")               # generate a graph of code flow, with graphviz
In [109]: cp = CP[0]                                          # select first code path

In [110]: emusym.resetArm()                                   # reset the symbolic ARM registers and memory contents

In [111]: emusym.emusym_code_path(cp)                         # run one code paths under symbolic emulation

In [112]: emusym.emusym_code_path(cp, codetree=True)          # return a code tree, useful for decompiling
Out[112]: SEQ(IF(MEM(8 + MEM(unk_R3)), IFB(NE, SEQ(IF(MEM(4 + MEM(unk_R3)), IFB(TRUE, SEQ(MEMWRITE( ...

In [113]: emusym.split_code_path_ea_cond(cp)                  # split the code path in two plain lists
Out[113]: ([580296, 580300, 580308, 580312, 580316, 580320, 580328, 580332, 580336, 580344, 580344], 
[[], [], ['NE'], ['NE'], ['NE'], [], ['EQ'], ['EQ'], ['EQ'], ['EQ'], ['EQ']])
In [114]: emusym.GuessFunction(0x8ea48)                       # guesses the end of the function, when you know the start
Size: 424
Out[114]: 584684

bkt[edit | edit source]

Backtracing: solves the contents of ARM registers / memory addresses by symbolic emulation of ASM code, going backwards until all requested unknowns are found. Theory is here: IDAPython/Backtracing.

In [115]: ea = 0x8CE98

In [116]: g ea-28
   8ce7c:	e1a0700a 	mov	r7, r10
   8ce80:	e3a00032 	mov	r0, #50	; 0x32
   8ce84:	e3a01003 	mov	r1, #3	; 0x3
   8ce88:	e59f2090 	ldr	r2, [pc, #144]	; **'%s: ERROR Deleting config'
   8ce8c:	e59f3074 	ldr	r3, [pc, #116]	; **'config_parse'
   8ce90:	e59fa074 	ldr	r10, [pc, #116]	; 0x8cf0c: pointer to 0xFF0673EC (DebugMsg)
   8ce94:	e1a0e00f 	mov	lr, pc
   8ce98:	e12fff1a 	bx	r10
   8ce9c:	e3570000 	cmp	r7, #0	; 0x0
   8cea0:	159f507c 	ldrne	r5, [pc, #124]	; 0x8cf24: pointer to 0xFF0182CC (free)

In [117]: bkt.back_solve(ea, ['ARM.R0','ARM.R1'])             # a faster heuristic
Out[117]: [50, 3]

In [118]: bkt.back_solve_slow(ea, ['ARM.R0','ARM.R1'])        # slow, dumb, but reliable
Out[118]: [50, 3]

In [119]: bkt.go_back([ea])
Out[119]: [577172, 577176]

In [120]: bkt.go_back([ea-4,ea])
Out[120]: [577168, 577172, 577176]

In [121]: bkt.find_func_call(ea, 4)
Out[121]: (4278612972L, 'DebugMsg', "(0x32, 3, '%s: ERROR Deleting config', 'config_parse')")

In [122]: bkt.trace_calls_to("DebugMsg", 4)
Function found at ff0673ec
Out[122]: {570628: (4278612972L, 'DebugMsg', "(0x12, 3, '%s created (and exiting)', 'null_task')"), ...

These functions find the function which is called at a given address, using backtracing if needed.

In [123]: bkt.subaddr_bl(0x8CD74)
Out[123]: 575428

In [124]: bkt.subaddr_bx(0x8CE98)
Out[124]: 4278612972

In [125]: bkt.subaddr_mov(addr)                               # for MOV PC, ...

deco[edit | edit source]

Experimental decompiler based on SymPy (see module emusym). Only works for functions without loops.

In [126]: print deco.decompile(0x8D070)
*(-4 + sp0) = lr0
*(-8 + sp0) = unk_R6
*(-12 + sp0) = unk_R5
*(-16 + sp0) = unk_R4
FIO_Open(arg0, 0x1000, arg2, ...) => ret_FIO_Open_8D084
strcpy(0xb2818, arg0, arg2, ...) => ret_strcpy_8D0A0
if ret_FIO_Open_8D084 == -1:
    *(0xb27d4) = 0xb2854
    return 0xb27d4
config_parse(ret_FIO_Open_8D084, arg0, arg2, strcpy) => ret_config_parse_8D0B8
FIO_CloseFile(handle=ret_FIO_Open_8D084) => ret_FIO_CloseFile_8D0CC
*(0xb27d4) = ret_config_parse_8D0B8
return 0xb27d4

cache[edit | edit source]

Stores the result of lengthy computations.

In [127]: cache.access("my_computation", func_which_takes_ages_to_run)

In [128]: cache.disable()

In [129]: cache.enable()

In [130]: cache.clear()

There is also some experimental support for persistent cache, but it's disabled by default.

progress[edit | edit source]

Simple progress indicator; also computes ETA.

In [131]: progress("Doing some useless stuff...")
In [132]: time.sleep(1)
In [133]: progress(0.25)
Doing some useless stuff... [25% done, ETA 0:00:03]...
In [134]: time.sleep(2)
In [135]: progress(0.75)
Doing some useless stuff... [75% done, ETA 0:00:01]...

doc[edit | edit source]

Make Wiki docs from IPython commands included in main text. Something like auto-generating docs from docstrings.

You write:

This is an example:

and you get:

This is an example:
 In [1]: '''2+2'''
 Out[1]: 4

All the code to be executed in IPython is prefixed by:

  • four spaces: normal code, i.e. show input command, run it and show output.
  • three spaces and %: show input command, run it, but do not show output.
  • three spaces and ~: show input command, but do not run it.
  • three spaces and #: run the command, but do not show anything.
  • three spaces and [list of indices]: display only those lines of output; use a string instead of index to display anything you like.



displays the first line, second line, ellipsis and then the last line. Also, in this mode, max line length is limited to 100.

Spacing is critical, so pay attention! Also, 4 spaces and % is a magic command.

To make a doc:


Result is in scripts/doc/api-ref.wiki, with Wiki markup codes.


--Alexdu 16:56, December 1, 2010 (UTC)

Community content is available under CC-BY-SA unless otherwise noted.