Magic Lantern Firmware Wiki
Register
Line 64: Line 64:
   
 
===Interactive console===
 
===Interactive console===
The script invokes IPython, where you can browse the dump, find matches between firmware versions, and lots of other cool stuff.
+
The script invokes IPython at the end of the automatic initial analysis; here you can browse the dump, find/verify matches between firmware versions, and lots of other cool stuff.
   
 
$ python main.py
 
$ python main.py

Revision as of 23:18, 19 November 2010

This is the script for matching functions and data addresses between a bunch of IDC databases.

Theory (how it works): IDAPython/Firmware matching

Usage

[code will be uploaded soon]

Dependencies

  • Python (I use 2.6 under Linux). Just to be sure, install numpy/scipy and ipython.
  • arm-elf-gcc in your PATH (see Build instructions/550D for how to do that)
  • It doesn't require IDAPython nor IDA, just the IDC files.

Preparing input files

Prepare a working directory where you will put the input files. You will need:

  • Some dumps, with the .bin extension. Include the load address in the dump name.
  • Some IDC files. Try to give them names somewhat similar to the dumps, to help the autodetection.
  • The script (called for now match.py), in the same folder (or in PATH, if you like)

For example, those names are valid:

5D_204_06_0xff810000.bin
550D_108_05_0xff010000.bin
500D_0xff010000.bin
5D 204 AJROM0.idc
550D_108_20101116_indy_ROM0.idc

Running

Then just say:

python match.py

and you should get something like this:

Input files:
===============================================================================
           Binary dump (*.bin)     LoadAddr     IDC database (*.idc)    
===============================================================================
      5D_204_06_0xff810000.bin     FF810000     5D 204 AJROM0.idc
           500D_0xff010000.bin     FF010000     n/a
    550D_108_05_0xff010000.bin     FF010000     550D_108_20101116_indy_ROM0.idc
===============================================================================
Disassembling 5D_204_06_0xff810000.bin <ff810000>... ok
Disassembling 500D_0xff010000.bin <ff010000>... ok
Disassembling 550D_108_05_0xff010000.bin <ff010000>... ok
Parsing 5D 204 AJROM0.idc... found 40692 MakeName's and 19191 MakeFunction's
Parsing 550D_108_20101116_indy_ROM0.idc... found 56768 MakeName's and 18053 MakeFunction's
Parsing disassembly of 5D_204_06_0xff810000.bin...
   found 1263894 lines
Parsing disassembly of 500D_0xff010000.bin...
   found 1171162 lines
Parsing disassembly of 550D_108_05_0xff010000.bin...
   found 1395198 lines
Creating codesigs for 5D_204_06_0xff810000.bin...
Creating codesigs for 550D_108_05_0xff010000.bin...
saving cache... ok
Found 6623 raw code matches between 550D_108_05_0xff010000.bin and 5D_204_06_0xff810000.bin.

Results

To find the results, just sort the working directory by modification date.

  • match-log.txt: shows detailed info about the matching process, for each pair of functions.


Advanced use and debugging

Interactive console

The script invokes IPython at the end of the automatic initial analysis; here you can browse the dump, find/verify matches between firmware versions, and lots of other cool stuff.

$ python main.py

... lots of messages ...
ARM firmware analysis console ready.
In [1]:

Internals

[rewriting...]

Global variables

  • D: dictionary containing Dump objects, indexed by their file name.
In [1]: D
Out[1]:
{'550D_108_05_0xff010000.bin': Dump of 550D_108_05_0xff010000.bin,
 '5D_204_06_0xff810000.bin': Dump of 5D_204_06_0xff810000.bin}
In [2]: D.values()
Out[2]: [Dump of 5D_204_06_0xff810000.bin, Dump of 550D_108_05_0xff010000.bin]
In [3]: mk2,t2i = D.values()

Classes/objects

Dump

Contains all info about a dump.

Fields Uppercase's are dictionaries.

In [4]: t2i. <TAB>
t2i.A2N                t2i.Fun                t2i.__class__          t2i.__str__            t2i.loadaddr
t2i.ARGS               t2i.MNEF               t2i.__doc__            t2i._get_strings       t2i.refs
t2i.DATA               t2i.N2A                t2i.__init__           t2i._get_strings_work  t2i.strings
t2i.DISASM             t2i.RAWASM             t2i.__module__         t2i.bin                t2i.strrefs
t2i.FUNCS              t2i.ROM                t2i.__repr__           t2i.funcs              
In [10]: t2i.bin, t2i.loadaddr
Out[10]: ('550D_108_05_0xff010000.bin', 4278255616L)
In [11]: t2i.funcs("japan")
ff20cf64: get_JapanLang_struct_14c48_2a0
ff4369ac: StopMnLanguageJapanApp
ff436d30: StartMnLanguageJapanApp
ff0978bc: GUI_LimitLangJapan
ff4369e4: MnLanguageJapan_handler
ff436de0: language_japan_something
In [12]: t2i.refs("sounddev")
SoundDevStartOut+32:
ff053d70:	e51f4b54 	ldr	r4, [pc, #-2900]	; ff053224 <_binary_550D_108_05_0xff010000_bin_start+0x43224>
pointer to 0x1ED0 (sounddev)
...etc...
In [13]: t2i.strrefs("^CreateTask$")
String references to ff06e2c8 'CreateTask':
createTask_maybe+68:
ff06e158:	028f0f5a 	addeq	r0, pc, #360	; *'CreateTask'
'CreateTask'


Fun

ASM function. Constructor:

  • Fun(dump, name_or_addr)
  • dump.Fun(name_or_addr)
In [50]: f = Fun(t2i,"setFiltreOff")
Unknown function: setFiltreOff. Using closest match: SetFilterOff.

In [51]: f
Out[51]: SetFilterOff at 0xff064e98 in 550D_108_05_0xff010000.bin

Fields:

In [52]: f. <TAB>
f.__class__   f.__init__    f.__repr__    f._get_end    f._get_size   f.called_by   f.disasm      f.end         f.sig
f.__doc__     f.__module__  f.__str__     f._get_sig    f.addr        f.calls       f.dump        f.refs        f.size
In [53]: "%x"%f.addr
Out[53]: 'ff064e98'

In [54]: "%x"%f.end
Out[54]: 'ff064eb8'
In [55]: f.size
Out[55]: 32
In [60]: f.called_by()
ff064120: sub_FF064114+12
ff064d30: UnpowerMicAmp+40
In [61]: f.calls()
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eb4:	eafffb4e 	b	@audio_ic_write	
In [62]: f.disasm()
// Start of function: SetFilterOff
NSTUB(SetFilterOff, ff064e98):
ff064e98:	e92d4010 	push	{r4, r14}
ff064e9c:	e28f20f4 	add	r2, pc, #244	; *SetFilterOff
ff064ea0:	e3a01003 	mov	r1, #3	; 0x3
ff064ea4:	e3a00014 	mov	r0, #20	; 0x14
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eac:	e8bd4010 	pop	{r4, r14}
ff064eb0:	e3a00c31 	mov	r0, #12544	; 0x3100
ff064eb4:	eafffb4e 	b	@audio_ic_write	
// End of function: sub_FF064EB4
In [63]: f.sig
'push add mov mov bl pop mov b '


Functions

...maybe should be moved into Dump class?

  • BYTE(dump, addr): read a byte from the ROM, from the dump object dump
In [5]: hex(BYTE(t2i, 0xff011e1c))
Out[5]: '0x61'
  • INT32(dump, addr): shortcut for D[bin].ROM[addr]
In [6]: hex(INT32(t2i, 0xff011e1c))
Out[6]: '0x73616b61'
  • GuessString(dump, addr): detect a string starting from addr
In [7]: GuessString(t2i, 0xff011e1c)
Out[7]: 'akashimorino'
  • funcname(dump, addr): function name extracted from IDC, or sub_ABCD1234 if it's not found
In [8]: funcname(t2i, 0xFF28AA58)
Out[8]: 'GetJpegInfo'
  • getname(dump, addr): similar to funcname, but used for other names (not functions).
In [10]: getname(t2i, 0x26284)
Out[10]: '0x26284 (sd_device)'
  • guess_data(dump, value): return a friendly name for value. It detects whether value is a function address, a pointer to a string or a pointer to some other value in ROM (or just a plain number).
In [1]: print guess_data(t2i,0xff05de04)
pointer to 0x2e5b0

In [2]: print guess_data(t2i,0xff011e1c)
'akashimorino'

In [3]: print guess_data(t2i,0xff0673ec)
@DebugMsg

Functions for interactive use

They do not return or change anything, just display stuff at the console.

  • find_funcs(bin, regex, ratio=1, num=10): Find functions using either a regex search, or a fuzzy string match
In [1]: find_funcs(t2i, r"Flavor[C|S]")             # when ratio=1, it uses a regex search
ff205b24: FlavorSharpness
ff205c14: FlavorContrast
ff205dac: FlavorSaturation
ff205e9c: FlavorColorTone
In [2]: find_funcs(mk2, "DebugMsg", 0.5)            # when ratio < 1, this is the min. allowed ratio for fuzzy search      
ff86af48: TH_DebugMsg
ff9b7660: AJ_called_by_DebugMsg
ff86b22c: AJ_DbgMgr.c
  • find_refs(bin, value): look for references to a given name or value.
In [1]: find_refs(t2i,0x2b74)

DebugMsg+112:
ff067458:	2a000003 	bcs	ff06746c <_binary_550D_108_05_0xff010000_bin_start+0x5746c>
ff06745c:	e59f00f4 	ldr	r0, [pc, #244]	; ff067558 <_binary_550D_108_05_0xff010000_bin_start+0x57558>
ff067460:	e7901101 	ldr	r1, [r0, r1, lsl #2]
pointer to 0x2b74

... etc ...
In [2]: find_refs(t2i,"sounddev")
... lots of entries ...
  • show_diasam(bin, start, end): displays disassembly of the code between start and end address.
  • show_func(bin, f): displays disassembly of a function, given by name or address.
In [1]: show_func(t2i, "SetFilterOff")

// Start of function: SetFilterOff
NSTUB(SetFilterOff, ff064e98):
ff064e98:	e92d4010 	push	{r4, r14}
ff064e9c:	e28f20f4 	add	r2, pc, #244	; *SetFilterOff
ff064ea0:	e3a01003 	mov	r1, #3	; 0x3
ff064ea4:	e3a00014 	mov	r0, #20	; 0x14
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eac:	e8bd4010 	pop	{r4, r14}
ff064eb0:	e3a00c31 	mov	r0, #12544	; 0x3100
ff064eb4:	eafffb4e 	b	@audio_ic_write	
// End of function: sub_FF064EB4