Magic Lantern Firmware Wiki
Line 101: Line 101:
   
 
* funcname(bin, addr): function name extracted from IDC, or sub_ABCD1234 if it's not found
 
* funcname(bin, addr): function name extracted from IDC, or sub_ABCD1234 if it's not found
In [8]: '''funcname(0xFF28AA58, t2i)'''
+
In [8]: '''funcname(t2i, 0xFF28AA58)'''
 
Out[8]: 'GetJpegInfo'
 
Out[8]: 'GetJpegInfo'
   
 
* getname(bin, addr): similar to funcname, but used for other names (not functions).
 
* getname(bin, addr): similar to funcname, but used for other names (not functions).
In [10]: '''getname(0x26284, t2i)'''
+
In [10]: '''getname(t2i, 0x26284)'''
 
Out[10]: '0x26284 (sd_device)'
 
Out[10]: '0x26284 (sd_device)'
   

Revision as of 14:23, 18 November 2010

This is the script for matching functions and data addresses between a bunch of IDC databases.

Usage

[code will be uploaded soon]

Dependencies

  • Python (I use 2.6 under Linux). Just to be sure, install numpy/scipy and ipython.
  • arm-elf-gcc in your PATH (see Build instructions/550D for how to do that)
  • It doesn't require IDAPython nor IDA, just the IDC files.

Preparing input files

Prepare a working directory where you will put the input files. You will need:

  • Some dumps, with the .bin extension. Include the load address in the dump name.
  • Some IDC files. Try to give them names somewhat similar to the dumps, to help the autodetection.
  • The script (called for now match.py), in the same folder (or in PATH, if you like)

For example, those names are valid:

5D_204_06_0xff810000.bin
550D_108_05_0xff010000.bin
500D_0xff010000.bin
5D 204 AJROM0.idc
550D_108_20101116_indy_ROM0.idc

Running

Then just say:

python match.py

and you should get something like this:

Input files:
===============================================================================
           Binary dump (*.bin)     LoadAddr     IDC database (*.idc)    
===============================================================================
      5D_204_06_0xff810000.bin     FF810000     5D 204 AJROM0.idc
           500D_0xff010000.bin     FF010000     n/a
    550D_108_05_0xff010000.bin     FF010000     550D_108_20101116_indy_ROM0.idc
===============================================================================
Disassembling 5D_204_06_0xff810000.bin <ff810000>... ok
Disassembling 500D_0xff010000.bin <ff010000>... ok
Disassembling 550D_108_05_0xff010000.bin <ff010000>... ok
Parsing 5D 204 AJROM0.idc... found 40692 MakeName's and 19191 MakeFunction's
Parsing 550D_108_20101116_indy_ROM0.idc... found 56768 MakeName's and 18053 MakeFunction's
Parsing disassembly of 5D_204_06_0xff810000.bin...
   found 1263894 lines
Parsing disassembly of 500D_0xff010000.bin...
   found 1171162 lines
Parsing disassembly of 550D_108_05_0xff010000.bin...
   found 1395198 lines
Creating codesigs for 5D_204_06_0xff810000.bin...
Creating codesigs for 550D_108_05_0xff010000.bin...
saving cache... ok
Found 6623 raw code matches between 550D_108_05_0xff010000.bin and 5D_204_06_0xff810000.bin.

Results

To find the results, just sort the working directory by modification date.

  • match-log.txt: shows detailed info about the matching process, for each pair of functions.


Advanced use and debugging

Interactive console

You can run it in IPython; after the script finishes, you can poke around and make various queries.

$ ipython
In [1]: run match.py
...


Internals

A dump is identified by its file name, used as index into the various dictionaries used.

Global variables

  • bins: list of dumps (i.e. file names with .bin extension)
In [2]: bins                                  # what dumps we have loaded?
Out[2]: ['550D_108_05_0xff010000.bin', '5D_204_06_0xff810000.bin']

In [3]: t2i, mk2 = bins                       # give a short name to each one
  • loadaddrs: dictionary of load addresses for each dump
  • idcs: dictionary of idc file names for each dump
  • D: dictionary containing lots of info about dumps: ROM contents, IDC names, functions, signatures...
In [4]: D[t2i].ROM[0xff011e1c]
Out[4]: 1935764321


Functions

  • BYTE(bin, addr): read a byte from the ROM, from the dump whose file name is bin
In [5]: hex(BYTE(t2i, 0xff011e1c))
Out[5]: '0x61'
  • INT32(bin, addr): shortcut for D[bin].ROM[addr]
In [6]: hex(INT32(t2i, 0xff011e1c))
Out[6]: '0x73616b61'
  • GuessString(bin, addr): detect a string starting from addr
In [7]: GuessString(t2i, 0xff011e1c)
Out[7]: 'akashimorino'
  • funcname(bin, addr): function name extracted from IDC, or sub_ABCD1234 if it's not found
In [8]: funcname(t2i, 0xFF28AA58)
Out[8]: 'GetJpegInfo'
  • getname(bin, addr): similar to funcname, but used for other names (not functions).
In [10]: getname(t2i, 0x26284)
Out[10]: '0x26284 (sd_device)'

Functions for interactive use

  • find_funcs(bin, regex, ratio=1, num=10): Find functions using either a regex search, or a fuzzy string match
In [1]: find_funcs(t2i, r"Flavor[C|S]")             # when ratio=1, it uses a regex search
ff205b24: FlavorSharpness
ff205c14: FlavorContrast
ff205dac: FlavorSaturation
ff205e9c: FlavorColorTone
In [2]: find_funcs(mk2, "DebugMsg", 0.5)            # when ratio < 1, this is the min. allowed ratio for fuzzy search      
ff86af48: TH_DebugMsg
ff9b7660: AJ_called_by_DebugMsg
ff86b22c: AJ_DbgMgr.c
  • find_refs(bin, value): look for references to a given name or value.
In [1]: find_refs(t2i,0x2b74)

DebugMsg+112:
ff067458:	2a000003 	bcs	ff06746c <_binary_550D_108_05_0xff010000_bin_start+0x5746c>
ff06745c:	e59f00f4 	ldr	r0, [pc, #244]	; ff067558 <_binary_550D_108_05_0xff010000_bin_start+0x57558>
ff067460:	e7901101 	ldr	r1, [r0, r1, lsl #2]
pointer to 0x2b74

... etc ...
In [2]: find_refs(t2i,"sounddev")
...


  • guess_data(bin, value): return a friendly name for value. It detects whether value is a function address, a pointer to a string or a pointer to some other value in ROM (or just a plain number).
In [1]: print guess_data(t2i,0xff05de04)
pointer to 0x2e5b0

In [2]: print guess_data(t2i,0xff011e1c)
'akashimorino'

In [3]: print guess_data(t2i,0xff0673ec)
@DebugMsg
  • show_diasam(bin, start, end): displays disassembly of the code between start and end address.
  • show_func(bin, f): displays disassembly of a function, given by name or address.
In [1]: show_func(t2i, "SetFilterOff")

// Start of function: SetFilterOff
NSTUB(SetFilterOff, ff064e98):
ff064e98:	e92d4010 	push	{r4, r14}
ff064e9c:	e28f20f4 	add	r2, pc, #244	; *SetFilterOff
ff064ea0:	e3a01003 	mov	r1, #3	; 0x3
ff064ea4:	e3a00014 	mov	r0, #20	; 0x14
ff064ea8:	eb00094f 	bl	@DebugMsg	
ff064eac:	e8bd4010 	pop	{r4, r14}
ff064eb0:	e3a00c31 	mov	r0, #12544	; 0x3100
ff064eb4:	eafffb4e 	b	@audio_ic_write	
// End of function: sub_FF064EB4