Magic Lantern Firmware Wiki
Advertisement

This is the script for matching functions and data addresses between a bunch of IDC databases.

Usage

[code will be uploaded soon]

Dependencies

  • Python (I use 2.6 under Linux). Just to be sure, install numpy/scipy and ipython.
  • arm-elf-gcc in your PATH (see Build instructions/550D for how to do that)
  • It doesn't require IDAPython nor IDA, just the IDC files.

Preparing input files

Prepare a working directory where you will put the input files. You will need:

  • Some dumps, with the .bin extension. Include the load address in the dump name.
  • Some IDC files. Try to give them names somewhat similar to the dumps, to help the autodetection.
  • The script (called for now match.py), in the same folder (or in PATH, if you like)

For example, those names are valid:

5D_204_06_0xff810000.bin
550D_108_05_0xff010000.bin
500D_0xff010000.bin
5D 204 AJROM0.idc
550D_108_20101116_indy_ROM0.idc

Running

Then just say:

python match.py

and you should get something like this:

Input files:
===============================================================================
           Binary dump (*.bin)     LoadAddr     IDC database (*.idc)    
===============================================================================
      5D_204_06_0xff810000.bin     FF810000     5D 204 AJROM0.idc
           500D_0xff010000.bin     FF010000     n/a
    550D_108_05_0xff010000.bin     FF010000     550D_108_20101116_indy_ROM0.idc
===============================================================================
Disassembling 5D_204_06_0xff810000.bin <ff810000>... ok
Disassembling 500D_0xff010000.bin <ff010000>... ok
Disassembling 550D_108_05_0xff010000.bin <ff010000>... ok
Parsing 5D 204 AJROM0.idc... found 40692 MakeName's and 19191 MakeFunction's
Parsing 550D_108_20101116_indy_ROM0.idc... found 56768 MakeName's and 18053 MakeFunction's
Parsing disassembly of 5D_204_06_0xff810000.bin...
   found 1263894 lines
Parsing disassembly of 500D_0xff010000.bin...
   found 1171162 lines
Parsing disassembly of 550D_108_05_0xff010000.bin...
   found 1395198 lines
Creating codesigs for 5D_204_06_0xff810000.bin...
Creating codesigs for 550D_108_05_0xff010000.bin...
saving cache... ok
Found 6623 raw code matches between 550D_108_05_0xff010000.bin and 5D_204_06_0xff810000.bin.

Results

To find the results, just sort the working directory by modification date.

  • match-log.txt: shows detailed info about the matching process, for each pair of functions.


Advanced use and debugging

Interactive console

You can run it in IPython; after the script finishes, you can poke around and make various queries.

$ ipython
In [1]: run match.py
...
In [2]: bins                                  # what dumps we have loaded?
Out[2]: ['550D_108_05_0xff010000.bin', '5D_204_06_0xff810000.bin']

In [3]: t2i, mk2 = bins                       # give a short name to each one
In [4]: hex(D[t2i].ROM[0xff011e1c])           # read from ROM; only multiples of 4 allowed here
Out[4]: '0x73616b61'
In [5]: BYTE(t2i, 0xff011bde)                 # this is for any address; reads a single byte from ROM
Out[5]: 143
In [6]: GuessString?                          # how to get help for a function
...
Definition:    GuessString(ROM, a)
...
In [7]: GuessString(t2i, 0xff011e1c)          # find a string starting from a known address
Out[7]: 'akashimorino'

Internals

A dump is identified by its file name, used as index into the various dictionaries used.

Global variables

  • bins: list of dumps (i.e. file names with .bin extension)
  • loadaddrs: dictionary of load addresses for each dump
  • idcs: dictionary of idc file names for each dump
  • D: dictionary containing lots of info about dumps: ROM contents, IDC names, functions, signatures...


Functions

  • BYTE(bin, addr): read a byte from the ROM, from the dump whose file name is bin
  • GuessString(bin, addr): detect a string starting from addr
  • funcname(bin, addr): function name extracted from IDC, or sub_ABCD1234 if it's not found
  • getname(bin, addr): similar to funcname, but used for other names (not functions).


Functions for interactive use

  • find_funcs(bin, regex): find functions using a regex string
In [1]: find_funcs(t2i, r"Flavor[C|S]")
ff205b24: FlavorSharpness
ff205c14: FlavorContrast
ff205dac: FlavorSaturation
ff205e9c: FlavorColorTone
  • find_data_ref(bin, value): look for references to a given value.
In [1]: find_data_ref(t2i,0x2b74)

DebugMsg+112:
ff067458:	2a000003 	bcs	ff06746c <_binary_550D_108_05_0xff010000_bin_start+0x5746c>
ff06745c:	e59f00f4 	ldr	r0, [pc, #244]	; ff067558 <_binary_550D_108_05_0xff010000_bin_start+0x57558>
ff067460:	e7901101 	ldr	r1, [r0, r1, lsl #2]
pointer to 0x2b74

... etc ...
  • guess_data(bin, value): return a friendly name for value. It detects whether value is a function address, a pointer to a string or a pointer to some other value in ROM (or just a plain number).
In [1]: print guess_data(t2i,0xff05de04)
pointer to 0x2e5b0

In [2]: print guess_data(t2i,0xff011e1c)
'akashimorino'

In [3]: print guess_data(t2i,0xff0673ec)
@DebugMsg
Advertisement