类型
Radare2 supports the C-syntax data types description. Those types are parsed by a C11-compatible parser and stored in the internal SDB, thus are introspectable with k command.
Most of the related commands are located in t namespace:
[0x000051c0]> t?
Usage: t # cparse types commands
| t List all loaded types
| tj List all loaded types as json
| t <type> Show type in 'pf' syntax
| t* List types info in r2 commands
| t- <name> Delete types by its name
| t-* Remove all types
| tail [filename] Output the last part of files
| tc[type.name] List all/given types in C output format
| te[?] List all loaded enums
| td[?] <string> Load types from string
| tf List all loaded functions signatures
| tk <sdb-query> Perform sdb query
| tl[?] Show/Link type to an address
| tn[?] [-][addr] manage noreturn function attributes and marks
| to - Open cfg.editor to load types
| to <path> Load types from C header file
| toe[type.name] Open cfg.editor to edit types
| tos <path> Load types from parsed Sdb database
| tp <type> [addr|varname] cast data at <address> to <type> and print it
| tpx <type> <hexpairs> Show value for type with specified byte sequence
| ts[?] Print loaded struct types
| tu[?] Print loaded union types
| tx[f?] Type xrefs
| tt[?] List all loaded typedefsNote that the basic (atomic) types are not those from C standard - not char, _Bool, or short. Because those types can be different from one platform to another, radare2 uses definite types like as int8_t or uint64_t and will convert int to int32_t or int64_t depending on the binary or debuggee platform/compiler.
Basic types can be listed using t command, for the structured types you need to use ts, tu or te for enums:
Loading types
There are three easy ways to define a new type:
Directly from the string using
tdcommandFrom the file using
to <filename>commandOpen an
$EDITORto type the definitions in place usingto -
Also note there is a config option to specify include directories for types parsing
Printing types
Notice below we have used ts command, which basically converts the C type description (or to be precise it's SDB representation) into the sequence of pf commands. See more about print format.
The tp command uses the pf string to print all the members of type at the current offset/given address:
Also, you could fill your own data into the struct and print it using tpx command
Linking Types
The tp command just performs a temporary cast. But if we want to link some address or variable with the chosen type, we can use tl command to store the relationship in SDB.
Moreover, the link will be shown in the disassembly output or visual mode:
Once the struct is linked, radare2 tries to propagate structure offset in the function at current offset, to run this analysis on whole program or at any targeted functions after all structs are linked you have aat command:
Note sometimes the emulation may not be accurate, for example as below :
The return value of malloc may differ between two emulations, so you have to set the hint for return value manually using ahr command, so run tl or aat command after setting up the return value hint.
Structure Immediates
There is one more important aspect of using types in radare2 - using aht you can change the immediate in the opcode to the structure offset. Lets see a simple example of [R]SI-relative addressing
Here 8 - is some offset in the memory, where rsi probably holds some structure pointer. Imagine that we have the following structures
Now we need to set the proper structure member offset instead of 8 in this instruction. At first, we need to list available types matching this offset:
Note, that ms2 is not listed, because it has no members with offset 8. After listing available options we can link it to the chosen offset at the current address:
Managing enums
Printing all fields in enum using
tecommand
Finding matching enum member for given bitfield and vice-versa
Internal representation
To see the internal representation of the types you can use tk command:
Defining primitive types requires an understanding of basic pf formats, you can find the whole list of format specifier in pf??:
there are basically 3 mandatory keys for defining basic data types: X=type type.X=format_specifier type.X.size=size_in_bits For example, let's define UNIT, according to Microsoft documentation UINT is just equivalent of standard C unsigned int (or uint32_t in terms of TCC engine). It will be defined as:
Now there is an optional entry:
X.type.pointto=Y
This one may only be used in case of pointer type.X=p, one good example is LPFILETIME definition, it is a pointer to _FILETIME which happens to be a structure. Assuming that we are targeting only 32-bit windows machine, it will be defined as the following:
This last field is not mandatory because sometimes the data structure internals will be proprietary, and we will not have a clean representation for it.
There is also one more optional entry:
This entry is for integration with C parser and carries the type class information: integer size, signed/unsigned, etc.
Structures
Those are the basic keys for structs (with just two elements):
The first line is used to define a structure called X, the second line defines the elements of X as comma separated values. After that, we just define each element info.
For example. we can have a struct like this one:
assuming we have DWORD defined, the struct will look like this
Note that the number of elements field is used in case of arrays only to identify how many elements are in arrays, other than that it is zero by default.
Unions
Unions are defined exactly like structs the only difference is that you will replace the word struct with the word union.
Function prototypes
Function prototypes representation is the most detail oriented and the most important one of them all. Actually, this is the one used directly for type matching
It should be self-explanatory. Let's do strncasecmp as an example for x86 arch for Linux machines. According to man pages, strncasecmp is defined as the following:
When converting it into its sdb representation it will look like the following:
Note that the .cc part is optional and if it didn't exist the default calling-convention for your target architecture will be used instead. There is one extra optional key
This key is used to mark functions that will not return once called, such as exit and _exit.
最后更新于
这有帮助吗?