代码分析
Code analysis is a common technique used to extract information from assembly code.
Radare2 has different code analysis techniques implemented in the core and available in different commands.
As long as the whole functionalities of r2 are available with the API as well as using commands. This gives you the ability to implement your own analysis loops using any programming language, even with r2 oneliners, shellscripts, or analysis or core native plugins.
The analysis will show up the internal data structures to identify basic blocks, function trees and to extract opcode-level information.
The most common radare2 analysis command sequence is aa
, which stands for "analyze all". That all is referring to all symbols and entry-points. If your binary is stripped you will need to use other commands like aaa
, aab
, aar
, aac
or so.
Take some time to understand what each command does and the results after running them to find the best one for your needs.
In this example, we analyze the whole file (aa
) and then print disassembly of the main()
function (pdf
). The aa
command belongs to the family of auto analysis commands and performs only the most basic auto analysis steps. In radare2 there are many different types of the auto analysis commands with a different analysis depth, including partial emulation: aa
, aaa
, aab
, aaaa
, ... There is also a mapping of those commands to the r2 CLI options: r2 -A
, r2 -AA
, and so on.
It is a common sense that completely automated analysis can produce non sequitur results, thus radare2 provides separate commands for the particular stages of the analysis allowing fine-grained control of the analysis process. Moreover, there is a treasure trove of configuration variables for controlling the analysis outcomes. You can find them in anal.*
and emu.*
cfg variables' namespaces.
One of the most important "basic" analysis commands is the set of af
subcommands. af
means "analyze function". Using this command you can either allow automatic analysis of the particular function or perform completely manual one.
Some of the most challenging tasks while performing a function analysis are merge, crop and resize. As with other analysis commands you have two modes: semi-automatic and manual. For the semi-automatic, you can use afm <function name>
to merge the current function with the one specified by name as an argument, aff
to readjust the function after analysis changes or function edits, afu <address>
to do the resize and analysis of the current function until the specified address.
Apart from those semi-automatic ways to edit/analyze the function, you can hand craft it in the manual mode with af+
command and edit basic blocks of it using afb
commands. Before changing the basic blocks of the function it is recommended to check the already presented ones:
There are two very important commands for this: afc
and afB
. The latter is a must-know command for some platforms like ARM. It provides a way to change the "bitness" of the particular function. Basically, allowing to select between ARM and Thumb modes.
Recursive analysis
There are 4 important program wide half-automated analysis commands:
aab
- perform basic-block analysis ("Nucleus" algorithm)aac
- analyze function calls from one (selected or current function)aaf
- analyze all function callsaar
- analyze data referencesaad
- analyze pointers to pointers references
Those are only generic semi-automated reference searching algorithms. Radare2 provides a wide choice of manual references' creation of any kind. For this fine-grained control you can use ax
commands.
The most commonly used ax
commands are axt
and axf
, especially as a part of various r2pipe scripts. Lets say we see the string in the data or a code section and want to find all places it was referenced from, we should use axt
:
Apart from predefined algorithms to identify functions there is a way to specify a function prelude with a configuration option anal.prelude
. For example, like e anal.prelude = 0x554889e5
which means
on x86_64 platform. It should be specified before any analysis commands.
Configuration
Radare2 allows to change the behavior of almost any analysis stages or commands. There are different kinds of the configuration options:
Flow control
Basic blocks control
References control
IO/Ranges
Jump tables analysis control
Platform/target specific options
Control flow configuration
Two most commonly used options for changing the behavior of control flow analysis in radare2 are anal.hasnext
and anal.afterjump
. The first one allows forcing radare2 to continue the analysis after the end of the function, even if the next chunk of the code wasn't called anywhere, thus analyzing all of the available functions. The latter one allows forcing radare2 to continue the analysis even after unconditional jumps.
In addition to those we can also set anal.ijmp
to follow the indirect jumps, continuing analysis; anal.pushret
to analyze push ...; ret
sequence as a jump; anal.nopskip
to skip the NOP sequences at a function beginning.
For now, radare2 also allows you to change the maximum basic block size with anal.bb.maxsize
option . The default value just works in most use cases, but it's useful to increase that for example when dealing with obfuscated code. Beware that some of basic blocks control options may disappear in the future in favor of more automated ways to set those.
For some unusual binaries or targets, there is an option anal.noncode
. Radare2 doesn't try to analyze data sections as a code by default. But in some cases - malware, packed binaries, binaries for embedded systems, it is often a case. Thus - this option.
Reference control
The most crucial options that change the analysis results drastically. Sometimes some can be disabled to save the time and memory when analyzing big binaries.
anal.jmpref
- to allow references creation for unconditional jumpsanal.cjmpref
- same, but for conditional jumpsanal.datarefs
- to follow the data references in codeanal.refstr
- search for strings in data referencesanal.strings
- search for strings and creating references
Note that strings references control is disabled by default because it increases the analysis time.
Analysis ranges
There are a few options for this:
anal.limits
- enables the range limits for analysis operationsanal.from
- starting address of the limit rangeanal.to
- the corresponding end of the limit rangeanal.in
- specify search boundaries for analysis (io.maps
,io.sections.exec
,dbg.maps
and manymore - see
e anal.in=?
for the complete list)
Jump tables
Jump tables are one of the trickiest targets in binary reverse engineering. There are hundreds of different types, the end result depending on the compiler/linker and LTO stages of optimization. Thus radare2 allows enabling some experimental jump tables detection algorithms using anal.jmptbl
option. Eventually, algorithms moved into the default analysis loops once they start to work on every supported platform/target/testcase. Two more options can affect the jump tables analysis results too:
anal.ijmp
- follow the indirect jumps, some jump tables rely on themanal.datarefs
- follow the data references, some jump tables use those
Platform specific controls
There are two common problems when analyzing embedded targets: ARM/Thumb detection and MIPS GP value. In case of ARM binaries radare2 supports some auto-detection of ARM/Thumb mode switches, but beware that it uses partial ESIL emulation, thus slowing the analysis process. If you will not like the results, particular functions' mode can be overridden with afB
command.
The MIPS GP problem is even trickier. It is a basic knowledge that GP value can be different not only for the whole program, but also for some functions. To partially solve that there are options anal.gp
and anal.gp2
. The first one sets the GP value for the whole program or particular function. The latter allows to "constantify" the GP value if some code is willing to change its value, always resetting it if the case. Those are heavily experimental and might be changed in the future in favor of more automated analysis.
Visuals
One of the easiest way to see and check the changes of the analysis commands and variables is to perform a scrolling in a Vv
special visual mode, allowing functions preview:
When we want to check how analysis changes affect the result in the case of big functions, we can use minimap instead, allowing to see a bigger flow graph on the same screen size. To get into the minimap mode type VV
then press p
twice:
This mode allows you to see the disassembly of each node separately, just navigate between them using Tab
key.
Analysis hints
It is not an uncommon case that analysis results are not perfect even after you tried every single configuration option. This is where the "analysis hints" radare2 mechanism comes in. It allows to override some basic opcode or meta-information properties, or even to rewrite the whole opcode string. These commands are located under ah
namespace:
One of the most common cases is to set a particular numeric base for immediates:
It is notable that some analysis stages or commands add the internal analysis hints, which can be checked with ah
command:
Sometimes we need to override jump or call address, for example in case of tricky relocation, which is unknown for radare2, thus we can change the value manually. The current analysis information about a particular opcode can be checked with ao
command. We can use ahc
command for performing such a change:
As you can see, despite the unchanged disassembly view the jump address in opcode was changed (jump
option).
If anything of the previously described didn't help, you can simply override shown disassembly with anything you like:
最后更新于
这有帮助吗?