Android Debugging
From OMAPpedia
There are many different ways of debugging various parts of the Android software stack (ie: bootloader, kernel, applications etc.). We will cover a few tools that we have used. Please feel free to update this list or provide more information about other methods that may be available.
[edit] Eclipse ADT
[edit] Note on Installing Eclipse Plugins
Before installing the Android Development Tools, be sure to put your proxy server (if applicable) into the General preferences under Network Connections and to "install" the software update link to the version of Eclipse you have (3.5 is "http://download.eclipse.org/releases/galileo/).
You will likely have to install several required plugins from Eclipse before ADT will install (possibly GEF and WST plugins).
[edit] Debugging on Zoom2 with Eclipse ADT
The Android Development Tools (ADT) plugin for Eclipse adds powerful extensions to the Eclipse integrated development environment. It allows you to create and debug Android applications easier and faster. Details on ADT can be obtained from http://developer.android.com/guide/developing/eclipse-adt.html.
It is assumed that ADT plugin has already been setup to work with Eclipse environment as described http://developer.android.com/sdk/1.1_r1/installing.html#installingplugin.
Step 1: Upon installing the ADT plugin for eclipse, Dalvik Debug Monitor Service (DDMS) should have been setup. DDMS configuration should be changed as in below:
Click on Window->Preferences; Select Android -> DDMS Change - ADB debugger base port: 8700; Logging Level: Verbose Click on Apply
Step 2: DDMS perspective can now be opened from the eclipse menu via:
Window -> Open Perspective -> Other -> DDMS; Click on OK
Step 3: Get Eclipse to attach to your Zoom2 board.
Bootup the zoom2 board and find the IP address of the board. If you havent added ip=dhcp in the bootargs, you can start the ethernet and obtain an IP address using dhcp using following commands
# netcfg eth0 up # netcfg eth0 dhcp
Using the command below you can verify that the board did obtain an IP address
# netcfg
NOTE: If you boot via NFS, then uboot will typically print out the board's IP address to console.
On the host machine run the following commands from terminal shell:
$ export ADBHOST=$ adb kill-server $ adb start-server
Check if you are now connected to the Zoom2 device by running the following command on the Host Terminal console:
$ adb devices
It should output something like:
emulator-5554 device
This confirms that Zoom2 board is connected. With this setup, you should be able to use Android Debug Bridge, Logcat, DDMS and other tools directly from Eclipse ADT environment for creating your applications for Android on Zoom2.
[edit] Troubleshooting
Issue: ADB is not in the path, where should I find it?
Resolution: ADB command line tool is found at:
Issue: ADB is having a problem connecting over Ethernet.
Resolution: This is because the ADB stub on target defaults to USB. To fix this, in the Zoom console:
# setprop service.adb.tcp.port 5555
This will avoid ADBD defaulting to USB transport. Restart ADBD on Zoom to take the changed settings.
# stop adbd # start adbd
Alternatively, the setprop command can be included in init.rc so that system property is set at start up, before starting ADB stub.
[edit] Debugging with GDB and DDD
The user space programs can be debugged using various debug commands). Here are some gnu apps that can be used to ease the debugging of binary files on the android platform. GDB, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.
[edit] GDB (the GNU Debugger)
Following are the instructions to enable GDB on Android:
1. Obtain the IP address of the target. This can be found by adding “ip=dhcp” in the bootargs, which will obtain and print the IP automatically during boot. Alternatively if you have the busybox command line tools available on the target you can type "ifconfig eth0" to obtain the IP address of the target.
2. On the host, perform the following (once per new console window): Go to mydroid directory and run
source build/envsetup.sh setpaths export ADBHOST=
Ensure that above setup works by running
adb kill-server ; adb shell
You should see a command prompt of the target on your host. Verify this by running "ps" or similar commands. Exit the adb shell by typing “exit”
3. Start GDB using the following command
gdbclientexecutable name: file name in system/bin dir port number: default is :5039 (need the colon before the number) task name: obtained by running "ps" on the target. GDB uses it to identify the PID internally.
E.g. for video playback, use (note the space after mediaserver and colon):
gdbclient mediaserver :5039 mediaserver
Then you can run commands like “info threads”, “break”, “step” etc.
For a full listing of GDB commands refer to: http://www.yolinux.com/TUTORIALS/GDB-Commands.html
You may have to run the following after each target reboot:
adb kill-server
[edit] DDD (Data Display Debugger)
DDD is a graphical front-end for GDB and other command-line debuggers like GDB.
Following are the instructions to enable DDD on Android:
The steps are almost same as GDB:
1. Obtain the IP address of the target. This can be found by adding "ip=dhcp" in the bootargs, which will obtain and print the IP automatically during boot. Alternatively if you have the busybox command line tools available on the target you can type "ifconfig eth0" to obtain the IP address of the target.
2. Install DDD: in the shell run:
sudo apt-get install ddd
3. Add the following function to build/envsetup.sh:
function dddclient() { local OUT_ROOT=$(get_abs_build_var PRODUCT_OUT) local OUT_SYMBOLS=$(get_abs_build_var TARGET_OUT_UNSTRIPPED) local OUT_SO_SYMBOLS=$(get_abs_build_var TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED) local OUT_EXE_SYMBOLS=$(get_abs_build_var TARGET_OUT_EXECUTABLES_UNSTRIPPED) local PREBUILTS=$(get_abs_build_var ANDROID_PREBUILTS) if [ "$OUT_ROOT" -a "$PREBUILTS" ]; then local EXE="$1" if [ "$EXE" ] ; then EXE=$1 else EXE="app_process" fi local PORT="$2" if [ "$PORT" ] ; then PORT=$2 else PORT=":5039" fi local PID local PROG="$3" if [ "$PROG" ] ; then PID=`pid $3` adb forward "tcp$PORT" "tcp$PORT" adb shell gdbserver $PORT --attach $PID & sleep 2 else echo "" echo "If you haven't done so already, do this first on the device:" echo " gdbserver $PORT /system/bin/$EXE" echo " or" echo " gdbserver $PORT --attach $PID" echo "" fi echo >|"$OUT_ROOT/gdbclient.cmds" "set solib-absolute-prefix $OUT_SYMBOLS" echo >>"$OUT_ROOT/gdbclient.cmds" "set solib-search-path $OUT_SO_SYMBOLS" echo >>"$OUT_ROOT/gdbclient.cmds" "target remote $PORT" echo >>"$OUT_ROOT/gdbclient.cmds" "" ddd --debugger arm-eabi-gdb -x "$OUT_ROOT/gdbclient.cmds" "$OUT_EXE_SYMBOLS/$EXE" else echo "Unable to determine build system output dir." fi }
4. On the host, perform the following (once per new console window): Go to mydroid directory and run
source build/envsetup.sh setpaths export ADBHOST=
Ensure that above setup works by running
adb kill-server ; adb shell
You should see a command prompt of the target on your host. Verify this by running "ps" or similar commands. Exit the adb shell by typing “exit”
5. Start DDD using the following command
dddclientexecutable name: file name in system/bin dir port number: default is :5039 (need the colon before the number) task name: obtained by running "ps" on the target. GDB uses it to identify the PID internally.
E.g. for video playback, use (note the space after mediaserver and colon):
dddclient mediaserver :5039 mediaserver
For the DDD manual, refer to: http://www.gnu.org/manual/ddd/html_mono/ddd.html
You may have to run the following after each target reboot:
adb kill-server
[edit] Lauterbach TRACE32
Lauterbach TRACE32 could be used to debug bootloaders, kernel and user space.
Instructions on using Lauterbach TRACE32 for debugging on Zoom2:
Install Lauterbach TRACE32 software on your PC (the below screenshot is from Oct 10 2008 release). Connect emulator cable to J5 (20 pin header) on Zoom2 debug board and power the emulator. Connect USB cable from the emulator to PC
Run zoom2_startup.cmm script to select your target as OMAP3430 and attach from File -> Run Batchfile. If the script is not run, some of the settings will have to be manually selected from CPU -> System Settings
Ensure that the emulator is “running” by the green status indicator (seen at the bottom of the below screenshot) before exercising any use cases that need to be debugged.
Run the use case (ex: audio/video playback) Halt the processor by clicking on the “pause” button and view registers (View -> Registers), list source (View -> List Source) etc.
Make sure to load the symbols for files that you’re interested in debugging and set source path for source code correlation to work correctly. Also you may have to ensure that options such as –g is added during compiling your code to generate symbolic debugging directives. In some instances consider reducing the level of optimization used as the compiler will re-arrange instructions and hence it may be difficult to match the order of execution in the source code.
Examples of setting the source search path and loading symbols:
sYmbol.SourcePATH.Set "V:\mydroid\kernel\" data.load.elf V:\mydroid\kernel\vmlinux /nocode /strippart "kernel"
These commands can be directly entered from either the debugger command prompt or by using a *.cmm script.
Adapt changed base directories with the "/strippart" option; do not use recursive directory search, due to performance reasons and equal source file names.
For user space debugging, TRACE32 needs some help as it needs to be told where some of the modules you're interested in debugging are loaded. To do this you will have to run "ps" on the target and get PIDs for the application.
Then run "cat /proc/PID/maps > logfile" where PID is the process ID retrieved from "ps" in the above step. There is an avplayback_symbols.cmm file attached that exhibits how to do this. Below screenshot demonstrates being halted in user space during running of an AV playback use case.
zoom2_startup.cmm
avplayback_symbols.cmm
[edit] CodeComposer
This could be used to debug bootloaders. Previous versions of CCS (v3.3 and older) did not contain Linux awareness but it is currently being added to CCSv4. It should be possible to debug the kernel and user space once CCSv4 is released. See Linux_Aware_Debug for more information.
[edit] Selectively Enable Opencore Debug Print
To utilize the existing log statements without rebuilding the whole PV library, you can do this:
1. In the beginning of the file, after the last "#include" line, add following:
- include
- undef LOG_TAG
- define LOG_TAG "YOUR_MODULE_NAME"
- undef PVLOGGER_LOGMSG
- define PVLOGGER_LOGMSG(IL, LOGGER, LEVEL, MESSAGE) JJLOGE MESSAGE
- define JJLOGE(id, ...) LOGE(__VA_ARGS__)
2. In the end of the file, add these:
- undef PVLOGGER_LOGMSG
- define PVLOGGER_LOGMSG(IL, LOGGER, LEVEL, MESSAGE) OSCL_UNUSED_ARG(LOGGER);
You can play with the macro to filter based on level too.
[edit] Experimenting CpuFreq Governors to profile at different ARM-MHz
Below governors are functional in L27x K35 kernel
Performance governor (system in turbo mode. 1008MHz ARM, 200MHz L3)
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo 1 > /sys/devices/system/cpu/cpu1/online
Ondemand governor (Cpu MHz as per system load)
echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Hotplug (ondemand+cpu1 hotplug handling based on cpu0 MHz. enabled by default in L27x)
echo hotplug > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
UserSpace governor (can be used to profile usecase in different ARM MHz)
echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
In userspace governor, useful commands to switch MPU MHz
# Check Available frequencies cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies 300000 600000 800000 1008000
# e.g. Switch to 800Mhz echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
# e.g. check CPU0 Frequency cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq 800000
To control CPU1 manually,
echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online
[edit] Profiling
There is a simple profiling mechanism implemented in the kernel, implemented by storing the current instruction pointer at each clock tick.
To enable profiling, pass the boot argument
profile=N
where N is a number which determines the granularity of profiling. The lesser the number, the more the granularity of profiling.
A busybox utility named 'readprofile' is available to process the profiled data. 'readprofile' requires the kernel symbol table file 'System.map' to resolve the symbols.
To clear the profiled data:
$readprofile -r
To display the profiled data:
$readprofile -m /System.map|sort -nr You should see an output similar to the following: 1415 total 0.0003 1153 omap3_enter_idle 3.3132 28 schedule 0.0304 15 omap_i2c_isr 0.0179 12 v7wbi_flush_user_tlb_range 0.1579 11 copy_page 0.1146 10 __memzero 0.0781 8 __copy_to_user 0.0085 6 update_mmu_cache 0.0341 5 sub_preempt_count 0.0260 5 mmc_queue_map_sg 0.0305 5 handle_IRQ_event 0.0431 4 unmap_vmas 0.0027 4 filemap_fault 0.0038 4 __do_fault 0.0042 3 vsnprintf 0.0013 3 up_read 0.1500 3 omap_hsmmc_enable_clks 0.0069 3 kmem_cache_alloc 0.0208 3 get_page_from_freelist 0.0025 ....
The first column gives the number of ticks and the last column gives the number of ticks divided by function size.
This profiler covers only the kernel. For system wide profiling and advanced options, OProfile can be used.
[edit] OProfile on OMAP3
OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. It consists of a kernel driver and a daemon for collecting sample data, and several post-profiling tools for turning data into information
OProfile is optional component during KERNEL build. It may have been enabled by default. You can confirm that the kernel has OProfile support, by looking for following lines in the
CONFIG_OPROFILE_OMAP_GPTIMER=y CONFIG_OPROFILE=y CONFIG_HAVE_OPROFILE=y
Hardware Configuration
The Hardware Configuration required to execute the test cases includes:
Linux machine (can be with your favorite distro) TCP/IP configuration on Zoom2 board Zoom2 Board
Software Configuration
The Software Configuration required to execute the test cases includes:
Tera Term (or any terminal program) Graphviz on Linux machine (Use this command on Host terminal $ sudo apt-get install graphviz GPROF2DOT python script (Copy the script to any location in your path (e.g. in ~/bin of your Linux machine); Ensure that ~/bin is exported in the PATH Run the following command - $ cd ~/bin && chmod 777 gprof2dot.py
Installation
This step should be done after the android file system has been built.
$MYDROID is the location where the android SDK is installed. eg: export MYDROID=/home/$user/Lxx.x/mydroid
Edit the $MYDROID/external/oprofile/opimport_pull script as follows:
Remove the python version number from the first line eg. change #!/usr/bin/python2.4 -E to #!/usr/bin/python -E Append the following lines at the end of the file to generate cpuloads.txt and callgraph.png for further analysis os.system(oprofile_event_dir + "/bin/opreport --session-dir=. >> cpuloads.txt") os.system(oprofile_event_dir + "/bin/opreport --session-dir=. -p $OUT/symbols -l -t 0.1 >> cpuloads.txt") os.system(oprofile_event_dir + "/bin/opreport -cg --session-dir=. -p $OUT/symbols > callgraph.txt") os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 0.1 -e 0.1 | dot -Tpng -o callgraph.png")
On Eclair we have seen the Android tools opannotate, oparchive, opimport and opreport tools in prebuilt/linux-x86/oprofile/bin/ folder are not working properly.
These binaries (tar balled) from donut are available @ Oprofile.tar.gz. Download this .gz file to $MYDROID folder
$ cd $MYDROID $ tar xvf Oprofile.tar.gz
Since we perform the post-processing on host, we don't need the actual vmlinux file (~40 MB) on target. Make sure that you create a dummy file named "vmlinux" in the root directory to satisfy opcontrol arguments.
# echo 0 > /vmlinux
Execution
Set-up OProfile directories
Make sure that you have created an empty file and named it vmlinux as described in above section. Run the following command on the target
# opcontrol --setup
By default there should be no output.
In case you see, "Cannot create directory /dev/oprofile: File exists do_setup failed#", it means that, OProfile is not built in the Kernel. Verify that you have selected OProfile in make menuconfig step of Kernel build (Refer Configuration Section)
Initialize the OProfile daemon
The kernel range start and end addresses need to be verified on the setup for each release using:
# grep " _text" /proc/kallsyms c0030000 T _text # grep " _etext" /proc/kallsyms c03e1000 A _etext
Note: You need busybox installed for this command to work. Refer here if you haven't set-up busybox.
Using the above addresses, run the following command
# opcontrol --vmlinux=/vmlinux --kernel-range=0xC0030000,0xC03e1000 --event=CPU_CYCLES:64
You should see the following output on your terminal
Cannot open /dev/oprofile/1/enabled: No such file or directory Cannot open /dev/oprofile/2/enabled: No such file or directory Using 2.6+ OProfile kernel interface. Reading module info. Using log file /data/oprofile/samples/oprofiled.log # init: untracked pid 914 exited
Increase the Back trace depth, so that more details can be captured in the log
# echo 16 > /dev/oprofile/backtrace_depth
To ensure that everything is ready, you can run the following command
# opcontrol --status
The following output should be seen. Note that the PID will change depending on your system.
Driver directory: /dev/oprofile Session directory: /data/oprofile Counter 0: name: CPU_CYCLES count: 64 Counter 1 disabled Counter 2 disabled oprofiled pid: 915 profiler is not running 0 samples received 0 samples lost overflow
Starting and Stopping the profiler
Run the following command to start the profiler
# opcontrol --start
and use the command below to stop the profiler
# opcontrol --stop
Generating the Results
We need to run the following steps on the Host machine (that has android SDK/build) to generate the results.
On command prompt of Host machine (that has android SDK/build), do the following
$ cd $MYDROID $ source build/envsetup.sh $ setpaths $ export ADBHOST=
Note: This should be done @ $MYDROID level (where the build was set-up otherwise, it wouldn't work)
If ADB over Ethernet is not working refer to Troubleshooting here
You can use the following commands to start DHCP on Zoom board and set-up ADB over ethernet
# netcfg eth0 up ; sleep 3 ; netcfg eth0 dhcp ; sleep 2 ; netcfg # setprop service.adb.tcp.port 5555 ; stop adbd ; start adbd
Post-process OProfile results
This needs to be done from the PC where Android SDK is installed. Go to the terminal on host PC and do the following:
If you are using OProfile package with pre-build binaries, symbol files and vmlinux, you can follow the steps below: In case, you are using OPROFILE binaries that were not build on your machine, you might have to create a symbolic link to zoom2 folder, since OProfile looks there
$ mkdir ~/oprofilepackage && cd ~/oprofilepackage $ tar xvjf$ cd mydroid $ mv ../kernel . $ source build/envsetup.sh $ sed -i -e 's_$(call inherit-product, frameworks/base/data/sounds/OriginalAudio.mk)_#$(call inherit-product, frameworks/base/data/sounds/OriginalAudio.mk)_g' build/target/product/full.mk $ setpaths $ export MYDROID=${PWD} $ ln -s $MYDROID/out/target/product/zoom2 $MYDROID/out/target/product/generic
NOTE: The Kernel path needs to be updated for your KERNEL build
$ cp $MYDROID/kernel/android-2.6.32/vmlinux $OUT/symbols/vmlinux
Generate the OPROFILE results using the command below
$ opimport_pull
The following files and the Callgraph image can be referred for OProfile results. They will be generated in the
Note: If there are some binaries that are compiled on WINDOWS and linked in to your build :( - you will see the message below
Traceback (most recent call last): File "/home/user/bin/gprof2dot.py", line 1965, inMain().main() File "/home/user/bin/gprof2dot.py", line 1890, in main self.profile = parser.parse() File "/home/user/bin/gprof2dot.py", line 1062, in parse self.parse_entry() File "/home/user/bin/gprof2dot.py", line 1112, in parse_entry function = self.parse_subentry() File "/home/user/bin/gprof2dot.py", line 1136, in parse_subentry filename, lineno = source.split(':') ValueError: too many values to unpack cat: write error: Broken pipe
In this case, you can open callgraph.txt in
For eg. if you have E:\workspaces\ make it \workspaces\
Now, cd to
# cd# cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 0.1 -e 0.1 | dot -Tpng -o callgraph.png
The guidelines and caveats while interpreting Oprofile results are available at Oprofile source forge Wiki
[edit] OProfile on OMAP4 K35 (pre-ICS)
We forward ported the K32 patches for OProfile in K35. Please pull in the patches 8192 (for the kernel) and 8193 (for Android), per the instructions in the following sections.
Note that the instructions for using OProfile on OMAP4 with a K35 release (pre-ICS) and a K3.0 release (ICS) are slightly different due to the patches needed. Thus, please follow the instructions in the appropriate section below.
[edit] Note for OProfile on Pandaboard
To use OProfile on Pandaboard, you must use a kernel with CONFIG_PM enabled. Otherwise any accesses to CTI unit will cause system hang. The root cause is when PM is enabled, prcm_setup_regs is called, which enables all the DPLL autoidle and autogating. Then, CTI can be initialized normally.
The kernel source is at TI Ubuntu git tree:, ti-ubuntu-2.6.35-ti903.8+ti+release3.pm or newer. To get the patch please check topic "Oprofile on Pandaboard / Omap4" in Pandaboard google group.
[edit] Changing to Performance Governor
It is recommended that you use Performance Governor to ensure that the OPP doesn't change during the Profiling
Command to check the GOVERNOR (on TARGET console, type the following command )
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
If the output of the above command is "ondemand" or "hotplug" you need to change to performance governor. Use the commands below
# echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor # echo 1 > /sys/devices/system/cpu/cpu1/online
[edit] Steps to rebuild Kernel
Step 1: Apply the patch 8192
Step 2: Clean up the KERNEL build
~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- distclean ~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- android_4430_defconfig
Step 3: Enable these options in the .config file (via menuconfig) to enable oprofile:
CONFIG_PROFILING=y CONFIG_OPROFILE=y
Step 4: Rebuild the KERNEL
~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- uImage
[edit] Steps to rebuild USERSPACE Component
Step 1: Apply the patch 8193
Step 2: Source the build environment (Assuming that you have built Android previously, go to your MYDROID folder)
~/omap4/mydroid$ cd $MYDROID ~/omap4/mydroid$ source build/envsetup.sh ~/omap4/mydroid$ setpaths
Step 3: Build OPCONTROL source
~/omap4/mydroid$ cd $MYDROID/external/oprofile ~/omap4/mydroid/external/oprofile$ mm
Step 4: Push the built libraries in File system
~/omap4/mydroid$ adb push $OUT/system/xbin/oprofiled system/xbin/ ~/omap4/mydroid$ adb push $OUT/system/xbin/opcontrol system/xbin/
PS: In case the system is read-only (for eMMC Builds), you can remount as rw using command below on console
# mount -o rw,remount -t ext3 /dev/block/mmcblk0p1 /system/
[edit] Work-around for recognizing CPU
The post-processing utils don't recognize the CPU for some reason, so workaround as follows (one-time).
Step 1: Extract the unit_masks.gz and events.gz files into the "invalid cpu type" directory. Please copy-paste "as-is"
~/omap4/mydroid$ mkdir $MYDROID/prebuilt/linux-x86/oprofile/invalid\ cpu\ type/ ~/omap4/mydroid$ cd $MYDROID/prebuilt/linux-x86/oprofile/invalid\ cpu\ type/ ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ wget http://omappedia.org/images/b/b6/Unit_masks.gz ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ wget http://omappedia.org/images/d/d7/Events.gz ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ gzip -d Events.gz ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ gzip -d Unit_masks.gz ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ mv Unit_masks unit_masks ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ mv Events events
Step 2: Create a folder path and softlink on TARGET
# mkdir -p /system/usr/local/share/oprofile/arm/armv7
If you are using eMMC File system, this step needs to be done everytime board is rebooted:
# ln -s /system/usr /usr
Step 3: Push the events and unit_masks files on target
~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ adb push events /system/usr/local/share/oprofile/arm/armv7/ ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ adb push unit_masks /system/usr/local/share/oprofile/arm/armv7/
[edit] Installation
These steps should be done after the Android file system has been built.
Step 1: Edit the $MYDROID/external/oprofile/opimport_pull script as follows:
Remove the python version number from the first line eg. change
#!/usr/bin/python2.4 -E
to
#!/usr/bin/python -E
Replace
stream = os.popen("find raw_samples -type f -name \*all")
by
stream = os.popen("find raw_samples -type f -name \*\.\*\.\*")
Replace
os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=.")
by
os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=.") os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=. -m tgid >> cpuloads.txt") os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=. -m tgid -p $OUT/symbols -lg -t 0.1 >> cpuloads.txt") os.system(oprofile_event_dir + "/bin/opreport -c --session-dir=. -m tgid -p $OUT/symbols > callgraph.txt") os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 1 -e 1 | dot -Tpng -o callgraph.png")
Step 2: Install the package 'graphviz' needed by above script 'opimport_pull'.
$ sudo apt-get install graphviz
Step 3: Save the python script gprof2dot.py to any location in your path (e.g. in ~/bin of your Linux machine). Ensure that ~/bin is exported in the PATH. Make the script executable.
$ chmod 777 ~/bin/gprof2dot.py
Step 4: We have seen the Android tools opannotate, oparchive, opimport and opreport tools in prebuilt/linux-x86/oprofile/bin/ folder are not working properly. These binaries from Android Donut are working properly, and they are available @ Oprofile.tar.gz. Download this .gz file to $MYDROID folder
~/omap4/mydroid$ cd $MYDROID ~/omap4/mydroid$ wget http://omapedia.org/images/3/32/Oprofile.tar.gz ~/omap4/mydroid$ tar xvf Oprofile.tar.gz
[edit] Execution
Step 1: Set-up OProfile directories
# opcontrol --setup
You should see the output below
Unable to open cpu_type file for reading Make sure you have done opcontrol --init Please ignore the above error if running opcontrol --setup
Step 2: Ensure that the EVENTS can be listed
# opcontrol --list-events
You should see output below. If you dont see the output below, the patches in Step 1 were not applied correctly.
CPU Type: ARM V7 PMNC name : meaning ------------------------------------------------------------------------------ PMNC_SW_INCR : Software increment of PMNC registers IFETCH_MISS : Instruction fetch misses from cache or normal cacheable memory ITLB_MISS : Instruction fetch misses from TLB DCACHE_REFILL : Data R/W operation that causes a refill from cache or normal cacheable memory DCACHE_ACCESS : Data R/W from cache DTLB_REFILL : Data R/W that causes a TLB refill DREAD : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code) DWRITE : Data write architecturally executed INSTR_EXECUTED : All executed instructions EXC_TAKEN : Exception taken EXC_EXECUTED : Exception return architecturally executed CID_WRITE : Instruction that writes to the Context ID Register architecturally executed PC_WRITE : SW change of PC, architecturally executed (not by exceptions) PC_IMM_BRANCH : Immediate branch instruction executed (taken or not) PC_PROC_RETURN : Procedure return architecturally executed (not by exceptions) UNALIGNED_ACCESS : Unaligned access architecturally executed PC_BRANCH_MIS_PRED : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction PC_BRANCH_MIS_USED : Branch or change in program flow that could have been predicted CPU_CYCLES : Number of CPU cycles JAVA_BC_EXEC : Number of Java bytecodes decoded, including speculative ones JAVA_SFTBC_EXEC : Number of software Java bytecodes decoded, including speculative ones JAVA_BB_EXEC : Number of Jazelle taken branches executed, including those flushed due to a previous load/store which aborts late CO_LF_MISS : Number of coherent linefill requests which miss in all other CPUs, meaning that the request is sent to external memory CO_LF_HIT : Number of coherent linefill requests which hit in another CPU, meaning that the linefill data is fetched directly from the relevant cache IC_DEP_STALL : Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill DC_DEP_STALL : Number of cycles where CPU has some instructions that it cannot issue to any pipeline and the LSU has at least one pending linefill request but no pending TLB requests STREX_PASS : Number of STREX instructions architecturally executed and passed STREX_FAILS : Number of STREX instructions architecturally executed and failed DATA_EVICT : Number of eviction requests due to a linefill in the data cache ISS_NO_DISP : Number of cycles where the issue stage does not dispatch any instruction ISS_EMPTY : Number of cycles where the issue stage is empty INS_RENAME : Number of instructions going through the Register Renaming stage PRD_FN_RET : Number of procedure returns whose condition codes do not fail, excluding all exception returns INS_MAIN_EXEC : Number of instructions being executed in main execution pipeline of the CPU, the multiply pipeline and the ALU pipeline INS_SND_EXEC : Number of instructions being executed in the second execution pipeline (ALU) of the CPU INS_LSU : Number of instructions being executed in the Load/Store unit INS_FP_RR : Number of floating-point instructions going through the Register Rename stage INS_NEON_RR : Number of NEON instructions going through the Register Rename stage STALL_PLD : Number of cycles where CPU is stalled because PLD slots are all full STALL_WRITE : Number of cycles where CPU is stalled because data side is full and executing writes to external memory STALL_INS_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the instruction side STALL_DATA_TLB : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the data side STALL_INS_UTLB : Number of cycles where CPU is stalled because of micro TLB misses on the instruction side STALL_DATA_ULTB : Number of cycles where CPU is stalled because of micro TLB misses on the data side STALL_DMB : Number of cycles where CPU is stalled due to executed of a DMB memory barrier CLK_INT_EN : Number of cycles during which the integer core clock is enabled CLK_DE_EN : Number of cycles during which the Data Engine clock is enabled INS_ISB : Number of ISB instructions architecturally executed INS_DSB : Number of DSB instructions architecturally executed INS_DMB : Number of DMB instructions speculatively executed EXT_IRQ : Number of external interrupts executed by the processor PLE_CL_REQ_CMP : PLE cache line request completed PLE_CL_REQ_SKP : PLE cache line request skipped PLE_FIFO_FLSH : PLE FIFO flush PLE_REQ_COMP : PLE request completed PLE_FIFO_OF : PLE FIFO overflow PLE_REQ_PRG : PLE request programmed
Step 3: The kernel range start and end addresses need to be verified on the setup for each release using:
# grep " _text" /proc/kallsyms c0043000 T _text # grep " _etext" /proc/kallsyms c05da000 A _etext
Using the above addresses, run the following command. Note that the events and the assoicated cycles can be specified in this command line. Reduce the event counts to get more samples.
# opcontrol --vmlinux=/vmlinux --kernel-range 0xc0043000,0xc05da000 --event CPU_CYCLES:25000000
You should see the following output on your terminal
CPU Type: ARM V7 PMNC Using 2.6+ OProfile kernel interface Using log file /data/oprofile init: untracked pid 1868 exited file/samples/oprofiled.log
Increase the Back trace depth, so that more details can be captured in the log
# echo 16 > /dev/oprofile/backtrace_depth
[edit] Starting and Stopping the profiler
Step 1: Run the following command to start the profiler
# opcontrol --start
You should see output below
CPU Type: ARM V7 PMNC PMNC registers dump CPU 0: PMNC =0x41093000 CNTENS=0x80000003 INTENS=0x80000003 FLAGS =0x00000000 SELECT=0x00000001 CCNT =0xfa0a1f00 CNT[0] count =0xfffb6c20 CNT[0] evtsel=0x00000060 CNT[1] count =0xfffb6c20 CNT[1] evtsel=0x00000061 CNT[2] count =0x00000000 CNT[2] evtsel=0x000000d1 CNT[3] count =0x00000000 CNT[3] evtsel=0x00000061 CNT[4] count =0x00000000 CNT[4] evtsel=0x000000e5 CNT[5] count =0x00000000 CNT[5] evtsel=0x000000a3 PMNC registers dump CPU 1: PMNC =0x41093000 CNTENS=0x80000003 INTENS=0x80000003 FLAGS =0x00000000 SELECT=0x00000001 CCNT =0xfa0a1f00 CNT[0] count =0xfffb6c20 CNT[0] evtsel=0x00000060 CNT[1] count =0xfffb6c20 CNT[1] evtsel=0x00000061 CNT[2] count =0x00000000 CNT[2] evtsel=0x000000ba CNT[3] count =0x00000000 CNT[3] evtsel=0x000000a5 CNT[4] count =0x00000000 CNT[4] evtsel=0x0000001e CNT[5] count =0x00000000 CNT[5] evtsel=0x000000ec
Step 2: Check the status of profiler
# opcontrol --status
You should see output below:
CPU Type: ARM V7 PMNC Driver directory: /dev/oprofile Session directory: /data/oprofile Counter 0: name: CPU_CYCLES count: 100000000 Counter 1 disabled Counter 2 disabled Counter 3 disabled Counter 4 disabled Counter 5 disabled Counter 6 disabled oprofiled pid: 1869 profiler is running 4 samples received 0 samples lost overflow
Note: Repeat this step several times, and make sure you are receiving samples. The samples received count should continue to increase until profiling is stopped
Step 3: Run the following command to stop the profiler (after accumulating around 1k samples or 1-2 minutes)
# opcontrol --stop
You should see output below:
CPU Type: ARM V7 PMNC PMNC registers dump CPU 1: PMNC =0x41093001 CNTENS=0x80000003 INTENS=0x80000003 FLAGS =0x00000000 SELECT=0x00000000 CCNT =0xfad259d9 CNT[0] count =0x11bb2d59 CNT[0] evtsel=0x00000060 CNT[1] count =0x132dab33 CNT[1] evtsel=0x00000061 CNT[2] count =0x00000000 CNT[2] evtsel=0x000000ba CNT[3] count =0x00000000 CNT[3] evtsel=0x000000a5 CNT[4] count =0x00000000 CNT[4] evtsel=0x0000001e CNT[5] count =0x00000000 CNT[5] evtsel=0x000000ec PMNC registers dump CPU 0: PMNC =0x41093001 CNTENS=0x80000003 INTENS=0x80000003 FLAGS =0x00000000 SELECT=0x00000000 CCNT =0xfbcb5c68 CNT[0] count =0x13326d28 CNT[0] evtsel=0x00000060 CNT[1] count =0x1e33060c CNT[1] evtsel=0x00000061 CNT[2] count =0x00000000 CNT[2] evtsel=0x000000d1 CNT[3] count =0x00000000 CNT[3] evtsel=0x00000061 CNT[4] count =0x00000000 CNT[4] evtsel=0x000000e5 CNT[5] count =0x00000000 CNT[5] evtsel=0x000000a3
[edit] Generating the Results
We need to run the following steps on the Host machine (that has Android SDK/build) to generate the results.
On command prompt of Host machine (that has Android SDK/build), do the following
$ cd $MYDROID $ source build/envsetup.sh $ setpaths $ export ADBHOST=
Note: This should be done at the $MYDROID level (where the build was set-up otherwise, it wouldn't work)
If ADB over Ethernet is not working, refer to Troubleshooting here
You can use the following commands to start DHCP on Zoom board and set-up ADB over ethernet:
# netcfg eth0 up ; sleep 3 ; netcfg eth0 dhcp ; sleep 2 ; netcfg # setprop service.adb.tcp.port 5555 ; stop adbd ; start adbd
[edit] Post-process OProfile results
This needs to be done from the PC where Android SDK is installed. Go to the terminal on host PC and do the following:
If you are using OProfile package with pre-build binaries, symbol files and vmlinux, you can follow the steps below: In case, you are using OPROFILE binaries that were not build on your machine, you might have to create a symbolic link to zoom2 folder, since OProfile looks there
NOTE: The Kernel path needs to be updated for your KERNEL build
$ cp $MYDROID/kernel/android-2.6.35/vmlinux $OUT/symbols/vmlinux
'opimport_pull' script expects following environment variable to be defined.
$ export OPROFILE_EVENTS_DIR=$MYDROID/prebuilt/linux-x86/oprofile
Generate the OPROFILE results using the command below
$ opimport_pull -r
FIXME: The above command might hang, hit "Ctrl+C" to end the command above and manually post-process the results.
Note: If you see the message below, its because you haven't download the Android Donut oprofile tools (Refer to [Installation http://omappedia.org/wiki/Android_Debugging#Installation] )
~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory ~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory ~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory Traceback (most recent call last):
[edit] Steps to Re-run OProfile
In order to capture another OProfile log in the same session (i.e. without rebooting the board), perform the following:
opcontrol --shutdown opcontrol --reset
After this start with all the steps as mentioned in "Execution" section above
[edit] Other Pointers - OProfile in OMAP4
1. The OProfile in OMAP4 is slightly different from OMAP3 as it profiles only the CPU usage. The CPU Idle time is not considered. This leads to minimal logs which focuses on CPU utilization.
2. In order to take more samples the CPU_CYCLES field in opcontrol configuration can be modified (like CPU_CYCLES:10000000) but we need to be careful as GP Timer is not being used now.
[edit] IDLE Mhz Calculation
NOTE: Above analysis is valid only for the duration the CPU is active. The cycle counter should not be used for calculation of IDLE Mhz since it gets paused during idle. Please use the cpuidle statistics to compute the idle time for your tests.
[edit] Manually Post-process the results (for advanced users)
To capture additional events, separate them across cores etc. configure opcontrol similar to the following
opcontrol --kernel-range 0xc0043000,0xc05da000 --event CPU_CYCLES:10000000 --event IC_DEP_STALL:30000 --event DC_DEP_STALL:30000 --separate-cpu=1
To display consolidated samples for all captured events
~/omap4/mydroid$ cd~/omap4/mydroid$ $MYDROID/prebuilt/linux-x86/oprofile/bin/opreport --session-dir=. -m all
Output should look like (considering that you captured atleast 1k Samples)
CPU: invalid cpu type, speed 0 MHz (estimated) Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 10000000 Counted DC_DEP_STALL events (Number of cycles where CPU has some instructions that it cannot issue to any pipeline and LSU has at least one pending linefill request but no pending TLB requests) with a unit mask of 0x00 (No unit mask) count 30000 Counted IC_DEP_STALL events (Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill) with a unit mask of 0x00 (No unit mask) count 30000 CPU_CYCLES:100...|DC_DEP_STALL:3...|IC_DEP_STALL:3...| samples| %| samples| %| samples| %| ------------------------------------------------------ 1584 75.0000 16 32.6531 10 29.4118 no-vmlinux 169 8.0019 12 24.4898 3 8.8235 libdvm.so 136 6.4394 13 26.5306 9 26.4706 dalvik-jit-code-cache 86 4.0720 4 8.1633 7 20.5882 libGLESv1_CM_POWERVR_SGX540_120.so.1.1.16.3924 61 2.8883 2 4.0816 1 2.9412 libc.so 12 0.5682 0 0 0 0 libcutils.so 12 0.5682 0 0 1 2.9412 libutils.so 8 0.3788 0 0 0 0 libsrv_um.so.1.1.16.3924 6 0.2841 0 0 0 0 libandroid_runtime.so 6 0.2841 0 0 0 0 libbinder.so 5 0.2367 0 0 0 0 libhardware_legacy.so 4 0.1894 0 0 0 0 libui.so 3 0.1420 0 0 0 0 sensors.omap4.so 3 0.1420 0 0 1 2.9412 libIMGegl.so.1.1.16.3924 3 0.1420 0 0 0 0 libpvrANDROID_WSEGL.so.1.1.16.3924 3 0.1420 0 0 0 0 libsurfaceflinger.so 2 0.0947 0 0 0 0 libm.so 2 0.0947 0 0 0 0 libskia.so 2 0.0947 0 0 0 0 libsurfaceflinger_client.so 1 0.0473 0 0 0 0 busybox 1 0.0473 0 0 0 0 linker 1 0.0473 0 0 0 0 libandroid_servers.so 1 0.0473 0 0 0 0 libnativehelper.so 1 0.0473 0 0 0 0 libz.so 0 0 2 4.0816 1 2.9412 gralloc.omap4430.so.1.1.16.3924 0 0 0 0 1 2.9412 libGLESv1_CM.so
To use CPU Cycles events
~/omap4/mydroid$ cd~/omap4/mydroid$ $MYDROID/prebuilt/linux-x86/oprofile/bin/opreport --session-dir=. event:CPU_CYCLES
Output should look like (considering that you captured atleast 1k Samples):
CPU: invalid cpu type, speed 0 MHz (estimated) Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 10000000 Samples on CPU 0 Samples on CPU 1 cpu:0| cpu:1| samples| %| samples| %| ------------------------------------ 814 76.5757 770 73.4032 no-vmlinux 71 6.6792 98 9.3422 libdvm.so 67 6.3029 69 6.5777 dalvik-jit-code-cache 42 3.9511 44 4.1945 libGLESv1_CM_POWERVR_SGX540_120.so.1.1.16.3924 29 2.7281 32 3.0505 libc.so 10 0.9407 2 0.1907 libutils.so 6 0.5644 2 0.1907 libsrv_um.so.1.1.16.3924 4 0.3763 2 0.1907 libbinder.so 3 0.2822 9 0.8580 libcutils.so 3 0.2822 0 0 libsurfaceflinger.so 2 0.1881 1 0.0953 libIMGegl.so.1.1.16.3924 2 0.1881 4 0.3813 libandroid_runtime.so 2 0.1881 1 0.0953 libpvrANDROID_WSEGL.so.1.1.16.3924 2 0.1881 0 0 libsurfaceflinger_client.so 2 0.1881 2 0.1907 libui.so 1 0.0941 2 0.1907 sensors.omap4.so 1 0.0941 1 0.0953 libm.so 1 0.0941 0 0 libnativehelper.so 1 0.0941 1 0.0953 libskia.so 0 0 1 0.0953 busybox 0 0 1 0.0953 linker 0 0 1 0.0953 libandroid_servers.so 0 0 5 0.4766 libhardware_legacy.so 0 0 1 0.0953 libz.so
[edit] OProfile on OMAP4 ICS (K3.0)
[edit] Kernel Build Changes
1. Enable these options in the .config file (via menuconfig) to enable oprofile:
CONFIG_PROFILING=y CONFIG_OPROFILE=y
2. Apply this patch for oprofile support on kernel 3.0 [1] and build the kernel.
3. Flash the device with new boot.img
[edit] Linux Host Setup
Note: There are some issues using the oprofile binaries that are part of the current Android release by default. Thus, you should pull in an older version of the oprofile binaries and a modified version of the opimport_pull script that are known to work with ICS on OMAP4.
1. Pull the oprofile binaries and copy them to $MYDROID
$ cd ~/tmp $ wget http://omapedia.org/images/3/32/Oprofile.tar.gz $ tar xzvf Oprofile.tar.gz $ cp -rf ~/tmp/prebuilt/linux-x86/oprofile/bin/* $MYDROID/out/host/linux-x86/bin
The oprofile binaries are opimport, opannotate, oparchive, and opreport.
2. Replace the $MYDROID/external/oprofile/opimport_pull script with this: [2].
And add these two lines at the end of opimport_pull script for call graph support
os.system(oprofile_bin_dir + "/opreport -c --session-dir=. -m tgid -p $OUT/symbols > callgraph.txt") os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 1 -e 1 | dot -Tpng -o callgraph.png")
Install graphviz package
$ sudo apt-get install graphviz
[edit] Setup Oprofile on the OMAP4 Target Device
1. Use adb to push the busybox binaries to /data/local in the target device. The busybox binaries can be found here: [3]. You will need to run adb as root:
$ cd $MYDROID/out/host/linux-x86/bin $ sudo ./adb root $ sudo ./adb remount $ sudo ./adb push busybox /data/local $ sudo ./adb shell root@android: # export PATH=/data/busybox/bin:/data/busybox/sbin:/data/sbin:$PATH
2. Use adb to push the oprofile binaries to the target device (if they are not already present).
$ cd $MYDROID/out/target/product/blaze_tablet/system/xbin $ sudo ./adb push opcontrol /system/xbin $ sudo ./adb push oprofiled /system/xbin
If the system is read-only (such as for an eMMC boot), you can first remount with the commands:
$ sudo ./adb shell root@android: # mount -o remount rw / root@android: # mount -o remount rw /system root@android: # ln -s /proc/mounts /etc/mtab
3. Follow the same steps for OMAP4 pre-K3.0 for pushing the events and unit_masks files to the target: [4]
[edit] Run Oprofile on the OMAP4 Target Device
1. Verify the kernel range start and end addresses:
root@android: /data/local # KERNEL_BEG=`/data/busybox/bin/grep " _text" /proc/kallsyms | /data/busybox/bin/awk '{print $1}'` root@android:/data/local # echo $KERNEL_BEG c004f000 root@android: /data/local # KERNEL_END=`/data/busybox/bin/grep " _etext" /proc/kallsyms | /data/busybox/bin/awk '{print $1}'` root@android:/data/local # echo $KERNEL_END c07e4000
2. Reset oprofile:
root@android:/data/local # opcontrol --shutdown root@android:/data/local # opcontrol --reset root@android:/data/local # opcontrol --setup
Note: Run this command using the kernel range start and end addresses that you found earlier. You can reduce the event counts to get more cycles.
root@android:/data/local # opcontrol --vmlinux=/vmlinux --callgraph=16 --kernel-range=0x$KERNEL_BEG,0x$KERNEL_END --event=CPU_CYCLES:100000 Cannot open /dev/oprofile/1/enabled: No such file or directory Cannot open /dev/oprofile/2/enabled: No such file or directory Starting oprofiled... Using 2.6+ OProfile kernel interface. Reading module info. Using log file /data/oprofile/samples/oprofiled.log Ready
3. Check opcontrol status:
root@android:/data/local # opcontrol --status Driver directory: /dev/oprofile Session directory: /data/oprofile Counter 0: name: CPU_CYCLES count: 100000 Counter 1 disabled Counter 2 disabled oprofiled pid: 1414 profiler is not running cpu1 0 samples received cpu1 0 samples lost overflow cpu1 0 samples invalid eip cpu1 0 backtrace aborted cpu0 0 samples received cpu0 0 samples lost overflow cpu0 0 samples invalid eip cpu0 0 backtrace aborted
4. Ensure that the events can be listed:
root@android:/data/local # opcontrol --list-events name : meaning ------------------------------------------------------------------------------ IFU_IFETCH_MISS : number of instruction fetch misses CYCLES_IFU_MEM_STALL: cycles instruction fetch pipe is stalled CYCLES_DATA_STALL : cycles stall occurs for due to data dependency ITLB_MISS : number of Instruction MicroTLB misses DTLB_MISS : number of Data MicroTLB misses BR_INST_EXECUTED : branch instruction executed w/ or w/o program flow change BR_INST_MISS_PRED : branch mispredicted INSN_EXECUTED : instructions executed DCACHE_ACCESS : data cache access, cacheable locations DCACHE_ACCESS_ALL : data cache access, all locations DCACHE_MISS : data cache miss DCACHE_WB : data cache writeback, 1 event for every half cacheline PC_CHANGE : number of times the program counter was changed without a mode switch TLB_MISS : Main TLB miss EXP_EXTERNAL : Explicit external data access LSU_STALL : cycles stalled because Load Store request queue is full WRITE_DRAIN : Times write buffer was drained CPU_CYCLES : clock cycles counter
5. Run start and stop to receive samples for approximately 30 seconds of profiling:
root@android:/data/local # opcontrol --start;sleep 30;opcontrol --stop
[edit] Pull the opreport to the Linux Host
1. Set the paths that are used by opimport:
$ export ANDROID_HOST_OUT=$MYDROID/out/host/linux-x86 $ export OUT=$MYDROID/out/target/product/blaze_tablet
2. Run the opimport_pull script, which then calls opreport:
$ $MYDROID/external/oprofile/opimport_pull –r ./test_output CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt TIMER:0| samples| %| ------------------ 7671 99.8308 vmlinux 4 0.0521 oprofiled 3 0.0390 dalvik-jit-code-cache 3 0.0390 libdvm.so 2 0.0260 libc.so 1 0.0130 libandroid_runtime.so
The oprofile binaries opimport, opannotate, oparchive, and opreport available at http://omapedia.org/images/3/32/Oprofile.tar.gz are 32-bit binaries. If you are using 64-bit machine, download the 32-bit libraries that you need (eg: libpopt.so.0 )
$ sudo aptitude install ia32-libs $ wget http://ftp.us.debian.org/debian/pool/main/p/popt/libpopt0_1.16-1_i386.deb #(Note: this link is for Intel x86 machines)
# Extract it into your 32-bit library tree $ sudo dpkg -X libpopt0_1.16-1_i386.deb ~/tmp/lib
[edit] Strace
Recommended Usage on target
# strace -ff -F -tt -s 200 -o /sqlite_stmt_journals/strace -p
Note
-ff makes strace follow fork() and output each forked files trace to a different file
-F means we try and follow any vfork()s.
-tt prints out the time of system calls in microseconds
-s 200 so that we can see a bit more detail in any strings that are used.
[edit] Using INST2 for Performance Measurement on DSP
INST2 is a tool that helps us measure DSP MHz used for a particular use case e.g. Video playback or record.
It is recommended that the VDD1 OPP is locked before starting this tool for obtaining accurate result.
To lock the OPP use the following commands:
# echo n > /sys/power/vdd1_lock
(where n stands for the OPTIMAL OPP the use case should run at)
Refer to this omap3-opp.h file for the OPP table corresponding to your chip
After executing your usecase, make sure that OPP lock is removed using the command below
# echo 0 > /sys/power/vdd1_lock
Step 1: Install busybox on the target filesystem.
Copy the pre-built busybox at /data/busybox/
On target do the following
# cd /data/busybox/ # chmod 777 busybox # busybox --install # export PATH=/data/busybox/:$PATH
Incase the "busybox --install" fails with message "Busybox not found" error, check with ls command to confirm if busybox is actually present and then try "./busybox --install"
PS: This step is optional for FROYO (2.6.32 Kernel)
Step 2: Download the Dsp_load_measurement_tool.tar.gz file to your host machine if you are on eclair (2.6.29 Kernel) or Dsp_load_measurement_tool_froyo.tar.gz for Froyo (2.6.32 Kernel)
Untar this file using the command below
$ tar xvf Dsp_load_measurement_tool.tar.gz $ cd dsp_load_measurement_tool $ tar xvf inst2.tar
Copy /dsp_load_measurement_tool/inst_log file to the root directory in your file system; using SD card or adb push
# cp inst_log .OR $ adb push inst_log .
Step 3: Give permissions to the files copied
# cd / # chmod 777 inst_log
Step 4: Now run the use case i.e. start playback or record
Step 5: Run the instrumentation
# ./inst_log
Following messages should appear
DSP device detected !! DSPProcessor_Attach succeeded.
Step 6: Once the use case is complete (i.e. playback or record is done), wait for "INST: Log file written. Waiting for INST DSP side cleanup" message
Step 7: Now wait for "INST: DSP Cleanup done. Exiting" message. If this msg does not appear, run the use case once more and wait again for a "INST: DSP Cleanup done. Exiting" message. Basically this will flush previous results and you do not need to run the full use case again.
This is to make sure previous result is flushed out.
Step 8: Bring the "log.bin" on HOST PC. This gets generated in /sqlite* folder of target FS for Eclair (2.6.29 kernel) and in /mnt folder on Froyo (2.6.32 kernel)
Important Note: busybox should be installed prior to executing the cp command. If you get the message "cp: not found" while copying the log, you should use the adb pull command before unmounting the SD card.
For Eclair (2.6.29 kernel)
# cp /sqlite*/*.bin / # cp log.bin /sdcard OR# sync
For Froyo (2.6.32 kernel)
# cp /mnt/*.bin / # cp log.bin /sdcard OR# sync
Step 9: On the HOST PC use the following command line to generate the results.
Go to following folder:
$ cd dsp_load_measurement_tool\inst2\tools $ perl inst_load.pl -clog.bin for example $ perl inst2/tools/inst_load.pl -c660 log.bin
On the screen you will see information about the file for example:
Number of records: 40731 splice() offset past end of array at inst2/tools/inst_load.pl line 123. Range #0 - beg: 0, end: 40731, length: 40731 Range #1 - beg: 0, end: 40731, length: 40731 Ignoring IDLE traces 0-22 and 40714-40730 Clock Freq: 660 MHz Total Cycles: 25383675998 Event | Handle_Event | Cycles | Evts/s | ms/Evt | MHz |30 Evt/s MHz ------+---------------+-----------+---------+--------+-------+------------ unkwn | 00000000_0000 | 11568 | 0 | 0 | 0 | IDLE | 00000000_2001 | 1.396e+10 | 48.99 | 11.23 | 220 | UALG | 2025fad4_2009 | 70257552 | 39 | 0.07 | 1.1 | 0.85 CTRL | 2025fad4_2011 | 40768 | 0.19 | 0.01 | 0 | RESET | 2025fad4_2021 | 762453 | 0.3 | 0.1 | 0 | USN | 2025fad4_2041 | 47083524 | 50.17 | 0.04 | 0.7 | ALGO | 2025fad4_2101 | 560448611 | 19.68 | 1.12 | 8.8 | 13.46 UALG | 20260ee4_2009 | 2.320e+09 | 25.56 | 3.58 | 36.6 | 42.91 CTRL | 20260ee4_2011 | 21475 | 0.13 | 0.01 | 0 | RESET | 20260ee4_2021 | 286269 | 0.05 | 0.24 | 0 | USN | 20260ee4_2041 | 116914279 | 76.21 | 0.06 | 1.8 | ALGO | 20260ee4_2101 | 8.305e+09 | 25.54 | 12.81 | 130.9 | 153.7
Look at the last row, in above results. In this case the algo. consumed 130 Mhz. and it was running @ 25.54 fps. If we interpolate that to 30 fps, the effective Mhz would be 153.7
Step 10: The final step is to apply the next formula:
DSP Mhz consumption = (Clock Freq - IDLE MHz).
For example in this case above the DSP CPU Load is 400-220 = 180 MHz
[edit] Debugging segmentation fault
Type 1 Seg Fault has backtrace of shared objects (most of the times we face this)
The PC holds the offset and it has to be traced in the top most 'so' file present in back trace. This can be done by using addr2line or objdump tool
a. Using addr2line:
Syntax:-f -e
Example: #00 pc 0000e234 /system/lib/libc.so ./prebuilt/linux-x86/toolchain/arm-eabi-4.3.1/bin/arm-eabi-addr2line -f -e ./out/target/product/generic/symbols/system/lib/libc.so 0x0000e234
Returns strlen bionic/libc/arch-arm/bionic/strlen.c:62
b. Using objdump:
Syntax:-S
Example: prebuilt/linux-x86/toolchain/arm-eabi-4.3.1/bin/arm-eabi-objdump -S out/target/product/zoom2/symbols/system/lib/libOMX.TI.Video.encoder.so > ObjDumpFile.txt Output is redirected to ObjDumpFile file
Now the ObjDumpFile has the Intermix of source code with disassembly. We can search using the PC offset in it to see the exact line number
Tip: Make sure to use the debug version of 'so' files which are within the symbols dir in out folder otherwise we would not get the source code symbols in obj dump
Type 2 Seg fault print has the thread name along with register values. The backtrace of shared objects is not present
1. The PC holds a virtual address
2. The first '3' digits of PC provide info on the virtual memory mapping where the shared object('so') is loaded and remaining digits provide the offset within the 'so'
3. Use the 'ps' call to get the Process ID of the thread where it fails.
4. Get the memory map of the process using the following:
In device console/adb shell, print values of 'maps' file within the process id. Syntax: cat proc/ /maps It is recommended that we get the memory map during the normal execution before the failure itself otherwise it is possible for process to be killed and might lead to mismatch.
5. Use the first '3' digits of PC value to identify the shared object which has caused the failure(using memory map)
6. Get the obj dump of the 'so' and search with the offset to get the exact line of failure. (this can be done in the similar way as in Type 1 seg fault debugging mentioned above)
Example:
Seg Fault Message: PV author: unhandled page fault (11) at 0x00000004, code 0x017 pgd = ccfc8000 [00000004] *pgd=8cbe1031, *pte=00000000, *ppte=00000000 Pid: 9691, comm: PV author CPU: 0 Not tainted (2.6.29-omap1 #20) PC is at 0x81b23140 LR is at 0x81b23049 ...
Based on PC, the shared object is in range of 0x81bxxxxx and offset within that is 0x23140
PV author runs in context of Media server process (PID: 944) Using cat /proc/944/maps we can identify the ‘so’ loaded in this region 81b00000-81b2a000 r-xp 00000000 b3:02 34574 /system/lib/libOMX.TI.Video.encoder.so So appears to be in OMX TI Video encoder
Now with the offset we can use the addr2line or objdump to get the line causing this issue
[edit] OPP Level Measurement
We can measure the OPP level(VDD1) at which the use case is running easily using the script available in Mydroid (FroYo codebase).
Script path:
Or you can access it from GIT Web interface
Pre-requisite
a. Busybox binary is present in /data/busybox folder (instructions for this can be referred to Step 1 under section "Using INST2 for Performance Measurement on DSP" above
b. Copy this script file(get_opp_levels.sh) to the file system
Steps for measurement: (in device console)
a. Change the permissions of the script file
b. Start the desired use case
c. Execute the script
chmod 777 get_opp_levels.sh./get_opp_levels.sh
d. Output would look like this:
OPP Level,Initial Time(cs),Second Time(cs),Time Spent in OPP(cs),Measurement Time(s),Percentage Time in OPP OPP1G,2332,2332,0,20,0 OPP130,131,131,0,20,0 OPP100,410,410,0,20,0 OPP50,43145,45151,2006,20,100
Based on the percentage time we can decide the OPP at which use case has run. In the above example, it has run at OPP50 for 100% of the time for 20 second window.
Other pointers:
a. The script by default runs for 20secs and prints the results.
b. It can be tweaked based on our requirements to perform it for shorter/longer duration etc. using an additional argument.
For eg: The command below with measure OPP transitions for 100 sec. window
./get_opp_levels.sh 100