Introduction
Many of us mostly heard about native crashes while we were studying engineering, for instance one popular well known term was segmentation fault. When working on native code like C, C++ based, we do have many tools to debug such crashes one such popular tool is GDB. I am sure many of us have used it many times. But in the world of android platform we have to use different tools to analyze and debug android native crashes. In this blog I want to discuss about following items in below order.
- How does an android native crash looks like, What are tombstones?
- What each term mean in the process stack trace?
- How to exactly know which line of the code has issue with debug tool ?
- How to proceed further if the tool could not help, What next to do ?
- Conclusion
How does an android native crash looks like, What are tombstones?
Here is detailed example how native crash would look like. Now the question where do we see this crash trace? So when any native process crashes then the Android system generates one short trace that can be seen in the adb logs with basic details, and more detailed with all the memory map/stack trace are generated under /data/tombestones folder inside the device which called as tombstone file. In this blog I shall discuss more on the trace from the adb logs.
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Android/aosp_angler/angler:7.1.1/NYC/enh12211018:eng/test-keys'
Revision: '0'
ABI: 'arm'
pid: 17946, tid: 17949, name: crasher >>> crasher <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xc
r0 0000000c r1 00000000 r2 00000000 r3 00000000
r4 00000000 r5 0000000c r6 eccdd920 r7 00000078
r8 0000461a r9 ffc78c19 sl ab209441 fp fffff924
ip ed01b834 sp eccdd800 lr ecfa9a1f pc ecfd693e cpsr 600e0030
backtrace:
#00 pc 0004793e /system/lib/libc.so (pthread_mutex_lock+1)
#01 pc 0001aa1b /system/lib/libc.so (readdir+10)
#02 pc 00001b91 /system/xbin/crasher (readdir_null+20)
#03 pc 0000184b /system/xbin/crasher (do_action+978)
#04 pc 00001459 /system/xbin/crasher (thread_callback+24)
#05 pc 00047317 /system/lib/libc.so (_ZL15__pthread_startPv+22)
#06 pc 0001a7e5 /system/lib/libc.so (__start_thread+34)
Tombstone written to: /data/tombstones/tombstone_06
Credits: Android Developer documents
What each term mean in the stack trace?
Looking at the above trace it appears to be complex, but it is not if you focus on the core details. Let me list out below
Pid: Process ID
Tid: Thread ID
name: Name of the process that just crashed
Signal/code: Signal ID which caused this crash, in this case it is SEGSEGV
Registers/SP/PC: r* are resigters data, sp is satck pointer, pc program pointer when the process crashed.
backtrace: This contains the most critical details. The line starts with # is backtrace frame. Which contains #FrameNumber | PC counter address | dynamic library/bin path | function name respectively.
How to exactly know which line of the code has an issue with the debug tool?
Still here, lets see how we can make sense out of the bactrace we just saw. Before that let me introduce you with the tool that I use frequently which is “addr2line“. Now let’s dive into how to use this, the first thing is to see the backtrace section. find the line which belongs to your process, why this because always assume the issue is in your module rather that any core system library like in below case libc. In below backtrace line #2 belongs to the process from where everything started. So lets find out the all the info that our tool is going to need.
PC address: 00001b91
library/bin name: crasher
backtrace:
#00 pc 0004793e /system/lib/libc.so (pthread_mutex_lock+1)
#01 pc 0001aa1b /system/lib/libc.so (readdir+10)
#02 pc 00001b91 /system/xbin/crasher (readdir_null+20)
Now we got our details. What we do is we find the symbol file for the libray name / bin. This can be found under the compiled build out/target/product/<targetName>/symbols/<path mentioned in backtrace frame>. once you find the symbol file we are mostly done. Use below command to find the exact line code where the issue occurred. Go to that file line number, check code and fix it.
addr2line -e absolute_path_to_symbolfile 00001b91
How to proceed further if the tool could not help, what to do next ?
There are cases when the problem under the hood is complex. Like double free, unreleased locks, memory leaks etc. The following things can be tried out.
- Get the thread ID of process from the bractrace and check the adb logs. What was the last thing that thread did, check if we could find any clue about the crash. If not add more logging to get better understanding.
- You can also use Valgrind, Address sanitizers to find more insightful information about the process like memory leaks etc.
Conclusion
In this blog we discussed basic details of how to debug native android crash and suggested more advanced tools to kind of avoid all the production crashes. I must say crash debugging is an art, the more you do it you get better at it. I am leaving a few really nice documents for more detailed understanding. Bye Bye, happy coding!
References
https://source.android.com/docs/core/tests/debug/native-crash
https://linux.die.net/man/1/addr2line
https://valgrind.org/docs/manual/quick-start.html
https://learn.microsoft.com/en-us/cpp/sanitizers/asan?view=msvc-170
More blogs
https://www.workingexample.tech/low-level-design-of-youtubes-last-watched-video-feature-part-1/
https://www.workingexample.tech/low-level-design-of-youtubes-last-watched-video-feature-part-2/