Close
Android native crash

Android native process crash debugging

Introduction

Many of us mostly heard about native crashes while we were studying engineering, for instance one popular well known term was segmentation fault. When working on native code like C, C++ based, we do have many tools to debug such crashes one such popular tool is GDB. I am sure many of us have used it many times. But in the world of android platform we have to use different tools to analyze and debug android native crashes. In this blog I want to discuss about following items in below order.

  • How does an android native crash looks like, What are tombstones?
  • What each term mean in the process stack trace?
  • How to exactly know which line of the code has issue with debug tool ?
  • How to proceed further if the tool could not help, What next to do ?
  • Conclusion

How does an android native crash looks like, What are tombstones?

Here is detailed example how native crash would look like. Now the question where do we see this crash trace? So when any native process crashes then the Android system generates one short trace that can be seen in the adb logs with basic details, and more detailed with all the memory map/stack trace are generated under /data/tombestones folder inside the device which called as tombstone file. In this blog I shall discuss more on the trace from the adb logs.

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Android/aosp_angler/angler:7.1.1/NYC/enh12211018:eng/test-keys'
Revision: '0'
ABI: 'arm'
pid: 17946, tid: 17949, name: crasher  >>> crasher <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xc
    r0 0000000c  r1 00000000  r2 00000000  r3 00000000
    r4 00000000  r5 0000000c  r6 eccdd920  r7 00000078
    r8 0000461a  r9 ffc78c19  sl ab209441  fp fffff924
    ip ed01b834  sp eccdd800  lr ecfa9a1f  pc ecfd693e  cpsr 600e0030
backtrace:
    #00 pc 0004793e  /system/lib/libc.so (pthread_mutex_lock+1)
    #01 pc 0001aa1b  /system/lib/libc.so (readdir+10)
    #02 pc 00001b91  /system/xbin/crasher (readdir_null+20)
    #03 pc 0000184b  /system/xbin/crasher (do_action+978)
    #04 pc 00001459  /system/xbin/crasher (thread_callback+24)
    #05 pc 00047317  /system/lib/libc.so (_ZL15__pthread_startPv+22)
    #06 pc 0001a7e5  /system/lib/libc.so (__start_thread+34)
Tombstone written to: /data/tombstones/tombstone_06
Credits: Android Developer documents 

What each term mean in the stack trace?

Looking at the above trace it appears to be complex, but it is not if you focus on the core details. Let me list out below

Pid: Process ID

Tid: Thread ID

name: Name of the process that just crashed

Signal/code: Signal ID which caused this crash, in this case it is SEGSEGV

Registers/SP/PC: r* are resigters data, sp is satck pointer, pc program pointer when the process crashed.

backtrace: This contains the most critical details. The line starts with # is backtrace frame. Which contains #FrameNumber | PC counter address | dynamic library/bin path | function name respectively.

How to exactly know which line of the code has an issue with the debug tool?

Still here, lets see how we can make sense out of the bactrace we just saw. Before that let me introduce you with the tool that I use frequently which is “addr2line“. Now let’s dive into how to use this, the first thing is to see the backtrace section. find the line which belongs to your process, why this because always assume the issue is in your module rather that any core system library like in below case libc. In below backtrace line #2 belongs to the process from where everything started. So lets find out the all the info that our tool is going to need.

PC address: 00001b91

library/bin name: crasher

backtrace:    
#00 pc 0004793e  /system/lib/libc.so (pthread_mutex_lock+1)
#01 pc 0001aa1b  /system/lib/libc.so (readdir+10)   
#02 pc 00001b91  /system/xbin/crasher (readdir_null+20)    

Now we got our details. What we do is we find the symbol file for the libray name / bin. This can be found under the compiled build out/target/product/<targetName>/symbols/<path mentioned in backtrace frame>. once you find the symbol file we are mostly done. Use below command to find the exact line code where the issue occurred. Go to that file line number, check code and fix it.

addr2line -e absolute_path_to_symbolfile 00001b91

How to proceed further if the tool could not help, what to do next ?

There are cases when the problem under the hood is complex. Like double free, unreleased locks, memory leaks etc. The following things can be tried out.

  • Get the thread ID of process from the bractrace and check the adb logs. What was the last thing that thread did, check if we could find any clue about the crash. If not add more logging to get better understanding.
  • You can also use Valgrind, Address sanitizers to find more insightful information about the process like memory leaks etc.

Conclusion

In this blog we discussed basic details of how to debug native android crash and suggested more advanced tools to kind of avoid all the production crashes. I must say crash debugging is an art, the more you do it you get better at it. I am leaving a few really nice documents for more detailed understanding. Bye Bye, happy coding!

References

https://source.android.com/docs/core/tests/debug/native-crash

https://linux.die.net/man/1/addr2line

https://valgrind.org/docs/manual/quick-start.html

https://learn.microsoft.com/en-us/cpp/sanitizers/asan?view=msvc-170

More blogs

https://www.workingexample.tech/low-level-design-of-youtubes-last-watched-video-feature-part-1/

https://www.workingexample.tech/low-level-design-of-youtubes-last-watched-video-feature-part-2/

https://www.workingexample.tech/how-to-write-a-web-crawler-for-large-language-models-at-scale-part-i/

https://www.workingexample.tech/how-to-write-a-web-crawler-for-large-language-models-at-scale-part-2/