The application generating the segmentation fault must be configured to produce a coredump or should be manually run with gdb in case a segmentation fault can be manually replicated.
Let us pick Apache to go through the debugging steps with a real world example. In order to produce a coredump apache2 needs to be configured:
$ vi /etc/apache2/apache2.conf ... CoreDumpDirectory /tmp/apache-coredumps ...Also the OS must not impose limits on the size of the core file in case we don't know how big it would be:
$ sudo bash # ulimit -c unlimitedIn addition the core dump directory must exist and must be owned by the apache user (www-data in this case)
sudo mkdir /tmp/apache-coredumps sudo chown www-data:www-data /tmp/apache-coredumpsMake sure to stop and start apache separately
$ sudo apachectl stop $ sudo apachectl startLook into running processes:
$ ps -ef|grep apache2Confirm that the processes are running with "Max core file size" unlimited:
$ cat /proc/${pid}/limits ... Max core file size unlimited unlimited bytes ...Here is a quick way to do it with a one liner. It lists the max core file size for the parent apache2 process and all its children:
$ ps -ef | grep 'sbin/apache2' | grep -v grep | awk '{print $2}' | while read -r pid; do cat /proc/$pid/limits | grep core ; doneTo test that your configuration works just force a coredump. This can be achieved by sending a SIGABRT signal to the process:
$ kill -SIGABRT ${pid}Analyze the coredump file with gdb. In this case it confirms that the SIGABRT signal was used:
$ gdb core /tmp/apache-coredumps/core ... Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f1bc74073bd in ?? () ...Leave apache running and when it fails with a segmentation fault you can confirm the reason:
$ gdb core /tmp/apache-coredumps/core ... Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in ?? () ...Explore then deeper to find out what the problem really is. Note that we run apache now through gdb to get deeper information about the coredump:
$ gdb /usr/sbin/apache2 /tmp/apache-coredumps/core Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 44 /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c: No such file or directory.Using the bt command from the gdb prompt gives us more:
Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 44 /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c: No such file or directory. (gdb) bt #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 #1 0x00007fc4d7cd79ce in run_cleanups (cref=I was miss led by the messages above until I started changing apache configurations and re-looking into generated coredumps. I realized Apache would fail at any line of any source code complaining about "No such file or directory" in a clear consequence of its incapacity to access resources that did exist:) at /build/buildd/apr-1.5.0/memory/unix/apr_pools.c:2352 #2 apr_pool_destroy (pool=0x7fc4d875f028) at /build/buildd/apr-1.5.0/memory/unix/apr_pools.c:814 #3 0x00007fc4d357de00 in ssl_io_filter_output (f=0x7fc4d876a8e0, bb=0x7fc4d8767ab8) at ssl_engine_io.c:1659 #4 0x00007fc4d357abaa in ssl_io_filter_coalesce (f=0x7fc4d876a8b8, bb=0x7fc4d8767ab8) at ssl_engine_io.c:1558 #5 0x00007fc4d87f1d2d in ap_process_request_after_handler (r=r@entry=0x7fc4d875f0a0) at http_request.c:256 #6 0x00007fc4d87f262a in ap_process_async_request (r=r@entry=0x7fc4d875f0a0) at http_request.c:353 #7 0x00007fc4d87ef500 in ap_process_http_async_connection (c=0x7fc4d876a330) at http_core.c:143 #8 ap_process_http_connection (c=0x7fc4d876a330) at http_core.c:228 #9 0x00007fc4d87e6220 in ap_run_process_connection (c=0x7fc4d876a330) at connection.c:41 #10 0x00007fc4d47fe81b in process_socket (my_thread_num=19, my_child_num= , cs=0x7fc4d876a2b8, sock= , p= , thd= ) at event.c:970 #11 worker_thread (thd= , dummy= ) at event.c:1815 #12 0x00007fc4d7aa7184 in start_thread (arg=0x7fc4c3fef700) at pthread_create.c:312 #13 0x00007fc4d77d437d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fb44328d992 in ap_save_brigade (f=f@entry=0x7fb4432e5148, saveto=saveto@entry=0x7fb4432e59c8, b=b@entry=0x7fb435ffa568, p=0x7fb443229028) at util_filter.c:648 648 util_filter.c: No such file or directory.Disabling modules I was able to get to the bottom of it. An apparent bug in mod-proxy when a miss configuration causes recursive pulling of unavailable resources from the proxied target. Once the issue is found comment or eliminate the configuration from apache:
$ sudo vi /etc/apache2/apache2.conf ... # CoreDumpDirectory /tmp/apache-coredumps ... $ sudo apachectl graceful
No comments:
Post a Comment