#NoProjects are needed for software development. Prioritize MMF based on multi-dimensional analysis (value, risk and cost of delay) instead.
I was tempted to share why I think that Projects in software development kill value rather than bring value but before posting I did my homework and I found that somebody wrote about it before.
Prioritization should be about selecting the minimal marketable features (MMF) that bring the biggest value, minimize the cost of delay and/or are bigger risk reducers (what I call multi-dimensional analysis). It should be done continuously just as software should be shipped. Accepting a big effort as more profitable than focusing on MMF delivery is a mistake. The devil is *always* in the details.
Monday, June 13, 2016
Saturday, June 04, 2016
AH00051: child pid ? exit signal Segmentation fault (11), possible coredump in ?
I ran into the below segmentation fault in apache accompanied by high CPU usage in the server:
[Fri Jun 03 06:00:01.270303 2016] [core:notice] [pid 29628:tid 140412199950208] AH00051: child pid 2343 exit signal Segmentation fault (11), possible coredump in /tmp/apache-coredumpsAfter debugging what the problem was, I concluded that mod proxy was the culprit. There is an apparent recursive bug. I decided not to report it because it was hard to replicate (this happened only late at night) and there was a workaround I could use. This error happened when there was a first request for a resource that the proxied worker could not serve, apache tried to use a specific document for the error but a configuration issue would try to pull such document from the worker as well. Since the document is not in the remote server mod proxy will enter an infinite loop resulting in an apache crash. Most likely a recursion related bug. Clearly resolving the configuration issue to make sure the document is pulled from the local apache hides this potential mod proxy bug.
[Fri Jun 03 05:56:40.831871 2016] [proxy:error] [pid 2343:tid 140411834115840] (502)Unknown error 502: [client 192.168.0.180:58819] AH01084: pass request body failed to 172.16.2.2:8443 (sample.com) [Fri Jun 03 05:56:40.831946 2016] [proxy:error] [pid 2343:tid 140411834115840] [client 192.168.0.180:58819] AH00898: Error during SSL Handshake with remote server returned by /sdk [Fri Jun 03 05:56:40.831953 2016] [proxy_http:error] [pid 2343:tid 140411834115840] [client 192.168.0.180:58819] AH01097: pass request body failed to 172.16.2.2:8443 (sample.com) from 192.168.0.180 () [Fri Jun 03 05:56:40.844138 2016] [proxy:error] [pid 2343:tid 140411834115840] (502)Unknown error 502: [client 192.168.0.180:58819] AH01084: pass request body failed to 172.16.2.2:8443 (sample.com) [Fri Jun 03 05:56:40.844177 2016] [proxy:error] [pid 2343:tid 140411834115840] [client 192.168.0.180:58819] AH00898: Error during SSL Handshake with remote server returned by /html/error/503.html [Fri Jun 03 05:56:40.844185 2016] [proxy_http:error] [pid 2343:tid 140411834115840] [client 192.168.0.180:58819] AH01097: pass request body failed to 172.16.2.2:8443 (sample.com) from 192.168.0.180 ()
Why do I get a Linux SIGSEGV / segfault / Segmentation fault ?
To respond to this question you will need to get a coredump and use the GNU debugger (gdb).
The application generating the segmentation fault must be configured to produce a coredump or should be manually run with gdb in case a segmentation fault can be manually replicated.
Let us pick Apache to go through the debugging steps with a real world example. In order to produce a coredump apache2 needs to be configured:
The application generating the segmentation fault must be configured to produce a coredump or should be manually run with gdb in case a segmentation fault can be manually replicated.
Let us pick Apache to go through the debugging steps with a real world example. In order to produce a coredump apache2 needs to be configured:
$ vi /etc/apache2/apache2.conf ... CoreDumpDirectory /tmp/apache-coredumps ...Also the OS must not impose limits on the size of the core file in case we don't know how big it would be:
$ sudo bash # ulimit -c unlimitedIn addition the core dump directory must exist and must be owned by the apache user (www-data in this case)
sudo mkdir /tmp/apache-coredumps sudo chown www-data:www-data /tmp/apache-coredumpsMake sure to stop and start apache separately
$ sudo apachectl stop $ sudo apachectl startLook into running processes:
$ ps -ef|grep apache2Confirm that the processes are running with "Max core file size" unlimited:
$ cat /proc/${pid}/limits ... Max core file size unlimited unlimited bytes ...Here is a quick way to do it with a one liner. It lists the max core file size for the parent apache2 process and all its children:
$ ps -ef | grep 'sbin/apache2' | grep -v grep | awk '{print $2}' | while read -r pid; do cat /proc/$pid/limits | grep core ; doneTo test that your configuration works just force a coredump. This can be achieved by sending a SIGABRT signal to the process:
$ kill -SIGABRT ${pid}Analyze the coredump file with gdb. In this case it confirms that the SIGABRT signal was used:
$ gdb core /tmp/apache-coredumps/core ... Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f1bc74073bd in ?? () ...Leave apache running and when it fails with a segmentation fault you can confirm the reason:
$ gdb core /tmp/apache-coredumps/core ... Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in ?? () ...Explore then deeper to find out what the problem really is. Note that we run apache now through gdb to get deeper information about the coredump:
$ gdb /usr/sbin/apache2 /tmp/apache-coredumps/core Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 44 /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c: No such file or directory.Using the bt command from the gdb prompt gives us more:
Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 44 /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c: No such file or directory. (gdb) bt #0 0x00007fc4d7ef6a11 in apr_brigade_cleanup (data=0x7fc4d8760e68) at /build/buildd/apr-util-1.5.3/buckets/apr_brigade.c:44 #1 0x00007fc4d7cd79ce in run_cleanups (cref=I was miss led by the messages above until I started changing apache configurations and re-looking into generated coredumps. I realized Apache would fail at any line of any source code complaining about "No such file or directory" in a clear consequence of its incapacity to access resources that did exist:) at /build/buildd/apr-1.5.0/memory/unix/apr_pools.c:2352 #2 apr_pool_destroy (pool=0x7fc4d875f028) at /build/buildd/apr-1.5.0/memory/unix/apr_pools.c:814 #3 0x00007fc4d357de00 in ssl_io_filter_output (f=0x7fc4d876a8e0, bb=0x7fc4d8767ab8) at ssl_engine_io.c:1659 #4 0x00007fc4d357abaa in ssl_io_filter_coalesce (f=0x7fc4d876a8b8, bb=0x7fc4d8767ab8) at ssl_engine_io.c:1558 #5 0x00007fc4d87f1d2d in ap_process_request_after_handler (r=r@entry=0x7fc4d875f0a0) at http_request.c:256 #6 0x00007fc4d87f262a in ap_process_async_request (r=r@entry=0x7fc4d875f0a0) at http_request.c:353 #7 0x00007fc4d87ef500 in ap_process_http_async_connection (c=0x7fc4d876a330) at http_core.c:143 #8 ap_process_http_connection (c=0x7fc4d876a330) at http_core.c:228 #9 0x00007fc4d87e6220 in ap_run_process_connection (c=0x7fc4d876a330) at connection.c:41 #10 0x00007fc4d47fe81b in process_socket (my_thread_num=19, my_child_num= , cs=0x7fc4d876a2b8, sock= , p= , thd= ) at event.c:970 #11 worker_thread (thd= , dummy= ) at event.c:1815 #12 0x00007fc4d7aa7184 in start_thread (arg=0x7fc4c3fef700) at pthread_create.c:312 #13 0x00007fc4d77d437d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/apache2 -k start'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007fb44328d992 in ap_save_brigade (f=f@entry=0x7fb4432e5148, saveto=saveto@entry=0x7fb4432e59c8, b=b@entry=0x7fb435ffa568, p=0x7fb443229028) at util_filter.c:648 648 util_filter.c: No such file or directory.Disabling modules I was able to get to the bottom of it. An apparent bug in mod-proxy when a miss configuration causes recursive pulling of unavailable resources from the proxied target. Once the issue is found comment or eliminate the configuration from apache:
$ sudo vi /etc/apache2/apache2.conf ... # CoreDumpDirectory /tmp/apache-coredumps ... $ sudo apachectl graceful
Increasing swap size in Ubuntu Linux
Here is how to depart with parted (no pun intended) from a swap partition to a swap file in Ubuntu Linux. When in need to increase the swap file it is way easier to increase the size of a swap file than to increase the size of a partition. Starting with the 2.6 Linux kernel, swap files are just as fast as swap partitions.
Remove the current swap partition:
$ sudo parted … (parted) print … Number Start End Size Type File system Flags 1 1049kB 40.8GB 40.8GB primary ext4 boot 2 40.8GB 42.9GB 2145MB extended 5 40.8GB 42.9GB 2145MB logical linux-swap(v1) … (parted) rm 2 (parted) print … Number Start End Size Type File system Flags 1 1049kB 40.8GB 40.8GB primary ext4 boot …Delete the entry from stab
# swap was on /dev/sda5 during installation # UUID=544f0d91-d3db-4301-8d1b-f6bfb2fdee5b none swap sw 0 0Disable all swapping devices:
$ sudo swapoff -aCreate a swapfile with correct permissions, for example the below creates a 4GB one:
$ sudo fallocate -l 4G /swapfile $ sudo chmod 600 /swapfileSetup the linux swap area:
$ sudo mkswap /swapfileEnable the swap device:
$ sudo swapon /swapfileConfirm that it was created:
$ sudo swapon -s … Filename Type Size Used Priority /swapfile file 4194300 0 -1 …Add the entry to /etc/fstab:
/swapfile none swap sw 0 0Restart:
$ sudo reboot nowConfirm that the swap is active:
$ free total used free shared buffers cached Mem: 4048220 1651032 2397188 536 66648 336164 -/+ buffers/cache: 1248220 2800000 Swap: 4194300 0 4194300 $ sudo swapon -s … Filename Type Size Used Priority /swapfile file 4194300 0 -1 …
Subscribe to:
Posts (Atom)