Tuesday, August 28, 2012

Recovering from a Mysql crash

Our Bugzilla and Mediawiki was down today due to MySQL corruption. Mysql daemon was not running and attempting to connect would give us:
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
The error log showed:
120828 14:55:33 InnoDB: Page checksum 3094081080, prior-to-4.0.14-form checksum 1158860183 InnoDB: stored checksum 841888054, prior-to-4.0.14-form stored checksum 808333620 InnoDB: Page lsn 1529887023 1298231855, low 4 bytes of lsn at page end 824192288 InnoDB: Page number (if stored to page already) 942551086, InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 976564768 InnoDB: Database page corruption on disk or a failed InnoDB: file read of page 34110.InnoDB: You may have to recover from a backup. InnoDB: It is also possible that your operating InnoDB: system has corrupted its own file cache InnoDB: and rebooting your computer removes the InnoDB: error. InnoDB: If the corrupt page is an index page InnoDB: you can also try to fix the corruption InnoDB: by dumping, dropping, and reimporting InnoDB: the corrupt table. You can use CHECK InnoDB: TABLE to scan your table for corruption. InnoDB: See also http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html InnoDB: about forcing recovery. InnoDB: Ending processing because of a corrupt database page. 120828 14:55:33 [Note] Plugin 'FEDERATED' is disabled. InnoDB: Log scan progressed past the checkpoint lsn 0 2062642898 120828 14:55:33 InnoDB: Database was not shut down normally! InnoDB: Starting crash recovery. InnoDB: Reading tablespace information from the .ibd files... InnoDB: Restoring possible half-written data pages from the doublewrite InnoDB: buffer... InnoDB: Doing recovery: scanned up to log sequence number 0 2062644241 120828 14:55:33 InnoDB: Starting an apply batch of log records to the database... InnoDB: Progress in percents: 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 6 4 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 InnoDB: Apply batch completed 120828 14:55:33 InnoDB: Started; log sequence number 0 2062644241 120828 14:55:33 [Note] Event Scheduler: Loaded 0 events 120828 14:55:33 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.1.49-1ubuntu8.1' socket: '/var/run/mysqld/mysqld.sock' port: 3306 (Ubuntu)
So visiting the page as suggested gave us the hint we needed.

Solution 1

There is no way you can guarantee no disruption. The database will need to go down for repair and no process should be able to interfere with that so I recommend to listen to a different port as you ca see below:
  1. Communicate there will be disruption to the proper channels. The database will be unavailable
  2. Edit the mysql configuration file to force a MySQL InnoDB recovery and start mysql in a different port:
    $ sudo vi /etc/mysql/my.cnf
    port            = 13306
    innodb_force_recovery = 2
  3. Restart MySQL
  4. Backup the databases:
    mysqldump --routines -P 13306 -u root -p --databases bugzilla wiki > ~/db-backup.sql
  5. Comment out the "innodb_force_recovery" line from /etc/mysql/my.cnf
  6. Restart MySQL
  7. Restore from backup
    mysql -P 13306 -u root -p < ~/db-backup.sql 
  8. Change back the port to the original
  9. Restart MySQL

Solution 2

If the above does nnot fix the issues you can probably try a couple of other recipes, however the one that should not fail is reinstalling mysql:
  1. Communicate there will be disruption to the proper channels. The database will be unavailable
  2. Edit the mysql configuration file to force MySQL listening in a different port:
    $ sudo vi /etc/mysql/my.cnf
    port = 13306
  3. Restart MySQL
  4. Make sure you have a file (like security.sql) with all permissions saved in a secure place. You can extract such information from the current db using show-grants-mysql.sh
  5. Backup the databases:
    $ databases="mydb"
    $ mysqldump --routines -P 13306 -u root -p --databases $databases > ~/db-backup.sql
  6. Backup my.cnf in a safe place (You should really keep this in a version control repository and use POB recipes to deploy it in the server instead of touching the file directly.
  7. Reinstall mysql using a POB recipe but from the local machine:
    $ wget https://raw.github.com/nestoru/pob-recipes/master/common/mysql/mysql-reinstall.sh
    $ chmod +x mysql-reinstall.sh
    $ ./mysql-reinstall.sh
  8. Restore the my.cnf file from your version control or (less ideal as said above) from the backup you have made
  9. Change the port to 13306 as explained above
  10. Restart mysql (service mysql restart)
  11. Restore mysql databases from the backup
    $ mysql -P 13306 -u root -p < ~/db-backup.sql
  12. Run security.sql script to set the correct permissions. Be sure to delete that file after
  13. Change port to original 3306 a explained before
  14. Restart mysql (service mysql restart)

Friday, August 03, 2012

Technology, Art and Leadership

Technology is like art, you don't need to be an artist to understand it or appreciate it. You need to be an artist to create it.

This was my answer today to some good feedback I got about the way I explained a technical problem. It is usually the case where we, Engineers, ignore the audience. I believe that Parenthood helps, how would you explain your kid a physical phenomenon without having to talk about Gravity which of course he still (5 years old) does not get?

The level of understanding is different and Abstraction plays a unique role towards the goal of teaching. When observing a Painting you probably start with colors and textures, to move soon into style recognition, then Art History, then (why not?) you probably attempt to paint as well. The appreciation will be different but the level of enjoyment will probably be the same as ultimately each brain will be satisfied by different amount of understanding which is always correlated to the current amount of knowledge.

Technology goes beyond Programming Paradigms and Languages. Dividing and conquer will never fail as the preferred strategy to resolve complex problems. With Separation of Concerns in place there will always be a place for everyone but it is up to the team leader to make sure the work is correctly divided and distributed. The Chief must be then like an artist because creativity is her(his) responsibility but in addition he(she) must think as an outsider as well.

Correctly articulating to different audiences the same problem is important to be able to lead teams no matter the size they are.

Wednesday, August 01, 2012

Install Talend Open Studio in Ubuntu Desktop from command line locally or remotely using a POB recipe

What happens when you need to recreate your development environment after a crash? At that point you learn The palest ink is better than the best memory.

And of course Plain Old Bash can help you not only with automating your installations locally but even remotely in the case your development machines are offered as a Service. DevOps movement is about automation and the Developers machines should be consider part of the Infrastructure. You can leverage to tools like Remoto-IT to easily build developer machines and achieve an important DevOps goal: Infrastructure As A Service ( IaaS )

This POB recipe works using CIFS shares thanks to some recipes I have posted before. You can easily make it work for NFS shares or to download Talend directly from the Web:
common/talend-gui-install.sh //windows/unc/path/to/TalendOpenStudio /mnt/TalendOpenStudio WindowsDomain `logname` `logname` TOS-All-r63143-V4.2.2.zip TOS-All-r63143-V4.2.2
Which uses:
#!/bin/bash -e
# talend-gui-install.sh

USAGE="Usage: `basename $0` <cifsDirectory> <localDirectory> <windowsDomain> <linuxUser> <linuxGroup> <distributionFileName> <distributionVersion>"



if [ $# -ne "7" ] 
 echo $USAGE
  exit 1 

mkdir -p "$localDirectory"
common/umount-path.sh "$localDirectory"
common/mount-cifs.sh "$cifsDirectory" "$localDirectory" $CREDENTIALS_PATH $windowsDomain $linuxUser $linuxGroup
echo "Copying distribution file. Might take a while depending on your connection ..."
cp "$localDirectory/$distributionFileName" /opt/
cd /opt/
dst=$distributionVersion-`date "+%Y%m%d"`
if [ -d "$src" ]; then
  echo "Backing up previous Talend installation ..."
  rm -fr $dst
  mv $src $dst
unzip $distributionFileName
chown -R $linuxUser:$linuxUser /opt/$distributionVersion
cp /opt/TOS/plugins/*/icons/appli_48x48.xpm /usr/share/pixmaps/talend_48x48.xpm
common/gnome-application-launcher.sh Talend /opt/TOS/TalendOpenStudio-linux-gtk-x86_64 talend_48x48.xpm
Note that we are using the gnome-application-launcher.sh recipe to create a Desktop launcher for Talend. Note we pass just talend_48x48.xpm as we previously copied to /usr/share/pixmaps/.

Gnome application launcher from command line

Here is a Plain Old Bash ( POB ) recipe that allows you to create a Desktop Gnome Application Launcher:
#!/bin/bash -e
# gnome-application-launcher.sh

USAGE="Usage: `basename $0` <name> <executablePath> <iconPath>"

if [ $# -ne "3" ] 
 echo $USAGE
  exit 1 



echo "[Desktop Entry]" > $launcherPath
echo "Version=1.0" >> $launcherPath
echo "Name=$name" >> $launcherPath
echo "Exec=$executablePath" >> $launcherPath
echo "Icon=$iconPath" >> $launcherPath
echo "Type=Application" >> $launcherPath
echo "Terminal=false" >> $launcherPath
chmod +x $launcherPath
chown $linuxUser:$linuxUser $launcherPath
Call it like this to create a Firefox launcher:
common/gnome-application-launcher.sh Firefox /usr/bin/firefox firefox-icon.png
Note we do not need to provide a full path for the Firefox application icon. You can also just say firefox.png as that one is included in /usr/share/pixmaps.