Wednesday, June 17, 2015

Scatter Diagrams from any two columns in Excel 2010

In Excel 2010 to plot a scatter diagram out of two columns that appear in any position and any order follow this steps:

  1. Click on an empty cell; select Menu | Insert | Scatter | Select first
  2. Right click on the chart area: click “Select Data” | Add legend entries (series) | Pick X and Y values | Pick a name for the series for example “Comparison of size and rental cost of apartments” or in general “Comparison of X and Y" | click OK
  3. From "Chart Tools | Design | Chart Layout" pick the first one (layout 1) which adds the axis labels
  4. Remove the label on the right which contains the name of the series. This is redundant as the title already states the same
  5. Click on each axis title labels to select it, then click again inside it to change it to the real name of Y and X

Thursday, May 07, 2015

Talend OutOfMemoryError: Java heap space because of many files in a directory

I have blogged in the past about how to debug OutOfMemoryError in Talend jobs.

There is at least one official Talend component that would be generating these errors when we point to a specific directory containing a really large amount of files. The reason is that some code generates an array of strings containing the file names which clearly will not scale. The way I figured this out was following the steps in that previous post. From Eclipse Memory Analyzer I saw the cause for high memory consumption was an array of strings which matched file names.

Of course it is a bad practice to use a root directory to store all files, one should use a temporary directory per run. So the solution is actually simple. Nevertheless keeping such array of strings is just a waste of resources so that should be avoided as well.

The bottom line is that just automatically increasing memory when a JVM code throws OutOfMemoryError is not an option. Instead the engineer should investigate and get to the bottom of why processes are inefficient. Failure to do so will only postpone the inevitable because simply underperforming jobs won't scale. In the case of Talend as in any java application the JVM provides the tools to understand what happened when a memory leak originated a crash.

Saturday, May 02, 2015

Fastest idempotent way to install nodejs in linux or MAC OSX

Simply install the binary from a plain old bash (POB) recipe ;-)

Thursday, April 30, 2015

Fastest idempotent way to install nodejs in Ubuntu

Originally I created a gist tailored at Ubuntu however a fastest way is just to use a Plain Old Bash script to install the binaries as presented here. This will work in any linux and MAC OSX.

Friday, April 17, 2015

run cygwin sshd under SYSTEM user - The CYGWIN sshd service on Local Computer started and then stopped. Some services stop automatically if they are not in use by other services or programs

I got this error when trying to switch cygwin sshd to run as windows SYSTEM user. This is a need if you want to allow cygwin to interact with graphical applications.
The CYGWIN sshd service on Local Computer started and then stopped. Some services stop automatically if they are not in use by other services or programs
The first thing to do is to look into the sshd logs:
$ tail -10 /var/log/sshd.log
/var/empty must be owned by root and not group or world-writable.
The reason for this error is the fact that SYSTEM user (look upper case) is not the owner of /var/empty. The exact reason why this error happens is explained here. So the solution is simple:
$ chown SYSTEM /var/empty
Even though I went pass this issue I later realized there was no way to run desktop applications remotely using this method as explained here and here.