Thinking In Software: June 2012

Friday, June 29, 2012

Microsoft EWS Managed Java API Installation from a POB recipe

It's been a year since I posted about EWS Exchange API and I wanted to give the managed API a try again.

For some reason Microsoft is still reluctant to put their EWS Managed Java API in a repository from where we can check out the source code. The ant script is buggy (won't build out of the box from command line) and even though at the time of this writing the version is 1.1.5 it still generates the resulting jar file as 1.1.0.

With the spirit of automation I am providing here a Plain Old Bash Recipe to build the project. The only thing that you have to do manually is download the zipped file as Microsoft enforces a license agreement every time you try to download (Again no real versioning repository exists with the project content). After running this recipe you will end up with the correct version number in the jar file. No need for Eclipse, just command prompt and ready you are to upload the artifact to your Maven repo.

#!/usr/bin/env bash -ex
#
# @author Nestor Urquiza
#
# Builds Microsoft EWS Managed Java API
# 

EWSJavaAPI_BIN=$1
: ${EWSJavaAPI_BIN:?"Usage: $(basename $0) </path/to/EWSJavaAPI.zip>"}

cp -f $EWSJavaAPI_BIN .
fileNameAndExtension=$(basename "$EWSJavaAPI_BIN")
extension="${fileNameAndExtension##*.}"
fileName="${fileNameAndExtension%.*}"
rm -fR $fileName
unzip $fileNameAndExtension

cd $fileName
sed -i '' 's!<javac srcdir="${src}" destdir="${build}"/>! \
              <javac srcdir="${src}" destdir="${build}"> \
                <compilerarg value="-Xlint"/><classpath> \
                <fileset dir="${basedir}/lib"> \
                    <include name="**/*.jar" /> \
                </fileset> \
                </classpath> \
              </javac>!g' build.xml


cd lib
curl -O http://repo1.maven.org/maven2/commons-codec/commons-codec/1.4/commons-codec-1.4.jar
curl -O http://repo1.maven.org/maven2/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar
curl -O http://repo1.maven.org/maven2/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar
curl -O http://repo1.maven.org/maven2/jcifs/jcifs/1.3.17/jcifs-1.3.17.jar
cd ../
ant
mv lib/EWSAPI-1.1.0.jar lib/$fileName.jar

I have taken the opportunity to update my project in Google Code repository which now shows a simple SendMail class that sends the members of a given distribution list.

Thursday, June 28, 2012

Test Email over TLS with a POB Recipe thanks to openssl and expect

We already saw how to test connectivity and send an email from a Plain Old Bash script (POB recipe). Here is a recipe to send it over TLS (ideal to use remotely from Remoto-IT to make sure SMTP is working properly in the remote server).

Here is how to invoke the script using Gmail SMTP:

./smtp-tls-test.sh smtp.gmail.com 587  gmailUser@gmail.com gmailUser@gmail.com "From Openssl using authentication over TLS" "Just a test" gmailUser@gmail.com gmailUserPassword

Of course this is offered just as (if you wish) curiosity because mailx will do the work for us:

$ sudo apt install mailutils
$ echo "body" | mailx  -r "sender@sample.com" -s "subject" -S smtp="host:port" -S smtp-use-starttls -S smtp-auth=login -S smtp-auth-user="username" -S smtp-auth-password="password" -S ssl-verify=ignore "receiver@sample.com"

Send Email using TLS from Windows scripting like VBS

The "CDO.Message" used in VBS does not work for TLS. So if no SSL or anonymous access is allowed within your protected network you will be forced to use an external command. Here is when the SendEmail Open Source Project will help. Just download the executable and you will be able to use it to send emails over TLS. Below is an example on how to use it with gmail TLS:

C:\>c:\scripts\sendEmail.exe -f gmailUserAddress -t gmailUserAddress -u "from sendEmail using TLS" -m "Just Testing" -s smtp.gmail.com:587 -o tls=yes -xu gmailUserAddress
.com -xp gmailUserPassword

This is handy if you are trying to monitor your event logs like I have posted before.

encode from echo

Every once in a while I am caught by forgetting echo command sends a trailing new line character. Use "-n" switch every time the result of echo is piped as input data for openssl when encoding. Here is for example the difference illustrated while encoding base64 the word 'hello':

$ echo -n "hello" | openssl enc -base64
aGVsbG8=

However when decoding you must not use the swtich '-n':

$ echo "aGVsbG8=" | openssl base64 -d
hello

Wednesday, June 27, 2012

Send Email from Windows using VBS and optional SSL Authentication

As I have posted before you can configure Windows Servers to send alerts when they detect errors (commonly logged in the Event Log). If your internal SMTP enforces authentication (I hope using SSL) then the scripts will need to be changed a little bit just to add support for it.

Rather than modifying previous posted scripts I am sharing this time a script that will just send an email from VBS. Optionally you can use authentication if you just supply the credentials: A couple of examples showing how to use it to send internal (sampleDomain) non SSL authenticated emails and externals SSL autheticated ones (gmail)

C:\>cscript c:\scripts\events\sendEmail.vbs smtp.company.com 25 corporateUser@company.com anotherCorporateUser@company.com "testing smtp from windows vbs" "Just a test"
C:\>cscript c:\scripts\events\sendEmail.vbs smtp.gmail.com 465 gmailUser@gmail.com corporateUser@company.com "testing smtp from windows vbs" "Just a test" gmailUser@gmail.com gmaiUserPassword

Monday, June 25, 2012

SQL Optimization: Warehouse versus Realtime Reports versus Listing Pages

Sometimes a Real Time Report is assumed to be the equivalent of a listing page. However, while they both consume instant data, Real Time Reporting can take longer than a listing page. This is because a user will happily wait to receive the report run at certain time but waiting a minute for a web page to be rendered will not be a nice experience at all.

On the other hand Warehouse Reports are fast and in many cases even faster than listing pages because they run out of data sources that have been optimized for data retrieval (reads) instead of the regular data sources which are used for reads and writes. However Warehouse reports are valid only if a snapshot of the past is acceptable.

Of course what is fast, or what is old is relative and it is easy to mix Warehouse Reports, Real Time Reports and Web Listing Pages concepts.

From time to time I need to explain the difference to Business people and I thought like writing it down to avoid repeating myself.

There is a reason why we cannot have a transmitter and a receiver that will work in all electromagnetic frequencies and it is the metric called Gain–bandwidth product.

In short if you try to design for higher gain (power to amplify) you will need to compromise the bandwidth (the spectrum of frequencies you will be able to amplify). If you want to build a device that will be an AM/FM radio, a TV receiver, a cellular phone, a WIFI transceiver, a satellite multi frequency transceiver and more you will need multiple circuits (and multiple antennas, more power disipation etc) which will mean a bigger equipment. At least that rule is still there after I learned about the "Gain–bandwidth product" 25 years ago. Technology advances and circuit integration gets better, the space you need gets smaller but the amount of spectrum in use also grows and the demands for different services grows as well. No, you cannot have it all.

Knowing the limitations are important so you can make wise decisions in your architecture. It does not matter if it is hardware or software based.

Architecture is the art of designing a future product. For the product to be real the Architect must design under constraints making sure the client understand the cost involved in the construction. Usability, expected life, maintainability, and other metrics will decide the final cost of the product. As an architect you are the Chief of the Building Process.

How this applies to databases? Let us pick MySQL and think about a typical request: Allow me to search for X in [table A].[field1] where [table B].[field 1] is Y but order by [table C].[field 1]. Of course the three tables are related by keys.

Pretty quick you will find yourself inspecting the query plan (EXPLAIN) just to realize that you can cut in half or more query execution times just applying some indexes in the case you are lucky enough to get composite indexes that relate the filter (WHERE) and the sorting (ORDER BY) however even if you are lucky your query might still not satisfy in some cases the demands for speed.

Here is where the Gain-bandwidth product concept helps. Is it really that we want this listing page in our web interface? Is this long taking query based report to be run real time or actually we are OK with 1 hour old data? Can we at least cache it? Or is it OK to use a typical data-warehouse? Is it so crucial to have this response time that we should forget about normalization for some fields and impact then persistance? Can the query ask for more on the WHERE clause? Can the query be divided in the one to be used by regular employees which commonly are supposed to query for more specific data and a 1 day delay executive report? Too many questions and depending on answers to these and more the Architect should come up with a proposal that makes sense budget and user experience wise.

Business just needs to understand the cost and the Architect for sure will deploy the solution that makes sense for the Enterprise.

As with the "Gain–bandwidth product" there is a "Speed-Normalization product" and depending on the allowed compromise you could even end up deciding for a NoSQL Database approach. So in specifics let us analyze what we are trying to do with the attached query:

Test Email Server POB Recipe

Checking if the email server is working from a particular box is a useful test we need to do from time to time. It makes sense to include that test as part of deployment, monitoring, server configuration and more.

Here is a POB script that can be run locally (or remotely from Remoto-IT for example)

Here is how to call it:

./smtp-test.sh "mail.sample.com" 25 "nurquiza@sample.com" "nurquiza@sample.com" "test from bash" "This is just a test"

In case you need to use TLS check this out.

Friday, June 22, 2012

Install SSL certificates in your java keystore from POB recipes

When certificates change for domains your java application uses most likely you need to troubleshoot, delete by mistake what you did not want to and what not. It is wise to automate everything that can be automated of course.

Here is a Plain Old Bash (POB) script to reinstall certificates in the keystore. You can automate the setup using Remoto-IT of course.

#!/bin/bash -e
# certs.sh

USAGE="Usage: `basename $0`     "

if [ $# -ne "5" ] 
then
 echo $USAGE
  exit 1 
fi

domain=$1
port=$2
alias=$3
keystore=$4
storepass=$5

set +e; $JAVA_HOME/bin/keytool -delete -alias $alias -keystore  "$keystore"  -storepass "$storepass" -noprompt; set -e
echo | openssl s_client -connect $domain:$port 2>/dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /tmp/$domain.cer
$JAVA_HOME/bin/keytool -import -keystore "$keystore" -file /tmp/$domain.cer -alias $alias  -storepass "$storepass" -noprompt

Of course this script can be run locally as well. Here is an example on how to call it in OSX:

./common/certs.sh sample.com 443 sample.com /Library/Java/Home/lib/security/cacerts changeit

Wednesday, June 20, 2012

Jasper Reports Images in HTML PDF and XLS

One of the quickest options to get images rendered in Jasper Reports whether it is HTML, PDF or XLS is to use the "isLazy" image attribute combined with a absolute URL in the imageExpression node.

The report will render images as usual if run as XLS or PDF (embedded images) but when run as HTML the image will render from the absolute URL which is used of course in the "src" attribute.

So consider the below simple JRXML:

<?xml version="1.0" encoding="UTF-8"?>
<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd" name="remoteImage" language="groovy" pageWidth="595" pageHeight="842" whenNoDataType="AllSectionsNoDetail" columnWidth="555" leftMargin="20" rightMargin="20" topMargin="20" bottomMargin="20">
 <property name="ireport.zoom" value="1.0"/>
 <property name="ireport.x" value="0"/>
 <property name="ireport.y" value="0"/>
 <background>
  <band splitType="Stretch"/>
 </background>
 <title>
  <band height="79" splitType="Stretch">
   <image isLazy="true">
    <reportElement x="314" y="27" width="62" height="50"/>
    <imageExpression><![CDATA["http://www.nestorurquiza.com/img/nu.png"]]></imageExpression>
   </image>
   <staticText>
    <reportElement x="106" y="51" width="151" height="20"/>
    <textElement/>
    <text><![CDATA[Here is my Logo:]]></text>
   </staticText>
  </band>
 </title>
 <pageHeader>
  <band height="35" splitType="Stretch"/>
 </pageHeader>
 <columnHeader>
  <band height="61" splitType="Stretch"/>
 </columnHeader>
 <detail>
        <band height="125" splitType="Stretch"/>
 </detail>
 <columnFooter>
  <band height="45" splitType="Stretch"/>
 </columnFooter>
 <pageFooter>
  <band height="54" splitType="Stretch"/>
 </pageFooter>
 <summary>
  <band height="42" splitType="Stretch"/>
 </summary>
</jasperReport>

It will correctly render the logo from the URL when used for any of the formats. For html in specific you will get something like:

<img src="http://www.nestorurquiza.com/img/nu.png" style="position:absolute;left:0px;top:0px;height:50px" alt=""/>

The advantage of this method besides simplicity is that static images can be kept in a central repository server which means an image change can be distributed to Jasper Reports or other pages in the web application. The typical example would be skinning a client website for which you need the logo in web pages but also in Jasper Reports.

The main disadvantage is conceptual: A report should be a snapshot and clearly a link is dynamic (the image can change any time in the future). In addition the images server has to be available for debugging purposes.

The alternative method is to not use lazy image loading in which case the image is pulled from a local path for XML and PDF but then for HTML a local servlet takes care of rendering the images which are stored previously in the user session. Here is what is needed in web.xml:

<!-- JasperReports Servlet to render local images for HTML EXporter -->
    <servlet>
        <servlet-name>JasperReportsImageServlet</servlet-name>
        <servlet-class>net.sf.jasperreports.j2ee.servlets.ImageServlet</servlet-class>
    </servlet>
    <servlet-mapping>
        <servlet-name>JasperReportsImageServlet</servlet-name>
        <url-pattern>/jriservlet/image</url-pattern>
    </servlet-mapping>

Here is what what is needed in your Controller when rendering XHTML/HTML Exporters:

request.getSession().setAttribute(ImageServlet.DEFAULT_JASPER_PRINT_SESSION_ATTRIBUTE, jasperPrint);
exporter.setParameter(JRHtmlExporterParameter.IMAGES_URI, "/jriservlet/image?image=");

Here is how the html source code will look like. The numbers refer to the index of images collection saved in user session:

<img src="/jriservlet/image?image=img_0_0_0" style="position:absolute;left:0px;top:0px;height:50px" alt=""/>

Wednesday, June 13, 2012

Hadoop, POB and Error: JAVA_HOME is not set

As I recently posted the importance of knowing Plain Old Bash (POB) shell scripting is huge, way more than what many IT Ops and Devs think.

Hadoop, which has been making history lately in pretty much any article you read about Big Data is heavily based on simple concepts and at the same time tightly tied to Bash. Yes POB scripts are the ones responsible for part of the Hadoop implementation Magic (Besides Java of course).

So I see some people complaining about the below error:

Error: JAVA_HOME is not set

And then some people suggesting to hardcode the $JAVA_HOME variable in $HADOOP_INSTALL/conf/hadoop-env.sh besides exporting $JAVA_HOME as a "shell environment variable".

No need for duplication really.

In Debian/Ubuntu for example if you use pseudo distributed mode you just have to set your $JAVA_HOME in ~/.bashrc (instead of just in ~/.profile or ~/.bash_profile). The reason is that Hadoop uses not-login interactive shell (like ssh user@host /path/to/command) to run commands in remote nodes. Clearly a setting in ~/.profile or ~/.bash_profile will not be picked by an interactive not-login shell.

If you are running a fully distributed cluster then set JAVA_HOME in /etc/environment

Enable SSH in OSX from command line

Depending on your version one or both of the below commands should be enough to allow SSH access to your MAC:

systemsetup -setremotelogin on
sudo launchctl load -w /System/Library/LaunchDaemons/ssh.plist

Tuesday, June 12, 2012

Debugging Talend Jobs

Today I was debugging a problem with Talend tJSONInput component for which of course debugging is a must have. I noticed I never posted before Talend debugging instructions so here they go:

Click on Designer|Run|Debug Run
Click on the arrow near "Traces Debug" so "Java Debug" shows up.
Click on "Java Debug" and say OK to switch to the Debug perspective. The program will stop in the main method if the Job class.
Place a break point double clicking on the left side of the line number. For example for the error below I searched for "The Json resource datas maybe have some problems..." inside tFileInputJSON_1Process method.
Resume using the green arrow
You can add new code and next time you run the debugger it will be executed (Use the bug symbol to run last debug configuration)

Sunday, June 10, 2012

JRebel for Remote debugging from Eclipse

As I have posted before there are ways to do agile Java Web development through hot code replacement so you do not have to say "Java sucks because you need to recompile to see your changes, that is why I code in you-name-it language". However if you are serious about agility in Java as I said you must consider JRebel. The JVM Hot Swap misses a lot of features provided by JRebel. Sure JRebel is not free but for a buck a day you can get a floating license for 10 team members, or for a buck per three days get your personal license, or if you do not have a company and are trying to get in business with zero capital there is also an option for you: Just open source your project and get JRebel for free, then provide consulting services, make money and pay for JRebel: It is worth it.

I personally do not like servers running in IDEs. I prefer my local environment to be as close as possible to the production environment, so I run SSL, Apache with mod-jk , different IPs for different website projects etc. I also debug all of them and in some cases at the same time without an IDE crashing or giving me extra trouble. Clearly the second advantage of this method is that when you are in needs to debug your real deployment environment you follow the same procedure: Just prepare your server to listen to a debug port and connect to it from Eclipse.

Here are the steps I followed to get JRebel replace the included HotSwap JVM capability to allow remote hot code replacement. This was tested with JRebel 4.6.2 + Eclipse Helios + Tomcat 7.0.22 + Maven 1.6.0-31.

Hit Eclipse "Help-Eclipse Marketplace" menu option. Look for JRebel and install it
From Eclipse Preferences choose JRebel and launch the configuration wizard (In version 5 "JRebel Config Centre" link will open all you need within Eclipse interface while previous versions would open a separate java app for configurations). Click on the link so you register and get your license key. Get back and paste it.
You are presented with some options like running your tomcat locally in Eclipse. I use an external tomcat instance so I unchecked that option, then I ignored the instructions to create a new startup and catalina shell scripts. I want to use the original server scripts.
Look at the "Embedded JRebel plug-in" link to find out where jrebel.jar is in your system

Add jrebel.jar and necessary options to setenv.sh:

JAVA_OPTS="-javaagent:/Users/nestor/eclipse-helios/plugins/org.zeroturnaround.eclipse.embedder_4.6.2.201205021255/jrebel/jrebel.jar -Drebel.remoting_plugin=true -Drebel.hotswap=false $JAVA_OPTS"

From Eclipse Preferences - Java - Debug uncheck all "Hot Code Replace" options
Right click on Project and select properties, locate "Maven-Lifecycle Mapping" and insert "clean" as the first goal to "Goals to invoke after project clean"
Make sure Eclipse will build your project automatically: Check "Project-Build Automatically" menu option
Right click the project and choose "JRebel - Generate rebel.xml" It will generate the file inside the main resources folder
If your server JVM is running in a different machine than your Eclipse IDE then right click the project and choose "JRebel - Generate rebel-remote.xml" It will prompt for the url you use to reach your project. Like above the file gets generated by default in main resources which means they both will end up in the classpath. Review the files for correct URLs and id. I use as id the name of the project. If you change id please look into my previous post.
Recompile and redeploy your project so the xml file(s) get to the server
Exclude rebel.xml (and rebel-remote.xml if it applies) from your versioning control. You do not want to mess with other developers settings unless you all use the same IDE, OS etc.
If your server JVM is running in a different machine than your Eclipse IDE then right click the project and chose "JRebel - JRebel Remoting: Automatic Sync". You should get a "Jrebel-Remoting uploaded changes successfully, have fun!" message
Connect to the remote debug port from Eclipse (From Java Perspective, Menu | Run | Debug Configurations | Remote Java Application | Right click and select New | give it a name like "Local" | set Host like localhost | Port like 8000 | click Debug. From that moment on the option will be available when clicking the bug icon arrow), set your break point and hit the page associated to it. You will be able to debug just as you are used to
Now overload the method holding the breakpoint or create a new one that gets called as well, that way you can see that adding a new method does work. Both the the old and the new methods will be accepted by the server and available locally for debugging. No more redeployments.

If you (like I do) use external maven to build then you will need to make sure you either refresh the Eclipse project to get latest from target folder or you will have to use clean option *only* from Eclipse.

Notice that you need to install the rebel xml files in every project you want to hot deploy. For instance if you divide your project in a WAR and a JAR then both will need to be configured to use jrebel properly.

Finally it is always better to make your changes when a breakpoint has already being reached by the debugger, in that way the change will trigger the reload and you can even say so from the changes in the Eclipse code line debugger pointer. However this option might be broken in version 4 as reflected in this "apparent" bug. Apparently it has been fixed in version 5 as i documented there.

Buying JRebel

A license key will be sent to you and from Eclipse Preferences, JRebel, open the link to configure. In the latest JRebel 5 licences are activated from the "JRebel Config Centre" which is the only link. In previous versions there were a couple of links, one of which would open a custom Applet. Now the whole process is integrated inside the Eclipse GUI. In order to activate the new license in the server click "Add External Server" or "Add remote Server" depending if you are running the external server locally or remotely then the JAVA_OPT options will show up. Note that I had to tweak the hints as explained in this possible bug:

-javaagent:"/Users/nestor/eclipse-helios/plugins/org.zeroturnaround.eclipse.embedder_5.0.0.RELEASE-201206201947/jrebel/jrebel.jar" -Drebel.workspace.path="/Users/nestor/eclipse-workspace-test" -Drebel.log.file="/Users/nestor/.jrebel/jrebel.log" -Drebel.properties="/Users/nestor/.jrebel/jrebel.properties"

Updating JRebel for Eclipse

In Eclipse go to "Menu|Help|Eclipse Marketplace" and click the "Update" button for JRebel. An Eclipse restart will be needed. I have tested this going from JRebel 4 to 5. Do not forget to update the JAVA_OPTS to point to the newest version of the library. For example below is the path for the new JRebel5 in my local OSX Eclipse dev environment:

/Users/nestor/eclipse-helios/plugins/org.zeroturnaround.eclipse.embedder_5.0.0.RELEASE-201206201947/jrebel/jrebel.jar

Saturday, June 09, 2012

JRebel: Server responded with an error: ERR_UNKNOWN_REBEL_ID

After changing rebel-remote.xml id I was getting the below error in Eclipse when trying to to sync jrebel.

Server responded with an error: ERR_UNKNOWN_REBEL_ID

It looks like the same problem will occur if you forget to deploy your app containing the xml like someone posted before. but in my case this was related to a cache issue that I resolved deleting the cache with the command below and restarting tomcat:

rm -fr /Users/nestor/.jrebel/cache

Sunday, June 03, 2012

Server Header - Security through obscurity

Security through obscurity is sometimes generalized to the extremes: "You are vulnerable unless you hide" OR "You are vulnerable anyway so do not bother hiding".

You should not assume hiding will save you, however be convinced that hiding is an addition to core best security practices that ultimately will help against some enemies (not all of course). You need to follow best practices like you will not reinvent a secure protocol to communicate your server and client just because you feel TLS as it is used by everybody is better understood by attackers and that is the reason you are vulnerable if you use it. At the same time you will not disclose a stack-trace of your backend application because it could reveal private information.

I stumble upon a question and a couple of answers. I believe it is not OK to give away all the information of your Web Server but let us face it, it takes less than a minute to figure out in most cases what the HTTP Connector is. On the other hand making the life easier to the unexperienced attacker who is after his first target does not make any sense either.

But if you ask me what I prefer whether send back "Apache" or "Thor" as my Server header I do prefer "Apache". There is after all a discipline called Psychology for a reason. Do you really think a real hacker likes easy stuff? Challenge is probably the highest incentive for the more creative (good or bad) ideas. I have hardened Servers for some Industries where by regulation the "Server" header must not be present. Interesting enough just Googling the company with the right keywords and options returns a query providing more information than what the headers would provide and then just testing SSL their websites exhibit serious TLS weakness.

So obscure to gain extra peace of mind but secure your system following best well known practices.

Saturday, June 02, 2012

DevOps and remote IT provisioning through POB / POS

I Just released Remoto-IT, a Plain Old Bash (POB) script to perform Remote IT Management.

There has been a lot of talks in the last couple of years about the DevOps movement. To achieve agility in a software development team it is crucial to get an agile infrastructure team. Collaboration towards Automation is the answer.

A wiki can help and should be used for collaboration but a scripting language is needed for automation. Developers should be able to write scripts to configure and deploy applications, sysadmins should be able to write them to manage the OS. Ultimately both sides meet and share code that contribute to continuous automation delivery. Just in terms of security alone DevOps allow to have a second pair of eyes on what the server should host as everything ultimately resides in scripts (recipes). Continuous delivery is impossible without a great level of automation.

There are many products out there which standardize the way you organize server scripts to do remote provisioning. I personally like POB or more generally speaking Plain Old Shell (POS) provisioning. Yes, the kind ISPs have been doing for more than a decade. After all how can you be a linux sysadmin without knowing bash? I rather do not ask for ksh and csh experience right ;-) ?

The most important quality of good provisioning scripts is that they should be idempotent, a difficult property to achieve but yet not impossible to get accomplished in any programming language because after all it all comes down to the algorithm (system stories if you like).

Idempotence is not easily achievable in bash but I do not believe it is impossible either, especially when you want an all or nothing outcome and you are not interested in "starting from where I left it" POB can be definitely used to provide automated provisioning.

This is of course not a claim to ignore really great tools like Chef or Puppet. POS will always be useful in any case. Just be aware there are commands in bash that are not idempotent so you will need to do checks and cleanups by yourself. For example curl and tar are idempotent while mkdir is not.

Anything you manually do with computers should be scriptable and so applying bootstrapping further automated. That includes building a VM, starting it, configuring networking, installing packages, starting services etc.

When a sysadmin is at the beginning of the journey to configure machines ( especially in Unix/Linux but increasingly in Windows ) the interactions with the shell are the very first step. I have been trying to educate teams about the fact that we do not need to write a paragraph or a statement per command unless we get cryptic. It looks to me that POB can be clear enough to not only do the manual configuration but also to provide automation and documentation together in the form of a script.

Nowadays scripts are called recipes which from some kind of DevOps framework your developers or sysadmin will use.

Friday, June 01, 2012

HTTP 304 Not Modified Bad Privacy compromise Security

For a software to be vulnerable there must be a bug but we all know bugs are inevitable so vulnerabilities will always exist.

However something might be vulnerable and yet no major impact will be received from an exploit if the system does not have what the attacker is looking for. Think now of a "honeypot".

Users make decisions in their local Browser that affect their privacy and without knowing it they become vulnerable to an exploit that otherwise won't cause much damage.

Software Engineers can do a lot more than just applying patches when vulnerabilities are discovered. Think about the pill for Colesterol versus changing the lifestyle. Software Engineers can design systems that correctly use caching (for increased performance but with minimum impact in security)

If your application is generating too many HTTP 304 codes (Not Modified) it might mean browsers are asking for resources that should be cached. If on the contrary you do not get 304 ever it might actually mean browsers are caching forever! That might be really bad news for the security of your users.

Take a look at your Server headers and make sure "Expires" and "Cache-Control" directives are used effectively. Get rid of those 304 ensuring your customers will not keep resources locally. The Browser cache is a high Privacy liability, use it wisely to increase the Security of your customers and yours (Remember your application is as weak as the weakest component and your customers/users are indeed part of it).

I have blogged in the past how to achieve correct caching with tomcat and today I can confirm tomcat 7 does a great job after using that ExpiresFilter. Static content BTW should be ideally served from your HTTP Connector (which must not be tomcat for sure). Apache mod_expires is a great caching implementation.

Thinking In Software