Thinking In Software: October 2012

Tuesday, October 30, 2012

Re Brew OSX

cd /usr/local && sudo git reset --hard FETCH_HEAD

Monday, October 29, 2012

JSON viewer and editor from a local HTML

I have looked at several JSON tools out there. So far I like the simplicity of json-editor because JSON is after all a Javascript Object Notation and so it makes sense to just have an HTML page which uses javascript to manipulate a JSON structure you want to either visualize or edit.

Just in case you are afraid from using github, not clear on front end web development or simply just trying to quickly try just one more json edition/viewing tool here are the steps you follow:

Save the zip file https://github.com/knv/json-editor/zipball/master as "json-editor". Alternatively you can clone the project and so on but I said I wanted to provide instructions for those not using git and still interested in trying just yet another json editor
Uncompress the zip and double click on index.html
Paste your json in "Value", hit "Save" (no worries it is just "saving" the json in memory)
On the right you get the JSON tree representation. As you click on different components you get different paths in "label"
You can manipulate the json structure while adding or deleting child or sibling elements in any existing element. You can edit an element and pressing save will update again the local (in memory representation of the object). Clicking on the root node you can see the whole JSON string again.

Handy right? I have asked one of the developers what he thinks about supporting JsonPath

Saturday, October 27, 2012

Talend Component Creation Tutorial

Building custom ETL components is a necessity in any ETL suite. With Talend it is not difficult to create your own components albeit it is not straightforward either.

I have written a tutorial that I just released in github together with a hopefully useful component as well (tFileInputCSVFilter).

The tFileInputCSVFilter component is just a second step after the initial approach of running the code out of a tJavaFlex component.

So you could basically try to develop your code using a tJavaFlex and once happy you can move to the custom Talend component creation. Of course you can jump right away into the component creation as well.

Thursday, October 25, 2012

Automate security patching in Ubuntu

Important: This worked for Ubuntu 10.10. Here is a POB recipe you can use with Remoto-IT to deploy cron-apt in your servers correctly configured to get security updates automatically installed.

While you need to be careful with automated updates and especially upgrades in Ubuntu, security updates should be performed ASAP and they are *most of the time* safe.

If you want to get notifications only when a security upgrade is performed (recommended) then use:

common/debian/cron-apt-security-update-reinstall.sh nurquiza@sample.com upgrade

Test the installation

You can force the process to run right away while changing the cron expression in /etc/cron.d/cron-apt, then inspect the log file in /var/log/cron-apt/log. You should get something like:

CRON-APT RUN [/etc/cron-apt/config]: Mon Oct 2 04:00:01 EDT 2012
CRON-APT SLEEP: 3322, Mon Oct 2 04:55:23 EDT 2012
CRON-APT ACTION: 0-update
CRON-APT LINE: /usr/bin/apt-get update -o quiet=2
CRON-APT ACTION: 3-download
CRON-APT LINE: /usr/bin/apt-get autoclean -y
Reading package lists...
Building dependency tree...
Reading state information...
CRON-APT LINE: /usr/bin/apt-get upgrade -u -y -o APT::Get::Show-Upgraded=true
Reading package lists...
Building dependency tree...
Reading state information...
The following packages have been kept back:
  linux-headers-server linux-image-server linux-server
0 upgraded, 0 newly installed, 0 to remove and 3 not upgraded.

Friday, October 19, 2012

Notes on SpringOne 2012

2012 SpringOne was full of good information like usual. The sad part of 2011 was the absence of Rod Johnson who was supposed to give the Key Note but was apparently sick so could not attend. Spring 2012 was not different in this regard. As we already know Rod departed from Vmware in July. Even though this is supposed to be a straight technical note I could not help to write some words of gratitude to the great Engineer that saved us from the complexities of J2EE taking us into a journey of IoC that we still of course enjoy. Thank you Rod, we owe you a lot!

Back to the event here are some of my notes about it.

The classical Spring Triangle (DI/AOP/Portable Service Abstractions) is being re@annotated (Injection Annotations/Composable Stereotypes/Service Oriented Annotations). All this can be defined with two simple words "Annotated Components".

Stereotype means defining a noun for example @Service. You can compose stereotypes meaning you can build an annotation that groups several others. A composable stereotype model is then just an Annotation definition (Interface) which groups some annotations itself. Injection Annotation defines a need for example @Autowired. Service Oriented Annotations define a capacity like @Transactional, @Scheduled, @Cacheable.

Spring supports a programmatic way of configuring applications. While this can be handy it looks to me like a potential problem for some teams lacking of good architectural direction. Separation of concerns could be easily violated if not used with care. Spring 3.1 takes already advantage of Servlet 3.0: WebApplicationInitializer replaces a lot of xml with java code. There is a composition model, some properties can be in web.xml while others in the Java initializer. There is no overriding mechanism. The application can now be initialized without web.xml help. A couple of methods worth mentioning: scan() and register(). A couple of annotations worth mentioning: @Configuration, @Bean.

Spring 3.2 being in github is now more open for contributions. Expected to be released in December 2012 it features Gradle based build.

Spring 3.3 is expected by the end of December next year. It is based on JDK 8 so it will support and use for JDK closures (lambda expressions), Date and Time API (JSR 310), NIO based HTTP client APIs (getting rid as we all know of jakarta commons http-client), parameter name discovery, java.util.concurrent enhancements.

Support for XML free JPA setup is now available.

MVC on the browser is becoming more popular and Spring is promoting that while proposing architectures like having not only the View but also the Controller in the client (in fact even a local model. This is of course not new, look at Gmail for one popular example. However I would expect Google to have an MVC on the Server side of Gmail as well). Spring propose only the service and data access to live on the server side. I personally have advocated for years that the MVC pattern does not necessarily mean you must have all layers in the Server. That is now a reality with frameworks like Backbone and Angular which deploy Controllers on the Front end and support local storage in browsers. However IMO the interaction between back-end and front-end MVCs might suffer of serious lack of DRY. The proposal to eliminate the Controller completely from the backend IMO could make security even a bigger pain of what it is today. Not to say that moving just the Controller to the Browser does not impose already additional security threats. The reality is that rich applications are demanding more and more and it looks like unavoidable to follow at least *part* of such advice.

Programmatic configuration goes further with MVC Java Config which can be used instead of MVC namespace. I would say that Architects have now bigger concerns about the classical "Where did my Architecture go", so my advice would be "Do review the code!".

Current @Cacheable is proprietary but as soon as Java has the implementation available (JCache JSR-107) Spring will integrate with it. We personally have been enjoying caching for while but it is good to go with the standards (whenever they are kept simple of course).

ASM and CGLIB are included into spring module jars.

Async MVC processing: This uses the Servlet 3 Async Thread Model. There are different approaches depending on the use case: Callable (swaps the servlet container thread by an application thread to process the request. The servlet thread is suspended and later resumed when the request is processed), DeferredResult (Out of Spring process which provide a DeferredResult) and AsyncTask (which wraps a callable to add features like timeout). @RequestMapping methods can return any of these objects. Look at the async tab in the Spring MVC showcase. There is chat sample using redis showing a distributed chat application which uses the async concepts. Long Polling is supported through these async approaches. All Spring Filters have been updated to support async-supported features from the specifications.

There is support for annotation driven JMS endpoint model

Spring will not stop support for java 5 and java 6.

To find out what is new in Spring 3.2 look at Stoyanchev presentation accessible from spring-mvc-32 update github project.

Here is a handy class: UriComponentsBuiolder to build urls

If you are using servlet 3 multipart requests, simplify your life using @RequestPart

Do better exception handling in your applications: @ControllerAdvice allows to globally define @ExceptionHandler to handle global exceptions. The same applies for @InitBinder for global binding and @ModelAttributes across all controllers. Global Exceptions is presented in the Spring MVC showcase project as well.

ContentNegotiatingViewResolver looks by default to the extension of the url to determine the View, then the accept header, url extension and parameter (format which I have been calling in my implementations ert or Expected Response type) in that order.

For error handling now we can rely on Custom Error Page in Servlet 3 (error-page in web.xml).

Path segment name-value pairs are supported through @MatrixVariable

Spring Mobile is an extension to Spring MVC for server side. It compliments client-side mobile frameworks. It does device detection, site preference management (storage engine for user preferences - by default cookies), site switcher (to switch from mobile and desktop). LiteDeviceResolver is the default implementation. Other classes to take a look at: DeviceWebArgumentResolver and DeviceResolverHandlerInterceptor. Support for Java configuration will be added soon. The site switcher capability approach which redirects to different sites for different devices does not look very DRY to me.

Platform targeted Sites can be developed using technologies like Lumbar and Thorax. I am not a big fan of this to be honest. I already explained my position about MVC, DRY and the need of a Business Hub when building web applications. In terms of selecting which front-end pieces should be included in one project or the other we have been using Maven overlays with success so I personally am not embarking on this.

Still CouchDB is not supported in Spring Data. I suggested looking at the Ektorp project for this. The familiar Spring Template pattern is used to access the supported NoSQL DBs. CrudRepository, PagingAndSortingRepository are some of the class worth to mention here. Implementing CrudRepository the JPA entity can be directly exposed via REST. JSON is first class universal protocol for this. Even for JPA you can use CrudRepository which exposes basic operation on entities. The entities can be exported automatically using REST semantics. I suggested looking at jpasecurity project for some enhancements in Spring Data project.

QueryDSL is less verbose than JPA2 Criteria API and Spring has embraced it for their data project.

The spring projects are in github which we already know is great for code collaboration as well as code review. Spring team calls the code review part "Jurgenization" prizing Jurgen contributions to coding conventions. They are migrating to Gradle everything they can.

Websockets support is not fully standardized yet but Spring has been working on it. Websockets try to solve issues like too many connections, too much overhead, burden on the client side. Trading, chat, gaming applications, collaboration, visualizing a lot of data are good candidates to take advantage of websockets. The Websocket Protocol (RFC 6455) uses HTTP to bootstrap but it runs on TCP directly, this is a low overhead solution. A simple header is sent by the client "Upgrade: websocket", the server replies a 101 "Switching Protocols" status code with a header "Upgrade: websocket". The library called d3.js is a good library to visualize data which the guys at springsource have used combined with vert.x library to rewrite the http://www.bitcoinmonitor.com/ application (which uses Long Polling) using websockets (In https://github.com/cbeams/bitcoin-rt there are several implementations using different technologies). From chrome the websocket protocol frames can be inspected. The technology is still new and a lot of users are still on browsers which do not support websockets. Existing proxies become a problem for websocket support. Encrypted (wss:) traffic will create better possibilities to go around this issue. Some manual configurations could be needed in browsers and server side proxies. Keeping connections alive is a problem when using websockets. Out of the box there is no confirmation of message delivery (even though there is "ping-pong" which can be used to provide "keepalive" and "heartbeat"). In Java the Java API for websockets (JSR-356) is still evolving and most likely will not be tied to Servlet specification. Spring plus vert.x can be used to develop applications based on websockets. Sock.js is a great client side library to implement websockets applications, it would fall back to other means of push protocols when the client does not support websockets so sock.js is definitely an excellent API to do asynchronous messaging between client and server. The message from Spring team: Websockets is a promising technology as a complement but it is not a silver bullet, the need for fallback options will be there for a long time. Backward protocol support is an important niche for Frameworks which Spring probably will address in future versions.

Springsource is promoting "Spring MVC used less for page rendering, more for REST API". we saw that from the second Key Note and also from several other presentations. Hence the search for a good front end framework starts and the options are really huge. Mustache is one good template library just to mention one, look for alternatives to see the rest. Backbone is one of the most popular MVC javascript frameworks.

REST support is fully supported in Spring, for example @PathVariable binds to the JSON request specific path. Spring (Data) REST uses Spring HATEOAS. Wikipedia says "The HATEOAS constraint serves to decouple client and server in a way that allows the server to evolve functionality independently" which I still have to see to believe. Design by Contract cannot be ignored and developing a client that will adapt on the fly to new servers needs is not something we mere mortals can do in 2012. Just the fact that the resources available can be listed does not guarantee the above, not to mention concerns about security.

While REST might be the way you want to go when all your clients will be speaking REST the reality is that this does not comply with my idea of a Business Hub (BHUB). As a reminder this idea is what allows me to serve the resources from one entry point and leave a View resolver determine what type of response the client needs. This is not just about a javascript client framework but about reports rendered in pdf or Excel for example. Of course is kind of impossible to get both: real REST approach and a Business Hub approach. This is because you cannot force non rich web clients to use JSON posts. On the other hand your REST applications need as much documentation as a proprietary BHUB approach. With BHUB you play just with normal POST, GET parameters and you certainly get JSON (but not limited to) back.

Spring Mobile has cool projects like Urban Air Ship to abstract the way you use push notifications for different platforms like Apple iPhone.

A JBoss presentation discussed the Spring and JEE coexistence. There will be always a space for Service simplification. In particular JPA (and JNDI), JTA, JMS, JCA,EJB, Cache (JSR-107), WebSockets (JSR-356), CDI and Bean Validations 1.1 are services that the JEE application server is already providing and Spring will transparently use those (if available) provided the correct configuration exists. So the point is Spring does support JEE provided services and not just plain servlet container servers like Tomcat where non of these services are available nor deployed by default (certainly there are ways but with Spring there you rarely use those for non JEE containers). Other capabilities like JSONP is simple enough and Spring will not introduce any simplifications. Multithreading and Spring Batch are still preferred to the vendor specific JSR 237 (WorkManager) which was withdrawn anyway in favor of the Java Concurrent API (JSR 236), dormant since 2003 but recently announced it should come to lige in Q1 2013. The use of Arquillian is again proposed to test JEE based code. I think I already blogged about my opinion on this when I posted about JavaOne. The Seam framework/CDI (JSR-299 ) extensions have been donated to Delta Spike and yet CDI beans can be used from Spring. In fact there is work in progress for bidirectional injection from CDI to Spring and the reverse. In JEE7 that bidirectional relation is getting tighter. The integration with JEE is a high priority for Spring.

IOC in Javascript looked appealing for those not happy with functional programming. IMO separation of concerns should reign and a Front End Engineer cannot say Functional Programming is not a perfect paradigm for the event driven nature of UIs.

A migration to a JSR-352 approach was demonstrated showing how Spring adapts easily as in fact many ideas of the specification come from the Spring Batch implementation. I have to say this again this year, I will not comment on Spring Batch versus ETL tools because I believe it is matter of how you structure your team and probably a subject for a lot of complex considerations that go beyond simple software development. For now I am not planning on using Spring Batch.

Some interesting notes on Testing: MockMvc allows to test Controllers. I personally think your behavior tests (with Selenium Web Driver) should cover anything wrong with Controllers however there is more about the Spring Test MVC. New @WebAppConfiguration (defaults to src/main/webapp) in Spring 3.2, @ContextConfiguration defaults to get a local file with the name of the class followed by "-context.xml" in the same path. Using EasyMock and a factory method Spring manages to inject mocked objects for testing. Mockito is also supported through a constructor passing the to-be-mocked class. In both cases a Factory is in charge of generating the mock. MockServletContext, MockHttpSession, MockFilterChain, MockClientHttpRequest and MockClientHttpResponse have been introduced. Here is another concern for Architects: I only hope developers will not put servlet scope objects (request, context, response, session) in services now that you can mock those from Spring. An ApplicationContextInitializer can be used to avoid annotations or xml for initializing the Spring Context for testing. Sprint Test MVC is an independent project, it depends on spring 3.2. Some limitations: No forward nor redirect, no JSP rendering, other rendering technologies do work as they do not depend on a real servlet container. I still believe Selenium WebDriver is the best way to test that tier in any case (Granted the problem is with side effects). In any case there is value on Controller Unit Test of course. IMO this framework creates interesting possibilities to perform automated Security tests like XSS attacks for example, however as noted before JSP won't be supported. You can check not only status, headers and content but flash attributes, handler, model content from Spring context as well. The method alwaysDo(print()) is used to provide information about the "perform" action. Method andReturn() will return all context servlet main objects in case we want to assert more specific data not available yet from the framework. Testing filters is powerful for example for spring security filter testing. HtmlUnit enables using Selenium tests but again that is not available for JSP. Learn more about this from the spring-32-test-webapps github project. In addition there are client and server tests in the Spring Framework itself so get the source code from github and start your own journey of fresh spring simple coding.

Notes on JavaOne 2012

Make the Future Java was the slogan for JavaOne 2012.

After spending 5 days x 12 hours of different session trainings at JavaOne I couldn't help to blog about my impressions on the pure technical side of the event. Please note that a significant portion of the presentations were handled by community driven projects and not precisely by Oracle so some of my notes below reflect information acquired from external to Oracle entities.

Garbage Collection optimization is still a time consuming and complex process that demands a lot of trial and error. The hope is that G1 will come to the rescue of mere mortal programmers

Troubleshooting JVM performance issues: Oracle is working on having all features from Flight Recorder plus JRockit Mission Control into Hotspot Mission Control. JTRockit will be deprecated in 2013. For commercial purposes JVisualVM will not be integrated with Mission Control. Java Mission Control is a graphical tool that provides information about the JVM, the client side of Java Flight recorder if you will.

From the origins of RMI all the way to WebSockets we are still trying to get distributed computing right. WebSockets is pure TCP, the same way REST is pure HTTP and with pros and cons it looks like the community will only keep using both in the next years.

Use the new generics enhancements to make sure a specific class is returned by methods which operate only on interfaces. Guava and Goldman Sachs collections are recognized as great enhancements to the JDK library.

Use command line tools (and of course script them) to know more about your JVMs: jps to find JVMs running in the system (-m and -v options); jcmd which is similar to jps for listing but it can send commands to the jvm so it can be used to diagnose the JVM (jcmd VM.version for example). A list of commands can be passed in a file via -f flag. Out of the box it allows for deadlock detection as it can pull stacktraces from the application (This creates a possibility for some interesting monitoring right?); jstat is used to list counters from inside the jvm. JVisualVM can be use to take a JVM core dump files and analyze them but jstack will do the same from command line (you get more power we would agree). Again jcmd is useful here: jcmd GC.heap_dump file.dump).

Inspecting the JVM can be done through several methods: JMX (jvisualvm as jconsole uses JMX for remote access), daemon (jstatd is a daemon that can be run in the server and then use jstat to connect to the it - There is no permissions here, so be careful where you run it), attach (It is used by jmap, jcmd and lps. Only available locally and for the same user). The file /tmp/hsperfdata has lot of JVM runtime information which is constantly updated by the JVM. Use jstack command for core files or non responsive jvm. Use it as last resort, it uses the debugger to pull information. JVM built-in profiler and tracer use a circular buffer with low overhead. It collects info from JVM.

Command jcmd should replace in the future jstack, jmap, jinfo. Improved Logging for the JVM like garbage collector logs have rotation but not the rest so the plan is to unify them.

The Java Discovery Protocol (JEP 158) will be used to broadcast information from the jvm so tools like jvisualvm can be notified.

JRockit Mission Control can be used to find duplicated Strings for example (those candidates for interning right :), we know we can use tools like Eclipse memory analyzer (MAT) for that but certainly it would be nice if the JDK itself comes with the tools we need as developers (the one-stop shop concept saves time of course)

Intel presented their SPECjbb2012 results for JDK7. They found no issues with most APIs: New I/O, JAXB, Try-with-resource, Catching Multiple Exceptions types, Type Inference for Generics Instance creation, Underscores in constants, Concurrent Utilities. However the Fork/Join Pool was found to be the big problem: Contentions and network throughput issues. JDK8 according to Oracle is simplifying this API so probably they will correct these detected issues.

The need to move to Java7 is clear. Just to mention a fact even though there is commitment to patch security bugs for JDK6 the support for it will cease in 2013. But there is more.

Find out from the jdk7 release notes website what is new in jdk7. Here are some of those features: string switch, diamond operator, simplified exception handling, better garbage collection (the basis of G1), multi-catch and try with (exceptions improvements)

G1 is specially designed for big heaps (above 6GB). It is better than CMS especially for fragmented heaps. It works dividing the memory in different regions which are heuristically selected for garbage collection (Divide and Conquer right ;-) It can be tested in JDK7u4+) with "-XX: +Use1GC". CMS GC will be deprecated soon.

Contention is avoided in Date (from Hashtable to ConcurrentHashMap). There are BigDecimal improvements. String to byte conversion improvements.

Java upgrades are supposed to come now every two years. Skipping versions means a bigger update gap. IntelliJ was presented as the only automated refactoring capable tool for migration to java 7. Personally I recommend looking into current open bugs before deciding for an upgrade to java 7 of course but you should be planning for it.

There are performance improvements in JDK7 specifically in JDBC, JAX-WS, JAXB, java io, async io

Here is a recipe for your JDK7 migration: Compile to java 6 using jdk7 first. Test for some time, then upgrade the Runtime to Java 7. Finally migrate code to Java 7.

Java 7 has a more strict API so expect some assumptions you have incorrectly done to break parts of your code for example SortedKey must have as input an Object which implements Comparable Interface.

Opencl is used today in jdk7, the integration inside jdk8 will continue just to finalize in jdk9 bringing full abstraction to the developer while the JVM takes advantage of GPU computing. Sinatra project promises to bridge the gap between Java and GPUs. MMUs can allow sharing virtual address space (Heterogeneous computing). There is important collaboration with Intel that might lead even to ship Java in hardware in the near future ;-)

JDK8 will remove PermGen space with the data going to the heap and native memory.

The London Java User Group has been praised for their work on 'adopt a JSR' and their contributions to the JCP. The message is: Oracle is taking very serious the interaction with the community so they are demanding us to contribute.

Java embedded had a Perrone Robots presentation (demo failed as expected - Murphy Law)

Java embedded best moment IMO was when Liquid Robotics (presented by Gosling) showed how they can control thousands of little ships which are moving using waves energy in a mechanic fashion. It auto-generates the energy to communicate data from its sensors via GSM or satellite depending how close they are to GSM networks. A piece of engineering.

Use the process to change the process is what Oracle is expecting from JCP. We certainly are looking forward for it.

A considerable part of JavaOne talks was dedicated to the promise of Lambdas (closures) in Java8 . Lambda is nothing more than anonymous functions but with not just new syntax, look for the use cases they can cover inside the JDK code itself. Java has been behind of most of the programming languages in this regard BTW. You can learn more about Lambda project using the lambda-dev mailing list. I heard more than once the statement: Developers are looking into Scala for features that are not supported in Java. Oracle is listening to the community and we can expect Java to get richer. Lambdas abstract behavior just like Generics abstract type. The code is treated as data (behavior can be stored in variables). Lambdas are more about the what and less about the how. For example with lambdas instead of the client being in charge of managing the loop, the library is in charge of the internal iteration. The way lambdas have been implemented is providing a default() method in interfaces. This rapidly brings a lot of questions about multiple inheritance and here is the explanation from the JDK team: Interfaces already provide a multiple inheritance mechanism for types, lambdas enhance multiple inheritance to behavior BUT not for state which is the real problem with C++.

JavaFX is the de facto standard to build native applications. AWT while providing OS specific native components lacks a lot of bells and whistles that Swing came with but on the other side the latter lack of support for specific OS native UI features is calling for its end of life.

JavaFX web view and jfx panel create a good opportunity to construct hybrid applications (Native UI with JavaFX + HTML5 + Javascript). A clone of the jVisualVM done with JavaFX was presented.

Use Solaris truss and Unix/Linux strace to debug database performance issues.

Nashorn (Naz-horn is the right pronunciation) brings javascript inside the JVM. A demo was presented using Mustache as javascript templating engine. It scales well and runs in small devices like Raspberry PI. The engine is 20 times faster than Rhino. Nashorn implementation relies heavily in Invoke Dynamics. Of course shebang is supported and so nashorn can be run from command line as well. There is node.jar which is a port of the nodejs API. These are interesting news that could benefit Node from the existing Java Services and Java itself from the power of NodeJS.

JEE 7 includes in the platform key features "to avoid the use of proprietary frameworks" and I quote it. I will be posting soon my notes about SpringOne BTW ;-)

Cleaner API is a mission for JEE. JMS is so simplified that it looked to me like Apache Camel code.

I heard the word POJO a lot, and not just Beans ;-)

The web socket API looks really clean, same for batch with annotations and Java Temporary caching.

DI is heavily used across the whole JDK.

There was a demo on web sockets called Angry Bids. All built on top of JEE using a REST approach.

Doing a remotely retrospective we reviewed DCE, COM, CORBA, RMI, RMI/IOP, SOAP, REST, Websockets (web sockets is just plain TCP)

Look into JDK secure coding guidelines

Software Archeology is unfortunately a common challenge, especially when you are a consultant or simply switching jobs. The amount of legacy and undocumented code makes your life difficult and we discussed how to mitigate this reality. Finding behavior is reduced to documenting using activity and sequence diagrams. Finding structure is about deployment, component and class diagrams. I would add to the equation (if not favoring it as highest priority) User Stories. Some tools can help here like trace based analysis using byte codes or aspects. Tools like mission control can help but it does not provide the order of method calls. There are tools that allow to have an output like the one from strace but from the JVM. We can generate system dumps and then analyze them after. IBM hosts in their website some of these fee tools (JVM trace for example)

Codename One presented how to build iPhone Applications from Java. I share Martin Fowler's opinion on this issue

Verisign presented JEE security in practice. You can find information they maintain in here. They discussed the use of HttpRequest#authenticate(), @ServletSecurity, session#logout(), HttpServletRequest#[getRemoteUser(), getUserPrincipal(), isUserInRole()]. Not always possible but white listing is always preferred. Use declarative security first then as needed use programmatically security. It was clear to me how ahead Spring Framework is in terms of security in comparison with JEE in the Web Tier. IMO vendor locking is precisely where Spring excels so to claim that spring security cannot be compared with official JEE just because the later is a standard is not a wise statement as far as I can tell. For one all vendors after all add security features in their application servers for example, so you will be locked of course.

In Java 8/9 we can expect a Modular Java Platform (Project Jigsaw). It allows to package ME and SE together, it will try to resolve the the jar hell problem, the scalability (down to small devices like Rasberry PI and up to the cloud like Oracle Exalogic Elastic Cloud T3-1B). Performance is expected to increase both in terms of download and startup time. A couple of comments on language keywords: the module keyword allows for organization of Java packages and the the public keyword loses its meaning as it is not longer public to the outside unless exported.

A presentation promoting agile JEE development using JBoss, IntelliJ and JSF including the use or Arquillian for testing which basically had to restart a servlet container every time a JUnit test was triggered. It was really IMO not that agile.

JDK Enhancement Proposals (JEP) promise to bring more community participation to Java as an open standard. The OpenJDK project is after all the incubator for new features of the Oracle JDK (Hotspot). Boxing will be removed at some point as an example of one of those current JEP. Just search for JEP to get an idea of the new features and enhancement proposals.

JEP 159: Enhanced Class Redefinition is in implementation phase. This will allow the hotspot to support redefinition of classes, method signatures and more.

A clarification for the meaning of @deprecated within JDK code: For the JDK team it does not mean it will disappear from the source code. The reason is backward compatibility.

Check your application is correctly using all cores. Modern computers use NUMA so use the optimization Flag -XX:+UseNUMA to allow a more optimal usage of memory. Bunch of other flags for you to look at: -XX:StringTableSize, (interned Strings), -XX:+UnlockExperimentalVMOptions to use even in JDK 6u21+ -XX:+UseG1GC between others.

Tuesday, October 16, 2012

Install a Custom Talend Component

Talend custom components are a nice way to go around limitations like bugs and missing features. Here is how you install them (tested in version 4.2.3):

Download the component (most likely a zipped file) from a provider (most likely from Talend Exchange
Uncompress the zip making sure it contains all files inside of the root of the resulting directory
Copy the directory to plugins/org.talend.designer.components.localprovider_$TALEND_VERSION/components/
Restart Talend and access your component

Some components could fail to be recognized due of issues with the xml declaration schema which you can find with a command similar to:

$ find $TALEND_HOME/ -name "Component.xsd"
/Users/nestor/Downloads/TOS-All-r67267-V4.2.3//configuration/org.eclipse.osgi/bundles/2247/1/.cp/model/Component.xsd

You can validate the component xml against the schema using an online service like http://xsdvalidation.utilities-online.info. That is how I found for example the tFTPGetFile was missing the node. As a side note I also had to repoint to the module edtftpj-1.5.6.jar from the GUI for this component.

Note that you can avoid restarting Talend to get your components recognized and ready to be used. The Generation Engine initialization is responsible to recompile javajet templates. This is triggered when you first load talend but it can be also triggered while pressing shift+ctrl+f3 (add fn if using a MAC).

If you find out the component is unable to load jar files it needs or any other weird behavior consider cleaning the cache deleting the file \configuration\ComponentCache.javacache and restarting Talend after.

Friday, October 12, 2012

Skipping lines on top of a file with Talend

This is straight forward in Talend. Just use the tFileInputFullRow "header" setting which as per the help defines the "Number of rows to be skipped at the beginning of a file":

I hurried too much on providing a solution using the tJavaFlex component which I not longer need thanks to the answer in Talend forums.

CSV Splitter or Filter with Talend Java

The Data Team was in need of an unnexistent Talend behavior.

Something that we could call the CSVSplitter or the CSVFilter, a component that would take a CSV file and would output that same row only if a lookup column matches certain content.

Of course you might think at first of a combination of tFileInputDelimited and a tFilterRow but that would not work if you do not know the schema. We need some schema less or dynamic schema component for this use case.

Jump directly to learn how to get this done from a component or read below to understand how you can do this from tJavaFlex and later build your own component with similar code.

Here is a project that shows this proof of concept. It parses an inputFile using a delimiter, and outputs only the lines where the lookupColumn has a specific lookupValue (four parameters). Below is a screenshot of the POC. I needed to use the tFilterRow because tJavaFlex will output blank lines when there is no output from code:

This approach has a big advantage. Instead of having to create a job, subjob or project per schema to parse a unique single job can take care of all your CSV splitting or filtering needs.

You can test the project with the below file:

person| city
Paul| Miami
John| Boston
Mathew| San Francisco
Craig| Miami

Change the lookupColumn between person and city and change the lookupValue to see how it filters the rows. Change the delimiter to test that as well.

Below is the code for the import, begin, main and end methods with the addition of a new requirement: Start parsing the file at a given row (starting at 0) where the header is expected to be. Import:

import com.csvreader.CsvReader;
import java.io.ByteArrayInputStream;
import java.io.BufferedReader;
import java.io.ByteArrayOutputStream;
import java.io.FileReader;
import java.io.InputStream;
import java.io.InputStreamReader;

Begin:

BufferedReader reader = new BufferedReader(new FileReader(context.inputFile));
ByteArrayOutputStream out = new ByteArrayOutputStream();
int rowNumber = 0;
String line = null;
while ((line = reader.readLine()) != null) {
  if(rowNumber >= context.headerRowNumber) {
    out.write((line + "\n").getBytes());
  }
  rowNumber++;
}
        
InputStream is = new ByteArrayInputStream(out.toByteArray());
CsvReader csvReader = new CsvReader(new InputStreamReader(is));
char delimiter = context.delimiter.charAt(0);
char textQualifier = csvReader.getTextQualifier();
csvReader.setDelimiter(delimiter);
csvReader.readHeaders();

String[] headers = csvReader.getHeaders();
StringBuffer sb = new StringBuffer();
for(int i = 0; i < headers.length; i++ ) {
  String header = headers[i];
  sb.append(textQualifier + header + textQualifier);
  if( i != headers.length - 1 ) {
    sb.append(delimiter);
  }
}
//System.out.println(sb);
int i = 0;
while (csvReader.readRecord()) {

Main:

String lookupValue = csvReader.get(context.lookupColumn);
//System.out.println("'" + context.lookupColumn + "'|'" + context.lookupValue + "'|'" + lookupValue + "'");
if(lookupValue.equals(context.lookupValue)) {
  //System.out.println(csvReader.getRawRecord());
  if( i == 0 ) {
    row2.line = sb.toString() + "\n" + csvReader.getRawRecord();
  } else {
    row2.line = csvReader.getRawRecord();
  }
}
i++;

End:

}
csvReader.close(); 
out.close();
reader.close();

Putting it all in a Talend Component

I have built a Talend component that encapsulates the logic here presented. It is included in a github project which contains a tutorial on how to build Talend custom components.

To use it you just need to configure the component to parse a file like the above. Look at the picture below for a usage example:

Tuesday, October 09, 2012

iReport Attribute 'uuid' is not allowed to appear in element

Some renegades still refuse to use Linux as development environment when working with Java, JasperReports, Talend etc. OSX so far looks OK but Windows one way or the other is always bringing issues.

Today I had to spend sometime with iReport Designer tool in Windows. We wanted to upgrade from version 4.1.3 to 4.7.1 so we opened the old reports perfectly in 4.7.1, compiled them, run them. Everything seemed to be perfect until the report was modified in which case version 4.7.1 would behave like version 4.1.3, basically it does not understand the new XML schema:

Error loading the report template: org.xml.sax.SAXParseException: cvc-complex-type.3.2.2: Attribute 'uuid' is not allowed to appear in element 'jasperReport'

I could not find a way to make 4.1.7 import settings from 4.1.3 without stopping from parsing correctly the JRXML which contains in newer versions the uuid attribute in multiple nodes.

So my only option was to tell the renegades to:

Close iReport
Delete the 4.1.7 settings directory. If you installed iReport in C drive here is the command to use. Otherwise locate the directory and delete it:
```
rmdir /s "c:%HOMEPATH%\.ireport\4.7.1"
```
Start iReport canceling the import for settings from 4.1.3

Bottom line is it looks like in Windows an iReport upgrade will result in losing your previous settings.

Delete old files except for certain directories and files with one liner bash

DISCLAIMER: Do understand what you are doing before proceeding. I am not responsible for your own actions. I just make public useful code which might become harmful in the wrong hands.

Here is a one liner "find" command that allows you to iterate through all files inside a given directory providing exceptions for certain files and directories. With the result you can run any command.

Let me read the below example for you: Find starting at /home/ directory all files (-type f) older than 30 days (-mtime +30), print full path file names followed by a NUL character so white spaces are correctly interpreted (-print0), ignoring (-prune) directory "nestor" or any hidden files (.*). Then list (ls) the items terminated by a null character and no-run-if-empty (xargs -0 -r). The "-o" switch says "Do not evaluate the next expression if the previous is true", reason why the exceptions go first.

$ find /home/ -type d -name "nestor" -prune -o -name ".*" -prune -o -type f -mtime +30 -print0  | xargs -0 -r ls -al

Clearly you can change "ls -al" by "rm" if you are confident all those files might go away.

On a related issue it is common to forget to use "-mindepth 1" option which basically tells find "do not list in your 'findings' the start directory". Imagine you want to delete anything below /opt/tmp/realtime_temp. If you do not use the "-mindepth 1" option the directory itself will be deleted as well. So do use the flag if you are not trying to delete the start directory. I have seen some Linux installations delete the dir while others won't ... go figure.

find /opt/tmp/realtime_temp/ -mindepth 1 -mtime +5 -exec rm -Rf {} \; > /dev/null

Another related issue is that 'find' needs the '-depth' flag in order to correctly remove all files and directories matching certain rules (when using -exec or piping to xargs. If using -delete that option implies -depth as per man pages). Not using '-depth' results in errors like 'No such file or directory' as 'find' tries to execute the remove command for files contained in directories that it already removed.

find /opt/tmp/realtime_temp/ -depth -mindepth 1 -mtime +5 -exec rm -Rf {} \; > /dev/null

Thinking In Software