Thursday, December 15, 2011

Two hours with Rasmus Lerdorf, the creator of PHP

I had the pleasure to meet Rasmus Lerdorf, the creator of PHP in a two hour session organized by South Florida PHP Users Group in Nova South Eastern University.

As one of the earliest PHP adopters I share a lot of feelings for the language and it was great to see so many young people in the room together with not so young like me ;-) and some older than me. It is great to see a language going on for so long. I would say it was a celebration of more than 15 years of PHP coding.

Even though I have been working more with python, ruby and Java for the last 5 years PHP always comes back: I need to patch a built in PHP software, configure it or even get a quick snippet and modify it to get my work done.

So with a lot of quotations here is an extract of the introductory ideas this MIT awarded "top 100 innovators in the world" shared with us.

The idea behind PHP was always "the need for speed and performance". Rasmus "did not worry about correctness but about resolving the problem". He is a guy that "does not love to program" and that is precisely why you want to use PHP because instead he actually "loves to solve problems".

"You have to add non scalable features in order to make php not scalable". The "documentation was always done before the function was implemented". He constantly "insisted in having lot of examples as part of the documentation"

PHP was born while "thinking in the ecosystem". Basically shared hosting demanded control on memory and CPU (QoS) to adequate existing servers to the needs of multiple applications.

PHP "runs crappy code extremely fast" so it is better for startups than other languages like Java. It simply scales well.

I asked the question about why he considered Java not scalable and he clarified:
-It does scale but it comes out of the box with features that allow the requests to be sticked in memory, a thing php by default does not allow. Also when it comes to go beyond one server it is then more difficult in java just because of this.

Of course one can argue sticky sessions are enough for most projects. Yes the user session will expire and then you will need to re login but things like remember-me will help (not that I like it really). Not big deal considering how many times this actually happens, I would say nobody complained to me on the clusters I have setup for which in really weird circumstances a shared clustered session was used.

He described some features that make PHP even faster today:
  1. libevent allows event driven programming ala nodejs.
  2. zeromq for better and simplest socket programming. I have to add that messaging is in fact something I would say will gain more and more momentum within the software community. The world is asynchronous as the creator of Erlang language said said.
  3. FastCGI Process Manager (FPM) allows dividing the workload while sending jobs to workers. This is the PHP way to say no to threads.

Rasmus talked about performance and brought some examples providing some guidance to improve it:
  1. Use wisely your datastores. Do not try to replace sql with nosql, it simply does not work that way.
  2. Turn off logging in production (error_reporting(-1) is expensive).
  3. Use strace: Look for ENOENT, excessive stats (lstat), check your realpath_cache_size.
  4. Use a profiler: callgrind, xdebug, xhprof, xhgui.
  5. Set default timezone (look at phpinfo).
  6. Watch for inclusions. Too many inclusions degrade performance. An example of a bad application is Magento which includes thousand pf files.
  7. HipHop-PHP can be used to do static analysis. They need to parse php better than the php parser itself because their goal is to translate PHP to native code. Be prepared to wait for the compiler but this is a good exercise to find problems in your code.

Here are some architecture reminders/suggestions from Rasmus:
  1. Use different domain/subdomain for static assets.
  2. Keep cookies short (MTU size maters and will translate in more roundtrips when it is overflown).
  3. Multiply by 5 times the amount of cores and that is the amount of real concurrent users. I have to say I have been calculating allowed concurrent users for years and the metric about just the cores is not that simple. It even depends on what the application is using of course so a word of advice from my end would be take a look at your system metrics for example with vmstat, top and other system commands. The use of automated stress tests driven by for example jmeter are a must do in my opinion.
  4. Use out-of-band processing with Gearman, PHP-FPM or custom via ZeroMQ.
  5. Tweak ORM and caching.

Finally he talked about the PHP 5.3 features, how PHP 6 development was stopped, the problems inherent to support Unicode and the efforts that are still being made in the current version 5 to support little by little more Unicode functionality. Here are some of those new features:
  1. Better performance through lot of code optimization.
  2. Support for closures.
  3. Namespaces. There is a big debate around the decision to use a backslash instead of a dot for the namespaces. To be honest I do feel that as awkward but as Rasmus said "we will need to get used to it".
  4. Late Static Binding support.
  5. Garbage Colector which you should not need in web apps but if you have long running scripts then ii will definitely help.
  6. NEWDOC which is like HEREDOC but does not perform any parsing inside the string block.
  7. Remember "Go To Statement Considered Harmful", the famous letter from Edsger Dijkstra? The controversy is still on as PHP allows "goto" to eliminate verbosity or the use of break. In my opinion the discussion about the name is secondary. You can use break or continue to a a label in Java for example so if a goto has just the meaning of breaking the loop to a variable I think is awkward but but I am fine with it. However allowing goto to go to any part of the code block is not precisely a multi level break or continue, it is something more than that. If that is harmful, confusing, miss leading or not I leave it to the discussion. Perhaps just adjusting the multi level sysntax from PHP to accept a letter instead of a "magic" number would be a better approach.
  8. DateInterval/DatePeriod classes.
  9. date_create_from_format
  10. FastCGI Process manager (FPM)
  11. For people moving to ngnix from apache .php_ini allows for custom php directives just like .htaccess for Apache.
  12. Traits: Compiler assisted copy and paste to resolve the lack of multiple inheritance.

Thank you Rasmus for PHP. Let's celebrate how lucky we are that not a single language could ever win all battles. Different languages exist to resolve similar problems unders different scenarios. This is not any different than we, human beings. We need different skills in a team as much as we need different languages in modern computing.

and different human beings exist to reso

No comments: