Archive for April, 2008

Right Tool for the Job

By Stanley | Wednesday, April 23rd, 2008

In one of the projects, our client wanted an auto-complete function in one of the search forms. At first we implemented it with SQL queries because it was the simplest and logical solution. It worked well with small data set containing about a hundred records. The response was fairly snappy - it took less than a second to fetch data and display the results. However, this solution did not scale well with large data set containing several hundred thousands of records. The response time was about half a minute - clearly not acceptable for an auto-complete widget.

The problem had to be solved from a different angle. Rather querying the database with  select statements, we used a free text search engine called Lucene via the sfLucenePlugin. Execution time was reduced substantially but it was still too slow. Since Lucene was originally written in Java, we did a quick test to see how well the Java implementation performed. To our delight, it performed very well - results were returned in less than a second.

The challenge was how to make Java and PHP talk to each other. We experimented with a number of alternatives. In the end we chose XMLRPC over HTTP to bridge the language gap. The protocol is well supported by both languages. It only takes a few lines to create an XMLRPC request object, send it to the XMLRPC Java server and get back an array of PHP objects. No home-brewed, string delimited request/response protocol! The architecture also allows the two processes to run on separate servers or have one Lucene server service multiple PHP servers for better scalaliblity and hardware utilization.

What was the lesson we learned? Use the right tool for the job: if a library suck in one language, try libraries written in other languages (or write one yourself in another language).

Open Web Conference. Hello Vancouver!

By isim | Wednesday, April 16th, 2008

The Thirdi team just came back from the 2008 Open Web Conference. It was my first time attending this type of conference. I really enjoyed most of the sessions as they allowed me to unleash the geek within. It was definitely a refreshing time of personal update. I picked up some important new lessons and cool tools as well as being reminded and re-affirmed of best practices in web software development.

The conference also helped to raise my personal awareness of the open source community. There were definitely some very bright minds in the conference. Listening to presentations and case studies where these people tackled and solved hard problems in elegant and practical ways was inspiring. The impact and influence that this community will continue to have in shaping our society is unfathomable.

I’ll like to share some of the important lessons I have picked up at the conference.

  1. Security on web applications (by Damien Seguy). Security is a key requirement in all web applications that is often overlooked (or at least not dealt with until it was too late). In his presentation, Daimen was able to demonstrate how a lot of software security compromises in the code can be avoided by using very simple tool such as grep. Best practices include:-
    1. getting rid of unneeded files (.doc, .zip, .old) from the web directory,
    2. moving configuration files out of the web directory,
    3. paying special attention to code such as var_dump, print_r, mysqli_query etc., and make sure their presence don’t open up any potential security breach.

    The idea is to avoid exposure of confidential information to unauthorized personnel.

  2. Deployment (by Chris Hartjes). This was one of my favourite talks. In his presentation, Chris shared with us his 6 rules of deployment. The one that I found most prominent can be paraphrased as “if a deployment process was not repeatable and not automated, mistakes will happen”. Chris went on to say “manual deployment is for suckers”. These are strong but truthful words. We might be able to get away with manual deployment for small projects. But if these projects ever grow and scale across multiple servers, manual deployment is out of the question. Another insightful statement by Chris was a good deployment process is a one-step process; not a two-, three-step process”.
  3. Engineering of internal process (by Ronn Abueg and James Andres). Ronn and James implemented a social intranet tool using Drupal. Interestingly, there isn’t much CMS-related content in this presentation. They were able to implement a solution to help facilitate transfer of internal knowledge and information without re-engineering people such as forcing the CEO to turn on his laptop, open up his browser, fill in a form etc. It was a solution that fits seamlessly and painlessly into their existing work flow. Thirdi has been searching for a simple, accurate and cheap (in terms of human effort) mean to track the number of development hours per project. Perhaps something we can implement is to tweak our IDE to keep track of how many hours we spent on a project, or a script to keep track of the number of commits per day, or duration between the early-morning update and end-of-day commit etc. Little things like these can be integrated seamlessly into our existing dev process without imposing additional overhead on our team.

I also picked up some interesting web tools (google gear, phpundercontrol, capistrano, maven, APC) from the conference. But I won’t talk about them here as all of them can easily be found on Google.

Overall, I enjoyed my time at the conference and was able to walk away with a lot of useful tips and tools.