pytst 1.18
I’ve just released pytst 1.18 which fixes a bug reported by Keith Davidson. The bug was in the scan method and appeared in some very specific yet totally mundane circumstances : if you scanned an input string of a very specific length with regard to the content of the tree, then the first scan would be ok, but the next ones not. This was something that was not in the unit tests and that managed to survive many years before being reported.
The code for the scan method (in tst.h) was one of the most complicated piece of code in this project, and I had quite a hard time figuring out what was happening. In the process, I began to realise that a rewrite would be necessary, so I cleaned the slate and entirely re-designed the algorithm, this time with careful planning.
While doing this, I realised that the old algorithm was much more wrong than I thought. There is a catch with the new algorithm, however : it has to do some backtracking. That is to say, in some condition, the algorithm has to go back and read again some part of the input string.
The old algorithm never had to look back – but the old algorithm would miss some matches, if you had dictionary entries that included multiple smaller dictionary entries. For example, in a dictionary with {'111','222','111222111'}, if you provided the input string '11122211' (notice the missing 1 at the end), then you would have a match on '111', then a non match on '22211'. The match on '222' would be missed.
For the moment, I prefer to release pytst 1.18 with an algorithm that is correct yet not optimal, and try to find a non-backtracking solution later. From my measurements, the impact on performance is around 10% but YMMV.
For the record, the new algorithm looks like this :
There is backtracking in to places : (J) and (K) where the input string index i is modified. To prevent backtracking, I have to find a way to analyse the tree so that I emit any match included in the current state then change to a proper state. It’s what Aho-Corasick is supposed to do, but I need to have a better look at it since I obviously got it wrong the first time.
You can download pytst 1.18 here.
A case against Twitter
OK, I’ve put up with enough BS about Twitter. It’s been months that this thing is growing on me, usually I can vent this off, but this time I can’t ; a tiny thread was the last straw for me.
Short version of this post : Twitter basically provides a crappy infrastructure for broadcasting short messages on the Web, leaving to everybody else, be it their users, people with too much time on their hands or even companies to invent their own usage patterns and implement features by themselves. Meanwhile, Twitter sit and watch the world revolve around them, occasionally trying to cope with the unbearable mess that their infrastructure seems to be. The worst thing is that for each usage pattern that someone else is kind enough to add to Twitter, there already exists a better alternative out of Twitter, sometimes years or decade older. Yet all this doesn’t seem to slow down Twitter a bit from passively stealing wind from innovative companies.
There, I said it.
Now, for the long version.
1) Twitter’s innovative model is ultimately awkward for social networking
The core of Twitter is : « say whatever you want to the world, in less than 140 characters ». In the standard configuration, every thing you publish can be seen by anybody. On top of that, you can follow people, that is to say that you can asymetrically « tune in » to what a Twitter user says and receive it on your Twitter home page, so that what they say is not lost amongst the blabber of everybody else.
I know that you can create protected profiles, I know you can send direct messages (DM), I know you can block people, all that stuff. Those are not core features, in the sense that they were added to address shortcomings of the initial offering, rather than to innovate along the lines of the core features. In a world without Twitter, you want to DM ? Try e-mail. You want protected profiles ? Try Facebook, Google Groups, Friendfeed groups, etc.
( As an aside, to an old, born-in-the-70s guy like me, Twitter sounds a lot like the CB radio of the web. It’s a medium, and like any medium it needs some content, some usage patterns to have some interest. In France at least, the killer app for CB Radio was an informal, real-time, dynamic community dedicated to reporting police speed traps. Now it has been re-invented with GPS and GPRS. )
Granted, this combination of short message, public broadcasting and asymetric subscription is an original creation from Twitter, coming just at the right time when the blog craze seemed to fade away (remember the blog craze ? back in 2004 ? one blog for each human being and all that stuff ?). If people can’t be bothered creating true blogs, if people suffer from the empty text area syndrom (the modern version of the white page one), well, let’s dedramatize this life streaming stuff and just let people tell what they just ate in a few words.
This was the « What are you doing ? » period of Twitter. It was a meme, it was fun, but once you’ve read your 10th cleaning the bathroom tweet, the excitement disappeared. The point is that Twitter was pretty fun in the sense that you could (and still can) reach random people on the Web and find out what they are doing… but except for a few outliers which ensure that Twitter gets some media coverage (like Twitter CEO’s pregnant wife who tweeted during labour), you soon discover that a random person on Twitter is, well, random, and you wish you could get a little bit more feedback from this…
…then you turn to Facebook because, well, that’s the place where your friends/colleagues/futur-ex-boss-that-you-forgot-you-befriended are, where you can trade pictures, say bad jokes and naughty things and try to outsmart each other on crappy IQ tests. Or you turn to Myspace, Orkut, whatever : it’s not the brand, it’s the fact that a richer social network exists, complete with more or less useful apps that you can use INSIDE the network, that it allows for true conversation with more or less privacy depending on the context, that it’s rich with images, songs, videos, that it’s so much MORE interesting that the awkward, public, text-only chat you can have on Twitter. Hey, even if all you need is a good old fashioned text-only chat, there are far better alternatives to Twitter, beginning with the chat system that were introduced in Facebook or Myspace.
So at the end of the day, this whole « public broadcasting your thoughts » seems a bit awkward. Of course some people are perfectly happy to whistle in an hurricane and expose everything about their life, and I’m fine with them, but Psychology 101 says most people want to be heard, and answered (if you don’t agree with this, imagine closing the comments and removing your Google Analytics tag on your blog, for instance). Twitter has a informal social network, of course, but it is quite specific (asymetric following) and much less engaging than the competition out there. This is true to the point that you have to turn to third party applications like Friend or Follow (once again showing the total apathy of Twitter, letting other people do the job) to get some sense of what’s going on in your trust network. Not good.
One symptom of this problem is the fact that teens don’t tweet. If teens don’t see any social value into Twitter, it’s a pretty good indicator that something is wrong, and that with time things will look even worse for Twitter.
Next in line, the 140 character limit. You know what ? In your favorite social network, I’d bet that most of the time people already write messages that are 140 characters of less, except nobody poses about this, because this is how people communicate anyway. What was a kind of artificial constraint providing a creative twist, a urge for conciseness in a blogging context, becomes totally uninteresting in a conversational context, where people already integrate the 160 characters as a limitation of the SMS. The problem is that this limit is actually preventing people from easily sharing interesting things, more on that later.
So, to sum up, Twitter was initially built as an alternative to blogs, grew some kind of original social network features, but in the end I can’t help but think that the result is awkward and that there are far better mainstream offerings out there.
2) Twitter is also awkward for sharing stuff on the web
Another popular usage pattern built around Twitter is the « share stuff on the Web » pattern : share URLs, share pictures, share videos… Well, guess what, you just CANNOT do that in Twitter per se, not easily at least. You have to use URL shorteners, third party web application like Twitpic, etc.
OK, Twitter bought bit.ly last May so that pretty much solves the issue of sharing URLs. Except that it’s a wart on the system and it hides the destination URL so that you don’t know whether you are going to land on a NSFW page, all this because for the 140 chars limitation. And I still can’t easily share pictures of video, they stay out of the loop, on an external system. Compare with this variation on the theme of « I just ate », powered my the micro-blogging platform tumblr.
The net result is that you are using a patchwork of interdependent patchy services with unknown viability to share things with people. You are subject to the infrastructure or financial problem of all those services (think about what happens with tr.im). Then if you want to get some feedback about what you shared, then you’re on your own trying to build a conversation (more on that later) and see if some people retweet what you shared.
Once again, there are perfectly valid alternatives out there if you want to share stuff : Digg, Google Reader + Google Notes, Friend Feed, del.icio.us, StumbleUpon, Yoono, and countless others. Each one of these services are dedicated to the task of reviewing what some people have discovered, give your advice about it (« I like it », « Favorite », rank it). Each one is perfectly aware that something else than 140 bursts of character exists, and handle images and video sharing in a handy way. Each one of them allow for some easy discussion of the subject at hand. In short, those services were properly engineered for sharing stuff on the Web. I’m sorry, but Twitter is not.
A fine example of the inadequancy of Twitter here is the retweet (RT) usage pattern, trying to emulate features like « I like » that you can find in other applications, except that it’s handled way better (I won’t bother you with the details of with RT fail, I’m sure you already experienced the awkward feeling of retweeting something that was already retweeted).
Once again, we see people trying to struggle around the limitation of Twitter by inventing usage patterns. This is very cool if we see all this from a social science experimental point of vue, I guess some people could get a PhD out of this, and we all feel like pioneers exploring the Wild West while doing that. The sad truth is that this Wild West is nothing but an artificial sandbox which per chance is under the scrutiny of mass media. Meanwhile, some real innovation and progress takes place somewhere else, out in the real world (FriendFeed, Google Wave, pubsubhubhub to name a few interesting spots).
By the way, the funny thing is that I actually tried to fix RTs by working on a collaborative ranking system for Twitter : Tweetraise. It was very fun to write, I learned a lot about Twitter and Google App Engine, but my end conclusion is : why bother ? Why would I spend some time trying to fix Twitter for free whereas some other very good services already exist ?
3) Twitter is awkward for conversations
Twitter doesn’t have any proper conversation system. It doesn’t thread tweets, even though there is a trace of threading attributes hidden in the Twitter API (see the in_reply_to_status_id element there). You end up either DMing or publicly replying to tweets with an @name mention and let the readers find out what you are replying to. You get absolutely no help from Twitter here, initially there weren’t even a reply button in the web interface (this feature was actually boasted by third party applications !).
So once again, people try to work around this with mentions, hashtags, and in the latest installment of The Twitter Circus, a tiny thread. Once again, it kinda works and people are delighted to see « innovation » in Twitter-land for features that have existed for decades in Usenet or IRC.
You know, it’s pretty sad to watch people struggle to single-handedly re-invent Google/Yahoo Groups, FriendFeed, Usenet or IRC within the limitations of Twitter. What’s the point ? OK, it’s some good hacking fun (like I’ve wrote, I went through this and it was nice), but shouldn’t we try to actually build some open source based innovative infrastructure instead of trying to incrementally improve a system for the benefit of a private corporation ?
4) Twitter’s infrastructure is crappy by design
OK. I know « public broadcast with asymetrical following » is not that simple to handle. I know that some usage pattern leading to people with millions of followers is not really the easiest problem to address. I know that the rate of growth of Twitter is putting a big stress on their infrastructure. Plus, it’s not easy to withstand a sustained growth, to survive DDOSes, and big pieces of news like the iranian elections or MJ’s death. Most of all, I don’t really know what kind of money Twitter can mobilise to solve these issues, but it surely isn’t in the same league as Google.
But come on… I don’t need to remind you of the loooong track record of outages and performance issues of Twitter. You can follow nearly everything in their status blog, and it’s not pretty.
The problem is not that Twitter has infrastructure issues which could prevent people from telling the world whether they like spinach or not. It is that there is a whole lot of people who invest time and money in this, trying to provide innovative services on top of what is becoming a carrier business. They are ultimately relying on the Twitter infrastructure for their projects / business to work properly. Twitter is at the core of an ecosystem which depends on their capacity to run the basic infrastructure seamlessly, a single point of failure in a very centralized system. What if Twitter doesn’t have the resources to fix their issues, reach and maintain a high quality of service ? Will everybody suffer from this after having invested so much time and money in Twitter, all the while bringing they incremental improvement and valuation ?
Have a look at FriendFeed. Those guys managed to deliver an almost flawless experience from day one. They have the experience, knowledge and brightness required to build this kind of service. FriendFeed proves that you don’t need to have huge, Google-like teams and resources to build an innovative, scalable, real-time service. Yet Twitter could not deliver. Guess which service out-numbers the other 50 fold ? Twitter wins. Go figure. On the other hand, guess who was bought by Facebook ?
Of course, there are two solutions to this infrastructure problem : either Twitter vastly improve their infrastructure and monetization scheme, and tell the world how they feel they might be able to technically and financially reach the 100 million, 200 million, 500 million users bar, or they just let go of the infrastructure problem and turn Twitter into a decentralized protocol for sharing short bursts of text.
( To be frank, there is in fact a third scenario, which is that everybody keep on blindly trusting Twitter to fix their problem and blissfully continue bringing value and users to the platform for the ultimate benefit of Twitter. )
What would become Twitter in a decentralized protocol scenario ? Well, it would be a bit like Hotmail, Yahoo Mail or Gmail in the e-mail business. A brand, an access portal competing with other on usability and features in the context of a common protocol.
But of course, this supposes that the world really need something like a decentralized Twitter, at a time where far richer and open initiatives like Google Wave or pubsubhubhub are building momentum…
Conclusion
Well, if you managed to reach that part (or skipped directly to it), congratulations and thank you
My conclusion is pretty much the same as the short version of the post (hopefully). I’m worried that Twitter is actually hurting innovation by distracting funds and good willed people, while actually failing to provide a useful user experience or giving any guarantee as whether their infrastructure is reliable or not. Let’s hope that the hype around Twitter dissipates far enough to prevent further damage. The only other way out for Twitter is to actually reinvent themselves and starts truly innovating instead of relying on the goodwill of API users. This is a tough endeavour given the potential horsepower behind big actors like Google Wave and Facebook/Friendfeed.
FriendFeed might actually benefit from Google Wave
With the current buzz about Google Wave, it is difficult not to feel sorry for FriendFeed. The common ground between Wave and FriendFeed is indeed pretty rich : real time communication, threaded conversations, lightweight social networking… Some people even suggest that there is some irony in seeing Google rip off good ideas from FriendFeed, the latter being founded by ex-googlers.
However, there is something peculiar about Wave : apart from the very Googly UI, the infrastructure is meant to be opened and decentralized. Wave can be seen as a way to collaboratively edit an XML document (the wavelets, with are the basic communication nodes found in waves). The point is that this XML document can be replicated on many providers’ infrastructures, and updates are propagated through extensions of the XMPP protocol (XMPP is AKA Jabber, it’s an instant messaging protocol that Google Talk also uses with extensions).
Check out this paper about Google Wave Federation Infrastructure.
So, we are not really in a winnner-takes-all scenario. Now, if I where FriendFeed, I’d have two choices, now.
1) Decide this is a Fire and Motion move from Google, aimed at first at Microsoft (to take the wind out of Bing), then Facebook, then FriendFeed. Keep away from this game, and slowly become an awesome app no one uses, dedicated to geeks and power users, and watch people have all the fun out there on Wave or Facebook.
2) Embrace Google Wave Federation Protocol and build gateways and proxies that enable FriendFeed users to participate into Google Wave discussion without leaving their favorite UI. Of course this means that FriendFeed might become a victim of the Fire and Motion scheme : Google adds new stuff into Google Wave, FriendFeed runs to support it, then Google adds new stuff, and on and on, which means that FF could loose its technical leadership on the market. However, there is certainly room for more than one Wave provider, and if this new way of communication becomes mainstream, this could mean that FriendFeed would at last find a way out of Geekland and reach a wide audience, wider than it is currently now. Then launch a feature war with Google on the social networking or real-time search front, both fronts on which they have already proven that they good do very well.
So, depending on the choice they make now, I think Google Wave could actually be good news for FriendFeed, turning their awesome power-app into the second player in a potentially huge new market.
The difference between pytst and ctst
Putting away the difference in languages (C++ for pytst, pure C for ctst), the two projects are quite different with regards to the way data is stored in the tree.
pytst implements a text-book ternary search tree with an AVL balancing algorithm. While I was working on this project, I designed a scanning algorithm that I particularly proud of, until I found out that it was already known as the Aho-Corasick algorithm. This scanning algorithm is handy when you want to match an input text against a huge quantity of search strings (which was the initial use case that led me to develop this).
The problem with pytst is that it is a very straightforward ternary search tree implementation, in that it requires at least a tree node per character per unique prefix in the input strings. Given all the meta data that can reside in a tree node (at minimum, we need the character plus up to three pointers to other nodes, plus one pointer for Aho-Corasick, plus anything required by the memory allocator), there is a huge memory overhead. The algorithms pytst implements may be pretty efficient, but the trade off in memory is quite taxing.
While I was working for Yoono, we decided to store huge quantities of URLs in memory using a TST, since the hierarchical nature of URLs from the same domain was a good fit for this structure. This time, the implementation language was Java, but we developed some pretty nifty tricks to optimize memory usage ; basically I reimplemented malloc/free running on int[], char[] etc. This way we would reduce the load on the Java memory allocation, object header overhead, GC, etc. We could then store millions of node in memory. This was cool, and it worked well, but it was a bit stupid from me the optimize the allocation issues before optimizing what was allocated. Fortunately, a clever colleague from Yoono (Yann Landrin-Schweitzer) took over my work and went further, optimizing the data structure itself based on some discussion he, Laurent Quérel and I had together.
ctst started as a simple rewrite in pure C of pytst, motivated by the difficulties I encountered with compiling the C++ code of pytst with different compilers, and building bindings for Ruby (which I had started using at that time). I also wanted to separate as much as possible the code implementing the algorithms from the code implementing the code storage, which was something we managed to do quite cleanly in Java. The ideas about optimizing the structure came by during the rewrite, so I finally decided to jump ship and implement them in ctst as well.
So how does ctst store data in the tree ? We took some of the ideas from Patricia trees and mixed it with the B-Tree principles. The result is a pretty compact and efficient structure. To illustrate it, let’s use a little example.
Suppose we want to store this list of words in ctst :
compacity compact compacted community commuter commuters compute
ctst build this tree (click on the image to zoom) :
( This drawing was done automagically using the dump method, which generate a Graphviz .dot file. )
This gives 11 nodes for 7 strings. I won’t waste time to draw the ternary search tree equivalent, so you’ll have to trust me, but you would need at least 25 nodes. Of course the node don’t contain the same things, but the payload/overhead ratio is highly in favor of ctst.
Of course I haven’t re-implemented all pytst algorithms in ctst, but now that I « finally » managed to correct a long-standing but in the codebase, I’ll be able to get back at it. So stay tuned !
Mapping my personal web
Everything starts with a slice of the Web, in the guise of a bunch of RSS feeds. Currently my blog roll hover a little above 280 feeds. There are the broad blockbusters like Techcrunch, ReadWriteWeb, Techmeme, Presse-Citron (a French blog), the mainstream media (e.g. French newspapers), but there are also a lot of small, specialized, « vertical » blogs from developers, entrepreneurs, competitors and/or friends.
Of course I cannot cope with the quantity of content that those ~280 feeds fetch. Some people are very anxious about reading everything in their blog roll, which for instance led to the introduction of a new anti-feature in Google Reader : « hide unread count ». I was a bit like that back in pre-history (circa 2004) but nowadays I just tell myself that I won’t be able to read everything, c’est la vie.
One thing is sure, I won’t miss anything important thanks to the « echo chamber » effect. As soon as big news hit the interpipes, I see it multiplied a few dozens of times on many feeds, and even if I missed it during the first 24 hours, I get a second chance with the French mainstream media which are often 48 hours late.
So each day I get the big news fast and I can quickly skip over the echoes. Then I get a bunch of small news, « vertical » articles, things that border between work, geeky things, technical articles, friends updates and so on.
My main tool for reading all this is Google Reader. More precisely, I mainly use the iPhone version of Google Reader during my daily subway trips, which gives me around 60 to 90 minutes a day (morning + evening) to do some triage and select interesting items to review later at my home or office (I star those items in Google Reader), or items to share right away.
The iPhone version of Google Reader is especially well designed for triage : the app has a mode in which it only displays 15 unread items at a time, and there is a « mark as read » link which marks all those items as read and loads 15 new items right away. So that’s what I do in the subway : just skip items whose title doesn’t seem interesting, read interesting items, star those I want to come back later on a bigger screen or share them right away. Believe me, since I’ve began doing this, I’ve never been bored while in the subway or while taking a cab
. Google Reader tells me that in the last 30 days, I’ve read 5.384 items which gives nearly 12 screens of 15 items a day.
BTW, I guess the batch loading mode of the iPhone app was designed to address connection problems, and it’s especially effective in the subway. For example, on my daily subway trip, there is a weak spot in my operator’s coverage for one Metro station, so I take care of loading a batch just before entering the station. This way, I don’t need the connection while in the weak spot. I sometimes get frustrated when I go on Metro lines that I don’t know as well, though
The items that I share are reviewed by friends directly on Google Reader, but they can also be found on a public RSS feed that is injected into FriendFeed for comments there, too. The RSS feed is also injected in one of our corporate blog powered by WordPress (+ FeedWordPress) so that my colleagues and I can discuss about it far from the prying eyes of our competitors
. Things could change now that Google Reader has a true commenting system with some privacy features, but having a private blog hosted by ourselves is much more reassuring for now.
For fun, I’ve got a few more services aggregated into FriendFeed : my Amazon wish list (feel free
), my public activity feed from github (I’ve got two projects there, though I’m not very active), LinkedIn feeds, etc.
Finally, FriendFeed posts everything to Twitter (except what’s coming from Twitter, of course, the echo chamber metaphor should remain one), so that I don’t have to do it myself
. To be frank, the reason I made FriendFeed push to Twitter is that for now, I don’t really use Twitter, but it seems to be quite important nowadays… So this is a cheap way to push updates on Twitter without going all the way into micro-blogging mode.
So, does anybody have some more tricks to share about how to manage the gazillion tons of information the Web throws at us each day ?
pytst 1.17
pytst 1.17 is out !
The source code can be fetched using git. If you prefer tarballs, head to the downloads page, where you’ll also find Win32 binaries.
Included in this release :
- Support for 64 bits architectures (tested under Linux only) – thanks to Thomas Brox Røst for the patch.
- The test script (in python/test/test.py) is now self-contained, no more nasty references to a « tcc » module which was used at my company. Thanks to Paul Harrington for the prodding about this
, and for our first fork / merge test.
BTW, github is really, really cool. The only thing missing is an issue tracker. Maybe in a future release ?
pytst now hosted on github
I’ve created an account on github and ported my private SVN repository to a public git repository. Now you can fork on the project as you want and eventually ask me to pull your modifications (using the « send pull request » functionality). The wiki will be a good place to document the library.
Hopefully this is the beginning of a new life cycle for pytst, I keep getting bugs/enhancement requests from time to time but I haven’t enough spare time to address them.
Here is the repository home page : http://github.com/nlehuen/pytst/
pytst 1.16
I’ve just released pytst 1.16, you can download it there.
This release fixes an annoying bug that occurs if you used CallableAction or CallableFilter. If your Python callback functions raised a Python exception, the whole process crashed. This meant that you had to catch all Python exceptions in the callback, which was not always handy. The exceptions now behave as expected, that is to say they are passed from the callback to the Python calling code through the C++ layer.
Thanks to Keith Davidson who reported this bug. Keith also reported a problem with NULL characters inside the keys : keys seem to be handled as NULL-terminated strings. It looks like there is a problem in the SWIG layer, since the C++ code doesn’t assume strings are NULL-terminated. I’ll have a look at this problem ASAP.
pytst 1.15
I’ve just released pytst 1.15, you can download it there.
This release fixes a quite important problem introduced on 2006/05/05… For a reason I don’t quite remember now, I decided to make the TST C++ template specialization dedicated to the Python bindings privately inherit from the more generic tst<...> template. I guess this was related to some problems I had with SWIG.
The problem is that this broke something in the way SWIG handled inheritance between the two types, the net result being that the Python TST class lost its prefix_match methods and other methods that were declared in the parent class but not in the specialization.
I’ve reverted the inheritance between the two C++ classes back to public inheritance, and everything is back to normal now.
I really wished I could spend more time working on ctst, since it features theoretically better structures and algorithms, plus to be frank I’m quite fed up with C++, and I’d be more than happy to go back to C.
Also, I must say that the native API for Ruby is very impressive. I’ve been developing the Ruby bindings for ctst by hand, not using SWIG, and I’ve spent substantially less time than it took to have SWIG properly handle my C++ code from pytst…
A tribute to Groovy
My life as a developer :
Update : it’s my life as a professional developer, since my life as a programmer began in 1983 on a Sinclair ZX81…
- 1995 : Java 1.0 beta is released. I begin playing with it. Hotjava browser, applets are all the rage. Soon after, the infamous « Loading Java » status bar message in Netscape is the signal that you’ll loose control of your browser for at least 15 seconds while the JVM loads an applet.
- 1996 : Academic and toy projects in Java. Since JDBC is not out yet, I decide to write my own DBMS in Java. It works but of course it’s crappy. Cloudspace, Hypersonic and over 100% pure Java DBMS are safe.
- 1997 : Wrote an HTTP 1.0 server in Java, with pluggable controller modules named « lumps » (Java <-> coffee <-> sugar lumps, nudge nudge wink wink). Damn, I should have thought about « Servlets ». Wrote a lump which interprets a server page language not unlike ColdFusion (that is, as crappy, but at least at that time it already ran on the JVM). Wrote my first server pages which could access an Access database through the freshly released JDBC-ODBC bridge and display data on my browser. As a trainee, I have to work on a project built with Borland Intrabuilder. That was both interesting and very scary, since the thing was drowning in bugs.
- 1998-1999 : Had to work, first part time, then full time. Of course, it’s all about EJB 1.0 beta and my lead developer Laurent Quérel and I try to build something around EJBs and Persistence Powertier which would not look totally unlike Hibernate today. I begin to get the abstraction addiction. It’ll take me a few years to get rid of this addiction.
- 2000-2001 : the future is on the mobile phone, and WAP is going to be the next web. In full abstraction addiction mode, my team and I develop our own specialized application server, so that you can write a web application once and browse it from any web-enabled device, the presentation being abstracted and adapted to the zillion different devices that would soon access our apps. We develop half a dozen application with this application server before the net bubble explodes and we realise that it would need a few more years before the Internet on a mobile phone would become a Big Thing. One good thing, though : when we began this application server business, we drop all the EJB nonsense. We built our own IoC container, much like Spring. And we didn’t need an ORM, because all we needed to map to was XML, dude. This was the time when I read & wrote a lot on the xml-dev mailing list.
- 2002-2003 : I’m done with the abstraction addiction. I’ve got a challenging job, lots of work to do, and no much time to think and ponder about abstracting anything. The code I write must be straightforward and must get things done. I’m feeling more and more irritated by the verbosity of Java and the zillion XML configuration files I’m forced to write. An IoC container is a nice thing, but honestly, I sometimes feel that a simple scripting language would be much more suited to configure my apps. I begin experimenting with Jython. I also try for the first time Groovy, which is under development, but the language seems a bit goofy at the time, a patchwork of good ideas but nothing really as coherent as Python.
- 2004 : OK, I finally get it. I need a dynamic language. I hesitate between Python and Ruby ; Ruby is more interesting but less polished, and at the time, a lot of documentation is badly translated from Japanese. I choose to invest in Python. I fall in love with Python. I use it first for any kind of data massaging tasks, then progressively for web development. I begin tinkering with mod_python, which will eventually lead to my active participation in the project. In the same time, I experiment with Zope, find it fascinating and decide to use it for a classic web application project. Big mistake, that I’m still paying for four years after.
- 2005 : Ruby on Rails appears. People marvel at the thing. I don’t get that the scaffolding demos that begin to invade the blogosphere have nothing to do with the real power of RoR. When watching those demos, I told myself « OK, that’s really cool, but real apps are much more complicated than just CRUD screens ». Of course, Sherlock, but do you really expect that things will be brighter in any other framework you can choose if even the basic stuff like CRUD is complicated ? Oh, and on August, the 23rd, my daughter Violette was born.
- 2006 : it’s time for a comeback to Java (nothing to add to what I’ve wrote at this time). I also work 50% on my time for Yoono, once again with Laurent Quérel. One day, walking back from lunch in a Parisian street, I recognize Guillaume Laforge, which I knew took over the leadership of the Groovy project. I tell him « Excuse me, but aren’t you Guillaume Laforge » ? Yes indeed – I think he found it a little creepy to have reached celebrity to the point of being recognized in the street by total strangers
. Anyway, at the time, in the street in front of our offices, I naively propose him to work for Yoono, as we were looking for top notch Java developers. Of course he declined politely, but now that I know what Guillaume was working on, I’m glad he did !
- 2007 : for a new big project, I hesitate quite a long time between Java + Spring + Hibernate + Struts, Python + Django and Ruby on Rails. I finally decide to give RoR a go. It’s awesome. Nothing to regret as of today, except maybe my choice of MySQL + InnoDB tables which gives strange problems with table locks. Apart from that, I get the same rush than anyone else trying RoR : freedom and power ! However, the deployment model sucks. What, a single-threaded application server ? You have to launch multiple servers and load-balance between them using Apache 2.2 mod_proxy_balancer, lighttpd or Pound ? OK, I’ve done it with mod_proxy_balancer and it was no rocket science, but seriously, It’s a bit of an hassle to manage. And the performance is not so great, I squeeze 60 reqs/sec per instance, with is totally sufficient for the app we built, but come on ? I get easily 4 or 5 times more reqs/sec with any badly configured Python or Java stack (please don’t flame me for this obviously inane benchmark).
- 2008 : Grails 1.0 is out. I have a new look at Groovy. It has been totally morphed into one of the coolest dynamic language out there under the lead of Guillaume Laforge. Groovy is now a true dynamic language, with a seriously good feature set. All my favorites from Python and Ruby are there. And it runs, fast, on the JVM. What can I say ? It looks almost perfect. Of course, writing such a language is a difficult task, and things could get ruined by weird lexical scoping interaction with closures or things like that (no, I don’t think about any particular language, Ruby). But at first sight, it looks very, very good. Could I ask more ? A nice web framework running on top of the JVM ? Well, Grails sure looks promising, I can’t wait to try it on a real application ! Performance-wise, my first benchmark tell me we’re far above Ruby on Rails. Oh, and IntelliJ supports Groovy and Grails with a pretty cool plugin, too. So I guess life as a web developer is now perfect, end of story
.
And now for the punch line : I have now reached a position at work where I’m supposed not to code any more, at all.
WHAT ? All those years of sweating, fighting and yearning, and now that a good web development stack is out in the wild, I’m supposed to retire as a developer and become a friggin’ MANAGER ? I tell ya, there’s no justice anymore in this world.
Oh, and 13 years after, my browser still hangs for more than 15 seconds as soon as I try to load a page with an applet. Meanwhile, Flash objects load almost instantly, without any problem…


