Categories

What’s new in Java 1.6.0_21 ?

Why, a new Oracle-branded Java Web Start splash screen, of course !

Before (up until 1.6.0_20) :

After :

It takes time to get used to it…

To be fair, this update contains truly interesting things, like a new version of the Hotspot VM (17.0), a better performing VisualVM and tons of bugfixes.

Check the release notes (notice the improvements in the Java web site since Oracle took over). You can compare them with the previous ones and derive whatever conclusions you want :)

Pure bliss with MongoDB

I’ve been playing with MongoDB lately, and this morning I came across this blog post from Eliot Horowitz, showing how you can stream Twitter into MongoDB in a single command. How cool is that ?

What’s interesting here is that you get an awesome testbed for MongoDB since you get a stream of around 30 highly structured items per seconds (this is the sample stream, to get the true Twitter stream which apparently is 20 times more fast you have to have special credentials). A few minutes after your first download of MongoDB, you have enough data to experiment various things.

The first thing I played with was replication. Eliot shows how to set up a master instance which is dedicated to soaking the Twitter stream, while a slave instance asynchronously receive the tweets from the master and processes them.

Master-slave replication in MongoDB is very simple to set up, as demonstrated by Eliot : you start the master server with the --master command ligne switch, and the slave server with --slave --source <master address>. That’s all, the slave will eventually replicate all the data from the server. For large databases, you can preload the slave with a dump of the server and it will only replicate what’s new since the dump (given the right command line switch).

To use MongoDB, you can write code in your language of choice (provided there is an implementation of the MongoDB driver). But you can also use the MongoDB JavaScript shell which puts a lot of other DBs CLI to shame.

Next, of course, I played with queries and updates. Queries are pretty rich, with lots of useful JSON-aware operators (more on that later). MongoDB has some update facilities that look extremely useful, like atomically incrementing a number or adding an item into a set.

The only thing with queries and updates is that the query expression may look and feel a little weird at first. But there is a solid and coherent design that ensures that once you’ve understood it’s principles, it all feel very simple and natural.

Things become very interesting when you delve into the indexing features. For example, in a few minutes of reading the documentation, I could build a geospatial index of the tweets and query them :

// ... the first time, wait ~6 seconds for 475,000 tweets ...
db.twitter.ensureIndex({"geo.coordinates":"2d"});

// get the 3 nearest tweets from my house
db.runCommand( { geoNear : "twitter" , near : [48.858009,2.451625], num : 3} );

MongoDB knows how to index structured values like arrays or maps. For instance, the index stores each value in an array so that you can query for the presence of a value, or any or all values from a list. This means that you can easily build your own full text index :

// the tokenizer ; in a real word text index you would be much more clever
tokenize = function(text) {
  return text.toLowerCase().match(/\w+/g);
};

// a query that gets all tweets with some text (yes, there are tweets without text)
tweets_with_text = function() {
  return db.twitter.find({text:{$exists:true}});
}

// this function stores all tokens extracted from the text into a document field
indexTweet = function(tweet) {
  tweet.tokens = tokenize(tweet.text);
  db.twitter.save(tweet);
}

// Let's go ! This may take a little while...
tweets_with_text().forEach(indexTweet);

// We build the token index
db.twitter.ensureIndex({tokens:1});

// And now we can query the index !

// This returns the text from 5 tweets containing the word "nicolas"
db.twitter.find({tokens:"nicolas"},{text:1}).limit(5)

// This returns the full data from 5 tweets containing the words "mongodb" or "nosql"
db.twitter.find({tokens:{$in:["mongodb","nosql"]}}).limit(5)

// Returns tweets with "mongodb" AND "nosql"
db.twitter.find({tokens:{$all:["mongodb","nosql"]}}).limit(5)

// Later, we can incrementally update the index :
// we only need to index the documents without tokens.
db.twitter.find({text:{$exists:true},tokens:{$exists:false}}).forEach(indexTweet);

// Of course we can mix & match queries :
// Get all tweets containing "paris" near my home
db.twitter.find({tokens:"paris","geo.coordinates":{$near:[48.858009,2.451625]}},{text:1}).limit(5)

But wait, there’s more ! MongoDB implements MapReduce and supports sharding. So this could pave the way for a new world of scalable yet easy computing. I’ll get back to this in a future post.

A lot of features in MongoDB are still work in progress. Lots of improvements are expected in concurrency, replication, features like full text search (because the above hack is, well, a hack), and performance.

For instance, the embedded Javascript interpreter is not thread safe, which means concurrent server-side Javascript execution is not possible, yet (it seems there are plans to support v8, which would be awesome). As a consequence, the current MapReduce implementation doesn’t support parallelism on a single machine, which prevents the efficient use of multiple CPU cores. But the project is in active development so watch the roadmap for future improvements.

To conclude, don’t be shy, download MongoDB now, use Eliot’s Twitter trick to fill a database and have fun experimenting with one of the most interesting NoSQL database out there !

pytst 1.18

I’ve just released pytst 1.18 which fixes a bug reported by Keith Davidson. The bug was in the scan method and appeared in some very specific yet totally mundane circumstances : if you scanned an input string of a very specific length with regard to the content of the tree, then the first scan would be ok, but the next ones not. This was something that was not in the unit tests and that managed to survive many years before being reported.

The code for the scan method (in tst.h) was one of the most complicated piece of code in this project, and I had quite a hard time figuring out what was happening. In the process, I began to realise that a rewrite would be necessary, so I cleaned the slate and entirely re-designed the algorithm, this time with careful planning.

While doing this, I realised that the old algorithm was much more wrong than I thought. There is a catch with the new algorithm, however : it has to do some backtracking. That is to say, in some condition, the algorithm has to go back and read again some part of the input string.

The old algorithm never had to look back – but the old algorithm would miss some matches, if you had dictionary entries that included multiple smaller dictionary entries. For example, in a dictionary with {'111','222','111222111'}, if you provided the input string '11122211' (notice the missing 1 at the end), then you would have a match on '111',  then a non match on '22211'. The match on '222' would be missed.

For the moment, I prefer to release pytst 1.18 with an algorithm that is correct yet not optimal, and try to find a non-backtracking solution later. From my measurements, the impact on performance is around 10% but YMMV.

For the record, the new algorithm looks like this :

There is backtracking in to places : (J) and (K) where the input string index i is modified. To prevent backtracking, I have to find a way to analyse the tree so that I emit any match included in the current state then change to a proper state. It’s what Aho-Corasick is supposed to do, but I need to have a better look at it since I obviously got it wrong the first time.

You can download pytst 1.18 here.

A case against Twitter

OK, I’ve put up with enough BS about Twitter. It’s been months that this thing is growing on me, usually I can vent this off, but this time I can’t ; a tiny thread was the last straw for me.

Short version of this post : Twitter basically provides a crappy infrastructure for broadcasting short messages on the Web, leaving to everybody else, be it their users, people with too much time on their hands or even companies to invent their own usage patterns and implement features by themselves. Meanwhile, Twitter sit and watch the world revolve around them, occasionally trying to cope with the unbearable mess that their infrastructure seems to be. The worst thing is that for each usage pattern that someone else is kind enough to add to Twitter, there already exists a better alternative out of Twitter, sometimes years or decade older. Yet all this doesn’t seem to slow down Twitter a bit from passively stealing wind from innovative companies.

There, I said it.

Now, for the long version.

1) Twitter’s innovative model is ultimately awkward for social networking

The core of Twitter is : « say whatever you want to the world, in less than 140 characters ». In the standard configuration, every thing you publish can be seen by anybody. On top of that, you can follow people, that is to say that you can asymetrically « tune in » to what a Twitter user says and receive it on your Twitter home page, so that what they say is not lost amongst the blabber of everybody else.

I know that you can create protected profiles, I know you can send direct messages (DM), I know you can block people, all that stuff. Those are not core features, in the sense that they were added to address shortcomings of the initial offering, rather than to innovate along the lines of the core features. In a world without Twitter, you want to DM ? Try e-mail. You want protected profiles ? Try Facebook, Google Groups, Friendfeed groups, etc.

( As an aside, to an old, born-in-the-70s guy like me, Twitter sounds a lot like the CB radio of the web. It’s a medium, and like any medium it needs some content, some usage patterns to have some interest. In France at least, the killer app for CB Radio was an informal, real-time, dynamic community dedicated to reporting police speed traps. Now it has been re-invented with GPS and GPRS. )

Granted, this combination of short message, public broadcasting and asymetric subscription is an original creation from Twitter, coming just at the right time when the blog craze seemed to fade away (remember the blog craze ? back in 2004 ? one blog for each human being and all that stuff ?). If people can’t be bothered creating true blogs, if people suffer from the empty text area syndrom (the modern version of the white page one), well, let’s dedramatize this life streaming stuff and just let people tell what they just ate in a few words.

This was the « What are you doing ? » period of Twitter. It was a meme, it was fun, but once you’ve read your 10th cleaning the bathroom tweet, the excitement disappeared. The point is that Twitter was pretty fun in the sense that you could (and still can) reach random people on the Web and find out what they are doing… but except for a few outliers which ensure that Twitter gets some media coverage (like Twitter CEO’s pregnant wife who tweeted during labour), you soon discover that a random person on Twitter is, well, random, and you wish you could get a little bit more feedback from this…

…then you turn to Facebook because, well, that’s the place where your friends/colleagues/futur-ex-boss-that-you-forgot-you-befriended are, where you can trade pictures, say bad jokes and naughty things and try to outsmart each other on crappy IQ tests. Or you turn to Myspace, Orkut, whatever : it’s not the brand, it’s the fact that a richer social network exists, complete with more or less useful apps that you can use INSIDE the network, that it allows for true conversation with more or less privacy depending on the context, that it’s rich with images, songs, videos, that it’s so much MORE interesting that the awkward, public, text-only chat you can have on Twitter. Hey, even if all you need is a good old fashioned text-only chat, there are far better alternatives to Twitter, beginning with the chat system that were introduced in Facebook or Myspace.

So at the end of the day, this whole « public broadcasting your thoughts » seems a bit awkward. Of course some people are perfectly happy to whistle in an hurricane and expose everything about their life, and I’m fine with them, but Psychology 101 says most people want to be heard, and answered (if you don’t agree with this, imagine closing the comments and removing your Google Analytics tag on your blog, for instance). Twitter has a informal social network, of course, but it is quite specific (asymetric following) and much less engaging than the competition out there. This is true to the point that you have to turn to third party applications like Friend or Follow (once again showing the total apathy of Twitter, letting other people do the job) to get some sense of what’s going on in your trust network. Not good.

One symptom of this problem is the fact that teens don’t tweet. If teens don’t see any social value into Twitter, it’s a pretty good indicator that something is wrong, and that with time things will look even worse for Twitter.

Next in line, the 140 character limit. You know what ? In your favorite social network, I’d bet that most of the time people already write messages that are 140 characters of less, except nobody poses about this, because this is how people communicate anyway. What was a kind of artificial constraint providing a creative twist, a urge for conciseness in a blogging context, becomes totally uninteresting in a conversational context, where people already integrate the 160 characters as a limitation of the SMS. The problem is that this limit is actually preventing people from easily sharing interesting things, more on that later.

So, to sum up, Twitter was initially built as an alternative to blogs, grew some kind of original social network features, but in the end I can’t help but think that the result is awkward and that there are far better mainstream offerings out there.

2) Twitter is also awkward for sharing stuff on the web

Another popular usage pattern built around Twitter is the « share stuff on the Web » pattern : share URLs, share pictures, share videos… Well, guess what, you just CANNOT do that in Twitter per se, not easily at least. You have to use URL shorteners, third party web application like Twitpic, etc.

OK, Twitter bought bit.ly last May so that pretty much solves the issue of sharing URLs. Except that it’s a wart on the system and it hides the destination URL so that you don’t know whether you are going to land on a NSFW page, all this because for the 140 chars limitation. And I still can’t easily share pictures of video, they stay out of the loop, on an external system. Compare with this variation on the theme of « I just ate », powered my the micro-blogging platform tumblr.

The net result is that you are using a patchwork of interdependent patchy services with unknown viability to share things with people. You are subject to the infrastructure or financial problem of all those services (think about what happens with tr.im). Then if you want to get some feedback about what you shared, then you’re on your own trying to build a conversation (more on that later) and see if some people retweet what you shared.

Once again, there are perfectly valid alternatives out there if you want to share stuff : Digg, Google Reader + Google Notes, Friend Feed, del.icio.us, StumbleUpon, Yoono, and countless others. Each one of these services are dedicated to the task of reviewing what some people have discovered, give your advice about it (« I like it », « Favorite », rank it). Each one is perfectly aware that something else than 140 bursts of character exists, and handle images and video sharing in a handy way. Each one of them allow for some easy discussion of the subject at hand. In short, those services were properly engineered for sharing stuff on the Web. I’m sorry, but Twitter is not.

A fine example of the inadequancy of Twitter here is the retweet (RT) usage pattern, trying to emulate features like « I like » that you can find in other applications, except that it’s handled way better (I won’t bother you with the details of with RT fail, I’m sure you already experienced the awkward feeling of retweeting something that was already retweeted).

Once again, we see people trying to struggle around the limitation of Twitter by inventing usage patterns. This is very cool if we see all this from a social science experimental point of vue, I guess some people could get a PhD out of this, and we all feel like pioneers exploring the Wild West while doing that. The sad truth is that this Wild West is nothing but an artificial sandbox which per chance is under the scrutiny of mass media. Meanwhile, some real innovation and progress takes place somewhere else, out in the real world (FriendFeed, Google Wave, pubsubhubhub to name a few interesting spots).

By the way, the funny thing is that I actually tried to fix RTs by working on a collaborative ranking system for Twitter : Tweetraise. It was very fun to write, I learned a lot about Twitter and Google App Engine, but my end conclusion is : why bother ? Why would I spend some time trying to fix Twitter for free whereas some other very good services already exist ?

3) Twitter is awkward for conversations

Twitter doesn’t have any proper conversation system. It doesn’t thread tweets, even though there is a trace of threading attributes hidden in the Twitter API (see the in_reply_to_status_id element there). You end up either DMing or publicly replying to tweets with an @name mention and let the readers find out what you are replying to. You get absolutely no help from Twitter here, initially there weren’t even a reply button in the web interface (this feature was actually boasted by third party applications !).

So once again, people try to work around this with mentions, hashtags, and in the latest installment of The Twitter Circus, a tiny thread. Once again, it kinda works and people are delighted to see « innovation » in Twitter-land for features that have existed for decades in Usenet or IRC.

You know, it’s pretty sad to watch people struggle to single-handedly re-invent Google/Yahoo Groups, FriendFeed, Usenet or IRC within the limitations of Twitter. What’s the point ? OK, it’s some good hacking fun (like I’ve wrote, I went through this and it was nice), but shouldn’t we try to actually build some open source based innovative infrastructure instead of trying to incrementally improve a system for the benefit of a private corporation ?

4) Twitter’s infrastructure is crappy by design

OK. I know « public broadcast with asymetrical following » is not that simple to handle. I know that some usage pattern leading to people with millions of followers is not really the easiest problem to address. I know that the rate of growth of Twitter is putting a big stress on their infrastructure. Plus, it’s not easy to withstand a sustained growth, to survive DDOSes, and big pieces of news like the iranian elections or MJ’s death. Most of all, I don’t really know what kind of money Twitter can mobilise to solve these issues, but it surely isn’t in the same league as Google.

But come on… I don’t need to remind you of the loooong track record of outages and performance issues of Twitter. You can follow nearly everything in their status blog, and it’s not pretty.

The problem is not that Twitter has infrastructure issues which could prevent people from telling the world whether they like spinach or not. It is that there is a whole lot of people who invest time and money in this, trying to provide innovative services on top of what is becoming a carrier business. They are ultimately relying on the Twitter infrastructure for their projects / business to work properly. Twitter is at the core of an ecosystem which depends on their capacity to run the basic infrastructure seamlessly, a single point of failure in a very centralized system. What if Twitter doesn’t have the resources to fix their issues, reach and maintain a high quality of service ? Will everybody suffer from this after having invested so much time and money in Twitter, all the while bringing they incremental improvement and valuation ?

Have a look at FriendFeed. Those guys managed to deliver an almost flawless experience from day one. They have the experience, knowledge and brightness required to build this kind of service. FriendFeed proves that you don’t need to have huge, Google-like teams and resources to build an innovative, scalable, real-time service. Yet Twitter could not deliver. Guess which service out-numbers the other 50 fold ? Twitter wins. Go figure. On the other hand, guess who was bought by Facebook ?

Of course, there are two solutions to this infrastructure problem : either Twitter vastly improve their infrastructure and monetization scheme, and tell the world how they feel they might be able to technically and financially reach the 100 million, 200 million, 500 million users bar, or they just let go of the infrastructure problem and turn Twitter into a decentralized protocol for sharing short bursts of text.

( To be frank, there is in fact a third scenario, which is that everybody keep on blindly trusting Twitter to fix their problem and blissfully continue bringing value and users to the platform for the ultimate benefit of Twitter. )

What would become Twitter in a decentralized protocol scenario ? Well, it would be a bit like Hotmail, Yahoo Mail or Gmail in the e-mail business. A brand, an access portal competing with other on usability and features in the context of a common protocol.

But of course, this supposes that the world really need something like a decentralized Twitter, at a time where far richer and open initiatives like Google Wave or pubsubhubhub are building momentum…

Conclusion

Well, if you managed to reach that part (or skipped directly to it), congratulations and thank you :)

My conclusion is pretty much the same as the short version of the post (hopefully). I’m worried that Twitter is actually hurting innovation by distracting funds and good willed people, while actually failing to provide a useful user experience or giving any guarantee as whether their infrastructure is reliable or not. Let’s hope that the hype around Twitter dissipates far enough to prevent further damage. The only other way out for Twitter is to actually reinvent themselves and starts truly innovating instead of relying on the goodwill of API users. This is a tough endeavour given the potential horsepower behind big actors like Google Wave and Facebook/Friendfeed.

FriendFeed might actually benefit from Google Wave

With the current buzz about Google Wave, it is difficult not to feel sorry for FriendFeed. The common ground between Wave and FriendFeed is indeed pretty rich : real time communication, threaded conversations, lightweight social networking… Some people even suggest that there is some irony in seeing Google rip off good ideas from FriendFeed, the latter being founded by ex-googlers.

However, there is something peculiar about Wave : apart from the very Googly UI, the infrastructure is meant to be opened and decentralized. Wave can be seen as a way to collaboratively edit an XML document (the wavelets, with are the basic communication nodes found in waves). The point is that this XML document can be replicated on many providers’ infrastructures, and updates are propagated through extensions of the XMPP protocol (XMPP is AKA Jabber, it’s an instant messaging protocol that Google Talk also uses with extensions).

Check out this paper about Google Wave Federation Infrastructure.

So, we are not really in a winnner-takes-all scenario. Now, if I where FriendFeed, I’d have two choices, now.

1) Decide this is a Fire and Motion move from Google, aimed at first at Microsoft (to take the wind out of Bing), then Facebook, then FriendFeed. Keep away from this game, and slowly become an awesome app no one uses, dedicated to geeks and power users, and watch people have all the fun out there on Wave or Facebook.

2) Embrace Google Wave Federation Protocol and build gateways and proxies that enable FriendFeed users to participate into Google Wave discussion without leaving their favorite UI. Of course this means that FriendFeed might become a victim of the Fire and Motion scheme : Google adds new stuff into Google Wave, FriendFeed runs to support it, then Google adds new stuff, and on and on, which means that FF could loose its technical leadership on the market. However, there is certainly room for more than one Wave provider, and if this new way of communication becomes mainstream, this could mean that FriendFeed would at last find a way out of Geekland and reach a wide audience, wider than it is currently now. Then launch a feature war with Google on the social networking or real-time search front, both fronts on which they have already proven that they good do very well.

So, depending on the choice they make now, I think Google Wave could actually be good news for FriendFeed, turning their awesome power-app into the second player in a potentially huge new market.

The difference between pytst and ctst

Putting away the difference in languages (C++ for pytst, pure C for ctst), the two projects are quite different with regards to the way data is stored in the tree.

pytst implements a text-book ternary search tree with an AVL balancing algorithm. While I was working on this project, I designed a scanning algorithm that I particularly proud of, until I found out that it was already known as the Aho-Corasick algorithm. This scanning algorithm is handy when you want to match an input text against a huge quantity of search strings (which was the initial use case that led me to develop this).

The problem with pytst is that it is a very straightforward ternary search tree implementation, in that it requires at least a tree node per character per unique prefix in the input strings. Given all the meta data that can reside in a tree node (at minimum, we need the character plus up to three pointers to other nodes, plus one pointer for Aho-Corasick, plus anything required by the memory allocator), there is a huge memory overhead. The algorithms pytst implements may be pretty efficient, but the trade off in memory is quite taxing.

While I was working for Yoono, we decided to store huge quantities of URLs in memory using a TST, since the hierarchical nature of URLs from the same domain was a good fit for this structure. This time, the implementation language was Java, but we developed some pretty nifty tricks to optimize memory usage ; basically I reimplemented malloc/free running on int[], char[] etc. This way we would reduce the load on the Java memory allocation, object header overhead, GC, etc. We could then store millions of node in memory. This was cool, and it worked well, but it was a bit stupid from me the optimize the allocation issues before optimizing what was allocated. Fortunately, a clever colleague from Yoono (Yann Landrin-Schweitzer) took over my work and went further, optimizing the data structure itself based on some discussion he, Laurent Quérel and I had together.

ctst started as a simple rewrite in pure C of pytst, motivated by the difficulties I encountered with compiling the C++ code of pytst with different compilers, and building bindings for Ruby (which I had started using at that time). I also wanted to separate as much as possible the code implementing the algorithms from the code implementing the code storage, which was something we managed to do quite cleanly in Java. The ideas about optimizing the structure came by during the rewrite, so I finally decided to jump ship and implement them in ctst as well.

So how does ctst store data in the tree ? We took some of the ideas from Patricia trees and mixed it with the B-Tree principles. The result is a pretty compact and efficient structure. To illustrate it, let’s use a little example.

Suppose we want to store this list of words in ctst :

compacity
compact
compacted
community
commuter
commuters
compute

ctst build this tree (click on the image to zoom) :

Compacity test

( This drawing was done automagically using the dump method, which generate a Graphviz .dot file. )

This gives 11 nodes for 7 strings. I won’t waste time to draw the ternary search tree equivalent, so you’ll have to trust me, but you would need at least 25 nodes. Of course the node don’t contain the same things, but the payload/overhead ratio is highly in favor of ctst.

Of course I haven’t re-implemented all pytst algorithms in ctst, but now that I « finally » managed to correct a long-standing but in the codebase, I’ll be able to get back at it. So stay tuned !

Mapping my personal web

My personal web

Everything starts with a slice of the Web, in the guise of a bunch of RSS feeds. Currently my blog roll hover a little above 280 feeds. There are the broad blockbusters like Techcrunch, ReadWriteWeb, Techmeme, Presse-Citron (a French blog), the mainstream media (e.g. French newspapers), but there are also a lot of small, specialized, « vertical » blogs from developers, entrepreneurs, competitors and/or friends.

Of course I cannot cope with the quantity of content that those ~280 feeds fetch. Some people are very anxious about reading everything in their blog roll, which for instance led to the introduction of a new anti-feature in Google Reader : « hide unread count ». I was a bit like that back in pre-history (circa 2004) but nowadays I just tell myself that I won’t be able to read everything, c’est la vie.

One thing is sure, I won’t miss anything important thanks to the « echo chamber » effect. As soon as big news hit the interpipes, I see it multiplied a few dozens of times on many feeds, and even if I missed it during the first 24 hours, I get a second chance with the French mainstream media which are often 48 hours late.

So each day I get the big news fast and I can quickly skip over the echoes. Then I get a bunch of small news, « vertical » articles, things that border between work, geeky things, technical articles, friends updates and so on.

My main tool for reading all this is Google Reader. More precisely, I mainly use the iPhone version of Google Reader during my daily subway trips, which gives me around 60 to 90 minutes a day (morning + evening) to do some triage and select interesting items to review later at my home or office (I star those items in Google Reader), or items to share right away.

The iPhone version of Google Reader is especially well designed for triage : the app has a mode in which it only displays 15 unread items at a time, and there is a « mark as read » link which marks all those items as read and loads 15 new items right away. So that’s what I do in the subway : just skip items whose title doesn’t seem interesting, read interesting items, star those I want to come back later on a bigger screen or share them right away. Believe me, since I’ve began doing this, I’ve never been bored while in the subway or while taking a cab :) . Google Reader tells me that in the last 30 days, I’ve read 5.384 items which gives nearly 12 screens of 15 items a day.

BTW, I guess the batch loading mode of the iPhone app was designed to address connection problems, and it’s especially effective in the subway. For example, on my daily subway trip, there is a weak spot in my operator’s coverage for one Metro station, so I take care of loading a batch just before entering the station. This way, I don’t need the connection while in the weak spot. I sometimes get frustrated when I go on Metro lines that I don’t know as well, though :)

The items that I share are reviewed by friends directly on Google Reader, but they can also be found on a public RSS feed that is injected into FriendFeed for comments there, too. The RSS feed is also injected in one of our corporate blog powered by WordPress (+ FeedWordPress) so that my colleagues and I can discuss about it far from the prying eyes of our competitors :) . Things could change now that Google Reader has a true commenting system with some privacy features, but having a private blog hosted by ourselves is much more reassuring for now.

For fun, I’ve got a few more services aggregated into FriendFeed : my Amazon wish list (feel free :) ), my public activity feed from github (I’ve got two projects there, though I’m not very active), LinkedIn feeds, etc.

Finally, FriendFeed posts everything to Twitter (except what’s coming from Twitter, of course, the echo chamber metaphor should remain one), so that I don’t have to do it myself :) . To be frank, the reason I made FriendFeed push to Twitter is that for now, I don’t really use Twitter, but it seems to be quite important nowadays… So this is a cheap way to push updates on Twitter without going all the way into micro-blogging mode.

So, does anybody have some more tricks to share about how to manage the gazillion tons of information the Web throws at us each day ?

pytst 1.17

pytst 1.17 is out !

The source code can be fetched using git. If you prefer tarballs, head to the downloads page, where you’ll also find Win32 binaries.

Included in this release :

  • Support for 64 bits architectures (tested under Linux only) – thanks to Thomas Brox Røst for the patch.
  • The test script (in python/test/test.py) is now self-contained, no more nasty references to a « tcc » module which was used at my company. Thanks to Paul Harrington for the prodding about this :) , and for our first fork / merge test.

BTW, github is really, really cool. The only thing missing is an issue tracker. Maybe in a future release ?

pytst now hosted on github

I’ve created an account on github and ported my private SVN repository to a public git repository. Now you can fork on the project as you want and eventually ask me to pull your modifications (using the « send pull request » functionality). The wiki will be a good place to document the library.

Hopefully this is the beginning of a new life cycle for pytst, I keep getting bugs/enhancement requests from time to time but I haven’t enough spare time to address them.

Here is the repository home page : http://github.com/nlehuen/pytst/

pytst 1.16

I’ve just released pytst 1.16, you can download it there.

This release fixes an annoying bug that occurs if you used CallableAction or CallableFilter. If your Python callback functions raised a Python exception, the whole process crashed. This meant that you had to catch all Python exceptions in the callback, which was not always handy. The exceptions now behave as expected, that is to say they are passed from the callback to the Python calling code through the C++ layer.

Thanks to Keith Davidson who reported this bug. Keith also reported a problem with NULL characters inside the keys : keys seem to be handled as NULL-terminated strings. It looks like there is a problem in the SWIG layer, since the C++ code doesn’t assume strings are NULL-terminated. I’ll have a look at this problem ASAP.