Hello again, Java!

NOTE: This is a restored version from this archive.

For my currently biggest project at work, I’ve decided to get back in the flow and write the code in Java. These are a few reasons why I chose to use Java instead of Python for this particular occasion :

  • Performance. I literaly have to parse gigabytes of fixed-width formatted data, perform business logic and computation on it, then spit ou millions of rows in output files and in RDBMSes. Even when restricting the scope of my benchmark to the first part (data parsing), I get at least 5 times and up to 20 times better performances using Java versus Python + Psyco.
  • Productivity. If performance what all that mattered, I would have done it in C++, but hey, I really need to focus on the task at hand and not to fight with the compiler. Plus, it’s difficult to find, test and integrate third party libraries. In Java I could find and test two RETE engines (Jess and Drools) in less time than I would have spent just figuring how to build or integrate an obscure open source or a very expensive library.
  • Of course, Python is highly productive, but in this corner case it’s not performing well enough. Plus, I have to confess with a shudder that it’s not enterprisey enough for the context of this project, meaning that if I tell my client that the project is implemented in Python, I’ll get a lot of worried faces. Some people will be worried about how they will be able to maintain a project developed in a language they don’t know (yet), and some other people will just be worried because Enterprise©™ means Java and nothing else.

Of course, after having spent two years coding exclusively in Python and C++, I was quite happy to come back to my former favorite programming language. I wrote my first applet for the beta version of Hotjava, running on an SGI workstation at my school back in 1995, and was thrilled to see the same bytecode run on a Sun workstation, then later on my own PC. Write once run anywhere, yay ! I then proceeded to learn and use this language for 8 years, until I decided to have a taste of Python in 2003 and subsequently stopped writing Java code :).

This two years experience of using Python and C++ in professional projects have been quite instructive, to say the least. Coming back to Java, I can’t help but notice huge limitations and design issues that plague the language and the platform. As long as I drank the Koolaid, I didn’t see all this, but coming back from abroad allows me to make a few remarks on Java, for what it’s worth.

The good

First, the good things.

Performance. Java was already darn fast, and it feels like it is even faster now. Granted, the JVM initialization time is huge, but once it gets its momentum, it really flies. I’ve been doing some small benchmarks with my friend Laurent Quérel, and Java can clearly compete with C++, now, even surpass it since its memory management system is highly optimized.

The language is simple. The Java language is clean and simple, even with the generics and annotation system that Java 1.5 brought. Java and Python are equally simple to me, at least from the perspective of a wannabe C++ hacker. I’d even go further and say that Python may be more complex and sometimes ugly looking, with things like the __init__ constructor name, subtelties with __new__, metaclasses and so on. Note that I’m writing about the language, not the libraries, we’ll see that later.

Tools. As soon as I’ve decided to use Java once again, I bought a license of IntelliJ Idea. This IDE is simply amazing. There are pretty good open source IDEs out there, Eclipse and Netbeans come to mind. I’ve tried them both, and they are good. But IDEA just blow them away.

Anyway, I don’t care what IDE is the best, my point is that the simplicity of Java and the fact that it is a statically typed language gives birth to really helpful IDEs, with top notch code completion, really helpful refactoring support, and so on and so forth. This is not the case for Python, because being a dynamically typed language it is very hard for the IDEs to understand the code being written and to assist the developer (and yeah, I know about the hypothesis that Python is such a good language that refactoring isn’t needed. Well, duh !). C++ has a better IDE support (I’ve heard that MS VS Studio 2005 has a vastly improved Intellisense system that fully groks templates) but it seems that mainstream refactoring support is scarce. So nowadays, only Java allows you to fully stretch and mold your code without breaking anything. Plus, having a good IDE support is really helpful for newbies.

Libraries. There are many, many more Java libraries out there than you can wave a stick at. Which could be a good thing, if they weren’t so darn complicated, but I’ll come to that in the “bad” part. Meanwhile, let’s enjoy the fact that almost anything you can think about is already implemented once or twice in Java, usually in an open source fashion.

Centralized and steady evolution of the standard. Some may despise this, but the fact that Sun is still holding the reins of the development of Java and its peripheral APIs means that the language still evolves way, way more rapidly than C++ does. Remember this article about C++Ox ? One version every ten years isn’t going to help C++ to survive in the current frenzy. And it’s not only a matter of survival, C++ could really use a little bit of improvement here and there.

The bad

Everything has to be an interface or a factory. Really, be it in the standard library or in third party APIs, everything has to be an interface or a factory. It looks that to the Java developer, concrete code is bad, abstraction is goood.

Heck, I was easily doing what needed do be done in a few lines of hyper-concrete Python code, but now that I’ve came back to Java, I’ve gone back to the ways of putting interfaces anywhere and feeling bad as soon as I need to give a real class name to instantiate instead of going through a series of factories to do the same thing. Which is stupid anyway, because without a good configuration system (which takes a lot of time to write or integrate, believe me, I’ve done both), then you HAVE to instantiate concrete classes anyway if you want something done.

To add insult to injury, factories are usually implemented with Singletons or static methods, which are one of the most static constructs of a program… What ? You don’t abstract around your factory class ? What if you want to change it ? Maybe you should provide a factory for your factories ? Actually, IIRC a situation like that exists in the JNDI API…

I guess that this problem tends to show that Java developers secretely want to code in Python and benefit from the freedom multiple inheritance and dynamic typing provides. Alas, we are stuck in Java, and we need to make sure we have interfaces everywhere, in the hope that this will help us to change anything in our code.

This is really ridiculous. I was doing a few tests with the Google Adwords API, just have a look at the code needed to make a simple API call. There is something very weird in having to build a factory instance (no singleton / static method this time) to build an instance that implements an interface for which we know there’s only one implementation ! A single constructor with the proper connection parameters would suffice, but no, that would be waaaaay too concrete.

I don’t have anything against the Google Adwords API per se, it just happened to be the latest example that I’ve seen of this widespread disease. For another symptomatic example, see this documentation about the Jakarta connection pool library and cry (there’s even a bit of IoC through an XML document, because you know, instantiating concrete class in Java code is baaaaaaad). One you’ve regained a bit of eyesight, have a look at an equivalent in Python. “But it’s not as configurable !” I hear you yell. Yes it is, just read the documentation. It only happens that the defaults the code provide are sensible, so 80% of the time you don’t have to worry about them unless you find out that a configuration issue in your connection pooling is really the source of your problems.

No syntax sugar for tuples, lists, sets and maps. In Java, if you want to build a list with 5 elements, you have to instantiate the list and add the 5 elements. You cannot even chain the 5 add calls. Likewise for sets or maps. I understand that in light of the previous items, introducing some syntax sugar to build list would force the compiler to choose - aaaarghh - a concrete implementation of the java.util.List interface, and that would be taboo. But come one, there must exist clever ways to have a nice syntax for lists and map all the while retaining the sacro-saint implementation independence !

Everything must be an object. Code without class does not exists. You want to implement a callback function, bam, you have to implement an interface. Granted, you can do it anonymously, but this uses an awkard syntax, and generates yet another class that the JVM must load and instantiate just to execute one function. Please, I need first-hand function objects !

As a consequence of the two previous items, there is not functional programming support in Java. Even if it could be done with an interface defining a one parameter function returning an object, there is not way to map a function on a collection other than iterating manually on it. That’s a good thing we have new-style for loops in JDK 1.5, then. What is a single liner in Python, for example map(string.capitalize,["nIcOlAs","LeHuEn"]) becomes a chore in Java. Functional programming in Python is not perfect (lambda expression have been controversed for years now), but it works, and it’s a real time saver when writing, testing and even reading the code.

It’s pretty symptomatic that there isn’t any standard interface to represent a single parameter function object. This pretty basic building block is not abstracted in the land of abstraction. It’s because Java doesn’t like functional programming, even in an OOP disguise.

Everything is not an object. You cannot store integral types like int or long in containers. You previously had to wrap them in an Integer or Long instance, but now with the JDK 1.5 you don’t have to, thanks to the new “autoboxing” feature. Of course wrapping an int in an Integer instance requires at least 10 times more memory, but who cares ? Well, people like the ones who wrote FastUtil do, and I’d like to thank them, because I do too. But wait, maybe new new generics feature could help us there ? Well, unfortunately…

Generics suck badly. Really. For instance, you cannot use integral types as type parameters. And even if you could, the result would NOT be an optimized ArrayList<int> class that would store ints and only ints, like you could expect from your C++ background. The generics implementation is merely a bit of syntactic sugar, and under the hood it’s always the same classes which store Object references. So you can get your ArrayList<Integer> and cry when watching your memory usage reach unforeseen heights, or turn to FastUtil.

Not only that, but the generics implementation is actually dangerous. It gives you a false sense of security, when security you don’t have because the current compiler cannot prevent you from getting a ClassCastException at runtime if you don’t pay attention (which was what the generics system was all about in the first place). I’ll give examples in future posts.

C++ templates have nothing in common with Java generics, except maybe the < > syntax. C++ templates are way more powerful than Java generics, and I really don’t understand why Sun has decided not to go this way. They had the opportunity of a real breakthrough, here, because even if C++ templates are powerful, they are also horribly complicated. Sun could have pushed templates within Java without breaking the bytecode, all the while giving us a simpler and thus better templating system than the one found in C++. So much for the opportunity.

Default parameters and named parameters are real time savers… Yet Java do not provide those features. As a result, you sometimes see tens of overloaded versions of the same method with different parameters. Of course, the particular version you need is often the one that is not implemented (there is a combinatory explosion of the number of possible overloadings), so you have to use the most complete version and enter the default value for the parameters you don’t care about by yourself. Take the above example about the connection pool configuration in Python : everything is done through one single function, and if you want to change something, you just use a named parameter.

Managing the JDK / JRE installation is a nightmare. I think I’ll leave this one for later, because it deserves a blog entry of its own. To sum it up, Downloading, installing and upgrading the JDK / JRE is a nightmare, and you often end up with tens of megabyte of duplicate files.

Tens years after, Java applets are still a joke. Java applets are disappearing progressively. The best hint that a web page contains a Java applet, nowadays, is that the web browser pauses up to ten seconds while loading the JVM and the application from the web. The situation has not improved since the first infamous “Loading Java” status bar message issued by Netscape around 1996 ! The only moment applets were running smoothly was when Microsoft implemented its own JVM and integrated it in Internet Explorer. Since they removed it, it’s a nightmare.

“No one cares about Java applets”, you may tell me. Oh really ? So I guess all those Flash thingies that you can see pretty much anywhere on the web don’t matter ? The virtual machine running Flash movies should be the JVM, not a proprietary VM from Macromedia. Flash movies could have been implemented in applets, it was just a matter of providing the appropriate development environment. But legal issues with Microsoft and Sun’s utter lack of competence in the desktop environment have just doomed the platform.

Java on the desktop is dead, and there is no replacement yet.The next result of the last two remarks is that Java as a distributed environment for desktop applications is a lost cause. There are a lot of thing going on right now (AJAX et al) that would never have seen the light if Java was successfull on the desktop. One could imagine that today’s desktop application landscape would be very different, if deploying Java desktop application was possible. But it’s not, and Sun is the only one to blame here.

What’s sad is that there are no alternative. A true, standardized, portable, high performance desktop platform is nowhere to be found. Nowadays, the only safe target platform is the web browser, with AJAX providing a little bit more interactivity than the request/response scheme previously allowed. But the web browser is far from being as powerful and integrated with the OS as a native GUI application could be, and writing portable code for the many browsers out there is a PITA.