Friday, February 15, 2013

Coding on Crete: An Interview with Java Specialist Heinz Kabutz

Dr. Heinz Kabutz is well known for his Java Specialists' Newsletter, initiated in November 2000, in which he displays his acute grasp of the intricacies of the Java platform for an estimated 70,000 readers. He is also well known for his work as a consultant and for workshops and training sessions at his home on the island of Crete where he has lived since 2006—and where he is known to curl up on the beach with his laptop to hack away, in between dips in the Mediterranean Sea.

Kabutz was born of German parents and raised in Cape Town, South Africa, where he developed a love of programming in junior high school through his explorations on a ZX Spectrum computer. He received a BS from the University of Cape Town and, at age 25, a PhD, both in computer science. Kabutz was named a Java Champion in 2005.

We previously interviewed Kabutz in 2007 and decided it was time for an update. 

Oracle Technology Network: In our 2007 interview, you were critical of Java developers for not engaging in unit testing. Have you noticed any changes in recent years?

Kabutz: Yes, continuous integration has pushed us in the right direction. Programmers seem to be thinking more about how they can test their code. There are some great tools for measuring coverability of the test cases.

The one place where unit testing is sorely lacking is with concurrent code. There are some tools that help find race conditions and deadlocks, but they typically find about a dozen faults per line of code. With such an amount of false positives, discovering a real problem is impossible.

Did you know that there is not a single—not even one—unit test for the Java Memory Model (JMM)? We have to just accept that it works on the Java Virtual Machine (JVM) we are running on. The theory is that if we write our Java code according to the JMM, the code will run correctly on any certified JVM. Unfortunately, the certification does not test the JMM thoroughly. Apparently, there are some tests for the java.util.concurrent classes, and so they assume that if these work, then the JMM must also be correct for that JVM.

While I was writing some code to do high-speed communication between threads, I discovered perfectly legal code that will livelock a JVM. The JVM hangs so badly that you cannot even get a thread dump from it. You have to use kill -9 to kill it. Being a good Java citizen, two years ago I reported this quietly to some of the engineers who could do something about it. I even determined the version of the JDK where this first occurred (1.6.0_14). I raised the issue again recently, but even version 1.7.0_07 is still broken. A malicious coder could insert a time bomb into a system that would bring down the server and make it very hard to find where the problem came from.

There are bugs in the JVM. Some get fixed quickly. Others, like my bug, do not. Mine is relatively benign, but it still is breaking the JMM. Without any form of tests, what confidence can we have that the strange behavior in our application is a coding bug and not perhaps a bug in the runtime?

I think the reason that there are no tests at all is because the JDK team wants a binary test. It must either fail every time or pass every time. With concurrency, such tests are difficult or impossible to write. We should rather work out statistically what the probability is that the behavior is according to the specification.

Even with that, I admit that it would be quite hard, if not impossible, to test effects such as bytecode reordering, since these effects can even happen at the CPU level. 

Oracle Technology Network: Is the principle that you must measure what you wish to improve any less important today than it was a few years ago?

Kabutz: It's definitely important. In our first 2007 interview, I described some code that appended strings together. I demonstrated that this code:
  public static String concat5(String s1, String s2, String s3,
                               String s4, String s5, String s6) {
    return new StringBuilder(
      s1.length() + s2.length() + s3.length() + s4.length() +
          s5.length() + s6.length()).append(s1).append(s2).
        append(s3).append(s4).append(s5).append(s6).toString();}

was faster than this code:
  public static String concat4(String s1, String s2, String s3,
                               String s4, String s5, String s6) {
    return s1 + s2 + s3 + s4 + s5 + s6;}

I also ended with this conclusion: "However, doing this prevents future versions of the Java platform from automatically speeding up the system, and again, it makes the code more difficult to read."
In Java 7, the two code snippets seem to be the same speed. String concatenation has been sped up significantly. I have not looked at how this is done internally, but if I were doing it, I would give every thread its own chunk of char[] memory in native heap to play with. Since we have so much memory available nowadays, this array could be fairly large, for example, 1 megabyte. I would then construct the StringBuilder for a thread pointing to its special memory location. This would mean that instead of constructing a char[] and then throwing it away as we run out of space, we would have one permanent space to work in.


Java Performance Issues 

Oracle Technology Network: What are some of the biggest Java performance issues today?

Kabutz: The biggest performance issue today is still that we often cannot pinpoint the bottlenecks. Customers usually approach us with problems that they have not been able to solve, no matter how many man-months they've thrown at them. The most recent issue I looked at boiled down to a simple race condition. If two threads insert an entry into a shared HashMap at the same time, and the key's hash code points to the same entry in the table, then the HashMap can be corrupted and you might get two entries pointing to each other. This means that whenever you try to call contains() on the map, you risk getting an infinite loop.

This manifested itself in the serialization. They were getting an OutOfMemoryError even though memory was not full. The caching mechanism was serializing the HashMap into a ByteArrayOutputStream. Since HashMap has a custom writeObject() method that does not take into account endless loops in the table, it continued running until the maximum size of a byte[] was exceeded. The problem was solved by using a ConcurrentHashMap instead.

Quite a few of the issues that I encounter have to do with threading in one form or another. The system might be performing poorly due to lock contention or context switching. Or it might have some data race that corrupts a data structure and ends up in an infinite loop. Fortunately, data races are usually catastrophic and yield completely wrong results. If they are "almost" correct, they might remain undiscovered for years.

However, I'm not sure whether this is a general trend in performance or we are just seeing a lot of this type of problem because these problems are so hard to solve. 

Oracle Technology Network: Are there any good books on Java that you would care to recommend? 

Kabutz: I can write, edit, publish, and send out a newsletter in one day. A magazine takes about 6 weeks between writing and when it hits the shelf in the bookshop. Books take about 6 months. Books are nice. You usually can afford to spend more time copyediting and editing the prose for clarity. Thus, my recommendation is that if you are interested in furthering your Java education, please read my newsletter. It won't cost you anything and I publish new ideas as I discover them.

That said, I was invited to write the foreword to The Well-Grounded Java Developer by Benjamin Evans and Martijn Verburg. The book looks at some of the Java 7 features, but it also shows what a well-grounded developer should know. The writing is very good, except for the foreword. I decided to write in my characteristic nonserious style, and by some error of judgment, they kept it "as is." I expected at least a little bit of push-back from the editors.

Charlie Hunt and Binu John also wrote an interesting book called Java Performance that I am currently trying to read. It contains some good information, but it could use some copyediting. Also, it is not for beginners. These guys are smart and you better be, too, if you want to understand what's going on in the book.

Coding on Crete


Oracle Technology Network: Tell us about your life programming and teaching classes on Crete. 

Kabutz: When, in 2006, I moved to Crete, an island in the middle of the Mediterranean Sea, I wondered whether it would be possible for a Java developer to live and survive in such a remote location. After all, we need electricity. We need a good internet connection. Was this experiment flawed?

It turns out that moving to Greece was an excellent career decision. Programming and creativity are closely linked. I always get inspired when I work in an awesome place. Struggling with difficult coding problems goes very well with the ocean surroundings—it's a perfect combination. Some of my readers might be surprised to hear that I also struggle to solve problems. Even though I have worked with Java for a very long time, I still get caught by issues that should actually be very easy to figure out. I recall struggling with some code to display a funny looking "r" that occurs in the Czech language. If I had been in an office, instead of outside next to my pool, I perhaps might have figured out how to do it quicker, but I would have been a lot more frustrated in the process. Most programming work consists of maintenance rather than creating new code.

Incidentally, I had an experience that showcased the power of JavaOne. At JavaOne 2012, I happened upon the developer from iText who wrote the library that I had battled with at home. He showed me a single method call that solved all the problems by finding the correct fonts. It was really cool bumping into him in the exhibition section.

I make most of my income from helping companies that want to give their Java programmers an elite education. No matter how experienced programmers are, they will always be challenged and learn something in my courses. The reason is due to the ecosystem that exists between my newsletter, my training, and my coding. I spend a lot of time researching Java. This research feeds into my courses and helps me discover topics for my newsletter. During my courses, I teach advanced Java programmers who come to me with very interesting questions. I usually manage to resolve these for the students, which then leads to more newsletters.

Kirk Pepperdine and I run about five courses on Crete per year. Students come mostly from Europe. Our classes are small. Can you imagine asking your boss if you can attend a course on a beautiful island? However, I also present all my courses via video conferencing, usually in-house for a company. This reduces risk and saves a lot of money. Plus I have a team of hand-picked instructors who can present my courses anywhere in the world.

I live in a remote area and do not get a good internet connection here. So I have to fork over $2000 a month just for an E1 line (2 megabit upload and download). Sounds like a lot of money, but traveling all over the world is more costly and a lot more risky.

Electricity can be cut off for half a day. This is no big deal if your industry is herding sheep (literally), but it's a bit of a showstopper when you are connected to a client 10,000 miles away. We, thus, have to have a failover system—in our case, our own diesel generator that powers everything.

In Greek, patience is ipomoni and perseverance is epimoni. When you live here, you need both. It is a wonderful place to live, but impatient people—like me—have to adjust their way of thinking. 

Oracle Technology Network: Tell us about the Java Specialists Symposium "unconference" you hold on Crete. 

Kabutz: We kept on having HR departments veto people who wanted to come join one of our classes on Crete, even when we could prove that it would be cheaper than attending the exact same course in a more boring location, such as London or Munich. So now, we don't pretend to work. We have sessions from 9:00 until 12:30, which are held in an Open Space Technology format. You can move between sessions if anything is boring for you. After that, we are free to do whatever we like. Usually, we go on a hike to some hidden beaches or we have a nice extended lunch somewhere. Since we run this during the school holidays, people bring their families. It's the perfect geek vacation. By the time their families wake up, the sessions are already done. We all know that the best part of a conference usually happens between sessions. We simply amplify the best part. Here is an image of one of the beaches we went to:



The format lends itself to interesting discussions. I have learned more during the last two Java Specialist Symposiums than during any other single traditional conference. Also, I was able to organize the conference and still have a lot of fun during the week. Attendance is free, but you need to be invited. Please contact me at heinzATjavaspecialistsDOTeu if you would like to join us, and tell us what your areas of Java expertise are. We are looking for intermediate to advanced Java programmers. We aim to keep numbers small (below 70), so that I can still make a BBQ for the entire conference at my house.

Janice J. Heiss


 Source: Oracle

1 comment:

Comments system

Disqus Shortname