Realistic Architect: 2010

Monday, December 13, 2010

How to rebuild a large legacy system

Nice and catchy title, isn't it? First a disclaimer: I've never got the chance to try this in real life.

I was having lunch with a colleague working in projects orbiting an ancient mainframe-system originally written in 1980's (we were having Nepalese food btw, my favorite). The problems they were having in his company are very common. The legacy-system was originally written by a small team - probably less than 10 people - and through the years the size and role of the system grew to something like a CRM. Yes, this must sound very familiar.

Now the people who wrote the system are closing 60 years of age and are planning retirement. The system will probably die as the last original developer leaves the company. How do you cope with a situation like this?

The most common solution to this problem is to buy a very expensive CRM-software and start a small integration-project that aims to replace the old system as-is. Usually about two years later you have an enormous, failing integration-project in your hands and instead of one, you now have two legacy-systems and even the original one is now burning more man-hours than before because "slight" modifications were necessary.

So how would I do it? Here's how:

I'd assemble a small team with the best developers I could get my hands on. I'd provide them with the best tools available and seat them in a comfortable office with free drinks and, most importantly, the original developers and experts with the knowledge of the legacy system. I'd start small, something like reproducing one core function of the old system. The developers of the new system could ask 100 times a day how some minor detail works. The new team could write acceptance tests from what they've learned of the old system.

Then maybe a year later the old and the new system could start to run side-by-side, some transactions of the old one routed to the new one. This is what is done with plants (hardening) so why not with software. One function at a time the new system would then replace the old one and grow to a complete system with all the functionality of the legacy software working as it did before, the difference between them being that the new one is done with (hopefully) a modern, flexible and light architecture that enables it to grow along with new business requirements. And what's even more important is that you now have a new team of people who know the new legacy system inside-out.

I wonder if it was Brooks again who has said that there are only two kinds of software-projects: ones that fail and legacy nightmares. As bleak as that sounds, in my opinion, a legacy-system is a nightmare usually only after the original developers of the system are no longer available. So after the system is done, make sure the developers have no reason to change jobs! Do not disassemble the team!

The lesson here, in my opinion, is that large systems cannot be built. They are grown instead by a long-living team. Anything else and you're probably going to fail.

My next blog will not have anything to do with legacy-architecture, I promise :)

Friday, December 3, 2010

Back To The Future

No. This blog entry isn't about the new game about the old movie with car-enabled time travel. Instead try to think back to late 1970's and early 1980's. Ready? Good.

Banks and other businesses with intrinsic need for computers were, at latest, building their information systems. CICS-systems with whopping 64 kilobytes of memory was pretty much the standard fare. Debugging meant printing out the core dump (after all it was ~64 000 characters, not that much). Used language was Cobol with static memory allocation. It's amazing to think it was possible to create large banking systems with that technology, isn't it?

What's more amazing is that many of these systems are still in use. If you use a cash dispenser (at least here in Finland), it's very likely that the transaction is eventually run in a CICS-system. Only companies that have been established after 1990's or so have more modern systems.

The question to ask, of course, is why haven't these antique systems been rewritten with Java, .Net or Haskell even? It's not from lack of trying, I assure you. I've personally witnessed a few very large projects and heard of many others. They were all failures by most standards. There never is one simple answer as to why a software project fails, but one question especially has haunted me for a while now: Why can't we seem to succeed in rewriting software that was originally done with such limited tools?

My theory about this is that back in the past, developers were able to concentrate better on the essential complexity of the software they were creating.

Commercial vendors have clouded us from the essential complexity of any given business-domain with process engines, portals, executable models, frameworks, predefined domains, predefined development processes...on and on. And the problem isn't just commercial vendors. Let's say you're going to use a very popular open source tool Grails to do your project (note, I have nothing againts Grails, I actually like it very much). Here's an incomplete list of techonlogies and frameworks you'll need to master in order to get a large project done:

Groovy, Java, JavaScript, XML, Ant, Gant, GORM, GSP, Spring core, Spring MVC, Hibernate, Log4j, HTML, jQuery, YUI, HQL, SQL, Ivy, Maven (repositories).

I used Grails as an example here because it's easy and productive compared to many other options. Still, the sad fact is that we're drowning in tools and frameworks. And as developers and architects, we've been conditioned into thinking we're helpless without them.

What many don't seem to grasp is that every framework, every library you add to a software stack adds accidental complexity which makes it more difficult to concentrate on the essential complexity. And no, the correct solution is NOT to start rolling your own framework. The solution I'm proposing is to minimize the use of frameworks and especially commercial products in an architecture.

I'll give an example: About a year ago I created a small ERP-solution from scratch. Sounds crazy, right? ERP's are monstrously complex and even starting to use one in a company takes huge amounts of effort. Well, you know what, it's not that complex when you can concentrate on the essential parts of it - the parts that matter to the problem at hand.

I ended up creating a very light (micro?) architecture where the domain was running on JPA-annotated classes in an embedded Jetty-server exposing REST/XML services using a small tool I created for creating XML-messages imperatively. The point is that the following code snippet (a whole service for returning all persons in the database) only requires Java 6 runtime, nothing else. No frameworks to learn or debug, no added complexity.

public void all() {
        ensureLogin();

        Message m = okResult();

        Query q = jpa().createQuery("select p from Person p");
        List persons = q.getResultList();

        for (Person person : persons) {
            m.set("person/@name", person.getWholeName());
            m.set("@id", person.getId());
            m.parent();
        }

        write(m);
    }

I'm not suggesting to stop using frameworks. I'm suggesting to evaluate very carefully whether they're worth the accidental complexity they bring. Do everything possible to minimize accidental complexity while providing developers tools to express the essential complexity!

And learn to make the difference between a tool and a framework. I will blog about tools vs. frameworks at some point. However I think my next blog will be about how (in my opinion) legacy systems could be rewritten successfully.

Friday, November 26, 2010

Why should I care about complexity?

While driving to work one morning, I remembered a discussion I had with a colleague maybe a year ago about complexity in software. Our discussion was about Fred Brooks' terms accidental complexity and essential complexity.

Accidental complexity is something we don't want. It's all the "fluff" that creeps into software projects: overly complex architectures, unneeded frameworks, "cool" code written by programmers, the list goes on and on. You'll notice that my definition is different from the original by Brooks'. That's because the original accidental complexity has been mostly solved (we don't write assembly anymore nor do we read core dumps printed on paper). Instead the modern accidental complexity is coming from the bazillion tools, frameworks and platforms all claiming to solve every software problem ever known.

Essential complexity is, well, essential. It is something that has to be solved in order to solve the problem at hand. It cannot be reduced, but it can be tackled. And more, it needs to be clearly visible so people are able to concentrate on solving it with minimum wasted effort.

I have a whole blog entry waiting to be written about how recreating old mainframe-systems these days seems impossible and how it relates to grown accidental complexity in software production, but today I'll address the connection of these two types of complexity and the role of a software architect.

Back to my old French car and my thoughts about our discussion. In a short moment of clarity, I realized what is the most important thing a (good) architect tries to do in every project:

Minimize accidental complexity and maximize the visibility of the essential complexity.

This might sound given or then consult-type jargon so let me explain. One very common reason why software projects fail (aside from plain bad salesmanship and contract negotiation) is that the essential complexity of the problem domain is either hidden completely or overshadowed by accidental complexity. Here's an example: A customer has bought a very expensive <insert your favorite commercial bloatware here> platform that will solve all their business needs - all that needs to be done is some customization, configuration and such. What follows is an often failed project where all of the time that should be spent tackling the essential complexity is spent instead in swimming in the accidental complexity of the given platform.

The tools to make essential complexity visible are well known. The most important is good domain-modeling of the problem domain. Another is projecting use cases on the domain-model to make it response to the concrete usage scenarios better. Nothing new or exciting here.

What's much more challenging is fighting the accidental complexity. This is where the value of an architect is weighed. Do you have the balls to say: "This framework/product is unnecessary and we won't include it in the architecture." Then you need the skills to keep programmers from producing code that is not essential to the problem at hand. This is where it's easiest to see the difference between a junior programmer and a seasoned one too. Seasoned programmers (if they've learned anything) solve the given problem with smaller amount of code - code that is clean, simple and without using unnecessary tools or frameworks.

Keeping accidental complexity at bay is often a matter of life and death for the software being created!

Friday, November 12, 2010

Liferay functional testing

Yesterday and today we've been having our first Fedex Day at Ambientia. The way we did this was to list ideas for it beforehand and then when the day started, we could group around the ideas freely.

I had submitted an idea regarding functional testing on Liferay and also ended up with it. So what to do about such a large topic in 24 hours? Well, first I googled the topic. Everyone including Lifreay itself is using Selenium, but I wanted something lighter, smaller and not dependent on external browser.

Initially I installed Canoo Webtest and started playing with it, but it felt a bit clunky and old fashioned with all the xml-files and rather weird runtime build on top of Ant. I had already decided to use Groovy for the tests since Canoo also supports it, but the documentation just wasn't there and the syntax was too ant-line also.

I've been toying with Grails Functional Testing Plugin recently and I've really enjoyed the simplicity and elegance of it. So I decided, what the hell, I have still almost half of our Fedex Day left so I rolled up my sleeves and started to port the grails plugin over to my Liferay-project.

After a while I had a stripped-down version of the grails plugin with all references to grails removed. Then I proceeded to make a very simple ant-script that compiles the groovy-files (we already have groovy-all.jar in our liferay project template so this was easy to do). Then I set up a directory called "funtional-tests" in the Liferay Plugins-SDK project and created the simplest possible test I could think of:

class SimpleTest extends functionaltestplugin.FunctionalTestCase {
  void testHelloWorld() {
    get('http://localhost:8080/')

    assertContentContains 'Hello World'
  }
}

Naturally it took some figuring out to get rid of weird errors, but soon it was working! Call me a nerd, but I think that's pretty cool.

Future plans

This is of course just a beginning. One big question is how to sandbox the functional test or the environment itself. Luckily I have a colleague here at Ambientia who is working on a small tool that can create pages, portlets and other data based on a simple xml-format. Using that we can setup and sandboxed environment where portlets can be tested functionally without having to ruin the actual portal you're building.

Also I'm thinking of trying to run Lifreay in Jetty embedded. This would make it possible (in theory at least) to control the whole environment while running the functional tests. I will try this anyway and report success or failure.

Another issue is controlling the state of the portlets themselves. What if you have a complex portlet done using Liferay's Service Builder that has it's own database and you want to run functional tests on it? I'm thinking maybe the edit-mode of portlets might help here. You might set up things like "create test data" and "destroy database" as simple buttons in the edit-mode and these wouldn't be too hard to call from the functional test. It could look something like this:

class ComplexTest extends functionaltestplugin.FunctionalTestCase {
  void testEditMode() {
    get('http://localhost:8080/home/portletXTestPage')

    click 'edit portlet x'

    click 'create test data'

    //run the actual test
  }
}

There is still lots to be done, but I'm quite optimistic about this.