Tuesday, December 30, 2008

Gifted developers are made, not born

Have you ever wondered why some programmers or architects are so much better than others (or you)? Why is it that some developers seem to be able to remember, execute, perform, win, learn, and teach better than others? Is it natural talent? Better training? Better equipment? Youth? Why are some simply more successful than others?

Malcolm Gladwell recently published a book called Outliers that speaks directly to that question; i.e. why are some people more successful than others. He takes a fairly scientific approach to gathering existing statistics and applies them in a straightforward manner to arrive at some interesting, if not obvious, explanations. For instance, if you were to look at the Junior Hockey Leagues in Canada (hockey is their life!), you would find out that the kids who excel in this sport are disproportionately born in the first quarter of the year (i.e. January, February, March). It's not just a little disproportionate either - it is widely disproportionate.

What is it about winter babies that make them better at hockey than autumn infants? As young boys become eligible for the advanced teams (you know - where they'll get better coaching and more opportunity), the cut off date for selecting who makes the teams is January 1. Let's assume there are two boys who will turn ten this year - the perfect age for advanced hockey training. One turns 10 in January, the other in November. As of January 1, both will be nine, but let's say that the league allows both boys to sign up. The January boy is actually a year older (give or take a few months) than the November boy. At age 10, a year of physical development is a huge differentiator. The January boy will likely perform better simply because he has had more time on the ice to develop coordination skills, strategy development, relationship/team building skills, and maturity.

In other words, the boy born in January will likely be a better player simply because he's been doing it longer. Allow me a few caveats here; you cannot apply broad statistics to any specific cases. Driving on ice causes accidents - we know that, but it doesn't mean that the next time you are on ice you are guaranteed to wreck. In the case of the Canadian Junior Hockey leagues however, the numbers are impressive. Forty percent of the best players are born in the first quarter of the year, 30% between April and June. Only 10% of the top players are born in the last quarter of the year. This is not a one time, one year, anomaly, this is a pattern that repeats year after year.

Gladwell goes on to cite many other interesting cases, such as Bill Gates, and the Beatles to illustrate that time is a great differentiator (note that I'm not saying 'age', although with junior hockey players 'age' and 'time' surely match up). Bill Gates, and others like him such as Bill Joy, were able to practice their craft (computer programming) for 10,000 hours before their mid twenties. The Beatles (a very interesting story that was news to me) had well over 10,000 hours of performance time before they hit it big in 1963. In fact, that number - 10,000 hours - seems to be a threshold of excellence.

Gladwell refers to studies that have been conducted to determine why certain musicians are successful; and others, while good, never seem to rise to the top. In the 1990s a study was conducted at Berlin's elite School of Music. The violinists were divided up into three groups; the stars, the pretty goods, and the unlikely-to-ever-play-professionally. The age at which the members across all three groups began playing was roughly the same, around five years old (no one had a head start). But at age 8, those that would become stars began practicing more, six hour per week by age nine, 8 hours by age 12, 16 hours per week by age 14, and by age 20 the stars were actively practicing 30+ hours per week. By the time they reached 22, they had practiced over 10,000 hours in total.

The psychologists who performed that study then turned toward piano players and found an almost identical pattern. But here is what was even more interesting - in all of the studies they could not find any 'naturally gifted' students who floated to the top of their discipline without 10,000 hours of practice, nor could they find anyone who practiced at the same level who was not a top performer. The correlation is clear; regardless of your natural abilities, income, size, weight, or other factors - those that put in the practice time (the more the better) achieve higher levels of performance. To be truly great, again regardless of natural gifts, 10,000 hours of active practicing (not just showing up) seems to be the key.

So, you want to be good at your craft? Practice. Write some code, and then write it again (refactor it). Look at other people's code, look at it again from an architecture perspective. Can you find a way to re-write it so that it works more efficiently? If the code you're reviewing works, can it work better, faster, with fewer lines of code, with fewer exceptions? Can you defer operations to the containing environment (Web server, Database, Operating System)? Write, write, and write again. Design, design, and design again. Read best practices, employ them, understand them - don't just invent code, review code; yours and others.

In Bill Buxton's book, Sketching User Experiences (Getting the Design Right and the Right Design), he relays a story of a college ceramics instructor who divided his class into two groups. One group was told that their grade would be based on the quantity of the artifacts they produced, and the other group would be graded on the quality of their products. Interestingly enough, the group that focused on quantity, ended up producing better quality. There is just no substitute for practice.

Friday, December 5, 2008

Being right is not enough

It was so blindingly simple, so obvious that a brick could get it. The idea was the prototype for no-brainer. It was so easy to understand that a housebroken puppy on a waxed floor would stop and take notice. To be honest, it has been so long that I don't actually remember what the architectural idea was, only that it was so elegantly simple it could not fail - and even a project manager could understand it.

There was a competing idea as well, which would work - in the sense that knotted ropes will work in the place of elevators, that sledgehammers work as back-up house keys, and hot air balloons work as public transportation. <sarcasm>The alternative was not brilliant</sarcasm>. To quote from the Hitchhikers guide to the universe "it's fundamental designs flaws were completely hidden by its superficial design flaws." In the end my idea (whatever it was) did not get implemented and the alternative (did I mention that it would work) was. Sigh. Eventually the predictable problems with the alternative solution surfaced in a variety of ways, most leading to some production issue, late/early night meetings, dissatisfied customers, and the "how did we get here?" conversations.

So, how do we get there? How does it happen that clearly superior technical solutions get bypassed for inferior solutions? Talk about history repeating itself. Talk about history repeating itself. OK, that was cheap - but why do the best technical answers not always rise to the top of the priority cream? The answer is definitively non-technical.

We've talked about trust in past blogs here. Trust is a hard-to-define commodity that is as difficult to build as it is easy to lose. Trust is based on relationships, and relationships are the key to almost every successful recommendation. As I think back on where, when, and why my early career recommendations were (and sometimes were not) accepted, I can pretty much point to the relationship I had with the decision maker to understand which ones were followed and which were not.

So here is the tough message that you may not want to hear. Being right is not sufficient. Being smart, being technical, industrious, hard-working, tenacious, innovative, or even brilliant is not enough. In fact, you can forgo many of those attributes, if you replace them (or better yet add to them) with knowing how to build and maintain a relationship. Here are some keys:

  • Spend more time listening and less time talking to your customers/managers/users (did he say managers?)

  • Before ending a conversation, repeat back what you thought you heard and make sure they confirm your understanding.

  • Stay in touch.

  • Use a variety of communication mediums (telephone, voicemail, email, instant messaging, and personal contact).

  • Never, never, never deal with conflict via email!

  • Always, always, always be receptive to criticism.

These last two bullets require a little more discussion as they will destroy trust and relationships very fast. I cannot count the number of email chains I have seen that contained deep technical conversations pitting one idea against another. Mistake; don't do it. No one in the history of recorded time has ever changed their mind after reading a counter argument to their own suggestion in a technical email. I actually checked this out on Snopes.com. If your recommendation conflicts with someone elses, you need to gather the relevant facts, your best non-emotional professional demeanor and have a face to face conversation. You say you don't like confrontations of this sort; well then let me write you a strong argument to why I'm correct. Or better yet... let's talk one on one, maybe we can learn from each other (get it, get it?)

No one likes to be criticized, and to be sure there are those that are better at delivering unpopular messages than others. Being receptive to criticism doesn't mean you have to like it, or even agree with it. Being receptive means listening to the underlying message of the criticism and trying to glean something useful out of it. Remember, most people suck at criticizing - you probably do to, so don't hold your critics to a higher level that you hold yourself. You will build tremendous trust if you appear to be receptive to disagreements, especially if you respond positively.

The whole point of this post is that good architecture design can get passed over for inferior ones because the inferior designs are proffered by individuals who have a stronger relationship with the decision makers. Don't let your great ideas get lost because you failed to build the necessary relationship with the key stakeholders. If you know your design/solution/correction/suggestion is correct and feel you're loosing the debate, the issue is likely that your voice is not carrying the weight it should by the right people. Being right is not enough. You need to be right, heard, and trusted by the right people.

Monday, November 24, 2008

Architecture, Alliteration, and Anti-freeze

As we approach the end of the calendar year, we enter a time of festive activities emanating from a plethora of social, cultural, and religious traditions. The once vivid memories of over-crowded parking lots, picked-over store shelves, tired, underweight, uninterested and uninteresting Santa's have faded from a mere twelve months (i.e. 15 production releases) ago. We are left with the thought of mustering up the better angels of our nature to prepare for the end-of-year freeze, where simultaneously work must slow down because production changes are persona non Grata, and yet history reminds us that the 35 days of Christmas beginning mid December and ending mid-January seem to be the busiest of the year. I keep waiting for this end-of-year lull I've heard so much about.

Ah yes, the End-Of-Year Freeze. A magical time in the life of IT professionals the world over. Where Santa has only 24 hours to accomplish miracles, we are given a month to plan, to think, to investigate, problem solve, diagnose, organize, reorganize, refactor, import, export, format, reformat, do, undo, collaborate, isolate, communicate, vacate, rest, test, invest, speed up, slow down, stop, start, restart, code, decode, encode, encrypt, decrypt, analyze, synthesize, scrutinize, sanitise, realize, open eyes, open hearts, open minds, and close the books. Clearly, Santa has the better deal.

The purpose for the end-of-year freeze is to maintain a known, predictable environment where the routine business of processing transactions can occur without the uncertainty and trepidation that comes with surprises; the kind of surprises that come with change. Change to logic, change to configurations, to work flow, appearance, authentication, authorization, or automation. Change must be managed, monitored, measured, minimized, manufactured, and matriculated through a defined process that purports to ensure limited disruption, down time, degradations, and undiscovered devilish details. And the best way to manage change while eliminating surprises is to freeze every single solitary system, application, OS, device driver, utility, suite, protocol, and user interface. Except of course, for that which we don't, won't or can't. Ah the EOYF, as we'd SMS to our BFF.

In a perfect world the end-of-year freeze would naturally be followed by the begin-of-year thaw. This would be a time where new features and function would slowly begin to emerge; first with cautious, careful steps, followed by more predictable replications which would be allowed to mature in a nurturing environment for many months before being exposed to the harsh realities of the real world (a.k.a. users).

Information Technology Architects long for the day when we can retire from sev #1 tickets, problem determination, change control, and anomalous application aberrations (i.e. non-reproducible errors). We would love to have the 35 days of Christmas (or Hanuka, or Kwanzaa, or December-and-a-couple) to ponder loose coupling, high cohesion, abstraction, encapsulation, or even (I can't hardly even imagine the day) polymorphism! It's not likely to happen this year. I suspect that I'll be able to republish this blog (and that last sentence) next year. Of course next year, the previous sentences will be referring to that year's freeze and the year that follows, so I've now started an infinite loop in the freeze-time continuum. Maybe this will show up as a bug some November and require an emergency code push during the freeze. My head hurts.

Architects rarely get to just architect; we tend to be entrenched in high level business requirements down through to implementation-level details. As such, the end-of-year freeze promises to exercise our context switching capabilities like an octopus on ice-skates. We'll be thinking about next year's features, functions, and fortifications, while pondering problematic production problems, anticipating technical training, and explaining emergency exceptions to the gods and goddesses of governance. To be sure the end-of-year freeze should indicate, as the word freeze does in every other context, motionlessness. For us, and our IT brothers and sisters it has not.

Friends, the word freeze is in trouble and it needs our help. It calls to us from the distant past, it calls from the data center, it calls from the confined cubicles of our competent coders, and it calls from our memories. I recall freeze tag as a child; the 'it' person touched you and you had to freeze, stop, lay still, cease to move, or be expelled. Freeze meant the absence of movement. Once touched nary a neuron would navigate, unless you were called by mom. We have the power, the right, and the responsibility to re-establish the meaning and value of the word freeze. Freeze should not simply mean that a change, which during any other part of the year would be considered normal, is merely, simply, and inexcusably tagged as an emergency. Indicating that a normal change request is now suddenly an emergency both misrepresents the importance of the change in question and dilutes the value of real emergencies - it desensitizes the change management system to the critical, highly visible, and truly emergent modifications.

Productions pushes will need to occur during the end-of-year this year ("Danger Will Robinson - the time loop continues!") in support of our businesses. But that is not an excuse for not thinking, not explaining, or not maintaining the most stable, predictable, unchanging production environment possible. Only promote those changes which cannot wait until the begin-of-year thaw of mid January. Fix what you must, test and test again, prepare for the changes you need to implement in the new year, but most of all - let us end the year with the most predictable production processing environment possible.

Monday, November 3, 2008

Who do you trust?

Do you trust me? (Do you even know me?) Do you trust the last person with whom you spoke? If you said yes, or no, could you quantify what you mean? What exactly is trust?

A colleague of mine recently shared an article that was produced by the Fraunhofer Center for Experimental Software Engineering and published on the computer.org web site. The paper called How Do We Build Trust into E-commerce Web Sites? states, Trust is a subjective, user-centric, context-dependent concept, and is thus difficult to define universally. On the Internet, several factors make trust more difficult to build, explaining why some successful brick-and-mortar retail chains have been unable to translate their reputation to the virtual platform the Web offers.

The paper describes trust, relative to eCommerce web sites, as the outcome of seven elements, each affecting the others; Ease of use, Usefulness, Benevolence, Competence, Integrity, Risk, and Reputation. As an architect working in the financial services industry, I've often stated that trust is the only thing we have to sell. If our customers don't or won't buy our trust (i.e. they lose faith in us), nothing else we do will gain their business.

Our web sites it would seem, need to be trusted by our customers in order to gain or maintain their business. So how do we do that, how do we gain their trust through a web interface. Remember that the TCP/IP protocol guarantees that every message (email, http packet, FTP, AJAX call) will be delivered zero or more times. In that environment, how do we build trust?

Consider the seven attributes of trust: Ease of use, Usefulness, Benevolence, Competence, Integrity, Risk, and Reputation. Here are some things to ask yourself about your business web site.

Ease of use - of course you think its easy, you built it and use it everyday. You are blind to idiosyncrasies, the bugs, and blind alleys. Get someone else to use your site and watch them closely. Your 'intuitive' click streams may be ambiguous, erroneous, and a require psychic divining mouse.

Usefulness - does your web site actually perform the functions your customers want? Far too many provide 'low hanging fruit' functionality that was easy and quick to build, but frankly not the transactions your customers want.

Benevolence - does your site do what it says it is going to do and nothing more, nothing less? If the user clicks, Display my Account, that should not trigger the Spam my inbox with offers for free-but-not-really vacations.

Competence - are the web site owners recognized as experts in the transaction being offered? Would potential customers trust Ford Motor Company to sell condominium timeshares in Orlando? Probably not, but they would trust PNC Bank to offer checking account services.

Integrity - does your web site use the collected data in the manner predicted and in no other way? Your web site should provide assurances, verification, policies, and protocols that provide confidence that user information is secure.

Risk - although not easily observed by users, risky behavior could be conducting transactions on behalf of your customer with third parties whose trust is dubious. This is the only element that is inversely proportional to trust. The riskier the web site appears, the less trust you engender.

Reputation - this is the one attribute that can increase the perception of all the others, and it can be circular. Customers will trust a web site that is trusted. If a web site was trusted in the past, and now offers a new feature, users will transfer their previous trust to the new offering.

The one attribute I did not see referenced in the paper was predictability. Maybe all of these other elements lead to predictability, but I've always associated trust with being able to predict behavior. I've known people that I could easily predict will drive faster than the speed limit. It's a lock, bet on it. I can trust they will speed.

One of the key tenants of user interface design that came out of the Client/Server days was that fancy dialog boxes introduced unpredictability at the cost of usability and trust.

So, do you trust me now? Do you have examples of web sites that make it hard for users to trust?

Monday, October 27, 2008

Stop Fixing Bugs

Imagine if the next system you installed just never failed. Ever. It just worked again and again and again. Whether you deploy servers, network systems, or application software - is this not your dream? To be able to deploy solutions that users use without fail. It is possible; but not by insanely doing the same things we do today and expecting a different result.

I just finished a book by Gartner Research Analyst Kenneth McGee called Heads Up that makes a strong case that for every disaster, be it business disaster, natural disaster, economic, or even terrorist related, there are always warning signs. After the fact we are always able to point out the predictors, the warnings, and telegraphs that signaled impending catastrophe. Far too frequently we ignore these signs or worse yet we can't pick them out amongst the chatter of static and noise we call data.

The first thing we do after every terrible befallment is to ask how we can prevent a re-occurrence. McGee cites many instances of disasters after which we have changed our mode of operations, focused on the relevant data and learned to avoid a repeat. The 1906 hurricane that leveled Galveston, Texas, the water valve that almost caused a meltdown at the Three Mile Island Nuclear Facility, and the O-Rings that ruptured in cold weather and destroyed the Space Shuttle Challenger all led to investigations and changes to prevent another disaster (OK, so Galveston took a while).

He methodically challenges the assumptions that there is too much data to analyze, that life is just too unpredictable, and that surprises are a natural part of the business world. If you ever get a chance, it is fascinating to read the press releases distributed by the National Oceanic and Atmospheric Administration (NOAA) prior to the arrival of Hurricane Katrina. Even though NOAA offered information such as this, many people claim that they were not warned of the severity of the storm.

Think about the world of Information Technology and how many times we reboot servers or restart applications as a proactive intervention to avoid a system crash, application freeze, and the ire of our user communities. I've even seen cases where a scheduled application restart is described as a fix. Um, er, restarting an application is not an acceptable fix - it is at best a stop gap measure and at worst the deliberate denial of a problem.

The reality is that we should be able to predict the majority of our application failures. But the first step is not accepting stupid IT tricks as remedies. The next time you have to fix a bug, take the time to collect data and symptoms about the problem, develop a hypothesis, formulate a response, and test thoroughly. But, don't put your fix into production just yet. You need to understand how the bug got past your certification process the first time.

Consider the programmers who work in the Johnson Space Center in Texas who write code for the Space Shuttle. They know how to write code, remarkably well. Here is an exert from an article about this team (Full copy here):

What makes it remarkable is how well the software works. This software never crashes. It never needs to be re-booted. This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats: the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

How do you get to a point that your code has only one bug per 400,000 lines? Process. When you find a bug, you don't fix the broken code - you fix the process that led to the broken code. Think about your current situation. You have a process that generated a bug, and another process that allowed the bug to slip into production. Aren't the errors in those two processes more important than the software bug itself. Shouldn't more important bugs be addressed before lesser bugs?

So, stop fixing bugs. Instead fix the process problems that allow bugs to be created. Fix the process errors that allow bugs to slip by QA testing and into production. Before you know it, you will have systems that just never fail.

Monday, October 20, 2008

Programmers and Common Tools

Years ago I used a programming editor from IBM called EPM; I don't even remember what the letters stood for, but I liked the fact that my editor's acronym matched the initials of my name. I used a version that worked on PC DOS and IBM OS/2. I became so good at this editor, and the commands so ingrained in my neural pathways that I believe my children were proficient in EPM from birth.

I worked on a team where there were constant debates and competitions to see whose editor could accomplish the most arcane tasks with the fewest keystrokes, because after all - this was a sign of superior software quality! Daily EPM versus vi versus emacs versus Notepad versus Wordstar versus godknowswhat took place - it's a wonder we delivered any useful applications at all.

Most of the time were were writing C and/or C++ code, so we believed that as long as the code compiled and ran, the editor used in the construction phase was of no particular importance. Every programmer could select their tool of choice, thus maximizing their personal productivity. So long as each member of the team was productive, then it follows the team and (by extrapolation) the company was productive.

As time went by, a few problems began to surface that caused some of us to rethink this approach. First, the tools became more complex and did more things for us. For instance, some of the tools performed on-the-fly syntax checking and compilations so we could execute the application without having to wait for a compile step. Different tools had slight variations in the compilers which caused different performance characteristics which we would spend days trying to resolve.

The tools became more complex and included wizards to automate the generation of code and of course the different tools generated different solutions. This yielded another round of debates and competitions which delayed actual work.

The purests maintained that these complex tools had the effect of dumbing down the programming staff, a claim that others thought to be entirely impossible (if you catch my drift). I always felt that dumb programmers were dumb and smart ones were smart, tools themselves did not cause that distinction. Personally I always examined auto-generated code to see how it worked and frequently stepped through the code with a debugger. Some of us believed that auto-generated code had the desirable effect of code consistency; different developers using the same tool would deliver consistent code if much of it was auto generated by a tool.

After years of debate, working with teams, and managing teams, I've come to the position that the more you can establish a standard set of tools and uniformly employ those tools across the enterprise, the better the organization will be.

First, let's assume that a wise tool choice has been made; one with extensibility, lots of functionality, is well regarded in the industry, and has ample technical and third-party support. The use of a common tool across the enterprise will yield many positive benefits. Chief among these is control over costs and quality. When every developer is allowed to select their own personal tools of choice, the enterprise has an enormous challenge in leveraging the tool vendors to realize lower costs or manage technical support.

Developer tools today are more than just text editors; these Integrated Development Environments (IDEs) perform complex tasks to generate code, stubs, manage configurations, integrate with requirements, project management, and source control repositories. It is not in the enterprises' best interest, nor a good use of programmer time to develop these functions, features, and integration points on their own - uniquely for each personally favorite IDE.

Quality in application development is a descendant of predictability and consistency. That a specific developer is slowed down by the nuances of having to use a less-than-perfect-tool (for them) is not significant in the larger goal of consistently building applications in a predictable way. Developers who insist on using vi, emacs, or Notepad in an Eclipse shop (for example) typify the notion that Pony Express Riders, with the wind in their face, are more productive than Locomotives with their overhead of rails, tunnels, bridges, and water towers. Ahh, No.

Staff mobility is another positive outcome of tool standardization. Being able to move resources around based on needs, due dates, problem determination activities, and business cycles can more easily be achieved if everyone is using the same tools. We've seen this again and again with productivity software such as word processors and spreadsheets; we saw it again with email solutions and calendar software. Developer tools are no different.

Still there are diehards that think programmers are special and should be able to choose whatever tools they want so long as the output (source code or binaries) are consistent. If you examine their work closely you will find that they have actually built their own IDE, specifically tailored for the enterprise in which they work, and then spend a significant amount of time maintaining their custom-home-grown IDE. Is that what they get paid to do, really?

Monday, October 13, 2008

Improving time-to-market with loose coupling and high cohesion

What a gripping title. I considered others such as Cohesion is the Viagra of loose couples, and Get High with Cohesion - Get the Low-down with coupling. My daughter is much better at the marketing side of life, yet (amazingly) she's not all that interested in coupling and cohesion. Go figure.

Eleanor Roosevelt once said, "The things you refuse to meet today always come back at you later on, usually under circumstances which make the decision twice as difficult as it originally was." This is never more true than in programming where the mistakes made early on will always require you to revisit them later - often at a higher price.

Briefly stated, loose coupling leads to a condition where changes in one module (object, component, procedure, class) of an application seldom cause changes to others. While various modules of an application may (and need) to communicate with each other through function calls, methods, APIs, and interfaces, the fact that a programmer has to change one module should not necessitate changes to another.

I've heard some noob (newbies) claim that loose coupling is a red herring. They'll cite the example of adding a new data item to the user interface (such as a web page). This would require changes to the web page, the servlet, the communication objects, the SQL, and the database. Well yes and no. A user request such as this may require all of those changes, but the changes are the result of the user request, not poorly designed code. In a well designed system, none of the function/method calls would need to be changed and the APIs would remain the same. What we mean by loose coupling is that changes to the internals of one module should not cause changes to the internals of another.

Cohesion is a bit different and deals with object responsibility. If an application has an object that represents a Bank Teller, that object should be highly cohesive in that it performs only those tasks associated with a teller. For instance, you would not expect to see a method call such as: Teller.openBranch() as a means of preparing the system to do business. Possibly the "openBranch" method would be appropriate for the BranchManager object. You'll sometimes hear the phrase misplaced responsibility used between developers discussing an application's cohesion.

A loosely coupled system is one where it is easy to make changes in modules of the code without having to make changes in others. A highly cohesive system is one in which the purpose/responsibilities of the various modules are clearly and cleanly delineated.

So what on our little green Earth does any of this have to do with improving time-to-market? I've seen statistics that indicate 70% of an application's life is spent in maintenance mode; after the initial development. We have applications that are over 20 years old where the original development spanned less than 18 months. These applications have spent 92% of their life in maintenance mode - and trust me, these apps have been constantly updated.

Recognizing that applications undergo changes, and exist for most of their life in maintenance mode, then asking how we can facilitate changes to reduce the time and costs associated with those changes is an important question. Enter coupling and cohesion.

Applications with loose coupling allow for changes in one module/class/object/procedure to occur without necessitating changes in another. Highly coupled applications (high = bad) often cause programmers, users, and customers headaches when changes in one part of the code cause unintended problems in another. The real problem, of course, with highly coupled code is that there are no warning signs, no breadcrumbs, or indicators that the change a programmer is about to make is going to cause an issue somewhere else. So, you are left with each and every change requiring substantial desk checking and regressing testing to certify the application. This takes time.

Highly cohesive applications enable developers to quickly determine where functions are (for modifications) and where they ought to be (for new function). Additionally, highly cohesive systems tend to also be loosely coupled. I'll not go into the math of this here, but experience shows that coupling and cohesion tent to travel together for both good and bad (i.e. high coupling (bad) tends to be found with low cohesion (also bad)). Low cohesion (bad) makes programmers work harder to identify problems in the code and/or add new function. This takes time.

As directors, managers, and developers we should strive for applications which are loosely coupled and have high cohesion for many reasons; the primary one being a reduction in time-to-market for modifications, bug fixes, and enhancements.

Saturday, September 20, 2008

My First Program

Do you remember your first program? I don't mean the first programming assignment in school, I mean the first application you ever wrote that accomplished some real goal. I am not an IT guy who started out as a geek in high school working in the school's computer lab. To be fair, I may have been a geek (I'd like to think not, but there were an awful lot of girls that would have nothing to do with me), but at the time I was in high school we, um, er, well.. didn't have a computer lab. Those came later; after electricity.

No, in those days I was already working as a magician, looking forward to fame and fortune. Since I am not metabolically built to handle the whole starving artist thing, I did go to college and get a job running retail stores. After a while I noticed that the computer printouts we received from the home office were, at best, a month old. There had to be a better way, so in 1978 I found myself a Radio Shack TRS-80 Micro-computer and began learning TRS-DOS and Basic.

What I needed was a summary of sales results by employee compared with their quota; who was over, under, and on target. I needed to see this for the week, the month, and the quarter. And, without even knowing what was possible, I established the requirement that all the information had to fit on one screen, graphically. I only had three obstacles:
  • I didn't know what an operating system was
  • I didn't know Basic, or programming, or logic
  • TRS-80 Micro-computers didn't do graphics

Other than that, I was good to go! A few months later I did have a sales management system, complete with character-based bar charts which used asterisks (*) for the chart symbols. Any story that would suggest this first effort was the model of object oriented programming, well documented, loosely coupled, and developed using a structured methodology would have to start with "Once upon a time." The most and only positive attribute of the code was that it more or less worked. Mostly less. I used it for two more years, during which time I went back to school for Computer Science.

To this day, I still view every IT request through the eyes of a business person trying to accomplish some business goal; with an emphasis on the usability of the solution. I still favor graphic displays of information rather than data laden reports, only now I am somewhat neurotic about encapsulation, abstraction, inheritance, and loose coupling. Call me an OO bigot if you must, a purest, old school, or just experienced.

What about you? Do you remember your first? What elements of the first application do you still carry with you today? Where have you grown? What still bugs you?

Thursday, September 11, 2008

Are You Building a Real-Time Enterprise

I'm currently reading Michael Hugos' book titled Building the Real-Time Enterprise and it made me wonder about my own company and how ready we are for Real-Time. Hugos defines Real-Time Enterprise as:

"A real-time organization is a company that has learned to operate in a continuously changing and coordinated manner with its suppliers and customers. A real-time organization receives, analyzes, and acts on a steady stream of information that describes its market environment and its internal operations."

I work in the banking and finance industry which has relied on automation and Information Technology (IT) for as long as computers have been available. In fact, the finance industry spends more on IT, as a percent of revenue, than any other industry according to Forrester Research. Yet, a substantial amount of our internal IT solution space is geared around time-based or batch processes.

As we move forward in time, especially as on-line banking and customer growth penetrate more and more time zones, the fundamental concepts that led banking 40 years ago into 'nightly processes' will have to give way. Our industry will need to operate in real-time, because phrases like 'end of day', and 'nightly cycle', and 'maintenance window' simply have no meaning in a 24 hour world.

The finance industry is not alone in coming to grips with real-time processing. I was at a conference this year and heard Dean Hager, CIO for Lawson Software tell a pertinent story contrasting the police departments in St. Louis and London. Hager and family live temporarily in London, although he is originally from St. Louis. He was anxious to visit friends and family on a recent trip to the States and on his return to the airport was apparently traveling in excess of the posted speed limit. His high velocity excursion was interrupted by the flash of colorful lights, quasi melodic tones, and a local constable.

Most of us know the drill from here, with the exchange of government documents, the quiet non-verbal pleas for lenience, and the unavoidable "thanks Officer", delivered with appropriate sincerity. In this case, however; the process actually ended quite well for Hager, as the officer allowed him to proceed on his way, albeit at a slower pace, without a citation. Before leaving, Hager decided to check his mobile phone for any email messages from his airline to see if his flight was delayed.

Among his new mail was a notification from the British Motor Vehicle agency that his other car (driven by his wife) was just photographed running a red light in London. The email was accompanied by a picture, a court summons, and a bill for the offense.

Consider the contrast between the scenario in the US, where the police office had to wait in hiding for a speeder, pull them over, take all the risks associated with a traffic stop (videos abound on YouTube), and then after an hour of work the event ended without a citation, i.e. collecting any fees for the state on behalf of the trouble/expense of maintaining speed control. Meanwhile in Britain, the governement was able to issue a bill for services rendered (traffic enforcement) with zero risk to an officer, in real time with sufficient data to counter any arguments.

I'll leave the whole big brother, cameras in the sky debate for another time. Additionally, the roles could be reversed as there are cities in the US that use cameras and places in Britain that do not - so this is not a US versus Britain conversation. For now let's concentrate on the two business models. Which one is more effective as a revenue generation solution? Which one is safer? Imagine how many lost opportunities happen when an officer is processing a single speeder.

In Hager's real world example we get to see the impact of a conventional business model compared to a real-time model.

Even though our banking systems still need to rely on batch processing for many of our functions, we should be thinking about solutions that operate continuously. So, for instance; let's say you are building, buying, or deploying a system that receives data from a batch process. Ask yourself - even though the upstream process is batch, does your new process have to be batch? Could the new solution be deployed to operate in real time, even though initially it will only get used in spurts, i.e. when the batches are ready. Later, when the upstream batch process changes to a continuous flow, your system will already be able to handle it. In other words, don't propagate time-based/batch architecture unless it is necessary.

What examples of real time systems do you have?

Monday, September 1, 2008

How to create unmaintainable systems

Several years ago, an article appeared entitled "How to write unmaintainable code." It is a very funny look at ourselves, because we have all used some of the techniques outlined in the piece. My personal favorite is:

"Exceptions are a pain in the behind. Properly-written code never fails, so exceptions are actually unnecessary. Don't waste time on them. Subclassing exceptions is for incompetents who know their code will fail. You can greatly simplify your program by having only a single try/catch in the entire application (in main) that calls System.exit(). Just stick a perfectly standard set of throws on every method header whether they could throw any exceptions or not."

I'm not going to revisit unmaintainable code, but I'd like to discuss building systems in an enterprise, and how "programmers in a box" struggle with the transition to corporate development. So here are my top picks for how to create systems that are unsupportable and unmaintainable in an enterprise. Of course, if you start with unmaintainable code, then delivering an unmaintainable system should be child's play.

Use whatever programming language you want:
Is Java the corporate standard for distributed development? Sure, but some .Net programming will pad your personal resume. What the heck, might as well write a .Net application using both VB.Net and C#, then you can throw your code into Visual SourceSafe even though there is no corporate support, no way to get your code to production (legitimately) and of course, SourceSafe doesn't always lose your code. Then for burps and giggles you can test your application using the embedded Microsoft test suite, even though the corporation has paid (significantly) for an enterprise testing solution, built a centralized testing Center of Excellence, and established a professional testing governance program - but what the hey, you're personal reputation is more important than all that.

Join the Scripting Language of the Month Club:
Scripting languages are notoriously easy use; and equally difficult to debug or maintain. Each contains some nuance or speciality that makes them ideally suited for a specific task, but generally not good for anything. They tend to have no formal structure, use cryptic syntax, and sprout up as frequently as spam. So, first of all use a scripting language, even when a more structured language is available. Secondly, take the time (at work) to become a certified script kiddie in JavaScript, ECMAScript, Groovy, Python, Jython, JRuby, ANT, Perl, PHP, Rexx, Kixtart, DOS Batch, UNIX Script, TCL, Pike, and VBScript so that no one can ever predict what language you'll use next, what VM needs to be installed on the servers, what version of the VM, or who you are (because you'll never include an Author tag in the source file). Lastly, make sure to call scripts in one language by scripts written others. Oh, and lest we forget, breed lots of confusion and dissent by blithering on about how your new script du jour is better than yesterday's legacy hack and how out of touch the company is, and there's never any training, and what a bunch of morons you work for.

Assume you'll be the only person who will ever look at the code:
Hey, at the moment you wrote the code (or downloaded it) you kind of, more or less, understood it - so there no need to insert any commentary about what is does or why. Besides, the code is self documenting to any quasi literate programmer with half of your underpaid, overworked skill. Afterall, its not like you hack code like the last dufus whose program you had to maintain. If only that guy had taken the time to insert a few meaningful comments. No, your code is different, it is simple, except where it isn't. It is bullet proof and will never need to be modified. Ever. By anyone. Even you. Ever.

Assume you're the first programmer that ever existed:
Every time you're handed an assignment, whether it's related to the human interface, the database, security, or business logic, assume that no one has ever been given this or any similar assignment and start coding from scratch. Furthermore, assume your code will be the end of this requirement for all time. No one will ever think to reuse it, so there's no need to document it, or to , gasp, make it abstract! No, make sure that the inner workings of the code have no interfaces, everything is 'final' and for heaven's sake do not package it up for export. Don't even think about a web service.

This cannot be the end-all on this topic, so I'd like to hear your thoughts on developing unmaintainable systems in a corporate enterprise.

Sunday, August 24, 2008

You get what you incent

Just to be clear, humans are incented by more than just money.  So the phrase, you get what you incent, does not necessarily mean you get what you pay for - although that may also be a true statement.  A friend of mine once said, "I can go a month on a good thank you."  Incentives come in all shapes and sizes.  If you are trying to deliver quality software, think about incenting the activities that will yield proven, sound techniques.

Naturally, some managers believe that their staffs are professionals and that beyond the compensation package and each individual's desire to deliver high quality products, the manager does not need to provide additional incentives.  I tend to agree with most of that perspective, up until to the 'don't need' part. 

Consider this.  The Federal Aviation Administration conducts tests to better plan for and handle airline emergencies.  Understanding how people will react in a crises situation can help the FAA create new procedures and safety equipment for which we will all benefit.  So... they simulate airline emergencies using professional actors who get paid $11.00 per hour for their work.  These actors are given specific roles to play and specific seats in which to sit.  The FAA then creates the faux incident to video and monitor the actor's reactions.

But those actors, with all of their Stanislavski training just don't respond in the same way as real, honest to goodness, terrified-for-my-life-get-out-of-my-way real passengers.  Passengers who, by the way, are not paid anything to act hysterically.  Confronted with this obstacle, the FAA was perplexed as to how to get the actors to behave more irrationally.  You get what you incent.  So they told the actors that first 25% off the airplane would be paid double.  It worked, with knees, elbows, profanity, and me-first behavior running rampant.

As an interesting aside, how many times a day do you think these emergency procedures are employed?  You've heard the flight attendants talking about finding the closest exit, which may be behind you yada, yada, yada.  But of course, any airline emergencies that might require your actual attention are so very rare, that well, watching baggage fall off the conveyor belt is a much better use of your time.  Someone once commented that in the event of an emergency landing they will not be looking for an exit; instead they leave via the large gaping hole in the aircraft.  Eleven.  According to the FAA, airlines utilize their emergency evacuation procedures 11 times a day in the US. 

So let's say you want to make sure that the code being developed is appropriately documented.  First decide what that means; does it mean you have a code-to-comment ratio greater than 20%.  How are you going to measure it; with tools, by hand, is it just raw commentary - or meaningful commentary?  Then create an incentive for getting there.  Any developer who delivers a code to comment ratio between 20% and 25% gets extended lunch for a week.  Or spend $25 on a trophy and who ever delivers gets to keep the cup in their cube for a month.  It won't take much - consider the actors 'pretending' to be scared to death.  At the same time, if the prize is too cheesy, it will diminish the value of the very goal you're trying to achieve.  Be realistic.

Let's say your team is moving office locations.  Make one of the window seats in the new location an incentive for something you need the team to rally around.  I know people who will run over their grandmother for a window seat.  Why their grandmothers are working here, I don't know, but still!  Here are some other incentives that might just cause some useful excitement:

  • A meeting-free week for the first person to accomplish 'X'.  Do you know what I would do for a meeting free week?
  • Upgraded computer.  Let's say you have ten personal computers in your team, each with a different refresh date.  Whoever reaches goal 'XYZ' gets the next refreshed computer - and time to set it up 'just so'.
  • All jeans all week.  Allow the winner of a contest to wear jeans all week, instead of just Friday.

These are but a few, off the top of my head ideas.  Think about what is meaningful to your team.  I know of one conference room where there is only one decent chair.  Who ever meets the challenge first - gets that chair, guaranteed, in every meeting for a month.  Of course for longer-term goals, you'll want something repetitive and meaningful, but the idea is that humans react to incentives other than US American Dollars.  Of course, we tend to react to them too.

What are some of your incentive ideas?

Tuesday, August 12, 2008

The Art in Architecture

I was once asked during a job interview if I had ever done anything creative. I asked in reply - "you mean other than programming?" I've always felt that programmers, DBAs, Systems Architects, Network Analysts, and others of our ilk; the good ones anyways - use both the left and right sides of their brains. What we do is art. It is also science to be sure, but if it were only nuts, bolts, glue, bailing wire, and spit there could be no enjoyment, no thrill, nor satisfaction.

The Roman architect Vitruvius stated that good architecture must posses and balance three attributes; function, structure, and beauty. Vitruvius only had stone, clay, and wood with which to work (these pre-date Assembler) and yet his fundamentals can be applied to modern IT architecture.

Function: This refers to how a thing is used; i.e. it must accommodate practical requirements for every purpose within its domain. A building without function may be beautiful, but it's sculpture, not architecture.

Structure: This refers to how a thing stands up, it's strength, flexibility, and resiliency. Whether it consists of steel, wood, brick or binary digits (bits), the framework must resist the loads placed upon it. But to be architecture, it must do more.

Beauty: This refers to sensory appeal and elegance. It is what Vitruvius called "delight." Architectural delight can be found in a curved stairwell, a vaulted stone ceiling, or an object so simple and elegant that is cannot fail. Beauty is the ultimate test of good architecture.

In our day-in, day-out, cost-driven, date-driven world of corporate computing we tend to focus on function, only sometimes considering structure, and rarely embracing beauty. But an elegantly designed object, service, or application is the one that tends to work without incident over long periods of time. These are the components that can be easily extended by third and fourth generation developers, long after the original author has been promoted to Project Manager.

IT designers looking to build high function, structurally sound, beautiful/elegant computer systems are often thwarted and discouraged by 'pragmatists' who talk about the real world, the trenches, and who have no time for 'purists.' Imagine what the world would be like if IT pragmatists inflicted their myopic standards outside of the digital domain. How dreary life would be.

Consider the project you are working on right now. Is there any part of it that you feel functions (does it's job), has good structure (is resilient and flexible) and is beautiful (simple, elegant, pretty). Try not to confuse beautiful with ornate or ostentatious. Elegant computer systems are simple. As Einstein put it, "as simple as possible, but no simpler." Your project could be a new user interface, a new database design, an indexing scheme, a network topology, a security solution, business logic, messaging transport layer, or COBOL batch job. Beauty can exist anywhere, but only if the designer cares.

Here's a test for your next problem - do you tend to implement the first solution that pops into your mind? Do you stop thinking after finding the first workable answer to a problem, or do you continue to think it through, considering other possible angles, approaches, and ideas. David Brenner, the comedian once commented that of course you find things in the last place you look - who would find something and then keep looking for it? But in the arena of IT solution design - the first answer you invent is often just a core of a good idea - its function; with structure and beauty a few more thoughts away.

You are an artist that must balance function, structure, and beauty to be truly great.

Tuesday, July 15, 2008

You WANT to Program the Mainframe!

I started out in the IT field learning the fundamentals of application design and development on a Radio Shack TRS-80 Microcomputer with a cassette tape drive - Woo Hoo! My next hands-on experience was as a student, leaning the basics of business computing on a Sperry-Univac 90/30. This mainframe system ran at the blazing speed of 0.5 Mhz and had 8K of RAM. That's 'K' as in 1024 bytes. We typically buy RAM in Gigabytes, 'G', as in 130,000 times more. My current laptop operates at 2 Ghz; 4,000 times faster. So my laptop, built in 2008 is significantly more powerful than my 1973-era mainframe.

Clearly the future belongs to smaller personal computers and their bigger cousins, servers; right?. Programmers graduating from two and four-year programs want to develop web services, service oriented architectures, object oriented applications, applications with cool user interfaces, and, of course, work on meaningful projects - so the mainframe is magnus-calculus-non-grata; right?

Ask any new programmer if they would like to work on the 'big iron' and very few, if any, will assert to the positive. You are more likely to get a snicker and an eye role, than any form of civil communication. And yet, there are some real reasons why programmers entering the workforce, or even experienced developers should consider a mainframe-based career.

First, the mainframe is not going away. Rumors have abounded for years on this topic and in fact, these same prognostications led me to enter the world of distributed computing 27 years ago. Talk about dinosaurs, stone knives, and bear skin rugs! We've come a long way with personal computers, servers, and mid-range systems, but remember, mainframes had a 40 year head start - and they haven't stood stagnant lo these past few decades. So the mainframe is a vibrant, growing technology stack and represents a solid career path for as many years into the future as anyone can predict.

So instead of asking "Do you want to program the mainframe?" ask a question like, "Do you want to work on the most important applications requiring the highest level of reliability, performance, security, and visibility?" "Do you want to be able to analyze, process, and mine more data, faster, and more often than ever thought possible?" Consider this, with Parallel Sysplex technology, you can combine the processes of up to 32 z/OS systems, yet make these systems behave like a single, logical computing facility. What's more, the underlying structure of the Parallel Sysplex remains virtually transparent to users, networks, applications, and even operations. Let me be clear - this capability is even transparent to the developers. Imagine writing an application that could run on/across 32 mainframe systems as if it were one super large system running billions of instructions per second.

You may say, "We can do that with clusters of servers and a grid computing model.", and I'd say well - partially. You cannot do it transparently to the application; you have to specifically design the application to take advantage of the grid - no trivial feat. And even if you did you'd still under perform the mainframe's input/output processing. No, the reality is, if you want/need raw horsepower including I/O that is well managed - you cannot beat the mainframe.

How about this - do you care about the environment? I don't mean are you a tree-hugging, business-hating, log cabin living eco-focused tofu eater. I simply mean, do you care at all? David Anderson, an IBM green consultant says "A single mainframe running Linux may be able to perform the same amount of work as approximately 250 x86 processors while using as little as two to ten percent of the amount of energy." IBM's new z10 is the equivalent of nearly 1,500 x86 servers, with up to an 85% smaller footprint, and using 85% less energy. According to Morgan Stanley, energy now represents 44% of the total cost of operating a data center.

So if the raw power impresses you and the Green IT message resonates, all that is left is the visceral degradation you associate with ... here it comes, say it with me... COBOL. The Monty Python of programming languages (I'm not dead yet, I think I'll go for a walk). Each and every day there are more COBOL programs executed than there are web pages viewed on the entire Internet. COBOL is here, but that doesn't mean a mainframe career is necessarily entrenched in MOVE, PERFORM, Working-Storage, and Filler Pic X(9). How about architecting, building, and managing the company's hundreds of service-oriented CICS-based transactions all exposed as real honest-to-goodness web services? All exposed and available at the highest speeds, in the most reliable environment, eco-friendly, and time-tested. These are the routines upon which the company, our customers, and your paycheck depend.

I'm not suggesting that there aren't any distributed application of importance - surely there are. Most of them happen to be fed from information and transactions running on the mainframe. Here's a killer, on the topic of straight performance, according to Forrester, COBOL still outperforms J2EE in a direct function-point versus function-point comparison.

Today's mainframes require as many diverse skills as you would find in the distributed environment. In fact, with WebSphere running on the the 'z', all of the distributed development technologies including Java, J2EE, HTML, CSS, Javascript, and XML are in play. But so are CICS, DB2, IMS, VSAM, and a host (pardon the pun) of other technologies upon which one could build a long, dynamic, lucrative career. And given the maturity of the environment, it's a career where you'd be inventing things, not re-inventing them.

Sunday, June 29, 2008

The value of SOA

There is a lot of confusion around SOA - Service Oriented Architecture. How will it help? Is is software, hardware, both? Is it something you buy, build, or is it an architectural concept relying on fundamental principles of OOAD such as abstraction, inheritance, encapsulation, and polymorphism?

Yeah, you get the picture. So, meet Greg, the Architect - he has a similar problem in trying to understand just what SOA is.

Greg is driven by his CIO's need to improve revenue, decrease costs, and get new product function and feature to market faster. Greg goes looking for ways to deliver SOA from various sources, only to discover there is a maze of content to consider, some of it coming from, shall we say - special interest groups (Vendors?).

It's not really possible to define all of what SOA is and is not in one short blog, so instead I'll provide some relevant links and discuss it in PNC terms. SOA is about assembling (buying or building) applications and application components that can be reused in ways which were not originally known. You might buy a solution for Customer Relationship Management (CRM) and later reuse the part of the solution that looks up a customer's account balance. Much of the SOA conversation surrounds how to reuse the get-the-customer-account-balance information.

So first, SOA is about reusing applications and/or parts of application in new ways. This is good, because any piece of program code you reuse is a piece of code that doesn't have to be paid for, is available immediately, and is probably pretty high in quality (since you're already using it). Now, programmers have been reusing code since Grace Hopper cut her first piece of nanoseconds. Problem is, they've been copying and pasting the reusable code, which leads to all sorts of issues when a bug is found.

So, SOA is also about how to reuse code. In its simplest form, the parts of applications that are to be resused are made available as Web Services, a specially designed way of finding, accessing, and using reusable code. The simple term "Web Service" actually has a lot of stuff behind it that allows for securing, monitoring, controlling, combining, and aggregating business functions into powerful capabilities which were not envisioned when the original code was acquired. That is the power of SOA - it provides a capability to be more flexible as time moves on. With SOA, speed to market can improve over time, even though systems become more complicated. Application can become more resilient, more reliable, and more controlled, because SOA was designed around the business concepts, not computer concepts.

Want to learn more? Here are some useful links for SOA:

Friday, June 13, 2008

Time to Market

It's baaaaaak!. Every so often the Time to Market monster raises its head and the village chiefs start to point and run. "We need to improve our time to market/profit/customer/engagement/yada/yada/yada!" I'm not suggesting that we don't need to decrease the time it takes to bring new features, products, or services to our customers; quite the reverse - we absolutely need to deliver more, faster.

My concern is twofold, first is that Time to Market is addressed as an afterthought. After we've determined the requirements, the resources, the tools, and the cost, we decide to add (like its an ingredient) development speed. (We also do this with quality - "Hey let's make it high quality while we're at it"; but that's a subject for another time.) Development speed is not something that can be *added* to a project, a team, or an organization. It is something that happens; when the project, the team, or the organization has acquired predictability in their development process. More on that later.

The second issue with the cyclical Time to Market mantra, is that we already know what it takes, yet every couple 'o years we do the "focus group" thing to discover how to improve our development speeds. Sigh. Imagine the U.S. Postal Service in the early 1800's, i.e. the Pony Express riders. These people transfered mail from town to town, like a giant rely race on horseback, riding as fast as they could. Improving delivery times generally meant beating the horses more, getting bigger, stronger horses, and riding them longer. Even with all of that, you could only eek out so much speed.

Along comes a visionary who says, we could transfer the mail faster if we built a railroad. The Pony Express Riders have their doubts because to them, with the wind at their face, the mail is moving as fast as it can, and besides it will take too much time to build a railroad. Railroads require too much construction, too much structure, AND you loose flexibility. Trains can only go where the tracks are. Jump ahead a hundred years and airplanes are used to move the mail. Consider FedEx.

The system used by FedEx is a marvel to see. Every night hundreds of airplanes, big ones like Boeing 747s fly into Memphis, Tennessee where all of the packages are offloaded, sorted, reloaded and flown out like a giant lung inhaling and exhaling. On an average day FedEx brings into and out of its Memphis SuperHub three million packages. On December 22, upwards of five million. All of these will be delivered within twenty four hours. The processing facility is capable of handling 500,000 packages per hour, 325,000 being small packages, 160,000 being boxes, and the remainder being large or oddly shaped.

Here is a graphic that shows the FedEx aircraft over a 24 hour period flying into and out of the Memphis SuperHub:

Now let's consider those statistics in relation to software development. FedEx is arguably one of the most efficient organizations in the world, as evidenced by the number of packages they deliver overnight. You can't get much better "time to market" than overnight! And yet less than 3% of what they handle is "special cases" - large or oddly shaped. They have created an infrastructure whereby 96%+ of their requests can be handled by the infrastructure with little to no special handling.

This is the secret (not that it is a secret) to improving time to market. Create an environment where only a small fraction of your user's requests need to be treated as special cases. How? Just follow these rules:
  • Know the business - putting the developers, or at least the designers and testers inside the business is a significant improvement.
  • Use Agile/Interative/SCRUM/eXtreme/Shorter methodologies and cycles - don't even plan for a 6 month development project - you'll never get it right (See Standish Group CHAOS Report)
  • Limit the solution sets - have predefined architectures, server configurations, hardware choices, UI standards, etc.
  • Build, maintain, and enforce reusable libraries - dedicate a very small team to identifying code modules, objects, and patterns and do not let newbies re-invent the wheel.
  • Tools - integration is more important than functionality; spend the time and the money to allow work to seemly flow from one set of tools to another.
I'd like to say that one of these is key, sort of a Golden Rule, but each brings separate and distinct value. Do them all. I would add that these are also key ingredients (did I just say that?) to improving software quality.

Monday, June 2, 2008

Know your test data

There has been a lot of attention given lately to the processes we go through to test our applilcations in preparation for a new release. Of course, an application must have data and that data must reflect the realities of the business. This leads us to an interesting problem; how do we ensure that our test data provide the full range of possible situations our applilcations will find in production? Easy, we'll just copy our production data to a test database or file.

Wait, not so fast. First, we have to be careful that our test data does not reveal information to people that don't already have a business need to know. ('testing' is not a business need to know!) So, unless our programmers and QA staff have an existing needs to know John Q Public's account balance, then letting them see the balance in the test data violates our privacy principles. So copying production data to test should, at best, only be a first step. The next thing we need to do is obfuscate the data, altering identification information such as ID numbers, names, addresses, and such. But wait, you say!

If I start obfuscating ID numbers then I'll break the referential integrity between my files. I won't be able to correlate the data in one file with the related data in another. Yep, that's right. Unless you are careful and change the ID number in all files the same way. This is non-trivial, and will likely take another program, which will need to be tested with test data. So this copy-from-production approach takes some work. That being said, it is fundamentaly flawed, and you have to work to minimize the impact of the problem.

When copying data from production, one has to be careful not to assume that all reasonable data possibilities are necessarily present. Just because a certain data configuration can exist, doesn't mean that it always must exist in every rendition of production data. Just because a situation existed last month, doesn't mean it will exist that way next month. But it might the month after that. You must therefore, validate with every production download that all data permutations are present.

The better solution, is to create the data you need to fully and completely validate every function and process in your application. This process will take some time, at first, but eventually you'll have a known, predictable test data set and can validate your application with the mathematical certainty usually reserved for GPS systems. Occaisionally a situation will arise in production that is not represented in your test data set. You'll have to add that condition in. You may even need to write code to 'refresh' your test data to update dates, or other permutation-sensitive elements.

The fundamental principle is this; you must know your test data or you cannot claim to have exercised the application. Copying from production opens the organization up to privacy issues and does not guarantee you've tested every condition. Furthermore, if you have an uncaught error in your existing version of the code, and you use the "well the new code matches production", you serve to perpetuate the error, with confidence!

The best practice is to create your own test data, making sure that every business permutation is fully exercised. Then, and only then, can you state with full confidence that you have tested the application. Now, let's talk about code coverage!

Friday, May 23, 2008

Advance yourself

There's an old joke that goes, "What do you call a medical student that only gets a 'C' on his Board Examinations?" And the answer is "Doctor." How many of us wonder if our Doctor did well in school, or more to the point, has stayed abreast of new thinking, treatments, and prescriptions. We all kind of assume that our medicos read, go to conferences, and learn from their peers throughout their professional life.

Well, we're professionals. Yet, how many books have you read about your craft in the last year? I don't mean reference manuals, syntax bibles, or error logs. I mean books, articles, web sites, blogs (I guess this one would count - huh?), and other sources. Books about security practices, or development patterns, or performance tuning?

If you're thinking that the company hasn't sent you to training since.... stop! How would that sound to you if your doctor said they hadn't read a medical book in two years 'cause the insurance company wouldn't pay for it? You'd find another doctor. Your key to success, is based on experience and knowledge. Experience comes with time, but you can accelerate learning and knowledge. It's your career, your profession. Advance yourself.

Create a plan to read three books this year relative to your career, or the career you want. Then, walk on down to your local Barnes & Nobel, or check out Amazon.com and pick up a book on an IT related topic - something that will expand your thinking. If you aren't using Patterns in your development, that's a really great place to start. Patterns are the architecture of code. Here is a free on-line book about patterns in Java. Do you know what Model Driven Development is? What about SCRUM methodology? Can you recall the forms of database normalization? These are but a few topics - there are about a billion more.

Here are some examples:
Use the comments section here to tell us about the last book you read. Right now I'm reading "Groundswell - winning in a world transformed by social technologies."

Thursday, May 1, 2008

EA's power comes from the message

Often times I'll hear an enterprise architect discuss their concern over how much support they get from their company's CIO or CTO. "If they'd only back me more often, I'm sure we could more things done." Comments like that usually follow a scenario where an architect proposed, recommended, or demanded a particular solution, only to be overridden by a business decision.

We've all been there, and know the feeling. First things first; as an architect don't put yourself in a position where you can lose. That doesn't mean you can't stake a position, or demonstrate conviction. You just don't want to say "No", rather you want to say "There are risks/costs/concerns" with the solution, that my approach addresses." (This implies, of course, that you have a viable solution for anything to which you might say "No").

Secondly, put yourself in the position of the CIO/CTO/CEO and image two of your brightest people have come to you with differing opinions. One, a business manager is saying "I need to do 'X' to advance my business" (with some implication of a technology that is inconsistent with EA's goals/vision). The other bright person (you?) says to the CxO, that the proposed solution 'X' is inconsistent with stated goals, is technically inferior, and will ultimately cost more and perform less.

Now of the two of you, only one is offering increased revenue, market share, penetration, or other goal which is likely tied to an incentive package. I'm not suggesting that the CxO's motives are anything other than pure, and in the best interest of the company. Rather, the business goals are typically incented and aligned with customer and shareholder expectations. You can count on the CxO to support the business every time.

The power of the EA group will ultimately derive from the EA message, which should be aligned with customer and shareholder objectives and expectations. You will begin to win support of your business units/customer before ever talking to the CxO when your message of EA value is expressed in terms which support the business goals. Asking the CxO to overrule a business decision in favor of an EA recommendation is a bad move; the conflict should never reach his/her desk.

Tuesday, April 15, 2008

Purpose(s) for Architecture Reviews

Large corporations, like ours, conduct architecture reviews of major technology initiatives to determine they're fit for the organization's operating practices. Exactly what comes out of a review should depend on when the review is preformed and the state of the project. By our analysis, there are seven kinds of reviews, each with a different purpose, outcome, and timing.

Reviews performed before the business has a specific project in mind are useful for setting strategic direction, possibly capitalizing on entrenched or emerging solutions. These projects have the benefit of having nearly perfect architectures and can demonstrate that EA is a proactive element of a total business strategy. Reviews performed in this scenario are known as Architecture Planning.

Once a business unit or team has a decent set of requirements, but before anyone has technical specifications, it is a good time for Architecture Alternatives to be discussed. These can be high level or fairly detailed, and have benefit of being the most useful of all reviews.

If an architecture team is asked to review a project where the business and technical designs are proposed, but final funding approval is still pending, then the review is actually an Architecture Approval. It's still early enough that changes can be made without affecting budgets, but expectations are usually already set.

Once project funding has been approved, the options for changing the architecture of a proposed solution gets narrower. It can (and should!) still be done, but after funding, reviews result in Architecture Recommendations (albeit strong ones if necessary), more so than architecture mandates. Whenever possible, try to get the reviews done before funding is approved.

Being asked to review the architecture of a solution after project execution (typically this means the coding has started) is a lot like being asked what you think of your manager's spouse. You better hope nothing bad comes out. Architects still and always have the responsibility to be honest, but the opportunities for suggesting change are more painful and expensive now. These reviews are really requests for Architecture Validation.

On a few occasions we've been asked to weigh in on a project just before the application was to be deployed. What can you say, these Pre-deployment Reviews are more about getting everybody aligned as to what is about to occur. If an architecture change is made now, the costs and timelines of the project are all but impossible to hit.

Generally, the later an architecture review takes place the worse it is for everybody. However, a Post-mortem Review is a good idea, and should be performed frequently to review lessons learned and prepare for the next Architecture Alternatives conversation.

Tuesday, March 18, 2008

Technology Surveys

We recently had a discussion about the value of surveys. You know - those postcards or web sites that ask you to rate the quality of product or service you've just received. Typically the questions are framed so that providing any negative feedback is difficult, you always wonder if they are truly anonymous, and figure that no one looks at them anyways.

I did stand-up training for several years and we surveyed the students after every session - typically four hours. These numbers were crunched into spreadsheets that showed our classes were rated 4.85 on a 5 point scale. After a while, the number of respondents was so high, that a negative review did nothing to the overall average. This combined with the fact that most people don't think much about their answers, they just want to get back to their lives, and your begin to realize that the surveys don't really mean much.

Don't get me wrong, I'm a proponent of user feedback, customer satisfaction, and reader-centric writing. It's just that most surveys, and the survey processes don't yield much, especially if bad reviews are 'erased', or skipped because they fall outside the norms. And yet... these are the surveys that provide the most value, and should receive the most attention. We had a case in the past month where one of the technology teams sent an automatic survey out after a service call, and the numbers came back well within the norms; which is to say very high. However; the customer included a remark that they wished the service could have started sooner.

One of our executives, asked what the technology team was doing about the remark, and received a diatribe of how this is not the norm, nobody else says this, defense, defense, defense. In other words - the surveys are worthless. As long as the numbers are high, we'll use them to promote our quality. When we get something negative, we'll spend our time developing all the reasons to ignore it.

In another case, a technology team collects a 10 question survey after each event, and is currently seeing that 90% of the respondents rate the service as either Excellent or Perfect (7 or 8 on an 8-point scale). This team received one survey recently with numbers like 2, 3, and 4. Immediately an analysis was performed and the team concluded that the root cause of the poor evaluation was outside-the-process communication that took place after the event. In other words, the event being rated was fine, per se, but back-channel sniping between participants had left a sour taste in the mouth of the customer.

In this case the technology team had a focused conversation to remind the service providers to stick with the defined process and not to engage in extra-process communication and to leverage the defined process to communicate necessary information. In short, the survey data was used to improve the situation, at least for the next event.

The primary driver for collecting survey data should be to improve the process, not to provide marketing material for how great you're doing. The comments and negative reviews are more valuable than hundreds of smiley faces!

Friday, March 14, 2008

Architecture Reviews: It's all about preparation

Last year, my team performed a little under 40 formal Architecture Reviews for both application and infrastructure projects in our corporation. If there was one element that determined whether the review went well, or not, it was the preparation the project team put into the review upfront.

We typically council team that it will only take a couple of hours to prepare for the ARB meeting, but that you should start two weeks before. We then provide a PowerPoint template file that contains all of the discussion topics to be covered, with little text inserts so that the preparer knows exactly what we're looking for. We also require a set of Blueprints, specifically formated architecture diagrams - there is a set of four we need which closely matches the content depicted in four cells of the Zachman Model for Enterprise Architecture.

With this level of preparation, the actual meetings go very smoothly, and we have been able to review three different projects in a single two-hour session.

Conversely, when project teams either don't prepare, or bring their own agenda and diagrams, we spend an inordinate amount of time jumping between topics, trying to decipher the pictures, and usually end up missing something important. In a few cases, when the project timeline allowed, we canceled the ARB meeting because the project team failed to provide the meeting materials in the proper format two days in advance. This is unfortunate when it happens, because the ultimate beneficiary of the ARB is the project team.

No one gets more value, confidence, or satisfaction with a successful project than the team that implements it...and a sound architecture is the backbone of a successful system (according to Software Engineering Institute). So take the time to prepare for your next ARB in the manner prescribed by the Architecture team. The meeting will go better, and your project will benefit.

Tuesday, March 4, 2008

EA and the physical world

I saw an interesting example the other day of where the world of physical construction (i.e. building a building) and the digital world of enterprise architecture diverge.  Normally, we use the physical world as an example when describing the merits of enterprise architecture.  We equate EA to city planning, and "building architecture" to applications.  This helps us bridge the gap that our business users and some of our technology partners find difficult to cross. 

Right next door to my office building, a new sky scraper is going up.  We've been watching the demolition, the site prep, and the steel girders being placed with childlike anticipation.  You'd think we were 8 years old playing with a giant Erector Set.  Well, one part of the new building will have a large ballroom and this has caused an interesting problem for the architect.  Imagine a steel skeleton for a tall building with floor upon floor of vertical posts supporting horizontal I-beams upon which the floors and ceilings will rest.  Now imagine that you want to remove one of the vertical posts, so that you could construct a larger-than-normal room, such as a auditorium or ballroom.

Of course, you wouldn't remove the vertical posts in that one spot for all floors, you just do it for the one floor that has the ballroom.  Therefore the next floor up, would have a vertical post in the location where you had removed it on the floor below.  This causes a problem.  By removing the vertical post on (let's just say) floor 4, there is no support under the vertical post on floor 5.  The architects for the building solved this problem by placing a VERY LARGE, horizontal I-beam; longer, thicker, wider, and stronger under the vertical post that has no support.  While this may be difficult to visualize through narrative description, the point is that the architect designed something bigger and stronger in a critical support place to ensure stability of the entire structure.

Our team began comparing and contrasting similar strategies in the digital or application architecture space and found an interesting non-parallel.  Often times, a component of an application must support more than its share of the load, such as a controller servlet for example.  In these cases, we tend not to create something bigger, longer, or heavier, rather we do the opposite; we create the lightest-weight component we can.  In these cases we create code that quickly determines what to do, and then hands off the work as fast as possible.  At some abstract level we imagine that both the bigger, badder I-Beam and the tight controller code could be described as distributing load, but superficially it would seem that the solution in the physical world is oppositional to the digital environment.

Can you think of other examples like this?

Wednesday, February 27, 2008

"No architectural implications"

I was reviewing the corporation's project portfolio last week and identified a number of technology initiatives that seemed like candidates for a formal Architecture Review (for the record, this was over and above the formal project selection process we have in place). I emailed the director for one application team and indicated that we wanted some additional information about one of their projects; scope, changes, and impacts. The response from the director was (essentially), "The project will cost 1.7 million dollars and there are no architectural implications." Really? How does someone spend that much money on an application development effort, even on an existing system, and not have any architecture implications?

I was a programmer for 20 years, and I'm not sure I ever worked on an upgrade that didn't have SOME architecture implications. Maybe, I thought, this was a data conversion project, or maybe an effort to comment code, or check the code into a new source repository, or add better testing. For $1,700,000 something had to be taking place and if there were no architectural implications, that would mean that the code was staying the same (sans comments). The Director later explained that there will be changes to the application interfaces, but that significant modifications will be made internally to the code. To make a long story short, there are architecture changes going on and we are going to be reviewing the application, beginning with a new set of before/after blueprints.

Wednesday, February 13, 2008

Security diagrams anyone?

Help me out, I need a way to illustrate the security of an application, or system, in a truly meaningful way.  Typically we see diagrams that show boxes representing workstations, PCs, desktops and servers, all connected with lines that are labeled Firewall, SSL or HTTPS, and I find these a little less than worthless.  In many cases, after a little digging, you find out that there are gaping holes somewhere in the solution - which makes the whole 'SSL' thing disinformation.  These labels tend to lull us into a false sense of, well, security.

I'm looking for a diagramming style, technique, method, structure, or tool that lets me show the various security elements of a solution, such that one could step through the picture and know where vulnerabilities exist.  I'm thinking the resultant illustration won't look like typical architecture pictures any more than a wiring schematic looks like a house.  That's OK, just so long as technology professionals can create, edit, and understand the meaning of the symbols, lines, and notations.

One last thing, and this is vital.  The solution cannot depend on the reader having to create mental models to see the security.  A line labeled SSL requires just that; you have to know what SSL means and that it begins and ends with the line.  I'm looking for something that shows the gaps, holes, and vulnerabilities.  If you have any ideas, drop me a line.

Tuesday, February 5, 2008

Reuse is not an EA metric

Enterprise Architects are always on the prowl for metrics which can be used to validate their existence.  Given that EA is more similar to Strategic Planning and/or Audit that say Sales, IT Operations, or even Project Management, this can be a considerable challenge.  Frequently, we hear that "reuse" is a measure of EA effectiveness. 

We all know that reuse saves time and money, and some can even show that a strategy of reuse will improve quality and time to market.  Reuse is a good thing; most of us would hold this as an axiom, but it's not a measure of architecture.  Don't get me wrong, measuring reuse, assuming you could actually do it, speaks more to the effectiveness of a Shared Services Team.  If the Shared Services Team is part of the EA Group, well then, I suppose it could be a sub-measure of EA.  But on the whole, reuse doesn't measure the value of architecture.

Architecture is more about quality attributes such as reliability, performance, throughput, responsiveness, to some degree functionality, and cost (which implies time), all of which could be achieved without a scrap of reuse.  Again, again, again, reuse is a good thing and is rightfully desired.  Reusing existing components; however, doesn't mean those components are loosely coupled, properly granular, or even well-designed.  But more importantly, reuse doesn't speak to the effectiveness of the EA program.

Reuse, doesn't cause the achievement of organizational flexibility, adaptability, or resiliency.  Reuse doesn't indicate an alignment between the business and the technology segments of the organization.  So, measure reuse because it's a good thing to do for many quantifiable and anecdotal reasons, but it doesn't express the value of a good architecture or architecture program.

Monday, February 4, 2008

Who really controls architecture?

So here is a scary thought; architecture is controlled by the one who last writes the code.  In the physical world, an architect designs a solution and as the product (tool, car, house, nuclear power plant) is being constructed, any number of audits are completed to ensure that what was designed is what is actually being built.

Financial institutions often provide funding for new office buildings sometimes costing hundreds of millions of dollars.  The developer does not receive a check for the full amount, rather he gets an amount sufficient for digging a hole.  The bank then sends smart people to determine if the hole is suitable for the proposed building, and only if so is the next round of funding/approval given. 

This is not so true in the digital world, at least not in the majority of corporate development centers.  A well architected solution is given to a developer (this is a topic into and of itself!) who then proceeds to write code.  Now if the digital world mimicked the physical world, we'd ask the developers to construct an object model which could be compared to the original architecture for validation.  Then, the developer(s) would have to construct, in the case of Java, Java Interfaces for all of the to-be-derived classes, and again, there'd be an audit. Digitally, we let the developer go until they have a functional system, albeit incomplete.  We then test the functionality to determine progress, and rarely if ever re-examine the actual architecture as coded.  Therefore, the architecture is under the control of the person who last wrote the code.

What techniques do you employ to ensure that developers are coding as intended by the architect?  Code reviews are seldom consistently used, and corporate time lines are so tight that adding addition delay into a project doesn't seem acceptable.  How do you validate and verify?

Follow by Email