Creative Chaos: October 2008

Thursday, October 30, 2008

When should a test run unattended? - III

First off, I've revised the title of of the series. I'm all for automating work that can be described and /precisely/ evaluated.

For example, let's say you have a PowerOf function. To test it, you could write a harness that takes input from the keyboard and prints the results, or you could write something like this:

is(PowerOf(2,1),1, "Two to the first is two");
is(PowerOf(2,2),4, "Two to the second is four");
is(PowerOf(3,3),27, "Three to the third is twenty-seven");
is(PowerOf(2,-1,undef,"PowerOf doesn't handle negative exponents yet");
is(PowerOf(2,2.5,undef,"PowerOf doesn't handle fractional exponents yet");

And so on.

When you add fractional or negative exponents, you can add new tests and re-run all the old tests, in order.

That is to say, this test can now run unattended and it will be very similar to what you would manually. Not completely - because if the powerOf function takes 30 seconds to calculate the answer, which is unacceptable, it will still eventually "Green Bar" - but hopefully, when you run it by hand, you notice this problem. (And if you are concerned about speed, you could wrap the tests in timer-based tests.)

Enter The GUI

As soon as we start talking about interactive screens, the number of things the human eye evaluates goes up. Wayyy up. Which brings us back to the keyword or screen capture problem - either the software will only look for problems I specify, or it will look for everything.

Let's talk about a real bug in the field

The environment: Software as a service web-based application that supports IE6, IE7, Firefox 2, Firefox 3, and Safari. To find examples, I searched bugzilla for "IE6 transparent", where we've had a few recently. (I do not mean to pick on IE; I could have searched for Safari or FF and got a similar list.) That does bring up an interesting problem: Most of the bugs below looked just fine in other browsers.

Here are some snippets from actual bug reports.

1) To reproduce, just go to IE6 and resize your browser window to take up about half your screen. Then log into dashboard, and see "(redacted element name)" appear too low and extra whitespace in some widget frames.

2) Page includes shows missing image in place of "Edit" button in IE6 and IE7

3) In IE6 only, upload light box shows up partly hidden when browser is not maximized.

4) In IE6 and IE7, comment's editor has long vertical and horizontal scroll bar.

5) In IE6 at editor UI, there is a thick blue spaces between the buttons and rest of the editor tools

6) To reproduce, in IE6, create some (redacted), then check out the left-most tab of (redacted 2). The icons for type of even are not the same background color as the widget itself. (see attachment)

All of these bugs were caught by actual testers prior to ship. I do not think it is reasonable to expect these tests to be automated unless you were doing record/playback testing. Now, if you were doing record/playback testing, you'd have to run the tests manually first, in every browser combination, and they'd fail, so you'd have to run them again and again until the entire sub-section of the application passed. Then you'd have a very brittle test that worked under one browser and one operating system.

That leaves writing the test after the fact, and, again, you'll get no help from keyword-driven frameworks like Selenium - "Whitepace is less than a half and inch between elements X and Y" simply isn't built into the tool, and the effort to add it would be prohibitive. If you wanted to write automated tests after the bugs were found, you'd have to use a traditional record/playback tool and now have two sets of tests.

That brings up a third option - slideshow tests that are watched by a human being, or that record periodic screen captures that a human can compare, side-by-side, with yesterday's run. We do this every iteration at Socialtext to good effect, but those tests aren't run /unattended/. Thus I change the name of this series.

I should also add, that problems like "too much whitespace" or "a button is missing but there is a big red X you can push" are fundamentally different from a crash or timeout. So if you have a big application to test, it might be a perfectly reasonable strategy to make hundreds of thousands of keyword-driven tests that make sure the basic happy-path of the application returns correct results (of the results you can think of when you write the tests.)

To Recap

We have discussed unit and developer-facing test automation along with three different GUI-test driving strategies. We found that the GUI-driving, unattended strategies are really only good for regression - making sure what worked yesterday still works today. I've covered some pros and cons for each, and found a half-dozen real bugs from the field that we wouldn't reasonably expect these tests to cover.

This brings up a question: What percentage of bugs are in this category, and how bad are they, and how often do we have regressions, anyway?

More to come.

Monday, October 27, 2008

Programming Parables

There are certain stories that should simply be a part of every technologist's background - they explain a kind of thinking about the world. Most of them, like the story of Mel or Winston Royce's "Waterfal" Paper, are very old and pre-date the internet.

Most of them are collected in "Wicked Problems, Righteous Solutions", a wonderful primer on system effects in software development and arguably a major pre-cursor to "agile" literature.

One little piece I found absolutely wonderful in Wicked Solutions is "The Parable of Two Programmers."

And, today, thanks to a guy named Mark Pearce, I found it on the intarwebs.

Here it is.

Enjoy.

Thursday, October 23, 2008

Sometimes, words aren't enough

(Sidebar: More coming the test automation series. Really. Just not today.)

Some people learn through explanation. Some have emotional reactions and enjoy anecdotes. Some like statistics, and others go for "the boss said so" or appeal to authority. A good journalist knows this and weaves statistics, acendotes, logic and interesting quotes from influential people to make an article.

But some people like to learn through experience. Indeed, in many activities (skiing, golf and writing come to mind), actually doing the work and active observation will get you far farther, faster, than reading a book.

This raises the question - if you do any training of anyone (even the guy in the next cubicle), how do you reach folks who like to experience?

Believe it or not, there's a website, TastyCupCakes.com, that lists a series of games devoted to simulating development and understanding the dynamics of software projects.

Go check it out.

What are you favorite testing simulation games, and do you think we should start a wiki? :-)

Wednesday, October 22, 2008

On Being An "Old Dude"

Some thirty-year-old just put a post up on theJoel On Software Forum: Should I get out of tech while I'm reasonably young?

Now, our youth-obsessed North American Culture bugs more more than a little, and I took the time to reply. I'd like to share a bit of that reply here:

When Steve Yeggae is on Stack Overflow, complaining that he is old because he saw the original "transformers" cartoon movie in the theater, we've got a problem as an industry. The fact is, we've got a job where you sit at a desk and your experience grows with age. People continue to be olympic athletes, and competitive at it, well into their forties. "It's a young man's game" is something people should be saying at seventy-two, not thirty two.

All this reminds me of the Muppets. Yet, the Muppets. My family is currently watching the first season of the Muppets via NetFlix. Each show has all the muppets plus one guest. The guest is actually /established/ in the entertainment field.

Of course, back then, you couldn't really get started until you were early 20's, so all of the entertainers are in their 30's at least, with the occasional Bob Hope who was in his 50's. Not an "over the hill" joke in the bunch, these folks were finally 'making it' when they hit the muppet show, just beginning to get to the top of the ladder at 35.

Because it took them ten years to have done anything of substance and be recognized for it.

It's taken me something like that long to be recogized in the field. (Yes, last month "Creative Chaos" made the top 100 blogs for dev managers.)

Those first ten years are the beginning of the story, not the end. And the reality is that I'm not an old dude; it's only a bizarre culture change that said so. It is also a recent culture change - the muppets certainly didn't feel that way in the 1970's.

Now compare that to the britney spears/ justin timberlake / christina aguilera / 20-is-over-the-hill culture we have today.

The problem with the 15-year-old teen idol, is, well, they haven't really done anything yet. And that's the problem with the 15-year-old coding genius. Sure, one is in a million is Shawn Fanning.

The rest ... aren't.

The Mickey Mouse Club is getting old too, gosh, they must be 24 now. hmmm ... Taylor Swift?

I think you get the point.

Monday, October 20, 2008

Asperger's Syndrome

A fried of mine recently sent me this article, with a subject like something like "An Aspie comes out of the closet."

The reason? I'm one of a small community of software testers who are diagnosed (or self-diagnosed) with Asperger's Syndrome, an autistic-spectrum disorder.

The way I explain Asperger's is this: My brain chemistry is a little different than most people. Growing up, I had problems dealing with people: They lied. They said things they did not mean; what kind of shoes you wore mattered more than any of your ideas. Far from a meritocracy, it really mattered in school how far you could kick a ball, how good you looked, and how quickly you could respond to a put-down. In short, I was a nerd.

As such, I turned to computers for escape. Computers made sense. If the computer did something wrong, it was because I screwed up the programming.

I earned a degree in Mathematics, which is the most objective field I know of - answers are right or wrong. Period. Even if you don't bathe for a week and have no social skills, if you're smart, you can do well in Math. (1)

Eventually, later in life, I realized that, well, people matter more than things. Having a room full of toys and no one to share them with is no fun. To be successful in any relationship, including the workplace, you need to understand people. So I got into people, psychology, and relationships. I forced myself to learn.

I used to feel bad about this -- right up until I read the very same story in a book by Jerry Weinberg called Quality Software Management. And I do mean the very same story.

It's a true story for both of us -- and, I suspect, for my friend who sent me the link that started all this. This means I have problems reading people, understanding social clues, and responding quickly with words - in the moment. (Such as: Off the cuff jokes) Ironically, that's part of why I got into public speaking -- in public speaking, you pick your words in advance, and you can practice them over and over. Ditto for writing.

So I wasn't surprised in 2001 when I read a description of Asperger's in Wired Magazine and said "that's me." Yes, there is more to the diagnosis than that, but I'd prefer to keep that part private.

The classic description of an Aspie is a "little professor" - someone who is seriously, seriously involved in particular subject area and (sometimes) has problems relating outside of that subject area. As a young person, I craved a structured envrionment that made sense; one of the reasons I loved playing cadet was that I knew who to salute and how to march and how to wear a uniform - the rules were explicit and written.

And those who knew me as a cadet also new that I ... needed a little help socially.

In the 1940's, someone with Asperger's might they collect stamps, or coins, or have a model train collection, or maybe knew every single baseball stat for a particular team. Today, Aspie's are more likely to write code, test software, play with CSS style sheets or design aircraft.

The end result of all this is that I'm not typical. Duh. No one is really average in every way. If I had my choice, at the beginning of my life, to be an empty suit with great social skills or someone who was able to generalize, abstract, create, and do wonderful things ... I don't think It'd be a tough choice.

For centuries, it has been ok for artists to be a little bit weird - In fact, I remember one graphic designer who used to wear flip-flops to work (that he promptly took off) combined with some sort of odd faux-prisoner outfit.

That little spark of oddity about the creative person on a bad day is the same thing we credit as the spark of genius on a good one. News Flash: Techie folks can be creative too.

What I'm trying to say here is - I may have Asperger's Syndrome, or it may be what they called "Disgraphia" when I was in Grade School - or it may be something else.

I don't believe it's the kind of thing to be hidden, but I've never made a post on it.

And when I got that email, well, the time seem right.

Regards,

--Matt Heusser
Footnotes:
(1) Yes, I bathed. Gosh, it's an expression!

Cloud Computing is the new XML

In April of 2000 I took a development course and got a free copy of XML Magazine. I didn't get what the actual value of the technology was. A few months later, I realized that there was nothing to get - if XML had value, the guys hyping the magazine didn't know it yet.

And if you don't get the same feeling listening to the gurus talk about cloud computing, I suspect you haven't been listening close enough.

But first, the Good News
Just like the XML, Cloud Computing does have some places where it can add value. I argue that, in the next few yeas, we will see some generally useful applications and niches for cloud computing. Eventually, over time, some organizations will be be able to to 'give up' their data centers and turn on web servers like a we turn on electricity or tap water - but anyone familiar with virtual web hosting allready has a deal like that. Eventually, it might be possible to place our servers and databases in the grid and turn up the number of servers when demand spikes. Still, Tim Chou wrote "The End Of Software" four years ago, and the sad fact is that that capability is still years out. What do I believe about cloud computing today?

The Bad News
- Cloud Computing will have limited applications compared to the over-hype it has now,
- It will take years to realize those applications,
- Exactly what you should be doing with cloud computing will probably be very different from what todays 'visionaries' are telling you
- The 'visionaries' who are currently hyping cloud computing will probably shake out of the market before the true value and applicability of cloud computing is realized

It would be nice to be proven wrong. For the time being, I believe the smart money is against the hype machine.

Cloud Computing is the new XML.

Remember: You heard it here first.

Friday, October 17, 2008

Blank Sheet of Paper Syndrome

(No, I haven't forgotten about Test Automation. I'm just trying to leverage my time in the best possible way. Most of this post came out of a recent discussion on the SW-IMPROVE discussion list ...)

One thing that I found helpful when gathering requirements is something I call "blank sheet of paper syndrome." That is to say, if you give a (brand new) analyst a template, they will often react with something like this:

"oh, easy, I can do this. Project Name, Project Manager, Executive Sponsor, Initiated Date, Today's Date, Desired Date ... I can fill this out." (... 30 minutes pass ...) "Here's your requirements doc."

On the other hand, if you give them a blank sheet of paper, you are more likely to have a reaction like this:

"But ... well ... this is blank! I have no idea what to write! I don't really know what the customer desires! I had better go find out!"

In my experience, this second reaction is much more likely to result in figuring out the /essential/ requirements of a software system, instead of gathering requirements that "all look like each other."

The argument /for/ requirements templates is that without the template you would forget something. My personal conclusion is that, for the vast majority of projects I have worked on, I will gladly run the risk of forgetting something on the template, in trade for the benefit of, hopefully, capturing something more essential on a personally-written document.

Now, I would be remiss if I did not add that the Planning Game in Extreme Programming - where you have no person in the middle and negotiate the requirements, is one valid implementation of blank sheet of paper concept. You cards could even have a slot for title, points, priority, and description, and I wouldn't be offended. :-)

In other words, Jerry Weinberg's Rule of three, in that you should strive for at least three options for every problem, still applies. "No Template/Blank Sheet" and "All Template/No Customization Sheet" is a false dichotomy - you can always do more or less, allow some customization, have a half-dozen questions instead of a three-page template, and so on.

All that I am saying here is that in the push for stable/predictable/repeatable, some organizations try to make all projects look the same. Sometimes, for the project team, that can do more harm than good.

Thursday, October 16, 2008

So what's a Privateer Scholar, anyway?

A few months back, I changed my LinkedIn title from dev/tester (or whatever it was) to "Privateer Scholar."

The title is derived from James Bach, who took it from Buckminster Fuller's Operating Manual for Spaceship Earth.

Here's the basic idea in Bach's words:

* A buccaneer-scholar is anyone whose love of learning is not muzzled, yoked or shackled by any institution or authority; whose mind is driven to wander and find its own voice and place in the world.

* This way of being has sometimes been called autodidact, individualist, anarchist, non-conformist, contrarian, bohemian, skeptic, hacker, hippie, slacker, seeker, philosoph, or free thinker. None of those fit for me.

Now, Bach refers to himself as a buccaneer because he is an independent contractor/consultant - he's really on his own.

I, on the other hand, like to try to work within organizations. I work within a specific authority (Socialtext, or Calvin College, or Maybe BZMedia) against it's enemies (competition - and at Calvin, the competition is called "Ignorance"). In piratical terms, for the time being - I seek a letter of marquee.

For more detail, see Bach's presentation on the subject.

Monday, October 13, 2008

When should a test be automated - II

Before we can dive in, let's take a step back.

When people talk about automation, they typically mean replacing a job done by a human by a machine - think of an assembly line, where the entire job is to turn a wrench, and a robot that can do that work.

Oh, at the unit level, where you are passing variables around, this makes a lot of sense. You don't need a human to run the test to evaluate a fahrenheit-to-celcious conversion function, unless you are worried about performance, and even then you can just put in some timers and maybe a loop.

But at the visual level, you've got a problem. Automated Test Execution comes in two popular flavors - record/playback and keyword driven.

Record/Playback does exactly what you tell it to (even at the mouseover level) and does a screen or window capture at the end, comparing that to your pre-defined "correct" image. First off, that means it has to work in the first place in order to define that image, so the only things you can record/playback are the ones that aren't bugs to start with - but more importantly, that means if you change the icon set, or if you image contains the date of the transaction, or if you resize the windows - or screen - you could have a test that passes but the computer tells you fails.

To fix that, we created keyword-driven frameworks, where you drive the GUI by the unique ID's of the components. A keyword-driven test might look like this:

click_ok, field_last_name
type_ok, field_last_name, heusser
click_ok, field_first_name
type_ok, field_first_name, matthew
click_ok, submit_button
wait_for_element_present_ok, //body, hello matthew heusser, 30000

Keyword-driven tests only look at the elements you tell them to. So, if the text appears on the screen, but the font is the wrong size -- you don't know. If the icons are wrong, you don't know. In fact, the code only checks the exact things you tell it to.

At the end of every manual test case is a hidden assertion - 'and nothing else odd happened.'

Keyword-driven tests fail to check that assertion. Record/Playback tests try, but fail to have any discernment, to know if the change is good or bad.

But that might be just fine. Keyword/Driven might be good enough for some applications. In others, we expect the image to never change. We can use automated tests as part of a balanced breakfast to eliminate some brain-dead work so we have more time for critical-thinking work.

The question is what, when, and how much.

Stay tuned.

Friday, October 10, 2008

When should a test be automated - I

I stand behind my last post on the Holy Grail, but it was often mis-interpreted as "no test automation."

Now, certainly, that's silly. At Socialtext, we use all kinds of tools to assist in our testing, some of them traditional run-capture-compare tools, others setup, bug tracking, grep, analysis, summary ... the list goes on.

That leads to the question "when should a test be automated" - of course that's a little bit silly, as having a computer look at one field, and having a human look at an entire window are two different things, but I do believe that it might be more helpful if I actually explored the area and provided some tangible advice.

To start with, let's take a look at Brian Marick's Article:

"When Should A Test Be Automated", that he wrote in 1998

I will use that as a jumping-off point. So take a good look, and tell me what you think.

More to come.

Wednesday, October 08, 2008

Complete test automation is the Holy Grail?

Recently, Phil Kirkham mentioned a comment he'd heard that he was puzzling over:

"Of course, complete test automation is the Holy Grail of software development."

The speaker was talking about interactive (user-driven) systems, and probably meant that automated test execution (do-this-compare-to-that), giving complete confidence in the push of a button, was this thing people searched for that had the magical ability to solve all your problems.

As I am a bit of an Authurian Legend buff, this intrigued me.

But what is the Holy Grail, really? WikiPedia says it is the cup (or maybe plate) used at the last supper - the one dipped in by both Christ and Judas. Or maybe it was a cup that held Jesus's blood. In whatever case, it's magical, and can heal people. Maybe. We think. Sorta.

As a Catholic, I do believe in the possibility of relics and sacred tradition, so I looked it up on Catholic Encyclopedia. You can read the entire article here, but just check out the summary:

A word as to the attitude of the Church towards the legend. It would seem that a legend so distinctively Christian would find favour with the Church. Yet this was not the case. Excepting Helinandus, clerical writers do not mention the Grail, and the Church ignored the legend completely. After all, the legend contained the elements of which the Church could not approve. Its sources are in apocryphal, not in canonical, scripture, and the claims of sanctity made for the Grail were refuted by their very extravagance. Moreover, the legend claimed for the Church in Britain an origin well nigh as illustrious as that of the Church of Rome, and independent of Rome. It was thus calculated to encourage and to foster any separatist tendencies that might exist in Britain. As we have seen, the whole tradition concerning the Grail is of late origin and on many points at variance with historical truth.

Let me put that in testing terms:

"As we have seen, the whole tradition concerning the test automation legend is of late origin and on many points at variance with historical truth."

In other words, the Holy Grail is is the stuff of fairy tales, said to have mystical powers but never actually seen by anyone. King Arthur's legend was an interesting story from my youth, a fun little adventure to pretend as a child - but when I became a man, I put away childish things.

Perhaps you could say that test automation is the Holy Grail of software development, after all.

Monday, October 06, 2008

New Annoucements up

(See below the "Creative Chaos" banner and description. You know, the text that's always the same that your brain filters out? It's different this time. Really.)

Working at an innovative software product company is simply amazing, and teaching information systems at night is one of the great honors of my life. At the same time, I've been able to teach religious education on Sunday, keep the monthly column going in Software Test & Performance Magazine, and even do a little speaking and, I had hoped, get some work on a book.

And I'll be coaching soccer in the spring. The problem is: How do I find the time to do a /good/ job as a coach? Because to do all this, I've been shorting the people I claim are the most important in my life: My Family.

Something's gotta give.

I have to learn to say "No." Now, I love conference invites. I'm honored, and I work really hard to make things work. But please, don't invite me to speak on a different continent with two months notice. That just isn't going to work. Odds are, if you lead a professional group, I'd love to speak in front of you, but please, let's talk June 2009, ok?

Of course, the first thing to go will be the blog. I'll be around, but not as much as I might hope. Expounding on how 'agile' is an /attitude/, or writing that treatise on test estimation that really hits the nail on the head -- that might have to wait.

Do you ever have to make tradeoffs between the good and the best? If you do, please tell me about it. I'd love to hear.

Warm Regards,

--matt heusser

Creative Chaos

Schedule and Events