Tag Archives: Software Development

Random thoughts on “Random”

Humans have a hard time understanding the concept of ‘random’. A great example of this that I love to use is to get someone to quickly pick the first ‘random’ number they think of between 1 and 100 (you can do this right now). If the number was truly random, a pick of 2 or 97 are equally likely. In reality, humans are really bad at being random number generators. This becomes even more evident if you ask someone to pick 2 or 3 numbers. Most likely someone will not pick number close together but will instead pick a few nicely space numbers.

When you tell someone to pick a random number, their brains automatically try to create a normalized set of numbers. Computers are also bad at pick random numbers but for completely different reasons.

When it comes to software development, you may require a feature with random elements to it. The classic example of this is ‘shuffle’ in a music player such as iTunes. If every time it tried to pick a new song, it picked a ‘random’ one, you may find yourself listening to the same song a few times in a row or songs from the same album back to back. The typical user reaction is ‘This shuffle is not very random’. We know this to be absurd. In fact, the song selection is very random and it is the human who is unable to understand what random means. What the user actually means is they want a more normalized distribution of songs rather than truly selecting a random one each time.

How a lot of music players solve this problem is to randomize the order of all of the songs in the playlist instead of picking a random one each time. This produces a playlist where each song is played exactly once. To a human this feels more ‘random’ but is actually just more normalized. There are also other tricks you can use like ensuring that “like” songs do not occur back to back such as keeping the same artist from playing back to back. There are lots of other ways you might be able to give a better user experience by making the “random” feature less random.

Never take what a customer says initially at face value. You often need to dig at what they really want. Though the customer claims they want ‘random’, they probably do not. Software Development is all about trying to find out what the customer is really looking for when they ask for something completely different.

Write as Little Code as Possible

The Zune Bug

A few weeks ago I heard of this issue where the Microsoft Zune crashes and won’t startup on December 31, 2008. The reason for this? A bug in the software that handled leap years. There are lots of articles on the original issue.

When I heard of this issue, I originally thought ‘How could that happen?’. Sure I could understand not handling things correctly when it is a leap year but causing a crash?

Well, if you like me were wondering how that could happen, you can now find out for yourself. This is a post someone made of the actual code that runs on the Zune. Look at line 249 and on.

If you missed it, there is a bug (obviously) where if the number of days passed in is 366, which is the case for December 31, 2008, the loop never terminates. The code checks to see if days is greater than 365 in a while loop. It then handles the greater than 366 condition but never checks if the days is equal to 366.

Write as Little Code as Possible

The less code you write the fewer bugs you will have in it. Most languages today have built in support for common tasks. One example of this is the Java Calendar object which would allow a developer to do what the zune code does using the platform APIs. Unfortunately, High level languages are usually not an option when writing code for a small embedded device.

Tips for writing as little code as possible:

  • Start with a high level language. Preferably use an agile and dynamic language if possible (Groovy, Ruby and Python etc). This of course depends largely on the requirements of the application.
  • Where possible using the built in platform APIs to do what you need.
  • Use open source software to fill in the gaps of platform APIs. Apache and Codehaus are great sources of open source software that are commercial friendly to ship.

You Don’t Have to Use it

This past week was the Electronic Entertainment Expo (E3 for short). Most of the major video game console vendors, publishers and developers get together and show the press some of the new stuff happening this year.

This year Microsoft unveils their new and improved Dashboard. The dashboard is the main Xbox interface that is used to navigate the downloads store, games, achievements, friends and media. Initial reactions on it were mixed. Some people felt it tries to dumb things down.

New XBox 360 Dashboard

As a consequence of this new press, Microsoft’s Nelson indicates that a href=”http://xboxfocus.com/news/618-using-new-dashboard-optionable/index.html”>you don’t have to use the new interface. The entire old interface is still present and can be accessed with the press of a button.

There are two fundamental problems I have with how they are approaching this problem. The first is that this is a violation of the DRY principle. Don’t Repeat Yourself. Most of the time we talk about not repeating code sections but I like to apply this to interfaces as well. In most cases, I cannot see a reason to create two interfaces that do the exact same thing. Users can be confused when they are presented with multiple ways to do the same task. From a user’s perspective they want to be instructed how to use the feature correctly and they expect a single answer for this.

The other problem with Microsoft’s approach is that they created multiple ways to do the same thing. I have been working on projects where new interfaces or features were proposed do deal with specific problems. These features were clear improvements over the old way. Inevitably someone asks “Is it possible to keep the old way of doing things as well as the new” or “can’t we have a button to enable the old interface”. These comments should cause you to rethink how good the change is. The new feature always one of the following: better than the old feature, worse than the old feature or neither better or worse than the old feature. If it is better, adopt the better feature. If it is worse or not any better, the feature should be revisited as to why the new approach was taken. Maybe a tweak of the feature could be better. The best option is rarely to keep both ways of doing things.

It is always best to provide a single way to do a single task. This creates a clear and consistent interface for the users. It also creates less confusion for users and makes it clear what is the proper way to use the product.

Customer Issues that are not Reproducible

A few weeks ago my Honda Civic was experiencing an intermittent clunking noise. I am by no means an expert in automobiles but I am fairly certain this noise was not “normal”. I proceeded to let the dealer know about this problem when I had it in for maintenance. I explained how it seems to make this noise sometimes when accelerating or decelerating.

Were they able to fix my problem? Of course not. They were not able to reproduce the problem and consequently did not fix anything. For car problems, we accept this and as a user of the car I am supposed to ignore this until it is either reproducible all the time or something major breaks. They did assure me that they did not see anything major wrong so there was no danger in driving the car.

In the world of software development, this just does not fly. If the user has a problem, they expect it to be fixed regardless of if they can reproduce the problem or even adequately explain it. For example, a support technician brought a customer issue to me where a server product hangs sometimes for no reason. It only happens on one machine and it works fine everywhere else.

The customer wants two things. An explanation as to what is causing the problem and a fix if one is possible. This is not an unreasonable request. I essentially want the same thing for my car. But how can this problem be fixed if it is not reproducible?

There are no coincidences in software. If it happens once, it can happen again. I do not care if the problem was caused by planetary alignment and cosmic rays… the planets can align again and there are always cosmic rays.

Your job as software developer or support technician is to gather as much information as possible to narrow down the problem and to try and find the root of the problem. Find out when it is likely to happen. Try to come up with a reproducible set of steps that will cause it again. With enough work you should be able to figure out what caused the initial issue. Once you find out the cause, then you can fix it like any reproducible issue.

Software Development Philosophy and Performance

Recently I have been working a lot with some people I would call “Old School Developers”. They were brought up in the “glory days” of the pre-internet era of development. I have been tasked with teaching an old dog new tricks. In this case, I am teaching Web Services development using Apache CXF and the Spring Framework to a few C++ developers of 20+ years. These people are really smart and have a good understanding of software development. What they need is a paradigm shift.

When working with these people, I constantly hear concerns about performance. This seems to be the number 1 issue on their mind. They need to be careful that they always achieve the best performance possible no matter how much extra complexity it adds to the code. These constraints even influence the design of the code where fundamental design changes are made for the sake of performance alone.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” (Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.)

Programmers are notoriously bad at knowing when to optimize code. We end up focusing on the little things when we really should be concerned with reducing more broad performance issues. Is shaving off a few milliseconds a priority when you are dealing with remote web service call over the internet?

My philosophy on software development is to focus on the most important aspects first. As you might have guessed, performance does not make that list. The most important criteria for

  • Readability – Write Code that is readable. You or someone else will appreciate it when you have to come back to it at a later point in time.
  • Simplicity – Keep it simple. Only build in complexity where required and only to the degree required. You can always refactor it later.
  • Say Less – Less code is more. If you can leverage something that exists, do so.

Do not take this the wrong way. I am not trying to say here that performance is not to be considered or that it is not important. I am simply saying that it is not the most important issue. Once you have good readable and simple code, you can go back and optimize it where it needs it. If you start out with performance as a goal, you will never end up with readable and simple code. No one goes back over working and fast code with the intent to make it slower and more readable.

Commercial Software and Open Source Licensing

There are many different types of open source software licenses. These range from the Apache License which is very commercial friendly to the GPL License which is not. In a nutshell, the reason the GPL is not friendly towards commercial software is because of the requirement that anything using the GPL code must in tern by provided under the GPL license and also open source. It is understandable that a company creating commercial software would not want their source code to be open.

In many cases the commercial software just wants to use an existing open source library within the product. For this case, the Lesser GPL License (LGPL) exists. This is a lesser form of the GPL that is supposedly commercial friendly. It does not require the source code of the commercial application to be opened nor does it restrict the licensing.

Even though the LGPL was intended for this type of use, many companies are required to stay away from any LGPL libraries. This is because of the following clause in the LGPL:

You may convey a Combined Work under terms of your choice that, taken together, effectively do not restrict modification of the portions of the Library contained in the Combined Work and reverse engineering for debugging such modifications, if you also do each of the following…

http://www.gnu.org/licenses/lgpl.html

The license clearly indicates that reverse engineering is only permitted for debugging the LGPL licensed library. The problem is really what is the definition of “reverse engineering for debugging”. Many companies do not want to put themselves in a situation where a customer could reverse engineer their code for “debugging” and also manage to get any trade secrets from that process. That type of a legal battle would be difficult and costly to fight. If someone maliciously reverse engineers your code and you bring legal action against them, you do not want to give them any resources they could use as a defense. It is far simpler to avoid the LGPL license all together then you never have to worry about this reverse engineering clause.

That clause was added to protect the integrity of the LGPL library but it has the effect of turning many commercial companyies off the LGPL totally.

Software Licensing (Part 2)

In Software Licensing (Part 1), I wrote about the issue concerning PC game piracy. This is not only an issue for the gaming market, but also for commercial software. Consider, Microsoft Windows. They took a lot of flack over their Windows Genuine Advantage system when they first rolled it out. The system took a “guilty until proven innocent” approach where it suspected all users of being pirates until they “validated” that their copy of windows was genuine. Microsoft took this system one step further with Windows Vista. After your Windows CD Key has been used twice (once to install the first time and once to allow for a reinstall), the key is locked and will not allow further validation against Windows Genuine Advantage. If you are in the habit of regularly reinstalling your operating system, you still can do this but it requires an extra step to call Microsoft and request that they reactivate your key.

At first, this validation mechanism seems similar to what Valve Software has done with Steam (discussed in part 1). The key difference here is that the software validation only works one way. Microsoft can verify that each CD Key is used once and only once for authorizing a copy of Windows Vista. What it does not do is authentication. The user has no way of proving to the Microsoft Genuine Advantage Servers that they own the CD key entered. Instead, the first person to happen to come along with that key is taken in as the rightful owner with no questions asked. If for some reason, you need to reinstall Windows Vista (more than once), Microsoft has to allow your CD Key to be used on additional computers. Since this is a common task for many users, this type of procedure is a common request of Microsoft Technical Support.

This type of request has become so common that you can even get a CD key reset without even giving your name. A friend of mine was telling me that it is possible to get Windows Vista Ultimate that is Windows Genuine Advantage validated without even owning a copy of the software.

  1. Download Windows Vista. This should be pretty easy to come by. I do not think Microsoft cares too much if you pirate the CD because you cannot use it without your copy being validated by Windows Genuine Advantage.
  2. Download a CD Key generator for Vista OR borrow a CD Key from a friend
  3. Call Microsoft support and indicate that you need to reinstall your computer but your CD key did not work. Microsoft will unlock this key to allow it to be used on more computers.

For all of the work Microsoft put into their new Anti-piracy system, a pirate can now easily get a “genuine” copy of their product simply by calling their tech support. I doubt this was what Microsoft had in mind when they implemented this new security scheme.

Licensing in Software Development

Make licensing easy. Do not treat your customers like criminals. Instead make the licensing process simple. For a long time PC games have required the physical CD to be in the drive in order to play the game even if the entire game contents are on the hard drive. This sort of a thing is more of an annoyance to paying customers rather than a deterrent against piracy. A pirate will crack the software so that no CD is required. Requiring the CD restricts the user more and actually makes the pirated way better. I knew a friend who usually purchased games legally and used cracks to allow him to play the games without the CD in the drive.

Long product keys are not the most friendly form of licensing. I work with business software and prefer using license files as opposed to a simple key. I also believe that these license files should be in plain text with a hash signature. The benefit to this is that you can store lots of information about the customer inside the file. The hash protects the file from tampering and the file can be read by a user.

For example, consider the following license file format:

<?xml version="1.0" encoding="UTF-8"?>
<license>
   <product>Product XYZ</product>
   <version>1.0</version>
   <customer-name>Jane Smith</customer-name>
   <key>647608973E40E3D2A31A886DC1AE3092</key>
</license>

A simple utility can be created to create this license XML file and generate the “key”. The key can be simply the content of the license file with a little salt thrown in. The salt can be secret predetermined random string that is added to the content before hashing. Unless the secret value of the salt is known, the hash can not be recreated with new values for the content. This protects the license from tampering.

To use this license file, the key can be checked to verify that the license has not been tampered with. After that the XML can be read with a standard XML parsing library to extract the license data. The software can store whatever information required here with no restrictions on length or type of content.

Piracy Protection

Licenses do not guarantee that the software will not be pirated. They provide a deterrent so that it is not as easy to pirate the software. So what should a software developer do to protect your software from piracy?

The simplest and probably best solution is to provide a service that accompanies your software. In part 1, Valve Software only allows users access to their online multi-player if they have an authorized account. If you are in a situation where you can provide services along with your software, it may provide an incentive for an otherwise pirate to purchase your software.

When it comes right down to it, if your users want to pirate your software, they will find a way. You can take whatever measures you want to make that harder for them, but they will inevitably find a way around them. Look at things like DVD encryption. Broken. HD-DVD and Blue-ray were said to be impossible to break within the lifespan of the media. Also broken. Providing security mechanisms is a good deterrent to casual pirates but even the best security can be eventually countered. The key is to not make the security too strict that it creates a hassle for paying customers.

Developer Testing

Testing is a hard problem because there is no way to guarantee that a certain product or piece of code is 100% bug free. Many organizations have testing or “quality assurance” departments who are responsible for doing the majority of product testing before software goes to the customer. Even with a dedicated testing department, developers still have a role to play in testing their code. This article describes the developer testing philosophies used on the project I work on. The project is a server application that mainly manipulates and creates documents.

After the developer testing is complete, the build should be in a good state to enter testing by the quality assurance group. The beauty of this is that it all can happen automatically overnight.

Nightly Builds
A full build of the project is automatically run each night after the developers go home. This is not a testing method by itself but it provides a process for further testing. Developers know that they cannot check in code into version control that will not build that evening. After a new feature has been committed, all members of the team can have access to this the next morning and ensure it works properly. If required, this build can be given to others outside the development team. This is a fully working version of the product that may even go to the customer.

A report of the build is emailed to the team indicating if the build passed or failed. Each build has a unique build number that is the same as the Subversion revision number. This way the build number is unique and the developers can get that exact build code from the source code repository if needed.

Nightly Unit Testing
After the nightly build is done all of the unit tests are run for the whole project. We try to restrict the unit tests to test true “units”. That is to say, test a single class or a very small number of classes. All of the unit tests are either Java JUnit tests or Groovy unit tests. A report of the unit tests is generated using the junitreport ant task. This indicates which tests passed and failed with information on any errors.

Nightly Integration Testing
After the unit tests are complete a set of integration tests are run. These integration tests test the entire product. The integration test suite installs the product, runs the product and then performs a series of tests to ensure that the basic end-to-end functionality of the product works. On this product a bunch of test input files are processed through this running server. The outputs of these are validated to ensure everything works properly. A lightweight test harness was created in Java and Groovy to do run the integration tests and perform this validation. This framework was created from scratch rather than basing it on JUnit as these tests are specific to the application domain. A report for these tests is generated.

Performance Testing
Performance is an important part of this application. The same test harness that runs the integration tests can be used to run the performance tests. Instead of validating if the software works, the speed in which the software performs its tasks is measured. This can be compared to previous versions of the software. This is not run with every nightly build because it takes more than 24 hours to run. It is instead run on occasion over the weekend.

Just like Taking out the Garbage (With Version Control)

I had a Mathematics teacher in High School who used to get very excited over factoring problems where you could simplify expressions by canceling out terms. He used to say it was just like “taking out the garbage”. Taking out the garbage was not usually a fun task but it is surprisingly satisfying to get rid of stuff that is not necessary and is just clutter. I run into this same sort of thing when developing software. I just love to “take out the garbage”.

Version Control software is essential to any project, no matter how small (That in itself is a topic for another day). Version Control software gives you the ability to take out the garbage all of the time without having to worry about losing something that is important. You can always go back to old versions of a file if you need them at a later point in time.

Delete Unused Classes

Often when you are refactoring a component or adding something you end up with a Class that was used before that is no longer relevant. Delete it. Remember that you can always get it back later if you need it again through your Version Control system. It does not matter how ‘useful’ this code was, how ‘nice’ it looks, how ‘cool’ it is. It may have been a useful utility that you might need again. You have to resist these urges to keep it around. Unless you know for a fact that you will use it again, you should get rid of it. It is not gone forever and you can get it back if you need it. This will help reduce the complexity of your code and consequently the readability.

Delete Unused Code, Don’t Comment It Out

I see a lot of developers who take a piece of code that is no longer used and comment out the entire section. Resist the urge to keep it inside the code. You can always get the code back through version control, so why do you want to keep a long comment block somewhere where it needs to be maintained? Another danger to this is that if you do want to use this code later on, you will likely end up removing comments around code that no longer compiles. Things change and this once working code may no longer work after you uncomment it.

The follow code is a good example of this. The line that was commented called a function that takes 2 parameters. Notice that the current version of the function takes only one. Code blocks that are commented out are not compiled so the code is not kept up to date with the rest of the code.

// We don’t need to do this anymore
// variable = someFunction(a, b)

someFunction(a) {

}

It is simply better to just delete the code block. If you need it at some point in time later, the version control software can resurrect your lost code.

Bug #12324 Add a heading here
Bug reference numbers are not necessary

I have seen lots of code with comments around a changed block indicating the change was for a particular bug. Before long, the code has a mess of comments all over the place indicating bugs that were fixed and where. This is another case where version control software can eliminate comments that unnecessarily clutter your code. This one does require a bit more discipline though.

Always indicate the bug number being fixed and a small when checking in code into Version Control. For example, you could add a comment like “Bug 12324: Added heading to section”. Then when you look through the Version Control log, you can easily see where changes were made and what they were made for.

If you are looking at a particular line of code and you need to know where it came from and why, you can use the “blame” feature. That will give you the person who last changed the line of code you are looking at along with the comment (which should include the bug number and fix description). For more information on this feature of version control systems, check out the “Who wrote this Crap?” (http://www.codinghorror.com/blog/archives/000992.html) article on Coding Horror.

Taking Out the Garbage

So next time you are confronted with “stuff” that is not longer used. Delete it and let your version control do the work of remembering it. This will keep your code much cleaner and easier to read. It is just like taking out the garbage.

From Waterfall to Agile

I read a great article today on the future of software development. It gives a great summary of what is wrong with the waterfall method of development and how agile methodologies try solve this problem.

The problem was that the Waterfall Model was arrogant. The arrogance came from the fact that we believed that we could always engineer the perfect system on the first try. The second problem with it was that in nature, dynamic systems are not engineered, they evolve. It is the evolutionary idea that lead to the development of agile methods. (article)

Management is often slow to learn of new trends in software development. The fact that agile has been around since the nineties and many institutions still have not heard of it proves this point. I believe the best way to bring agile methodologies into an existing organization that uses the waterfall or another approach is to bring them in slowly. This is what we have done in my organization and it has been very successful. Here are some good starting points for change.

  1. Change how requirements are viewed. If you are handed a set of requirements for an entire product release, plan for them to change no matter how “final” you are told they are.
  2. Split the development cycle into iterations. At the end of each iteration reprioritize your tasks for the next iteration. 3-4 weeks is a good length for an iteration. Create a full release of your product at this point whether someone will actually use it or not. Hopefully you will have a QA team to test it at this point but it is important to create a fully working version of the product anyway.
  3. Leave the design of system components to the iteration they will be developed in.

Pretty soon you will find that your Waterfall approach is a lot more agile than it used to be. I encourage you to keep changing your approach as you find what works best for your organization.