I wrote a simple program for work a week ago – a webMethods java service that deletes a directory and recursively deletes all files and sub-directories. I put together some unit tests, and it ran great. Then, I was told that when invoked from another service (creating the directory for zip/unzip), half the time the directory would not be deleted (though the children usually were), unless the program was run in debug mode. Of course, when the program ran stand-alone, it was flawless.
Suspecting a race condition, I played with some delays (for analysis only) between the child and parent deletions as well as between this service and the service invoking it. No luck with that. Perhaps the invoking service was taking time to release the directory resource so I tried a 15 second delay to rule that out – no luck as well.
It didn’t make sense for such a simple program. My guess was that another process was sometimes not releasing the resource and got stuck down that path for a bit. After a while, I decided to create a mind-map as to what was going on and what I was observing to see what would be revealed. As suggested in the Pragmatic Wetware book by Andy Hunt that I am finishing up, after a little bit of time the R-mode of my brain took over and I found a number of things I could try.
One of those things was checking to see if the directory had any children even after deleting all of them. Of course, this was silly because only the directory remained empty (the files were gone), and I almost skipped trying. Much to my surprise, they were not empty. Playing with Winscp, I discovered unexpected .nfs files were showing up as files were deleted. Furthermore, deleting the normal children mysteriously caused these files to be created, and deleting these files caused other .nfs files to suddenly spawn into the directory. Thus, the directories were no longer empty and could not be deleted.
I drove home thinking about this debugging incident and how to make it better and more efficient. Here is what came to mind:
- The use of mind maps was certainly effective and something I want to continue
- Its important to challenge your assumptions and don’t get locked into them early. Yes, it could be that another process did not release it, but there are other possibilities as well. Using a mind map earlier would have helped, but more helpful would have been to assess how locked into my assumptions I was
- Debugging is a technical and creative endeavor. Studying L-mode facts about the situation, and then employing R-Mode techniques earlier on would have helped
- You don’t need to stay locked on the problem until it is solved. I should have moved on to other work and let the R-mode side of brain work on the problem in the background
- Sitting down and trying to think of all the “evil” ways that the system could be messing me up was also helpful
- Continue playing and trying things that should not happen – I am really glad I did that!
In the end, I coded my program to delete the children before the parent directory because File.delete did not get rid of non-empty directories. Because I saw the children gone, that obvious possiblity eluded me for a bit. That is what I am thinking about for the future.