Archive for April, 2011
So RDoc is an absolutely terrible documentation generator both from the usage and the coding perspective. Although yard is not yet quite as stable as I’d like, it is everything RDoc is not, it’s pretty, well written, and provides numerous well thought out extension points. So as an exercise I decided to fix one of the more annoying issues with yard as an extension.
So recently, dhh tweeted that he doesn’t really get why cucumber and rpsec are so popular and questioned their benefit outside of a niche audience relative to the high cost of writing with them relative to X-Unit style tests. I’ve used ruby test, rspec, and cucumber extensively and I’m finding it hard to see that point of view.
I think the disconnect may be related to what your perspective is on the use of a unit test. While I think most developers would agree that unit tests are for validating and developing code, I think if that is all you see them for it is essentially like saying that good code is code that performs a task correctly. In both cases there are more factors to consider.
First, tests have to be maintained just like any other code. Unmaintainable tests slow down code refactoring rather than supporting it and as such will generally be discarded and the most useless tests are ones that don’t get run. So at the very least you need to apply the same concepts of readability, maintainability, etc to your tests that you (hopefully) bring to the rest of your code.
A second, and to my mind equally important, use of tests is to document what your system is supposed to do. Large projects with many maintainers and users may not have as much of an issue here. If your framework is covered by thousands of blog articles, books, and examples are widely available then feel free to disregard this advice. For the rest of us, getting as much self-documentation out of a code base as possible should be a major goal.
This is really where both cucumber and rspec outshine XUnit style test frameworks. A cucumber test is a human-readable description of feature of your system that a new coder can read without having to understand the underlying code. Even better, because each statement in the description links to code it provides a direct link between feature documentation and code required to configure, execute, and validate that feature. Finally, error messages have the same level of descriptive context making understanding errors that much easier.
The one point I will concede is that cucumber is not one size fits all. If you are validating an API (as opposed to a product interface like a web page or gui) cucumber is less beneficial because it obscures the API itself and replaces it with human language. In these cases rspec is clearly preferred.
In cases where you are testing the system however cucumber has some clear benefits over rspec:
- It forces you to think in terms of how the product will be used rather than how it is implemented.
- It forces you to describe that behavior concisely.
- It documents the behavior of the system clearly in a way even a non-programmer can understand.
- By doing all of this it makes maintaining tests much easier even as product feature implementations change because implementation is clearly abstracted from the test script
I’m surprised how often people in our industry inform me that picking up new programming languages should take a programmer a day or two or maybe a week. I disagree and I think that the assertion really belies a very limited perspective on what it is to learn a new programming language. Take, for instance, the transition from C++ to Java.
In many ways this transition is easy, both languages share similar concepts: they are both procedural, object oriented, and share similar concepts like static typing. I would generally agree that a C++ programmer can write an object in Java in a day and start reading Java code fairly quickly. Unfortunately learning the syntax is not learning a programming language and in many cases similarities are more of a problem than a benefit.
There are any number of features in Java and C++ that are beguilingly similar, yet have important and even dangerous differences: Exceptions and exception specifications, interfaces/multiple inheritance, templates/generics, type conversions/boxing, automatic garbage collection/smart pointers, packages/namespaces, etc. These similarities lead to, in the best case, terrible design and more generally in subtle coding bugs that then lie in wait for the unsuspecting code maintainer.
Beyond the basics, these differences in languages often lead to completely different idioms which a programmer may not have to completely understand to read or write some code, but which are crucial to writing code that other coders will understand and maintain. In C++ templates and operator overloading lead to the STL, boost, and any number of DSL-style libraries which provide usability (to a point) and performance.
In Java DSLs are, as much as any language designer can make them, forbidden. Performance is driven by just -in-time compilation rather than dynamic vs. static trade-offs, and collections libraries are driven by interfaces and anonymous classes with a syntactic sprinkling of generics rather than meta-programming concepts.
As a result writing a Java-style library in C++ or the reverse tends to create code that may work, but will provide a continuous maintenance challenge to any future coder in terms of readability, re-factoring, and integration with other standard language libraries and frameworks.
But these problems are really the tip of the ice-burg because the real challenge to effectively using a modern programming language is not in writing large libraries to parse custom languages, store data, sort collections, etc. The challenge is in understanding which technologies in the broader language ecosystem best accomplish this task for you. The easiest code to maintain is that which someone else is responsible for.
Oracle’s Java alone comes with a massive library of objects for logging, GUIs, serialization, database integration, testing, reporting, etc, etc, etc. and this discounts the vast array of solutions from other providers all of which have different benefits, integrate with different tools sets, have different licensing requirements and costs, stability considerations, etc.
C++ comes with the STL (which a surprising number of C++ coders seem to consider an advanced topic rather than a fundamental) but then splinters into many different incompatible ecosystems depending on platform or licensing preference.
In order to write, maintain, or especially design software in a language it is crucial to understand and leverage that ecosystem which brings me back to my original point: no coder even scratches the surface of one of these languages in a week. Many coders do not even really learn a language over the course of years and this may be the cause of the myth. If you don’t understand what it takes to program effectively in the language you are coding in now learning how to do so in another may not seem as challenging.