Why documents
shouldn’t be the basis of a domain analysis
DSL design requires that you
first understand the domain for which you want to build the language, so the
analysis of how stuff works is a huge part of my work. And of course this
“understanding” part is not limited to building DSLs, you have to do this to
build any kind of software that is specific to a particular problem set.
In many cases, customers give
me documents, existing software and other artifacts they have available to help
me build an understanding: “Here, read this, and you’ll know what you have to
build!” This is generally a bad idea, I will explain in this article why this
is the case.
A disclaimer: it is
absolutely useful to look at documents and other artifacts to get a rough
overview over a domain. After all, we all read books and presentations and
stuff all the time to learn about things. But getting a rough overview about
something is quite different from building a detailed, structured and formal
understanding of how something should work. And books are usually written with
the express purpose of teaching, and aren’t “random documents” people wrote for
some reason or another.
This brings me to the first
reason why domain analysis from documents is a bad idea: Most documents (and
yes, there are exceptions) are not written with a lot of love; somebody had to
write the documentation, and they tried to do that as fast as possible so they
could get back to something that is more fun. This shows in the documents and
makes them hard / unpleasant to read.
Second, company-internal
documents are usually written for insiders. They use a lot of jargon. Now,
jargon is fine, it is often the basis of a DSL, but if you are an external
language designer new to a domain or organization, you cannot just read a
jargon-rich document as an introduction to the domain. You have to learn the
jargon first — usually not through such documents.
A more fundamental problem is
that documents are often outdated and/or incomplete. Or there are multiple
documents that disagree. It’s the usual problem with documentation: because it
doesn’t “run”, there is not much incentive to keep it current. And so usually
it isn’t. You’ll get a not-so-useful perspective on the domain from reading it.
There is an even more
fundamental problem with documents: even if they are up-to-date and well
written, they describe the current state of the domain, warts and all. When
designing a new process/tool/language/DSL, you often want to clean things up,
you want to refactor, optimize, and get rid of “historical accidents”. So even
if you understood everything correctly from the domain, you’d just replicate
the status quo in a new tool. That’s often not what you want to do.
Lastly, you cannot interact
with a document. You cannot ask questions if you don’t understand something.
You cannot judge the relative importance of things described in the documents.
The aforementioned warts aren’t obvious. Different parts of a document won’t
suddenly start talking to each other, disagreeing about some aspect of what is
written there.
So what else should you do?
It’s probably obvious from that last sentence: talk to people. Find experts in
the domain and let them explain what they do. Build strawmen, mental models
(and ultimately, prototype tools), challenge them and potentially tear them
down again, replacing them with something better. Talk to different people and
let them disagree. If things look fishy, challenge them. Often, this happens in
the form of analysis workshops, I
wrote about how to run those before.
Documents can play a role in
this context: you can use them as completeness checks, and as reminders of
which things to talk about in a workshop. Going through some of them with the
domain experts is sometimes a good exercise. But making documents the primary
source — without access to people — doesn’t work.
The drawback? You gotta find
those people. They do exist in all the organizations I have ever worked with,
but there usually aren’t many people who really fully grok how a domain works
in total and in detail. And often — because they are experts — these folks are
busy. So it does make sense to organize the overall process in a way where
these people are not unnecessarily burdened. But if you want to build a DSL
that really captures the domain you gotta get at the brain of these people. And
that requires their time. There’s no way around it.