Domain-specific languages usually evolve much quicker than
programming languages. Because they are tailored to a particular domain, they
have to evolve with the domain. And they are developed together with users, so
you will try things out and then potentially change something that didn't quite
work. This is in contrast to general-purpose programming languages: because
they are so universal in their abstractions, there are no external factors that
effectively force an evolution of the language. You still might want to evolve
it, based on new ideas or hypes or fads in programming language design, but you
can do this much more slowly.
So if you evolve a language, what do you do with existing
models (aka programs)? It is usually not an option to just break them. You have
to keep them valid, somehow. In this article I want to discuss the
"somehow" a little bit.
The simplest way of ensuring that existing models continue
to work is to make sure that your language evolves in a backward-compatible
way. In other words, any new version of the language is (syntactically and
semantically) a superset of the previously existing one. If you can do this,
then the whole problem goes away. However, it also severely limits your degrees
of freedom in how you evolve your language: you can't ever remove anything, and
you cannot change the semantics of any existing construct. In practice this is
not a great idea.
The next better thing you can do is deprecate things you no
longer want to support -- and at the same time add something that is better in
some respect. Deprecated language concepts can still be used, for a time, but
at some time will be deleted. Depending on your tool you can also prevent the
creation of new instances of the deprecated concept, but the existing ones are
still ok, for a time.
For general-purpose languages this is harder impossible
because you can never reach all users of the language to tell them to change
away from the deprecated concept. Java is a good example of very slow removal
of deprecated things. For DSLs, however, this is more realistic, since most are
used in some bounded scope -- you often can reach all its users. So you can
inform, ask, bribe or threaten them, so they stop using the deprecated concept.
You can even have your domain-specific IDE report back to the mothership who
still uses the old stuff after some deadline. In addition, you can make
migrating away from deprecated constructs easier by providing quick fixes or
refactorings to semi-automatically change their existing program into an
equivalent version that does not use deprecated concepts.
This approach can work in practice. But you are still forced
into stepwise backward compatibility.
Lets now move into the space where you can make genuinely
incompatible language changes. Let's assume you live in a closed world where
you as the person who changes the language has all instances in reach.
Repository-based modeling tools can use this approach. You -- as the person who
updates the language -- can also use it with git if you can make everybody
check in everything by a deadline, on main, and you can then jointly touch all
models (it's not terribly realistic, we'll get back to this).
In such a closed world, you can pull along all models in
real time, as you make the (incompatible) language change. If the change breaks
only a few models, you can change them manually. More realistically, you might
want to write some kind of script that algorithmically changes all the
instances of the concepts you change. One step more conveniently, your language
development tool can potentially observe the changes you make to the language
and automatically generate the migration scripts, and in this way fully
automatically pull along all models.
This approach works, although the assumption of a closed
world in which you have access too all models ever written is a strong one,
both organisationally and from a scalability perspective. So let's get rid of
that constraint.
This is really the same problem as in database schema
migration. There, too, you have a closed world because all the data is
accessible -- it's right in that database you want to migrate.
In a distributed, decentralized world -- think: git -- you
cannot assume that you have access to all instances of your language at any
time. They are in different repos, on different branches. So you have to
migrate them to new version whenever you see them. Or better: whenever these
models see the new language.
You need a bit of infrastructure. The language has to
specify at which version it is. Each language version ships with migrations
that can pull programs of the previous over to the current version. Just like
in the previous case these can be manually written or automatically recorded
from the language change. Each model also has to specify its current version.
When the user opens a model in the IDE, that IDE automatically executes the
chain of migrations to bring that existing model up to date with the current
language version.
There's another requirement for this to work: the migration
script must be able to access the data in a model whose structure potentially
doesn't conform to the current language. The specific implications of this
requirement depend on the implementation technology. Usually it means that the
access to the data from the old version is based on reflection or via a M3 API.
It is also cool if the tool supports writing test cases for your migrations to
make sure they'll work correctly when they are executed automatically.
This last approach, based partially on generated and
manually created migration scripts is what MPS uses. If you are careful and
thorough with your migration scripts, the approach works well and can help
scale out your DSL to a large number of users.
There's one more caveat I have to add: migration scripts are
automated. In other words, there must a way to algorithmically decide how an
old program should be moved to the new version. Consider the situation where
you "split" the semantics of an existing concept: you can go left or
right from the old program. In this case you cannot automate the migration, the
user has to decide. I guess this doesn't happen too often, but in this case
you're back to deprecation, warning message and ideally a quick fix.
So which one do you use? In my practical work with MPS I
proceed as follows:
In the very early stages of language development where only
very few models exist (and all are under my control) I just break things and
fix them by recreating the part of the models I broke -- this works because MPS
doesn't "kill" the whole model if your language has an incompatible
change, it only "breaks" the concepts that change. You can see that
in the editor and you just delete that part and recreate it correctly.
As the language grows, usually a few select people start
creating models in order to validate the language. Just breaking their work,
even if that's just prototypical, isn't a great way of keeping them engaged. So
I use the eager-pull-along approach. It's feasible to tell them to check things
in and let me migrate things at 10 pm. Especially since lots of changes to
languages are additive and therefore backward compatible. Breaking changes are
comparatively rare.
And then, as the language is rolled out to more people, into
more repositories and branches, and latest when it gets used for production, I
use the on-demand distributed approach with migration scripts. The reason why I
don't use them right from the beginning is that with MPS, you have to write migration
scripts manually, and this is of course effort -- it makes language evolution
slower. So I like to push this out as far as possible.
It's funny. People always confront me with supposed
disadvantages of DSLs, compared to all kinds of alternatives. One point people
make in this context is that there's supposedly this problem with language
evolution. Let me ask you: when was the last time you have written migrations
for your library that automatically pulls along client code? Exactly. It's not
possible with any of the mainstream languages and IDEs. With DSLs, if you use a
decent tool, it is. Disadvantage? I don't think so.
Thanks to Niko Stotz for a couple of a really good additions based on reading an earlier draft of the article.