The Terms "Model-Based" and "Model-Driven" Considered Harmful

I truly believe that abstraction is at the core of computer science and software engineering. Abstraction is essential to be able to cope with complexity and allows us to build the tremendously complex systems we build today. Yet, it is hard to clearly define what abstraction is.  Florian Deissenboeck and I were attempting this almost 10 years ago in an article that was published in a workshop on the role of abstraction in software engineering. I think we found some interesting ways to think about it, but it was extremely hard to wrap your head around the concept as such.

Similarly, models, which I would define in our context as abstractions of systems that have a specific purpose, are essential for software engineering, maybe for engineering in general. Especially in software engineering, almost everything we deal with is a model. Requirements written down are a model of a part of a system to be. Even the short sentence in a typical user story is a model. The sketch with boxes and arrows on the whiteboard indicating an aspect of a software architecture is a model of the software. The Java source code is a model of the software abstracting e.g. the details of the execution on the machine. I could go on and on. Models are essential for software engineering.

So what does it mean to call something "model-based software engineering" (MBSE) or "model-driven software engineering" (MDSE)? I would argue without models there is no software engineering. Yet, there is a large research community working on MBSE/MDSE. To this day, I have not fully understood what that is, although I have been working on things that were called "model-based" myself.

My first research project in 2002 was a collaboration with BMW in which we tried model-based testing of their MOST network master. We used the approach my valued colleague Alex Pretschner, now professor at TU Munich, built in his PhD project. We invested a lot of time discussing with the BMW engineers to build a detailed model in AutoFOCUS. This model was then quite suitable to generate test cases to be run against the actual network master. Interestingly, we found that many of the defects we found were found during the modelling. My personal observation was that the model became so detailed, it was almost a reimplementation of the network master software. Was there really a conceptual difference between our "model" and the code?

Over the years, I have thought about this a lot. This post is my current status of what I think about  MBSE/MDSE. I do not want to convey that everyone working in that field is stupid and does non-sense research. Quite to the contrary, I do think there is a lot of interesting work going on. My hypothesis, however, is the following:

Using the terms "model-based" or "model-driven" in combination with software engineering or software engineering techniques obscures what these techniques are actually about and are actually capable of. Progress in software engineering is hindered by this division in "model-based" and "code-based".

I sincerely hope that this might start a discussion in our community. To substantiate why this hypothesis may be true, I collected the following five observations:

Nobody knows what MBSE/MDSE really is. There is a great confusion in research and especially practice what this MBSE or MDSE should be. For many, it is working with diagrams instead of textual code. For example, a UML sequence diagram would be a model-based technique but text in a programming language describing the sequence of messages might not. For others, it needs to be mathematically formal. For other still, it is the name for using Simulink to program. A recent study by Andreas Vogelsang et al. showed this perfectly.

Practitioners don't adopt MBSE/MDSE. This point is a bit hard to discuss given the first one. Oftentimes, if practitioners state that they do MBSE/MDSE, they apply a certain tool such as Simulink. Or they have some UML diagrams of the system lying around (usually out of date). In the same study of Vogelsang et al., they investigate drivers and barriers for the adoption of MBSE. They found that "Forces that prevent MBSE adoption mainly relate to immature tooling, uncertainty about the return-on-investment, and fears on migrating existing data and processes. On the other hand, MBSE adoption also has strong drivers and participants have high expectations mainly with respect to managing complexity, adhering to new regulations, and reducing costs." So practitioners think the tools are immature and the whole thing might have a negative ROI. But there's the hope that it might help to manage complexity and reduce costs. This does not seem like a methodology I would like to invest in.

MBSE/MDSE is Formal Methods 2.0. I don't think formalisation and formal analysis is useless. It has its benefit and surely a place in the development of software systems. Yet, it is not the holy grail it was often sold to be. If I play the devil's advocate, it feels like after formal methods failed to be broadly adopted in industry, they are now simply renamed as model-based methods. Yet, putting some nice diagrams on top of a formal analysis most of the times won't make it easier to understand. In contrast, I love the work by people like Daniel Ratiu or Markus Völter on integrating formal verification in DSLs or common programming languages (see e.g. here). Is this "model-based"? Does it really matter whether it is?

Models are positioned as replacing code. I hear that less nowadays, but it is still out there. The story line was that the success of software engineering has always been based on reaching higher levels of abstraction. We went from machine code to assembly to C and Java. Models are supposedly then the next step in which we can abstract away all these technical details and everything will be easier.  I don't believe that at all, unless for very specific instances of "model". Abstraction comes at a cost. What we abstract away can come back and haunt us. For example, although Java hides null pointers from us most of the time, sometimes we suddenly see a null pointer exception pop up. It breaks the abstraction and suddenly makes the underlying details visible. As we are not used to dealing with null pointers in Java, this might be even worse than dealing with them directly in the first place. Furthermore, there are many arguments that can be made that there is a huge tool chain supporting us dealing with source code that is not directly available for various kinds of models. Finally, I fail to see how it is easier to work with graphical diagrams than with plain text.

The research community is split (at least) in two. Again, I will play the devil's advocate and exaggerate to make the point: On the one side, we have the MBSE/MDSE people claiming that models are the future and everybody still working on code just hasn't realised that. On the other side, we have the code people (e.g. in the maintenance community) who only analyse source code and ignore what these MBSE/MDSE people are doing for the most part. In the end, I believe, that leads to duplications of efforts, misunderstandings in reviews of papers and project proposals as well as a very confusing picture for practitioners. I don't think we project a consistent vision of how software engineering should be done in the future.

In summary, I believe that the extensive use of the terms "model-based" and "model-driven" in the software engineering community has done more harm than good. When I read that a method is model-based, I don't know if that means it uses graphical diagrams, it just uses additional artefacts or it is based on formalisations. I would be happy if my little rant here serves as the starting point in our community to bring the two fractions together so that we can work jointly on the next generation of software engineering methods, techniques and tools.