The first time I met the concept of physical design was almost one decade ago, reading the book “Large-Scale C++ Software Design“, by John Lakos. The book is now quite aged, resulting in some out-of-date material (e.g. package prefixes versus C++ namespaces).Nonetheless, these elements are details with respect to the overall methodology that I continue to consider very important and up-to-date, also concerning current
model-based development environments. Moreover, it is probably the only (C++) book that seriously addresses the notion of physical design in contrast to the more popular topic of logical design. Even if the physical design is not so popular as its counterpart, it is essential to build large-scale, critical software systems. According to John Lakos:
The physical architecture is the skeleton of the system – if it is malformed, there is no cosmetic remedy for alleviating its unpleasant symptoms. The quality of the physical design of a large system will dictate the cost of its maintenance and the potential it has for the independent reuse of its subsystems.
This claim touches several important points. First, it emphasizes the need of up-front design in developing complex and large software architectures. In other words, don’t believe that your architecture will “emerge” simply as a side-effect of continuous refactoring steps. Refactoring simply do not scale well to be a serious “large-scale” architectural design tool. Second, Physical design provides some interesting metrics which can be very effective in order to evaluate the maintenance cost of an architecture. Furthermore, the same metrics can also be used to compare alternative designs. Finally, working with physical design lets the architect to address many testability issues (such as testing in isolation, testing hierarchically, ensuring high-quality software as it is being developed). I will discuss separately each of these aspects in future posts, especially in the context of modern, model-driven development environments.
Before to delve into technical details, however, I briefly recap here the difference between logical and physical design. The dichotomy between these two concepts is similar to that of classes and components. Logical design addresses issues such as the interaction of classes and functions, the (public) interface of a class, the signature of a method, and so on. In an object-oriented system, the class is the smallest fundamental unit of logical design. Physical design, on the other hand, takes into account physical entities (e.g. components, files, and libraries), compile-time coupling, link-time dependencies. The component is the smallest fundamental unit of physical design. Using both physical and logical design aspects, it is possible to automatically analyze a software architecture in order to evaluate several characteristics closely tied to software quality such as understandability, maintainability, testability, and reusability.
In order to do this, it would be desirable to have a unique notation to talk about both logical and physical design. Furthermore, it is desirable that such notation would be supported by modern CASE tools in order to be really applicable. At the time of his writing, Lakos proposed a visual language to talk about logical and physical architectures which were based upon the Booch Notation. Unfortunately, no tool never supported this endeavour. What I propose here is an adaptation of the Lakos notation to current UML. In this way, we can approach to the same type of analysis proposed by Lakos, but applying it to UML diagrams using current state-of-the-art CASE tools, where Lakos performed his automatic analysis of (only) C++ source code using a custom analyzer, in spite of his more general, langugage-independent method).
To set the stage, I choose a simple example consisting of three concrete classes forming a hierarchy of shapes. If we would like to describe the logical design of this small system in UML, we would probably draw a class diagram similar to the following:
Three types of shapes, Circle, Square, and Triangle, derive from the abstract class Shape. Logically, the interface of Circle, Square, and Triangle depend from the interface of Shape (this dependency is expressed by the generalization relation). This diagram represents a logical architecture because it describes logical entities (classes) and their relations. Suppose now that each class will be implemented as a single component (for example, in C++ this component consists of both a header file and an implementation file; in Java we will have a single physical .java file, and so on). In UML we can use a component diagram to describe each shape component. A dependency between classes in the logical design is then propagated to a dependency between components in the physical design, as illustrated by the figure below:
I will use extensively these types of diagrams to discuss design for testability issues. What the diagrams have in common here is the notion of dependency. After all, designing object-oriented system is mainly a question of managing dependencies. Not surprisingly, thus, we can rely on a very simple notation in order to reason about many aspects related to the complexity of a software architecture. More specifically, concerning the physical design, I will use classes, components, and dependencies. With these few elements, we are able to understand a lot of the overall architectural quality, especially for analysing the maintenance cost or the degree of testability (I will examine opportune metrics in a successive post). Concerning the logical design, again I adhere to the minimal notation used by Lakos for designing a logical system, based on classes and relations. More specifically, in UML I will use generalization (for hierarchies) and three types of (stereotyped) dependencies: the «Uses-In-The-Interface», the «Uses-In-The-Implementation», and the «Uses-In-Name-Only». Such dependencies replace to some extent the more specific association, aggregation, and composition relations in UML. The semantics of these relations is quite strightforward:
- «Uses-In-The-Interface»: a type T is used-in-the-interface of a component C if the type is used in the public interface of any class defined for that component. In this case, we will say that C depends in its interfaces from the type T. If T is used in the interface of C, it can also be used in any C implementation. Therefore, the «Uses-In-The-Interface» dependency subsumes the «Uses-In-The-Implementation» one, which will not be drawn in order to keep the diagrams simple.
- «Uses-In-The-Implementation»: a type T is used-in-the-implementation of a component C if that type is referred to by name anywhere in the definition of the component. In this case, the component C depends in its implementation from the type T. This dependency is not propagated to the interface of C. (In other words, the dependency is logically encapsulated in the implementation of C, thus T is not visible/accessible from the outside of C.)
- «Uses-In-Name-Only»: a component C uses a type T in name only if compiling C and any of the components on which C may depend does not require having first seen the definition of T. It is, for example, the case of a C++ type T which appear in a component C always only in the form of T* (i.e. a pointer). In that case, the compiler do not need to see the layout of T in order to compile C (it will always reserve the space for a memory address – the T* pointer – to represent any reference of T in C).
In the second part of this post, I will provide further examples which clarify the use of such dependencies. Before to conclude, I have to address a question that can arise: why we have to “re-invent” new relations by means of stereotyped dependencies when they are similar to the association-aggregation-composition relations already present in UML? The answer is simple: the traditional UML relations used in class diagrams do not differentiate between the interface and the implementation of a class. This is natural, because class diagrams are typically concerned with logical issues. From a logical point of view, what is and what is not used in the implementation of a component is unimportant because it is an encapsulated detail. Vice-versa, from a physical point of view, such usage can imply physical dependencies on other components. It is these physical dependencies that will affect maintainability, reusability, and testability in large systems. Good design requires that the developer understands the issues involved in both logical and physical design. Thus, we need more expressive dependencies.