Monday, August 18, 2014

An abstract means for dealing with service and microservice componentization

Some months ago I read this article about microservices and I found it very interesting because of the experience we have had so far in the development of Jolie language. Jolie was born as a language for crystalizing SOA principles in a specific domain language but, day by day, we were becoming more and more confident that the boundaries of SOA were not enough for dealing with some issues that we had encountered during programming with Jolie. Thus, I was very surprised and excited when I read about microservices because I think they enlarge the domain  of service oriented programming in a way which is coherent with the results we have had so far. This is why I would like to share our experience with the microservices community. On the one hand my aim is to promote the usage of Jolie as one of the technologies which could help in the design and the development of microservices, and on the other hand I hope to give my contribution in the definition of concepts and practices for microservices. In particular, in this post, I would like to introduce a simple concept called Service Threshold which could be of help when reasoning about service and microservice componentization. The main idea behind the service threshold is to provide an abstract boundary which allows for the separation among a world of services and a world of components, where a component is a general piece of software.

The Service Threshold
The Service Threshold is the architectural line which separates services from other components (that are not services) as shown In Fig 1. where I simply represent services as hexagons and other components as squares.

Fig 1. The Service Threshold abstractly separates services (hexagons) from 
components that are not services (squares).

It is a very simple concept but some questions need to be addressed:

Why do I need to introduce it?
A line always represents a division, in this case a logical division between two kinds of things that we consider different. I introduce it because it helps me to change my perspective about distributed software architectures. Indeed, I need to think of them in a way which allows me to design, deploy and manage them quickly and easily. Summarizing, I need that line because I am looking for a simplification of distributed and heterogeneous systems engineering. 

Are there different properties above and under the service threshold?
Clearly, there are differences between the top plane and the bottom one. If I operate "above" the line I would like to be in a world where a service is the basic element which can be composed and managed by following some specific rules related to services, no other kind of software components exist. Otherwise, if I am "under" the line, the service is just something I have to create, it is the target of my work.

Which are the differences between a service and component?
A service is an autonomous running piece of software able to provide its functions when invoked. Its execution does not depend on components out of those strictly required.  The service functions should be always available for invocations, if not the service should reply with a specific fault message. Invocations are always achieved by means of message exchanges which could be synchronous or asynchronous. It does not matter the transport protocol used for the message passing. A service can be stateless or stateful.
A component is a piece of software which cannot be considered as a service.

Which is the right place to put the threshold line?
This is the core question every distributed software architect should deal with. Depending on the answer, I would design a different system with different properties. If I put this line very high I will have few services and a lot of components, if I put this line at a low level I will have a lot of services and few components. So, the question could be changed in: how many services do I need? Again, which are the basic properties that I need to take into account in order to identify all the required services?

The service threshold is not a line but a boundary
A distributed system is a system where software artifacts are separated, thus even though I have discussed the service threshold as a horizontal line, it fits better to think about it as a boundary which allows me to isolate and identify the different services involved into a system.


Fig 2. The Service Threshold is a boundary which allows me to isolate and identify the services.

Once defined, the service  threshold allows me to focus on the single service without being worried about its connections too much. I can just limit my effort to the development of the service as an autonomous specific provider of functions. The final target is to provide a basic set of services which represent the core functions of the application I am interested in. Some of them could be bought from a third party supplier, the others could be developed from scratch. The most important think is that each service function must be well separated from the others and, at the same time, it must be strictly necessary to the overall system. As an example, let me consider the human body where each organ supplies a well defined function which is fundamental for life. They are precisely separated by means of different tissues and they are characterized by different biological properties. 

All of them are needed, because the functions that they supply are needed, nothing is too much, nothing is too less. They are connected in a perfect and inexpensive way, because they all participate to form the human body, a more complex and completely different organism. 

Service connections and dependencies
Once all the services have been identified, I can deal with service communication and connections. A communication is performed when a message is sent by a service and received by another one, where a message is a limited set of data transmitted in a limited interval of time. A connection represents the possibility to perform a communication between two services: a sender and a receiver. Usually a service receiver is not aware of the sender (this is not true in the case of a stateful long running transaction). A connection becomes a dependency when the sender needs to invoke another service in order to accomplish its own tasks.
Fig 3. Service A has a dependency on service B.
Service dependencies allow me to divide services in two main categories: 
  • Autonomous services
  • Depending services
Autonomous services usually control resources such as databases, electronic devices, computational resources, legacy systems, etc. These kind of services are the software artifacts which enable the integration of  the underlying resource with a system of services and the APIs that they provide represent the only way for interacting with it. They are usually designed to be independent from the context they will be inserted in and they are strongly loosely coupled.

Depending services are built upon other services because they require to call other services to accomplish their tasks. They could be:
  • Mixed services
  • Pure coordination services
Mixed services control resources as the autonomous ones (e.g. a database) but they also need other information from other services for providing all their functions.
Pure coordination services do not control any data resource but they just call other services for achieving their tasks (orchestrators are usually pure coordination services).
 

Fig.4 Autonomous services, Mixed services and Pure coordination services
By componing these kind of services it is possible to obtain a complex distributed system which provides a new set of functions different from those of the constituent ones. I call service complex system a distributed system whose components are all connected services.

Fig. 5  A service complex system is a distributed system whose components are all connected services.


The service threshold can be applied over a set of services
A service complex system could be formed by tens or hundreds of services. Clearly, when the number of services is very high it could be difficult to manage and maintain the system because of its complexity. In this case, in order to simplify the management, I exploit another interesting property of the service threshold: I apply it over a set of existing services by grouping them. The services grouped by a service threshold can be seen and treated as a single service. In this case all the inner connections are hidden for an external observer and the only ones that are relevant for us are those that can be observed outside the service threshold boundary. 
Fig 5. The service threshold can be applied over a set of services in order to group them
and identify them as a single service.
There are no limits to the application of the service threshold over a set of services, thus I could apply it again and again in order to reduce the system to a limited set of services. Since a service is usually stateless and always ready to serve different clients concurrently, it is possible that the same service could be enclosed by two (or more) different service thresholds as in the following picture:

Fig 6. A service could be enclosed by two different service thresholds. 

In this case the resulting services both contain the same service inside. Is this possible? Yes, in the following section I will answer to this question.

Primary thresholds and abstract thresholds
As I said before a service is always a running piece of code. Thus, a set of files in our file system is not a service, it is just a set of file which could contain executable code but they are not a service. They start to become a service when they run. This is a fundamental point because a service exists only if it is able to respond to our requests. This is why it is possible to double a service by wrapping it into two different service thresholds as represented in Fig 6. It is possible because I am grouping the service functions, not the piece of code. I call abstract service threshold all the service thresholds which group existing services. When I apply the abstract service threshold to a set of services I loose the deployment details of a service (where it is, how it is composed, which technology it uses, etc) because I am focusing only on its functions. On the contrary, the primary service threshold is applied to a set of components that, when executed, they become a service. By definition. the primary service threshold has a completely different nature with respect to the abstract one because it will be applied to components for tting services. Nevertheless, I both call them service threshold because they both allow me to identify a new service even if they are applied to different domains.

Fig 7. The Abstract Service Threshold (on the left) groups existing services, the primary service threshold groups a set of components that, when executed, they become a service.

The primary threshold is a technological threshold
The Primary Threshold is a technological threshold because it clearly defines the boundary between service composition mechanisms and other technologies mechanisms. It does not matter which kind of technology I select, but it is important that I will be able to satisfy some minimum requirements that allow me to create a service (I suppose you would ask me which are the minimum requirements for creating a service, wait wait wait I will discuss them in another post). Within the primary threshold I could exploit any kind of technology and any kind of communication approach (such as in memory data exchanges, file exchange, etc.).  I can also use specific applications such as web servers, application servers, etc. The final target is to prepare the basic software layer which allow me to jump into the service world. Everything happens inside this threshold is a matter strictly related to the technologies I chose for building the service. This point is important because allow me to state that there exists a trade off between the benefits obtained by the introduction of a service and the technological overhead I have to pay. Is it simple to maintain a service developed with a specific technology? Which skills I require for achieving the development? Does the technology easily scale with the service load? Moreover, do I need to adopt the same technology for all the services of our system or do I need to change it? When? Will I be able to maintain the required skills for managing all the existing services I have? These are not trivial issues because more is the number of technologies I adopt, more is the human resource knowledge of my team I have to manage. Indeed, if I adopt a technology I need someone who is able to manage it. 

Conclusions
In this post I discussed service and microservices componentization by introducing the concept of service threshold which is a useful tool for approaching distributed systems based on services and microservices. I hope this post could be useful for those people that are trying to approaching service system design in a simple and intuitive way. Service thresholds could be a useful means for structuring the design and defining the development phases of a service system.