kentoh - Fotolia


Unravel asynchronous inter-service communication in microservices

See why microservices need to communicate with each other asynchronously. Then, see how to surmount the challenges of inter-service communication in microservices architecture.

Asynchronous communication is the preferred method of messaging in distributed architectures. In an asynchronous communication pattern, a service sends a message but doesn't need to wait for the response. Instead, it places the request message into an ordered message queue. Even if the receiver is unavailable, the message remains in the queue to be processed later.

Asynchronous communication handles requests that must move through several services, as for a loan processing application that needs approval at various stages. It's an appropriate choice to deal with many scattered services that need to communicate with myriad data sources and other services. In a different setup, services that must wait for responses end up blocking the function of the entire service collection.

But while there are obvious benefits to asynchronous inter-service communication for microservices, there are some real drawbacks to watch out for as well.

Advantages of asynchronous communication          

Asynchronous communication has two notable benefits. The first is reduced coupling. The sender doesn't need to have any knowledge of the message's consumer, since it only needs to place the request in the queuing system. Those messages stored in the queue are processed -- often in batches -- whenever the receiver is ready to access them.

If the requesting service needs information immediately, but the information doesn't need to be current, the receiver can transmit a package of critical archived data to the requester until it is ready to provide the new data. The receiver can use this data to at least keep the service running while it waits for the new information.

A second benefit is fault isolation. Asynchronous messaging can handle regular or intermittent downtimes. If the receiving system fails for any reason, the sender can still send messages to the queue without interruption. The receiving service can pick up and process the messages as soon as it's ready to do so. This is a big change from synchronous inter-service communication, where both the sender and the receiver must be available and directly communicate for the operation to succeed.

Challenges and concerns with asynchronous messaging

Application developers deal with much more complexity when they introduce asynchronous communication between microservices rather than synchronous designs. It's not easy to implement request-response semantics using asynchronous messaging.

First, throughput can be a problem. If the messages need to wait for specific queue semantics to appropriately pass messages, the queue itself becomes a bottleneck. Each message needs a mechanism to manage its place in the workflow queue. You might even need to lock the messaging infrastructure, so to speak, which might restrict throughput if the queue is a managed service. Some organizations find the cost of managing an asynchronous inter-service messaging infrastructure for microservices significantly high.

Asynchronous vs. synchronous communication

Another major problem with asynchronous communication is that the services are tightly coupled with the message broker. Making a change to the message broker could severely impact message flow and bring communication to a halt.

On top of these two problems, don't let asynchronous communication hinder other microservices principles. The ability for an application to recover from failures and remain available and functional remains vital, so prioritize resilient principles. Resilient microservices need resilient inter-service communication.

When building microservices, consider the failures of services, components, networks and other resources. For resilient microservices applications, a service mesh abstraction layer at the infrastructure level adds circuit breakers, load balancing, failover, timeouts and retries. The service mesh should also capture inter-service call metrics -- like latency reports, errors, response sizes, etc. -- and enable distributed tracing. Since a single transaction can span several services, it is difficult to monitor the overall performance and health of the application, which makes distributed tracing worthwhile.

The retry method and the circuit breaker approach both make inter-service communications fault tolerant. While the former retries the operation that has failed a certain number of times, the latter attempts to prevent a service from retrying an operation that might fail.

And lastly, conduct proper versioning. When a new version of a service deploys, it shouldn't break any other services that depend on it. However, in some cases, you have to run multiple versions of a service side-by-side to avoid problems.

Overcome the asynchronous challenges

A couple significant workarounds exist for these inter-service messaging challenges.

As mentioned, a service mesh can alleviate most of these challenges through the use of circuit breakers, load balancing, failover, timeouts and automatic retries for failed requests.

If one service depends on another one only for data, then consider replicating the data. Data replication lets a workflow in the application use the data pertaining to another microservice while minimizing dependency between services. It helps the services scale and also solves data ownership issues in a microservices-based application.

If you can replicate copies of customer data to an order service, for example, it diminishes the order microservice's dependency on the customer service microservice. However, only use this approach in tandem with a strategy to update the replicated data, since that data will grow stale.

Dig Deeper on Distributed application architecture