Messaging is one of the most important aspects of modern programming techniques. Majority of today’s systems consist of several modules and external dependencies. If they weren’t able to communicate with each other in an efficient manner, they wouldn’t be very effective in carrying out their intended functions. In this blog, we’ll be looking at the different types of messaging systems available to us today like JMS and AMQP with the pros and cons of each.
Before the advent of standard messaging protocols like JMS (Java Message Service) or later the AMQP (Advanced Message Queuing Protocol) , messaging between applications was extremely basic and simply implied predictable static interaction between well-known end points. This system was completely synchronous in nature which made it very brittle and highly prone to failure. Each application had to be programmed to talk to the other directly in a very specific way. This is referred to as tightly-coupled communication (usually done via TCP network sockets or Java RMI).
If even one of the components in this synchronous message chain failed for some reason, it would adversely affect all the other dependents. Also, since state is maintained across each component, the system can get complex to reset in case of failure.
To address these challenges, the idea of having a Message Oriented Middleware was conceived.
Message Oriented Middleware
MOM (Message Oriented Middleware) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. Data is exchanged by message passing and/or message queuing in either synchronous or asynchronous fashion as needed. Messages are sent from one application to the other with a queue as an interim. Messages in a queue remain there until they are retrieved by the intended recipient.
The obvious benefits of a MOM approach are:
- The receiver application does not need to be available when the message is sent. Instead, the receiver can retrieve the message at any time.
- Messages can be retrieved off the queue in any order.
- Retrieval of messages using priority or load-balancing schemes.
- Provides a level of fault-tolerance using persistent queues that allow messages to be recovered when the system fails.
Java Message Service
JMS is a MOM specification that describes a common way for programs to create, send, receive and read distributed enterprise messages. From the name, one can see that this system has it’s roots in the Java language. It can, however, be utilized by other languages like Ruby or Python to interface with Java applications with a few caveats mentioned later.
The basic elements of a JMS-compliant system are:
- Provider: An implementation of the JMS interface for a Message Oriented Middleware. Providers are implemented as either a Java JMS implementation or an adapter to a non-Java MOM.
- Client: An application or process that produces and/or receives messages.
- Producer/Publisher: A client that creates and sends messages.
- Consumer/Subscriber: A client that receives messages.
- Message: An object that contains the data being transferred between clients.
- Queue: A staging area that contains messages that have been sent and are waiting to be read (by only one consumer).
- Topic: A distribution mechanism for publishing messages that are delivered to multiple subscribers.
JMS Messaging Models
Messages can be sent using the JMS API in the following two ways:
The Point-to-Point messaging model (P2P model) allows JMS clients to send and receive messages both synchronously and asynchronously via virtual channels known as queues. This model has traditionally been a pull- or polling-based model, where messages are requested from the queue instead of being pushed to the client automatically. As seen in Figure 1, a queue may have multiple receivers, but only one receiver may receive each message. The JMS provider will take care of doling out the messages among JMS clients, ensuring that each message is consumed by only one JMS client.
In the Publish-and-Subscribe (Pub/Sub) messaging model, a producer can send a message to many consumers through a channel called a topic. Consumers can choose to subscribe to a topic. Any messages addressed to a topic are delivered to all of the topic’s subscribed consumers. Every consumer receives a copy of the message. The pub/sub messaging model is push-based model, where messages are broadcast to consumers automatically without them having to request or poll the topic for new messages.
In the pub/sub messaging model, the producer sending the message is not dependent on the consumers receiving the message. Optionally, JMS clients that use pub/sub can establish durable subscriptions that allow consumers to disconnect and later reconnect and collect messages that were published while they were disconnected.
Benefits of JMS
Prior to JMS, programmers had to go through a steep learning curve to learn the complex proprietary APIs of the specific messaging server—this made writing messaging applications difficult and resulted in minimal portability. One of the main objectives of JMS was to minimize the learning curve for writing messaging applications and to maximize their portability. The high adoption of JMS in several enterprises can be attributed to the following factors:
- Wide Industry Support: The JMS specification is very easy to implement in existing messaging servers, JMS was the first enterprise messaging API that garnered wide industry support and, as a result, became the messaging standard.
- Simple & Standard Messaging API: A developer only has to learn the JMS API and can then write portable messaging enterprise applications easily and quickly. By defining standard messaging concepts and conventions supported across different vendor messaging systems, JMS has simplified client application development and addressed portability issues.
Drawbacks of JMS
With all the above mentioned benefits, JMS still has certain challenges that make it unsuitable in some use cases. The high dependency on the Java language can be a hindrance especially in multi-tiered applications (using microservices) where a variety of languages/frameworks are used. It’s much easier for a Java program to interface with the JMS API than other languages like Ruby or Python. A limitation of JMS is that the APIs are specified, but the message format is not. JMS has no requirement for how messages are formed and transmitted. Essentially, every JMS broker can implement the messages in a different format. They just have to use the same API.
Consider the case where you want a Java app to send a message to a .NET app. Since .NET cannot use JMS natively, there has to be a message broker present which converts Java’s JMS API calls using the standard OpenWire protocol to a protocol supported by .NET, MSMQ (Microsoft Message Queuing). There are several readily available brokers that do this Ex. Apache ActiveMQ, Kafka. This issue is further exacerbated when other apps developed using other languages are thrown into the mix. Having multiple brokers for converting one protocol to another results in a very large and unnecessary overhead.
A standard protocol needed to be defined for all languages/frameworks. The AMQP protocol was developed to solve this problem of interoperability.
Advanced Message Queuing Protocol
AMQP (Advanced Message Queuing Protocol) is an openly published wire specification for asynchronous messaging. Every byte of transmitted data is specified. This characteristic allows libraries to be written in many languages, and to run on multiple operating systems and CPU architectures, which makes for a truly inter-operable, cross-platform messaging standard.
The basic components of AMQP are very similar to those of JMS. Publisher,Subscriber and Queues are available. In addition, there are two extra components.
- Exchanges: AMQP entities where messages are sent. Exchanges take a message and route it into zero or multiple queues. The routing algorithm used depends on the exchange type and rules called routes or bindings both explained in detail later.
- Bindings: Bindings or Routes define on which queue(s) to pipe the message. Consumers subscribing to that queue then receive a copy of the message. Think of it as the link between exchanges and queues. Messages that are sent through exchanges possess a parameter called Routing Key which determines it’s destination queue.
Networks are very often unreliable and applications may sometimes fail to process messages therefore the AMQP model has a notion of message acknowledgements: when a message is delivered to a consumer the consumer notifies the broker, either automatically or as soon as the application developer chooses. When message acknowledgements are in use, a broker will only completely remove a message from a queue when it receives a notification for that message (or group of messages).
When a message cannot be routed, messages can be returned to publishers, dropped, or, if the broker implements an extension, placed into a so-called “dead letter queue”. Publishers can choose how to handle situations like this by sending messages using certain parameters.
AMQP brokers generally provide four exchange types:
- Direct Exchange: This exchange type is similar to the P2P model in JMS. An exchange that is bound to a queue requires a direct match between the routing and binding key for the message to be delivered to the consumer. However, unlike the P2P model, it is possible to send messages to multiple bound queues thereby allowing it to be sent to more than one recipient. Refer to the illustration of Direct Exchange in Figure 4 where each arrow represents a message with a specific routing key.
- Fanout Exchange: In this exchange type, the routing key is ignored and the messages are routed to all bound queues. If N queues are bound to a fanout exchange, when a new message is published to that exchange a copy of the message is delivered to all N queues. Fanout exchanges are ideal for the broadcast routing of messages. It is quite similar to the Pub/Sub model in JMS.
- Topic Exchange: In this exchange type, queues bind to the exchange just like in Direct Exchange but with the use of a wildcard pattern(* or #) so that the message is sent to specific bound queues that match the pattern provided. Topic exchanges have a very broad set of use cases. Whenever a problem involves multiple consumers/applications that selectively choose which type of messages they want to receive, the use of topic exchanges should be considered.
- Headers Exchange: Similar to Topic Exchange, only instead of routing keys, the message header values are used to detect a match.
Apart from the exchange types mentioned above, exchanges are also declared using a number of attributes. The most important attributes are:
- Durability (exchanges survive broker restart)
- Auto-delete (exchange is deleted when all queues have finished using it)
- Arguments (these are broker-dependent)
All the other elements like Consumers,Publishers,Queues etc. work similar to JMS.
The below table summarises the key differences between AMQP and JMS
AMQP vs JMS
To summarise, AMQP based messaging is definitely a better choice due to it being platform independent. It is a necessity in modern polyglot systems where multiple components need to communicate. Having said that, JMS still has it’s place in systems where Java is extensively and exclusively used. Several brokers such as RabbitMQ, ActiveMQ, Qpid & Kafka also facilitate communication between AMQP and JMS clients.
- Messaging: What to choose and when
In a previous blog, we gave an overview of the different messaging protocols available to us (AMQP & JMS) and listed each one's benefits and issues. In this blog, we…
- Introduction to Microservices
Traditional development methodologies encourage the ‘monolithic’ approach to application development. Building a single application that does everything required has been the modus operandi for a while. However, with the rise…
- Hadoop Cluster Verification (HCV)
Verification scripts basically composed of idea to run a smoke test against any Hadoop component using shell script. HCV is a set of artifacts developed to verify successful implementation of…
- Understanding Memory Tuning in JVM- A Case Study and Analysis
JVM Heap Model The JVM heap model consists of the Young generation and the Old generation memory. The newly created objects are allocated to the young generation memory, as they…
- HAWQ/HDB and Hadoop with Hive and HBase
Hive: Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query, and analysis. HBase: Apache HBase™ is the Hadoop database, a distributed, scalable, big…
- Real Time Data Ingestion (DiP) – Spark Streaming (co-dev opportunity)
This blog is an extension to that and it focuses on integrating Spark Streaming to Data Ingestion Platform for performing real time data ingestion and visualization. The previous blog DiP (Storm Streaming) showed how…