Thu Apr 27 2023
Event Sourcing: What Is It and Why Does It Matter?
Modern business software is composed of many interdependent parts, any number of which may need to read or change the system’s state. As software became more complex, developers frequently ran into the limitations of traditional relational databases for distributed state management. Event sourcing is an alternative state management technique that improves performance, scaling, and reliability while providing a comprehensive audit trail for future reference and analysis.
In this article, we’ll delve into the fundamentals of event sourcing, explore its benefits, and examine how it compares to traditional CRUD-based persistence.
What is Event Sourcing?
Event sourcing considers each state change within a system to be an event. For instance, an e-commerce shopper adding an item to their cart is an event. Traditionally, that action would have prompted a state change in a database’s order table. In contrast, a system built around event sourcing adds the event to an append-only store along with contextual data such as the order number or customer ID number.
Other software components can then access this stream of events to read and “replay” them, consistently arriving at the same conclusion about the system’s state. In our e-commerce platform example, the service responsible for building the shopping cart interface can reconstruct the events for a particular order and display them to the customer.
Event Sourcing Step-By-Step
Event sourcing is used in many types of business software across numerous industries, including fintech, social media, gaming, healthcare, and many more. While a lot of systems need synchronous operations, like for certain banking or stock trading functions, the asynchronous portions of these systems are integral for overall performance and reliability.
Let's stick with e-commerce to walk through how event sourcing might work in a typical application:
- A customer creates a shopping cart and a createdCart event is generated with the cart ID and customer ID.
- This and subsequent events are appended to the event store and published, so other services can access them.
- Next, the customer adds more items to their cart and a service emits more addedItemToCart events. The addedItemToCart event includes relevant metadata such as cart ID, item IDs, and quantities.
- If the customer removes an item from their cart, the system generates an itemRemovedFromCart event. Note that events are never removed from the event store: the itemAddedToCart event isn’t deleted; instead, an itemRemovedFromCart event is added.
- Finally, the customer checks out and the system generates a cartCheckedOut event after validation.
It should be noted that while the asynchronous events and services here are highlighted, it's very likely that a synchronous service is keeping track of the cart state. Because we have an event source for our services, if, for example, the service managing the cart contents crashes, it can rebuild the cart by replaying the addedItemToCart events.
What Are the Benefits of Event Sourcing?
One of the biggest advantages of event sourcing is the maintenance of a complete log of everything that’s happened on the system. That’s hugely beneficial for debugging, monitoring, and observability. You can reconstitute and replay events, observing their impact and the system’s behavior. With the right data monitoring tools for your event-driven systems, you can see precisely why incidents occurred, implement changes, and ensure better reliability as features and data needs grow.
That’s not easily achievable with CRUD-based state management where the database holds state for a single point in time. The history of changes isn’t accessible without significant additional infrastructure to capture events as they occur, such as an event store, software to generate events, and a reconstructable audit log.
Other advantages of event sourcing include:
- Scalability and resilience. Event sourcing allows data partitioning and distribution across nodes.
- Integration with external systems. Event sourcing uses events as a common language and a source of truth for different services and components. This streamlines communication and synchronization between systems.
- Potential for analytics and machine learning. Event sourcing collects detailed event-level data. The event data set can be used to power machine learning systems, such as recommendation engines, fraud detection, and user behavior prediction.
Common Tools and Data Formats Used in Event Sourcing Systems
There is no shortage of tools available to support event sourcing. In general, you can use any streaming or messaging platform, but it’s advisable to use solutions that are distributed by default, and therefore able to support highly-available and fault tolerant systems. Distributed event-sourcing tools include:
- Kafka - A distributed streaming platform suitable for building real-time data pipelines/applications capable of handling trillions of events daily.
- RabbitMQ - A message broker offering robust messaging options between applications/services using varying protocols.
- Apache Pulsar - A cloud-native distributed messaging and streaming platform that supports multiple tenants and namespaces.
Data formats commonly utilized in event sourcing systems include JSON, Avro, Protobuf or MessagePack.
Streamdal and Event Sourcing
Event sourcing is a powerful technique for managing the state of modern business software. By recording each state change within a system as an event in an append-only store, event sourcing improves performance, scaling, and reliability while also providing a comprehensive audit trail for future analysis.
Streamdal was initially built to make event sourcing easier. All data we ingest is logically indexed into a data science friendly format (parquet), and stored in the cloud, on-prem, or anywhere you need it.
Along with this native functionality, you can expect your event-source to also have:
- A Smart Dead-Letter Queue
- Monitors & Alerts
- Search and observability as JSON, regardless of binary or other encodings
With Streamdal you will fundamentally reduce your time and effort required to build and maintain an event source.
You can get started today!
Dan is the co-founder and CTO of Streamdal. Dan is a tech industry veteran with 20+ years of experience working as a principal engineer at companies like New Relic, InVision, Digital Ocean and various data centers. He is passionate about distributed systems, software architecture and enabling observability in next-gen systems.
Wed Jul 19 2023
Data Consistency in Distributed Enterprise Applications
Learn about data consistency in distributed enterprise apps, why it matters, and how to maintain it using validation and real-time data monitoring.