tadata
Back to home

Event-Driven Architecture: Patterns and Trade-offs

#architecture#event-driven#kafka#messaging

Event-driven architecture (EDA) decouples producers from consumers, enabling systems that are more scalable, resilient, and adaptable to change. But EDA introduces its own complexity -- eventual consistency, ordering challenges, and debugging difficulty.

Core Patterns

Pub/Sub (Publish-Subscribe)

Producers publish events to a topic; multiple consumers subscribe independently.

  • Loose coupling between services
  • Easy to add new consumers without modifying producers
  • No guarantee of processing order across consumers

Event Sourcing

Instead of storing current state, store the sequence of events that led to it.

  • Complete audit trail by design
  • Ability to rebuild state at any point in time
  • Enables temporal queries ("what was the state on March 1st?")
  • Trade-off: read models must be projected, storage grows over time

CQRS (Command Query Responsibility Segregation)

Separate the write model (commands) from the read model (queries).

  • Write model optimized for consistency and validation
  • Read model optimized for query performance
  • Often combined with event sourcing
  • Trade-off: two models to maintain, eventual consistency between them

Message Broker Comparison

BrokerModelThroughputOrderingRetentionBest For
Apache KafkaDistributed logVery highPer partitionDays to foreverStream processing, event sourcing
RabbitMQMessage queueHighPer queueUntil consumedTask queues, RPC patterns
AWS SNS/SQSPub/Sub + QueueHighFIFO optional14 days (SQS)AWS-native event routing
GCP Pub/SubPub/SubHighPer key (ordering)31 daysGCP-native event pipelines
Redis StreamsAppend-only logVery highPer streamConfigurableLow-latency, simpler use cases

Delivery Guarantees

GuaranteeDescriptionComplexityUse Case
At-most-onceFire and forget, may lose messagesLowMetrics, non-critical logs
At-least-onceRetry until acknowledged, may duplicateMediumMost business events
Exactly-onceEach message processed exactly onceHighFinancial transactions

Exactly-once is often achieved through idempotent consumers rather than true exactly-once delivery. Design consumers to handle duplicates safely.

Event Schema Evolution

As systems evolve, event schemas change. Strategies to manage this:

  • Schema registry (Confluent, AWS Glue) -- centralized schema management with compatibility checks
  • Backward compatibility -- new consumers can read old events
  • Forward compatibility -- old consumers can read new events
  • Versioned events -- include version field, consumers handle multiple versions
  • Upcasting -- transform old events to new format at read time

Key Design Decisions

DecisionOption AOption BGuidance
Thin vs fat eventsID + reference onlyFull payload includedFat events reduce coupling but increase payload size
Event vs commandNotification of what happenedInstruction to do somethingEvents for decoupling, commands for explicit orchestration
Shared vs dedicated topicsAll events on one topicOne topic per event typeDedicated topics for high-volume or sensitive events
Sync vs asyncRequest-responseFire-and-forgetAsync by default, sync only when immediate response required

Common Pitfalls

  • Event storms -- cascading events overload the system; use circuit breakers and backpressure
  • Ordering assumptions -- distributed systems do not guarantee global order; design for out-of-order delivery
  • Ghost events -- events referencing data that no longer exists; include sufficient context in events
  • Debug hell -- tracing a request across 10 services requires correlation IDs and centralized logging

Resources