Skip to content

Inter-Service Communication

How services talk to each other. The event store is the central nervous system.

The Pattern: Log-Based Architecture

Services don't call each other directly. They communicate through a shared, append-only event log.

Event-Driven Communication

Each service has a clear role:

Service Reads Writes Role
Book-E Proposals Proposes actions based on user input
Accounting API Approved events Execution results Executes approved actions against Folio/Fiken
Review-E Proposals Approvals/Rejections Reviews proposals, approves or rejects
CronJob Check results Triggers periodic compliance checks

Why Not Direct HTTP Calls?

Direct coupling

Book-E → Review-E → Accounting API → Folio

  • If Review-E is down, Book-E fails
  • If Accounting API is slow, everything is slow
  • Adding a new service means changing existing services
  • No audit trail of what happened

Event-driven

Book-E → Event Store ← Review-E / Accounting API

  • Services are independent — restart one without affecting others
  • Natural audit trail (the event log IS the history)
  • Add new services by subscribing to events
  • Retry failed operations by replaying events

Communication Patterns

Pattern 1: Proposal → Review → Execute

The primary pattern for write operations.

Proposal Review Execute

Each step is a separate event. Each service reads only the events it cares about.

Pattern 2: Query (Direct HTTP)

Read-only queries bypass the event store. No event needed for reading data.

Book-E → GET /folio/balance → Accounting API → Folio API → response

Why direct HTTP for reads? No side effects. Latency matters. No review needed.

Pattern 3: Cron Check → Notification

Daily proactive checks. CronJob calls Accounting API, which queries Folio/Fiken and returns results. Book-E posts summary to Discord.

Event Store as Integration Layer

1. Communication Bus

Services write and read events. No direct service-to-service calls for writes.

2. Audit Log

Every action is recorded with: who, what, when, why, and the result.

SELECT type, actor, status, created_at
FROM events
WHERE correlation_id = 'abc-123'
ORDER BY created_at;

-- ReceiptAttachmentProposed  | book-e      | PROPOSED      | 10:00:00
-- ReceiptAttachmentApproved  | system:auto | AUTO_APPROVED | 10:00:01
-- ReceiptAttachmentExecuted  | system      | EXECUTED      | 10:00:02

3. State Store

Current state is derived from events, not stored separately.

-- What's pending review?
SELECT * FROM events WHERE status = 'PROPOSED';

-- What has Book-E done today?
SELECT * FROM events WHERE actor = 'book-e' AND created_at > NOW() - INTERVAL '1 day';

-- Approval rate
SELECT
    COUNT(*) FILTER (WHERE type LIKE '%Approved') as approved,
    COUNT(*) FILTER (WHERE type LIKE '%Rejected') as rejected
FROM events;

Adding a New Service

  1. Define what events it reads — subscribe to event types
  2. Define what events it writes — new event types if needed
  3. No changes to existing services — they don't know the new service exists

Adding a Cost API

Reads: ReceiptAttachmentExecuted, InvoiceRegistrationExecuted

Writes: MonthlyCostReport

Existing services don't change at all.

Sync vs Async Decision Matrix

Scenario Pattern Why
User asks for balance Sync (HTTP) User is waiting, read-only
Book-E proposes receipt attachment Async (event) Goes through review
Review-E approves proposal Async (event) Triggers execution asynchronously
CronJob checks missing receipts Sync (HTTP) Reads data, writes event
New service reacts to approvals Async (event) Subscribe to events
Emergency: stop pending actions Sync (HTTP) POST /review/{id}/reject

Failure Handling

Failure What Happens
Book-E pod restarts Events already in Postgres. Nothing lost.
Accounting API is down Events queue up. Executed when API is back.
Review-E is slow Proposals stay as PROPOSED. No timeout pressure.
Folio API returns error Execution event records the error. Retry later.
Postgres is down Everything stops. Mitigated by k8s StatefulSet + PVC.