Inter-Service Communication
How services talk to each other. The event store is the central nervous system.
The Pattern: Log-Based Architecture
Services don't call each other directly. They communicate through a shared, append-only event log.
Each service has a clear role:
| Service | Reads | Writes | Role |
|---|---|---|---|
| Book-E | — | Proposals | Proposes actions based on user input |
| Accounting API | Approved events | Execution results | Executes approved actions against Folio/Fiken |
| Review-E | Proposals | Approvals/Rejections | Reviews proposals, approves or rejects |
| CronJob | — | Check results | Triggers periodic compliance checks |
Why Not Direct HTTP Calls?
Direct coupling
Book-E → Review-E → Accounting API → Folio
- If Review-E is down, Book-E fails
- If Accounting API is slow, everything is slow
- Adding a new service means changing existing services
- No audit trail of what happened
Event-driven
Book-E → Event Store ← Review-E / Accounting API
- Services are independent — restart one without affecting others
- Natural audit trail (the event log IS the history)
- Add new services by subscribing to events
- Retry failed operations by replaying events
Communication Patterns
Pattern 1: Proposal → Review → Execute
The primary pattern for write operations.
Each step is a separate event. Each service reads only the events it cares about.
Pattern 2: Query (Direct HTTP)
Read-only queries bypass the event store. No event needed for reading data.
Book-E → GET /folio/balance → Accounting API → Folio API → response
Why direct HTTP for reads? No side effects. Latency matters. No review needed.
Pattern 3: Cron Check → Notification
Daily proactive checks. CronJob calls Accounting API, which queries Folio/Fiken and returns results. Book-E posts summary to Discord.
Event Store as Integration Layer
1. Communication Bus
Services write and read events. No direct service-to-service calls for writes.
2. Audit Log
Every action is recorded with: who, what, when, why, and the result.
SELECT type, actor, status, created_at
FROM events
WHERE correlation_id = 'abc-123'
ORDER BY created_at;
-- ReceiptAttachmentProposed | book-e | PROPOSED | 10:00:00
-- ReceiptAttachmentApproved | system:auto | AUTO_APPROVED | 10:00:01
-- ReceiptAttachmentExecuted | system | EXECUTED | 10:00:02
3. State Store
Current state is derived from events, not stored separately.
-- What's pending review?
SELECT * FROM events WHERE status = 'PROPOSED';
-- What has Book-E done today?
SELECT * FROM events WHERE actor = 'book-e' AND created_at > NOW() - INTERVAL '1 day';
-- Approval rate
SELECT
COUNT(*) FILTER (WHERE type LIKE '%Approved') as approved,
COUNT(*) FILTER (WHERE type LIKE '%Rejected') as rejected
FROM events;
Adding a New Service
- Define what events it reads — subscribe to event types
- Define what events it writes — new event types if needed
- No changes to existing services — they don't know the new service exists
Adding a Cost API
Reads: ReceiptAttachmentExecuted, InvoiceRegistrationExecuted
Writes: MonthlyCostReport
Existing services don't change at all.
Sync vs Async Decision Matrix
| Scenario | Pattern | Why |
|---|---|---|
| User asks for balance | Sync (HTTP) | User is waiting, read-only |
| Book-E proposes receipt attachment | Async (event) | Goes through review |
| Review-E approves proposal | Async (event) | Triggers execution asynchronously |
| CronJob checks missing receipts | Sync (HTTP) | Reads data, writes event |
| New service reacts to approvals | Async (event) | Subscribe to events |
| Emergency: stop pending actions | Sync (HTTP) | POST /review/{id}/reject |
Failure Handling
| Failure | What Happens |
|---|---|
| Book-E pod restarts | Events already in Postgres. Nothing lost. |
| Accounting API is down | Events queue up. Executed when API is back. |
| Review-E is slow | Proposals stay as PROPOSED. No timeout pressure. |
| Folio API returns error | Execution event records the error. Retry later. |
| Postgres is down | Everything stops. Mitigated by k8s StatefulSet + PVC. |