This content originally appeared on DEV Community and was authored by Narednra Reddy Yadama
Everyone says: โWhatsApp runs on Erlang.โ
True. For the core.
But letโs flip it.
If you had to build a WhatsApp-class messaging system on Java today,how would you do it?
Hereโs the playbook Iโd reach for
The path to sub-100ms chat
Transport & fan-out
โข Netty for long-lived TCP/WebSocket/MQTT connections. Zeroโcopy, epoll/kqueue.
โข Protobuf frames. Tiny. Predictable.
โข gRPC for service-to-service RPC with deadlines + retries.
โข Aeron (optional) for ultra-low-latency one-way streams.
Broker & persistence
โข Kafka for durable event pipelines (chat events, receipts, presence).
โข Kafka Streams / Flink for real-time fan-out & stateful ops.
โข RocksDB as local state store for fast lookups.
โข Cassandra (or Scylla) for message history & wide-row access patterns.
โข Redis for hot presence, routing tables, rate limits.
Gateways & services
โข Micronaut / Quarkus for tiny memory footprints and fast cold starts.
โข Vert.x for reactive, back-pressure aware services.
โข CQRS split: write path optimized for append, read path for conversation timelines.
โข Idempotency keys everywhere (message resends happen).
Delivery semantics
โข At-least-once on the wire, exactly-once UX via de-dup DB keys.
โข Ack/receipt model: sent โ delivered โ read.
โข Store-and-forward for offline clients; resend on reconnect.
โข Sticky routing to keep a user on the same edge node when possible.
Latency budget (single region)
โข Serialization + network hop: ~2โ5ms
โข Gateway enqueue: ~1โ3ms
โข Broker to fan-out: ~5โ15ms
โข Fan-out to recipient gateway: ~5โ15ms
โข Device push/write: ~10โ30ms
Target: <60โ80ms P50, <150ms P95 end-to-end.
โข JVM tuning (the unsexy part that wins)
โข ZGC/G1 with small regions; avoid massive heaps.
โข Thread pinning for IO vs compute; prefer virtual threads for RPC fan-out.
โข Off-heap buffers (Netty/Aeron) + pooled allocators.
โข NUMA awareness on big boxes; pin brokers separately.
โข SLOs + load-shedding when queues grow (donโt let the tail kill you).
Reliability
โข Quorum/RAFT for metadata (topic maps, routing).
Privacy & safety
โข E2E encryption at the edge; servers carry opaque blobs + metadata envelope.
โข Abuse detection runs on metadata & user-reported content only.
โข Key transparency to prevent MITM on device keys.
Why this matters
Java isnโt โtoo slow.โ With the right stack, itโs a latency weapon: mature tooling, world-class profilers, rock-solid GC, and libraries built for billions of sockets. The craft is in back-pressure, memory locality, and aggressive simplification.
Want me to drop a tiny Netty + WebSocket fan-out skeleton and a Kafka + Cassandra timeline demo? I can share a repo next.
This content originally appeared on DEV Community and was authored by Narednra Reddy Yadama