This content originally appeared on DEV Community and was authored by jefferson otoni lima
I’m very excited to share more improvements implemented in the Quick Framework, developed by @jeffotoni. These updates focus on robustness, performance, and support for modern protocols.
Early mornings improving Quick
As we’re using Quick for an AI (Artificial Intelligence) communication project, several needs have emerged that have led to new implementations and improvements to Quick. So, I’ll present some improvements made this week.
There’s nothing better than putting a framework to the test in practice, and it’s fascinating! I’m using it in AI projects, developing native servers to orchestrate flows with LLMs, creating custom connectors, and continuously building RAG solutions. It’s been a challenging yet very exciting experience.
Reflections from a curious developer
The path is arduous, long, but incredibly enjoyable. I’m driven by curiosity, and sometimes I can’t contain myself. I need to study again, revisit something that was right there with me all along, but only now makes sense, you know? It wasn’t the right moment before, perhaps, but it was there, so close and yet so distant. And this belated discovery intrigues me even more.
Despite years of practice and development, I often feel as if I’m seeing the world of technology through a crack, like someone peeking at the “whole” through a keyhole. Reality seems fragmented; we only access bits and pieces, flashes, never the complete whole, and that’s intriguing. And perhaps we’ll never truly see it.
That’s why I decided to create these posts, not only to show in practice what was done for the Quick framework, but also to share some sincere reflections from a developer constantly learning.
What was implemented?
1. Duplicate WriteHeader Protection
We implemented a custom wrapper (responseWriter) that prevents “superfluous response.WriteHeader” errors. Now, multiple calls to WriteHeader are handled silently, avoiding crashes in complex middleware chains.
2. Hijacker Support
The responseWriter now implements the http.Hijacker interface, allowing connection upgrades natively. This enables real-time bidirectional communication directly through Quick.
3. HTTP/2 Server Push
We added support for the http.Pusher interface with the c.Pusher() method, enabling HTTP/2 Server Push to improve performance by proactively sending resources to the client before they are even requested, reducing latency and round-trips.
Example:
q.Get("/", func(c *quick.Ctx) error {
pusher, ok := c.Pusher()
if ok {
pusher.Push("/static/style.css", nil)
pusher.Push("/static/app.js", nil)
}
return c.Status(200).SendString("<html>...</html>")
})
4. Simplified Server-Sent Events (SSE) for Streaming LLMs
We implemented the c.Flusher() method, which facilitates real-time data streaming, essential for modern applications with Large Language Models (LLMs). It allows you to progressively send tokens as they are generated, creating interactive experiences in both the browser and CLIs.
Real-world use cases:
Streaming responses from ChatGPT, Claude, and Gemini
Real-time deployment logs
Progressive dashboard updates
Server push notifications
Example 1:
q.Post("/ai/chat", func(c *quick.Ctx) error {
c.Set("Content-Type", "text/event-stream")
c.Set("Cache-Control", "no-cache")
c.Set("Connection", "keep-alive")
flusher, ok := c.Flusher()
if !ok {
return c.Status(500).SendString("Streaming not supported")
}
// Simulate streaming of tokens the LLM
tokens := []string{"Hello", " this", " is", " a", " streaming", " response", " from", " AI"}
for _, token := range tokens {
// Standard SSE format
fmt.Fprintf(c.Response, "data: %s\n\n", token)
flusher.Flush() // Sends immediately to the customer
time.Sleep(100 * time.Millisecond) // Simulates LLM latency
}
// Signals end of stream
fmt.Fprintf(c.Response, "data: [DONE]\n\n")
flusher.Flush()
return nil
})
Example 2:
q.Get("/events", func(c *quick.Ctx) error {
c.Set("Content-Type", "text/event-stream")
c.Set("Cache-Control", "no-cache")
flusher, ok := c.Flusher()
if !ok {
return c.Status(500).SendString("Streaming not supported")
}
for i := 0; i < 10; i++ {
fmt.Fprintf(c.Response, "data: Message %d\n\n", i)
flusher.Flush()
time.Sleep(time.Second)
}
return nil
})
Client JavaScript (Browser):
const eventSource = new EventSource('/ai/chat');
eventSource.onmessage = (event) => {
if (event.data === '[DONE]') {
eventSource.close();
return;
}
document.getElementById('response').innerText += event.data;
};
Client CLI (Go):
resp, _ := http.Post("http://localhost:8080/ai/chat", "application/json", body)
reader := bufio.NewReader(resp.Body)
for {
line, _ := reader.ReadString('\n')
if strings.Contains(line, "[DONE]") {
break
}
fmt.Print(strings.TrimPrefix(line, "data: "))
}
It works perfectly with HTTP/2 multiplexing, allowing multiple simultaneous streams on the same connection.
5. Pooling Optimization
The Reset() method has been optimized to reuse the existing wrapper instead of recreating it with each request. This reduces memory allocations in the hot path, improving throughput.
6. Memory Leak Prevention
We implemented full context cleanup in releaseCtx(), including Context and wroteHeader fields, ensuring that no residual state remains between requests.
Impact
Greater robustness in high-concurrency scenarios
Native HTTP/2 support
Server-Sent Events (SSE) made easy with c.Flusher()
Reduced allocations per request
Zero breaking changes and 100% backward compatibility
All changes maintain full compatibility with existing code.
SSE vs WebSocket
Appearance | SSE | WebSocket |
---|---|---|
Server CPU | ![]() |
![]() |
Server Memory | ![]() |
![]() |
Bandwidth | ![]() |
![]() |
Latency | ![]() |
![]() |
Implementation | ![]() |
![]() |
Debugging | ![]() |
![]() |
Firewall/Proxy | ![]() |
![]() |
Bidirectional | ![]() |
![]() |
Protocol | HTTP/1.1 or HTTP/2 | WebSocket (RFC 6455) |
Parser | Plain Text | Binary Frames |
Handshake | Normal HTTP Request | HTTP Refresh |
Automatic Reconnect | ![]() |
![]() |
Browser Support | ![]() |
![]() |
Message Overhead | ~45 bytes | ~50+ bytes |
Ideal for | Notifications, feeds, logs | Chat, games, collaboration |
Scalability | ![]() |
![]() |
CDN Compatible | ![]() |
![]() |
Backend Complexity | ![]() |
![]() |
Performance Comparison – SSE Writing Methods
This benchmark was run to identify the most efficient method for detecting SSE events unrelated to http.ResponseWriter. Tests were performed on an Apple M3 Max with Go 1.x, measuring nanoseconds per operation (ns/op), bytes allocated (B/op), and number of allocations (allocs/op).
Benchmark Results
The w.Write([]byte()) method performs best with 13.56 ns/op and zero allocations, approximately 4x faster than fmt.Fprint() and 9x faster than io.WriteString().
For large messages (>1 KB), it is recommended to use sync.Pool to reuse buffers, reducing allocation and putting pressure on the garbage collector.
Recommendations
Development/Debugging: Use fmt.Fprint() for simplicity.
Production (small messages): Use w.Write([]byte()) for maximum performance.
Production (large messages): Use sync.Pool with reused buffers.
High performance (>10,000 requests/s): Combine w.Write() with a buffer pool.
The full benchmark is available in /bench
.
Method | Performance | Allocations | Complexity | Recommendation |
---|---|---|---|---|
fmt.Fprint() |
![]() |
3 allocations | ![]() |
![]() |
io.WriteString() |
![]() |
1 allocation | ![]() |
![]() |
w.Write([]byte) |
![]() ![]() |
0 allocations | ![]() |
![]() |
strings.Builder |
![]() |
1-2 allocations | ![]() |
![]() |
Multiple Writes |
![]() |
0 allocations | ![]() |
![]() |
sync.Pool |
![]() ![]() |
0 allocations | ![]() |
![]() |
Unsafe |
![]() ![]() |
0 allocations | ![]() |
![]() |
Running the Benchmark
To reproduce the performance tests and validate the results on your machine, run:
go test -bench=. -benchtime=1s -benchmem ctx_bench_test.go
Benchmark Parameters
-bench=. – Run all benchmarks
-benchtime=1s – Run each benchmark for 1 second
-benchmem – Include memory allocation statistics
Benchmark Results
Test Environment:
OS: macOS (darwin)
Architecture: ARM64
CPU: Apple M3 Max
Go Version: 1.x
Method | ns/op | ops/sec | B/op | allocs/op | Performance |
---|---|---|---|---|---|
WriteBytes |
13.32 | 75.1M | 0 | 0 | ![]() |
MultipleWrites |
20.68 | 48.4M | 0 | 0 | ![]() |
Unsafe |
20.98 | 47.7M | 0 | 0 | ![]() |
Pooled |
39.00 | 25.6M | 0 | 0 | ![]() |
FmtFprint |
52.69 | 19.0M | 16 | 1 | ![]() |
FmtFprintf |
62.61 | 16.0 million | 16 | 1 | ![]() |
IoWriteString |
111.6 | 9.0 million | 1024 | 1 | ![]() |
Optimized |
119.7 | 8.4 million | 1024 | 1 | ![]() |
StringsBuilder |
122.7 | 8.2 million | 1032 | 2 | ![]() |
Method | ns/op | Speedup | B/op | allocs/op | Performance |
---|---|---|---|---|---|
PooledLarge |
234.5 | Baseline | 0 | 0 | ![]() |
WriteBytesLarge |
811.2 | 3.46x slower | 9472 | 1 | ![]() |
Examples and Source Code
All tests can be accessed here -> Quick SSE
Here you can view the Bench source code bench
Contributions
Quick is an open-source project in constant evolution. Feedback and contributions are always welcome!
GitHub: Quick
golang #webframework #performance #opensource #go #quick
This content originally appeared on DEV Community and was authored by jefferson otoni lima