This content originally appeared on DEV Community and was authored by Simplr
Large Language Models (LLMs) are increasingly central to modern applications, powering features from content generation to complex reasoning. However, relying on a single provider (like OpenAI, Anthropic, Google) introduces risks: API downtime, rate limits, capacity issues, or transient errors can disrupt your service, degrade user experience, and impact business continuity.
How can we build more robust AI-powered features? While custom logic is an option, it adds complexity. A simpler, more elegant solution is ai-fallback
.
Introducing ai-fallback
: Simple, Automatic LLM Resilience
ai-fallback
is a lightweight, zero-dependency npm package specifically designed to provide automatic fallback between different AI models. It integrates seamlessly with the popular Vercel AI SDK (ai
package).
The core idea is simple:
- You define an ordered list of AI models using the
ai
SDK’s provider functions (e.g.,anthropic()
,openai()
). - You create a fallback model instance using
createFallback
. - You use this fallback model instance directly with
ai
SDK functions likegenerateText
,streamText
, orstreamObject
. - If your primary model fails,
ai-fallback
automatically retries the request with the next model in your list.
This significantly boosts your application’s resilience with minimal code changes.
How It Works: A Practical Example with the Vercel AI SDK
Integrating ai-fallback
is straightforward, especially if you’re already using the ai
package.
import { createFallback } from "ai-fallback";
import { anthropic } from "@ai-sdk/anthropic";
import { openai } from "@ai-sdk/openai";
import { generateText, streamText, streamObject } from "ai";
import { z } from "zod";
// 1. Create the fallback model instance
const model = createFallback({
// Define models in preferred order using ai SDK functions
models: [
anthropic("claude-3-haiku-20240307"), // Try Claude 3 Haiku first
openai("gpt-3.5-turbo"), // Fallback to GPT-3.5 Turbo
// Add more models if needed
],
// Optional: Log errors when a fallback occurs
onError: (error, modelId) => {
console.warn(`Error with model ${modelId}: ${error.message}. Attempting fallback.`);
},
// Optional: Automatically try switching back to the primary model
// after a specified interval (e.g., 5 minutes) following an error.
modelResetInterval: 5 * 60 * 1000, // 5 minutes in milliseconds
// Optional: For streaming, decide if retrying should happen even
// if some output was already sent. Set to true to restart generation
// on the fallback model from scratch if an error occurs mid-stream.
// retryAfterOutput: true,
});
// --- Usage Examples ---
// 2. Use the fallback 'model' directly with Vercel AI SDK functions
// Example 1: Generate Text
async function generate(prompt: string) {
try {
const { text } = await generateText({
model: model, // Pass the fallback model instance
system: "You are a helpful assistant.",
prompt: prompt,
});
console.log("Generated Text:", text);
} catch (error) {
console.error("All AI fallbacks failed for generateText:", error);
}
}
// Example 2: Stream Text
async function stream(prompt: string) {
try {
const { textStream } = await streamText({
model: model, // Pass the fallback model instance
system: "You are a helpful assistant.",
prompt: prompt,
});
console.log("Streaming Text:");
for await (const chunk of textStream) {
process.stdout.write(chunk);
}
console.log(); // Newline after stream
} catch (error) {
console.error("All AI fallbacks failed for streamText:", error);
}
}
// Example 3: Stream Structured Object (using Zod)
async function generateStructured() {
try {
const { partialObjectStream } = await streamObject({
model: model, // Pass the fallback model instance
system: "You are a helpful assistant.",
prompt: "Generate a person object with name and age.",
schema: z.object({
name: z.string(),
age: z.number(),
}),
});
console.log("Streaming Object:");
for await (const partialObject of partialObjectStream) {
console.log(partialObject);
}
} catch (error) {
console.error("All AI fallbacks failed for streamObject:", error);
}
}
// --- Run Examples ---
generate("Explain the concept of idempotency in APIs.");
stream("Write a short story about a curious robot.");
generateStructured();
Key Features for Production Reliability
- Seamless Integration: Works directly with the
ai
SDK’s core functions. - Automatic Switching: Handles errors and provider downtime transparently.
- Configurable Reset: The
modelResetInterval
option allows the system to automatically attempt switching back to your primary (often preferred or cheaper) model after a cooldown period, ensuring you don’t stay on a potentially more expensive fallback longer than necessary. - Streaming Resilience: The
retryAfterOutput
option provides control over mid-stream failures. Setting it totrue
ensures that if an error occurs after streaming has begun, the entire generation process restarts from scratch on the next available model, preventing incomplete or corrupted outputs. You’ll need to handle potential duplicate content in your application logic if using this. - Error Monitoring: The
onError
callback provides visibility into fallback events for logging and monitoring.
Why This Matters for Production Applications
- Enhanced Reliability: Directly mitigates the risk of single-provider issues.
- Improved User Experience: Shields users from backend failures, providing smoother interactions.
- Simplified Operations: Reduces the need for complex, custom error handling for provider switching.
- Increased Confidence: Deploy AI features knowing you have a robust fallback mechanism.
Choosing Your Fallback Strategy
Order your models
array based on:
- Capability/Performance: Start with the best model for the task.
- Cost: Fall back to cheaper alternatives.
- Speed: Prioritize faster models if latency is key.
- Feature Compatibility: Ensure fallbacks support necessary features (e.g., function calling, specific schemas for
streamObject
).
Get Started Today
Application resilience is crucial, especially for AI-dependent features. ai-fallback
offers a simple, powerful way to safeguard against provider instability.
Stop letting provider downtime dictate your application’s uptime. Add ai-fallback
to your project:
npm install ai-fallback @ai-sdk/anthropic @ai-sdk/openai ai zod
# or using yarn, pnpm, bun
Check out the package on npm: https://www.npmjs.com/package/ai-fallback
Integrate it into your application using the Vercel AI SDK. It’s a small change that delivers a significant improvement in production stability.
This content originally appeared on DEV Community and was authored by Simplr