Stop Worrying About LLM Downtime: Build Resilient AI Apps with `ai-fallback`



This content originally appeared on DEV Community and was authored by Simplr

Large Language Models (LLMs) are increasingly central to modern applications, powering features from content generation to complex reasoning. However, relying on a single provider (like OpenAI, Anthropic, Google) introduces risks: API downtime, rate limits, capacity issues, or transient errors can disrupt your service, degrade user experience, and impact business continuity.

How can we build more robust AI-powered features? While custom logic is an option, it adds complexity. A simpler, more elegant solution is ai-fallback.

Introducing ai-fallback: Simple, Automatic LLM Resilience

ai-fallback is a lightweight, zero-dependency npm package specifically designed to provide automatic fallback between different AI models. It integrates seamlessly with the popular Vercel AI SDK (ai package).

The core idea is simple:

  1. You define an ordered list of AI models using the ai SDK’s provider functions (e.g., anthropic(), openai()).
  2. You create a fallback model instance using createFallback.
  3. You use this fallback model instance directly with ai SDK functions like generateText, streamText, or streamObject.
  4. If your primary model fails, ai-fallback automatically retries the request with the next model in your list.

This significantly boosts your application’s resilience with minimal code changes.

How It Works: A Practical Example with the Vercel AI SDK

Integrating ai-fallback is straightforward, especially if you’re already using the ai package.

import { createFallback } from "ai-fallback";
import { anthropic } from "@ai-sdk/anthropic";
import { openai } from "@ai-sdk/openai";
import { generateText, streamText, streamObject } from "ai";
import { z } from "zod";

// 1. Create the fallback model instance
const model = createFallback({
  // Define models in preferred order using ai SDK functions
  models: [
    anthropic("claude-3-haiku-20240307"), // Try Claude 3 Haiku first
    openai("gpt-3.5-turbo"), // Fallback to GPT-3.5 Turbo
    // Add more models if needed
  ],
  // Optional: Log errors when a fallback occurs
  onError: (error, modelId) => {
    console.warn(`Error with model ${modelId}: ${error.message}. Attempting fallback.`);
  },
  // Optional: Automatically try switching back to the primary model
  // after a specified interval (e.g., 5 minutes) following an error.
  modelResetInterval: 5 * 60 * 1000, // 5 minutes in milliseconds

  // Optional: For streaming, decide if retrying should happen even
  // if some output was already sent. Set to true to restart generation
  // on the fallback model from scratch if an error occurs mid-stream.
  // retryAfterOutput: true,
});

// --- Usage Examples ---

// 2. Use the fallback 'model' directly with Vercel AI SDK functions

// Example 1: Generate Text
async function generate(prompt: string) {
  try {
    const { text } = await generateText({
      model: model, // Pass the fallback model instance
      system: "You are a helpful assistant.",
      prompt: prompt,
    });
    console.log("Generated Text:", text);
  } catch (error) {
    console.error("All AI fallbacks failed for generateText:", error);
  }
}

// Example 2: Stream Text
async function stream(prompt: string) {
  try {
    const { textStream } = await streamText({
      model: model, // Pass the fallback model instance
      system: "You are a helpful assistant.",
      prompt: prompt,
    });

    console.log("Streaming Text:");
    for await (const chunk of textStream) {
      process.stdout.write(chunk);
    }
    console.log(); // Newline after stream
  } catch (error) {
    console.error("All AI fallbacks failed for streamText:", error);
  }
}

// Example 3: Stream Structured Object (using Zod)
async function generateStructured() {
  try {
    const { partialObjectStream } = await streamObject({
      model: model, // Pass the fallback model instance
      system: "You are a helpful assistant.",
      prompt: "Generate a person object with name and age.",
      schema: z.object({
        name: z.string(),
        age: z.number(),
      }),
    });

    console.log("Streaming Object:");
    for await (const partialObject of partialObjectStream) {
      console.log(partialObject);
    }
  } catch (error) {
    console.error("All AI fallbacks failed for streamObject:", error);
  }
}

// --- Run Examples ---
generate("Explain the concept of idempotency in APIs.");
stream("Write a short story about a curious robot.");
generateStructured();

Key Features for Production Reliability

  • Seamless Integration: Works directly with the ai SDK’s core functions.
  • Automatic Switching: Handles errors and provider downtime transparently.
  • Configurable Reset: The modelResetInterval option allows the system to automatically attempt switching back to your primary (often preferred or cheaper) model after a cooldown period, ensuring you don’t stay on a potentially more expensive fallback longer than necessary.
  • Streaming Resilience: The retryAfterOutput option provides control over mid-stream failures. Setting it to true ensures that if an error occurs after streaming has begun, the entire generation process restarts from scratch on the next available model, preventing incomplete or corrupted outputs. You’ll need to handle potential duplicate content in your application logic if using this.
  • Error Monitoring: The onError callback provides visibility into fallback events for logging and monitoring.

Why This Matters for Production Applications

  • Enhanced Reliability: Directly mitigates the risk of single-provider issues.
  • Improved User Experience: Shields users from backend failures, providing smoother interactions.
  • Simplified Operations: Reduces the need for complex, custom error handling for provider switching.
  • Increased Confidence: Deploy AI features knowing you have a robust fallback mechanism.

Choosing Your Fallback Strategy

Order your models array based on:

  • Capability/Performance: Start with the best model for the task.
  • Cost: Fall back to cheaper alternatives.
  • Speed: Prioritize faster models if latency is key.
  • Feature Compatibility: Ensure fallbacks support necessary features (e.g., function calling, specific schemas for streamObject).

Get Started Today

Application resilience is crucial, especially for AI-dependent features. ai-fallback offers a simple, powerful way to safeguard against provider instability.

Stop letting provider downtime dictate your application’s uptime. Add ai-fallback to your project:

npm install ai-fallback @ai-sdk/anthropic @ai-sdk/openai ai zod
# or using yarn, pnpm, bun

Check out the package on npm: https://www.npmjs.com/package/ai-fallback

Integrate it into your application using the Vercel AI SDK. It’s a small change that delivers a significant improvement in production stability.


This content originally appeared on DEV Community and was authored by Simplr