Integrating Apple’s On-Device LLM: A Step-by-Step Guide to Foundation Models – ██FR█████ █INTELL███████████

This content originally appeared on Level Up Coding – Medium and was authored by Saharsh Vedi

For the first time in recent memory, Apple has launched a developer feature that Android simply doesn’t have, and it’s a big one. With the release of the Foundation Models framework in iOS 26, Apple is giving third-party developers direct access to powerful on-device large language models (LLMs). These models can understand and generate natural language, perform reasoning, and even call your app’s custom tools.

In this article, we’ll walk through exactly how to integrate Apple’s on-device LLMs into your own iOS app. You’ll learn how to integrate a simple chat interface with the on-device LLM, register custom tools, and connect the model to system frameworks like EventKit to create calendar events. All while taking full advantage of Apple’s privacy-first approach to AI.

Prerequisites

Before diving into code, it’s important to make sure you’re set up with the right device and software. Apple’s on-device LLMs have specific hardware and software requirements.

Device Requirements

Apple’s on-device language models require the latest Apple Silicon chips and are only available on the following:

iPhone 16 (any variant)
iPhone 15 Pro / 15 Pro Max
Any iPad with an Apple Silicon M-series chip (eg: M1, M2, etc.)
Any Macbook with an Apple Silicon M-series chip

Software Requirements

To build and test apps with the Foundation Models framework, you’ll need:

macOS 26(Tahoe)
XCode 26

You can install the macOS and Xcode betas through Apple’s Developer Program: developer.apple.com

Starter Code for the Chat App

The starter Chat App

Before we dive into using Apple’s new on-device Foundation Models, let’s start with a lightweight chat app as our base. This simple chat UI will serve as the frontend for our on-device AI assistant. You can find the starter code here: https://github.com/Saharshv/Foundation-Model-Tutorial/tree/starter-code

The app is a single-view iOS application built using SwiftUI and MVVM architecture. The interface is a normal chat with an input field and a list of messages.

This setup gives us the perfect canvas for integrating a chat-based interaction loop with an LLM.

App Architecture Overview

To keep things clean and scalable, we’re following the Model-View-ViewModel (MVVM) pattern. Here’s a quick breakdown of the architecture:

ChatView: The SwiftUI view displaying the chat interface
ChatViewModel: Manages UI state, user input, and updates to the message list
ChatRepository: Stores and manages all Message objects — acts as our local data store
OnDeviceLLMManager: This is where all the Foundation Model API logic will live.

This modular design allows us to separate concerns cleanly — keeping our UI, business logic, and LLM integration well-isolated.

Tip: If you’re building along, you can mirror this architecture or adapt it to your own app’s structure. The key is to keep LLM integration isolated so you can swap in different models or tools later.

Step 1: Creating and Validating the On-Device LLM Instance

To start integrating Apple’s on-device Foundation Model into our chat app, we’ll first create a dedicated OnDeviceLLMManager. This manager will act as the bridge between our app and Apple’s new FoundationModels framework.

The first task inside this manager is to create an instance of the language model and verify that it’s available on the device.

Apple’s model might not be accessible for a variety of reasons hence it’s recommended to always check that the language model is available.

import Foundation
import FoundationModels

class OnDeviceLLMManager {
    private var isModelAvailable: Bool { model.availability == .available }
    
    private let model = SystemLanguageModel.default
    
    func checkModelAvailability() -> (Bool, String) {
        switch model.availability {
        case .available:
            return (true, "On-device model is available.")
        case .unavailable(.deviceNotEligible):
            return (false, "On-device model is not available: Device not eligible.")
        case .unavailable(.appleIntelligenceNotEnabled):
            return (false, "On-device model is not available: Apple Intelligence not enabled.")
        case .unavailable(.modelNotReady):
            return (false, "On-device model is not available: Model not ready.")
        case .unavailable(let other):
            return (false, "On-device model is not available: \(other)")
        }
    }
}

Step 2: Checking Model Availability and Notifying the User

Message confirming that the model is available

With our OnDeviceLLMManager in place, the next step is to connect it to the rest of the app and inform the user whether the on-device LLM is available.

We achieve this by injecting the OnDeviceLLMManager into the ChatRepository, which manages all the messages shown in the chat view. This way, we can append a system message to the conversation as soon as the app launches: either confirming the model is ready or explaining why it’s unavailable.

final class ChatRepository {
    ...
    
    private let onDeviceLLMManager: OnDeviceLLMManager
    
    init(onDeviceLLMManager: OnDeviceLLMManager) {
        self.onDeviceLLMManager = onDeviceLLMManager
        checkModelAvailabilityAndTellUser()
    }
    
    ...

    // MARK: - Private Methods
    
    /// Check if on device model is available and send a message about the availability
    private func checkModelAvailabilityAndTellUser() {
        let (_, content) = onDeviceLLMManager.checkModelAvailability()
        sendMessage(content, isFromUser: false)
    }
}

Step 3: Prompting the On-Device LLM and Receiving Responses

With model availability confirmed, we’re now ready to have a conversation with Apple’s on-device LLM.

To do this, we’ll add a new method in our OnDeviceLLMManager that:

Creates a session using SystemLanguageModel
Generates a response based on a user-provided prompt

This method will be the core of our chat loop. Every time a user sends a message, this method will be called to get the LLM’s reply.

class OnDeviceLLMManager {
    ...

    func generateResponse(for prompt: String) async throws -> String {
        guard isModelAvailable else {
            throw OnDeviceLLMManagerError.modelNotAvailable
        }

        let session: LanguageModelSession = LanguageModelSession()
        let response = try await currentSession.respond(to: prompt)
        return response.content
    }

    ...
}

Now, let’s connect this method to our ChatRepository

final class ChatRepository {
    ...

    /// Send a message to the chat
    func sendMessage(_ content: String, isFromUser: Bool = true) {
        let message = Message(content: content, isFromUser: isFromUser, timestamp: Date())
        messages.append(message)
        
        if isFromUser {
            Task {
                let llmReply = try await onDeviceLLMManager.generateResponse(for: content)
                sendMessage(llmReply, isFromUser: false)
            }
        }
    }

    ...
}

Conversing with the On-Device LLM

Step 4: Reusing the LLM Session for Context and Managing Input State

Now that we’ve successfully connected to the on-device model and can generate responses, it’s time to optimize.

In a chat-based interface, it makes the most sense to re-use the same LLM session, so the model can retain context across multiple turns in the conversation. This is especially useful when building assistant-like flows where memory and coherence matter.

Additionally, we’ll add logic to track whether the model is currently generating a response, and if so, we’ll temporarily disable the input field to prevent the user from submitting another prompt.

In OnDeviceLLMManager , we will:

Store a persistent session property in OnDeviceLLMManager
Reuse that session across multiple prompts
Track whether the session is still generating a response

class OnDeviceLLMManager {
    @Published var isResponding: Bool = false
    private var session: LanguageModelSession?

    ...

    func generateResponse(for prompt: String) async throws -> String {
        guard isModelAvailable else {
            throw OnDeviceLLMManagerError.modelNotAvailable
        }

        let currentSession: LanguageModelSession
        if let session {
            if session.isResponding {
                throw OnDeviceLLMManagerError.sessionStillResponding
            } else {
                currentSession = session
            }
        } else {
            currentSession = LanguageModelSession()
            self.session = currentSession
        }
           
        self.isResponding = true
        let response = try await currentSession.respond(to: prompt)
        self.isResponding = false
        return response.content
    }
}

I created a separate isResponding property in OnDeviceLLMManager because I wanted to let the ViewModel to just subscribe to $isResponding and let the view know when to disable the input field and button. I could have used session.isResponding directly but I opted for convenience.

final class ChatViewModel: ObservableObject {
    ...
    @Published var isResponding = false
    
    private let repository: ChatRepository
    private let onDeviceLLMManager: OnDeviceLLMManager
    
    init(repository: ChatRepository, onDeviceLLMManager: OnDeviceLLMManager) {
        self.repository = repository
        self.onDeviceLLMManager = onDeviceLLMManager
        self.messages = repository.messages
        
        self.subscribeToMessages()
        self.subscribeToResponding()
    }

    ...

    private func subscribeToResponding() {
        onDeviceLLMManager.$isResponding
            .receive(on: DispatchQueue.main)
            .assign(to: &$isResponding)
    }
}

Let’s also update the view to use viewModel.isResponding so that we can disable the input field and also show a loader in place of the model’s message view.

struct ChatView: View {
    ...

    private var chatMessages: some View {
        ScrollView {
            ScrollViewReader { proxy in
                LazyVStack(spacing: 12) {
                    ForEach(viewModel.messages) { message in
                        messageView(for: message)
                    }
                    if viewModel.isResponding {
                        HStack {
                            ProgressView()
                            Spacer()
                        }
                    }
                }
                .onAppear {
                    scrollToLastMessage(with: proxy)
                }
            }
        }
        .scrollBounceBehavior(.basedOnSize)
        .scrollIndicators(.hidden)
        .padding(16)
    }

    private var inputRow: some View {
        HStack(spacing: 12) {
            TextField("Enter a message", text: $viewModel.input)
                .focused($isInputFocused)
                .submitLabel(.send)
                .onSubmit {
                    viewModel.onSendTap()
                    isInputFocused = true
                }
                .padding(12)
                .background(Color.gray.opacity(0.3))
                .clipShape(RoundedRectangle(cornerRadius: 20))

            Button(action: viewModel.onSendTap) {
                Image(systemName: "arrow.right.circle.fill")
                    .resizable()
                    .frame(width: 32, height: 32)
            }
        }
        .padding(16)
        .background(Color(uiColor: .systemBackground))
        .disabled(viewModel.isResponding)
    }
}

Maintaining session context

Tools: Giving More Power to the On-Device LLM

Before we move on to connecting our LLM with EventKit, let’s take a moment to understand one of the most powerful features Apple provides in its on-device language model framework: Tool Calling.

If you’re familiar with the concept of Model Context Protocol (MCP) then you can skip this section because tools are a chapter straight from MCP.

In many real-world use cases, a language model alone may not have the data or permissions required to take meaningful actions or provide accurate answers. This is where Tools come in.

What are Tools?

Tools are Swift types that conform to the Tool protocol and represent discrete capabilities your app exposes to the model. They allow the LLM to:

Perform actions (e.g., create a calendar event or change a setting)
Query local or remote data sources (e.g., database lookups)
Integrate with Apple frameworks (e.g., Contacts, HealthKit, WeatherKit)

How Tool Calling Works

When you provide a tool to a LanguageModelSession, the model can reason whether a given prompt requires help from one of the tools. If so, it:

Chooses the appropriate tool.
Generates arguments for it based on your prompt.
Executes the tool by calling its call(arguments:) method.
Receives output from the tool.
Produces a final response incorporating the result.

This tool-calling pipeline is automatic and mirrors how the LLM might “reason and act” in a real-world assistant.

Step 5: Creating Events with EventKit

Before we teach the on-device LLM how to manage our calendar, we’ll create a dedicated EventsManager to handle all calendar-related tasks, starting with a basic event creation method.

Remember to add the NSCalendarsFullAccessUsageDescription key in your Info.plist

import EventKit

class EventsManager {
    private let eventStore = EKEventStore()
    
    init() {
        requestAccessIfNeeded()
    }
    
    private func requestAccessIfNeeded() {
        Task {
            do {
                let success = try await eventStore.requestFullAccessToEvents()
                if !success {
                    print("Access to calendar events was not granted.")
                }
            } catch {
                print("Error requesting full access to events: \(error)")
            }
        }
    }
    
    func createEvent(title: String, startDate: Date, endDate: Date) -> Bool {
        let event = EKEvent(eventStore: eventStore)
        event.title = title
        event.startDate = startDate
        event.endDate = endDate
        event.calendar = eventStore.defaultCalendarForNewEvents
        
        do {
            try eventStore.save(event, span: .thisEvent)
            print("✅ Event created: \(title)")
            return true
        } catch {
            print("❌ Failed to save event: \(error)")
            return false
        }
    }
}

Step 6: Creating a Tool for the createEvent Method

Now that we have an EventsManager with a simple createEvent() method, let’s expose it to the on-device LLM as a Tool so the model can use it during generation. This will allow the model to call your method when it detects a user prompt that implies creating an event.

import FoundationModels

struct CreateEventTool: Tool {
    let name = "createEvent"
    let description = "Creates a calendar event using EventKit"

    @Generable
    struct Arguments {
        @Guide(description: "The title of the event")
        var title: String

        @Guide(description: "The start date and time of the event in yyyy-MM-ddTHH:mm:ss format")
        var startDate: String

        @Guide(description: "The end date and time of the event in yyyy-MM-ddTHH:mm:ss format")
        var endDate: String
    }

    let eventsManager: EventsManager

    func call(arguments: Arguments) async throws -> String {
        // Parse date strings into Date objects
        let formatter = DateFormatter()
        formatter.dateFormat = "yyyy-MM-dd'T'HH:mm:ss"
        guard let start = formatter.date(from: arguments.startDate),
              let end = formatter.date(from: arguments.endDate) else {
            return "Failed to parse the event start or end time."
        }

        let success = eventsManager.createEvent(title: arguments.title, startDate: start, endDate: end)
        if success {
            return "Successfully created the event '\(arguments.title)' from \(arguments.startDate) to \(arguments.endDate)."
        } else {
            return "Failed to create the event '\(arguments.title)'."
        }
    }
}

Final Step: Providing the CreateEventTool to the Session

When initializing a LanguageModelSession, Apple’s framework allows you to pass in a list of tools the model can access. This is where our new tool comes in.

By including the CreateEventTool during session initialization, we’re telling the model:

“You now have the ability to create calendar events. If a user prompt sounds like a scheduling request, feel free to call this tool.”

class OnDeviceLLMManager {
    private let eventsManager: EventsManager
    
    init(eventsManager: EventsManager) {
        self.eventsManager = eventsManager
    }

    ...

    func generateResponse(for prompt: String) async throws -> String {
        guard isModelAvailable else {
            throw OnDeviceLLMManagerError.modelNotAvailable
        }
        let currentSession: LanguageModelSession
        if let session {
            if session.isResponding {
                throw OnDeviceLLMManagerError.sessionStillResponding
            } else {
                currentSession = session
            }
        } else {
            currentSession = LanguageModelSession(tools: [CreateEventTool(eventsManager: eventsManager)])
            self.session = currentSession
        }
           
        self.isResponding = true
        let response = try await currentSession.respond(to: prompt)
        self.isResponding = false
        return response.content
    }

    ...
}

With everything set up, it's time to see the full system working inside our chat app.

Creating an event using the LLM

I had to give the LLM today’s date because otherwise it was trying to create an event in 2023. This could likely mean that the model is from 2023.

Final Thoughts: The Power and Promise of On-Device LLMs

With the introduction of Foundation Models and the ability to run LLMs directly on Apple devices, Apple has taken a bold and strategic step into the future of AI. The future that prioritizes privacy, speed, and seamless user experience. By bringing this capability to developers, Apple is signaling that intelligent, context-aware apps no longer have to rely solely on the cloud.

While today’s on-device models are optimized for efficiency and privacy, their capabilities are still catching up to the bleeding-edge cloud-based LLMs. The real question will be how Apple continues to evolve these models over time and whether they can scale up their capabilities without sacrificing the performance or power efficiency that iOS users expect.

That said, the foundation has been laid. We now have the tools to build smarter, more private, and more responsive apps right from the device. It’s an exciting time to be an iOS developer, and this is just the beginning of what on-device intelligence can enable.

You can find the complete code here. Thank you for reading!

References

Integrating Apple’s On-Device LLM: A Step-by-Step Guide to Foundation Models was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.

This content originally appeared on Level Up Coding – Medium and was authored by Saharsh Vedi