I Built a WebSocket Server to Stream iPhone LiDAR and IMU Data



This content originally appeared on DEV Community and was authored by Jaskirat Singh

The Hardware Setup Problem

I kept having the same experience. Some idea would come up that needed depth data or IMU readings. I’d think about it for a few minutes, then realize I’d have to order an Intel RealSense or some IMU breakout board, wait for shipping, figure out the drivers, wire everything up. The idea would just die there.

This wasn’t occasional. It happened enough times that I decided to look at what hardware I already had.

Your Phone Has Better Sensors Than You Think

I was looking at specs for various LiDAR sensors when I thought to check what my iPhone actually had. Turns out the 12 Pro and newer models have:

  • Time-of-flight LiDAR measuring 0.5 to 5 meters at 10fps
  • 1080p camera running at 30fps
  • IMU sampling at 200Hz
  • GPS
  • ARKit doing visual-inertial odometry in the background constantly

Pretty much everything you’d wire up to a Raspberry Pi for a robotics project, already integrated and calibrated.

The problem is iOS locks all of this down. You can use it for AR apps through Apple’s frameworks, but you can’t just stream the raw data out to process however you want. You can’t SSH in. You can’t pipe data to your laptop.

That’s what Arvos fixes.

How the System Works

Arvos runs a WebSocket server directly on the iPhone. Your phone becomes the server, your laptop connects as a client. This is backwards from how most mobile apps work, but it makes sense here since the sensors are on the phone.

The structure is pretty straightforward:

iPhone runs the server
    ↓
Captures from ARKit, AVFoundation, CoreMotion, CoreLocation
    ↓
Timestamps everything with nanosecond precision
    ↓
Streams over WebSocket to whatever's connected
    ↓
Your laptop/code processes the data

When you use the camera with LiDAR together, you get an RGB-D depth camera. Synchronized color and depth, like a RealSense but without buying or setting up anything. You can also stream individual sensors or any combination you need.

The big difference from traditional robotics setups: instead of spending a weekend mounting sensors on a Pi or Jetson, dealing with GPIO pins and I2C buses and power distribution, you just open the app. Five minutes instead of two days.

Why I Chose WebSocket

iOS doesn’t include a WebSocket server, so I had to build one using Apple’s Network framework. The choice came down to a few things:

WebSocket gives you bidirectional communication with low latency. It works in browsers without any plugins. Pretty much every language has libraries for it. Python, JavaScript, C++, ROS all have good support.

I considered other protocols but WebSocket hit the right balance of performance and compatibility.

The Timestamp Problem

This took me a while to solve properly. Each sensor on iOS uses a different clock:

  • Camera uses CMTime
  • IMU uses Date
  • ARKit uses CFAbsoluteTime

If you’re doing any kind of sensor fusion, you need all your data on the same timeline. I solved this by converting everything to mach_absolute_time(), which is monotonic and never jumps backward like system time can.

func now() -> UInt64 {
    let machTime = mach_absolute_time()
    var timebase = mach_timebase_info()
    mach_timebase_info(&timebase)
    let nanos = machTime * UInt64(timebase.numer) / UInt64(timebase.denom)
    return nanos
}

Everything gets a nanosecond timestamp. The sync error across all sensors is under a millisecond now.

ARKit Memory Issues

This one was frustrating to debug. ARKit hands you ARFrames that contain depth maps, camera images, pose data, all kinds of stuff. These frames are large. If you keep references to them, iOS just kills your app after 30 seconds or so.

The fix is extracting exactly what you need immediately and dropping the frame:

func session(_ session: ARSession, didUpdate frame: ARFrame) {
    guard let depthMap = frame.sceneDepth?.depthMap else { return }

    // Pull out the point cloud right now
    let points = convertDepthToPointCloud(depthMap, frame.camera)

    // Store the processed data, never the ARFrame
    let depthFrame = DepthFrame(
        points: points,
        timestamp: TimestampManager.now()
    )

    delegate?.didReceiveDepth(depthFrame)
    // Frame gets released when we leave this function
}

I added retain count monitoring in debug builds. If it goes above 10, something’s holding references it shouldn’t.

Two Camera Systems

Arvos actually switches between two different camera implementations depending on what you’re doing:

For RGB-D mode where you need synchronized color and depth, it uses the ARKit camera. This gives you frames that already have both RGB and depth data aligned.

For modes where you just need camera without depth, it uses AVFoundation directly. This has better performance since ARKit isn’t running, and you get more control over camera settings.

Preset Modes

Rather than making people configure each sensor individually, I set up modes for common use cases:

RGBD Camera runs camera and depth at 30fps. Acts like an Intel RealSense.

Visual-Inertial gives you 30fps camera with 200Hz IMU. Good for developing VIO algorithms.

LiDAR Scanner is just depth and pose, no camera. Pure 3D scanning.

IMU Only streams just the IMU at 200Hz. Battery lasts about 6 hours in this mode. Also enables the Apple Watch IMU if you have one paired.

Full Sensor turns everything on. Camera, depth, IMU, pose, GPS, watch if available. Eats battery in about 2 hours but you get complete data.

Custom lets you toggle whatever sensors you actually need.

This removes the configuration step. Pick what you’re doing, start streaming.

Data Flow

The path from sensor to your code looks like:

Hardware triggers iOS callback → Immediate nanosecond timestamp → Extract and process (depth to point cloud, JPEG compression, etc) → Check if enabled and within rate limits → Send to network stream and recording file if active → Serialize to JSON or binary → Broadcast to all connected clients

Controlling Frame Rate

ARKit can spit out frames faster than you asked for. Without controlling this, you’d waste battery and overwhelm the network. The fix is simple throttling:

let now = Date()
let elapsed = now.timeIntervalSince(lastFrameTime)
if elapsed < 1.0 / Double(targetFPS) {
    return // Skip this frame
}
lastFrameTime = now

Keeps the output rate consistent no matter what the sensor does.

Message Format

Small sensor data goes as JSON:

{
  "type": "imu",
  "timestamp_ns": 1703001234567890123,
  "linear_acceleration": {"x": 0.15, "y": 0.02, "z": -0.08},
  "angular_velocity": {"x": 0.05, "y": -0.12, "z": 0.03},
  "gravity": {"x": 0.0, "y": -9.81, "z": 0.0},
  "attitude_quaternion": {"w": 0.98, "x": 0.01, "y": 0.15, "z": 0.02}
}

Large data like images and point clouds goes as binary with a type byte prefix:

[TYPE: 1 byte][PAYLOAD: variable bytes]

0x01 = JSON
0x02 = JPEG camera frame
0x03 = PLY point cloud
0x04 = H.264 video chunk

Camera frames get JPEG compressed at 80% quality. Point clouds use PLY binary format. Way smaller than base64 encoding everything.

Recording

Arvos records to MCAP, which is what ROS 2 and Foxglove use. This means recordings work directly with existing robotics tools. No conversion needed.

Recording runs on a background thread so it doesn’t block sensor capture. Camera frames get H.264 encoded in parallel. The MCAP file includes schemas for each sensor type, making it self-describing.

Apple Watch Support

I added Watch support kind of experimentally. The watch runs a companion app that streams its IMU at 50Hz back to the iPhone through Watch Connectivity.

This lets you do dual-device tracking. Phone in your pocket, watch on your wrist, both sending IMU data. Potentially useful for activity recognition or gesture stuff.

Getting the timestamps synchronized between devices was tricky. I measure round-trip time and adjust watch timestamps to match the iPhone timeline. Usually within 5ms.

Network Usage

In Full Sensor mode, bandwidth looks like:

  • Camera at 30fps: 5-8 Mbps
  • Depth at 10fps: 2-4 Mbps
  • IMU at 200Hz: 0.05 Mbps
  • Pose at 60Hz: 0.02 Mbps
  • GPS at 1Hz: 0.001 Mbps
  • Watch at 50Hz: 0.03 Mbps

Total around 7-12 Mbps. WiFi handles this fine.

Three ways to connect:

Direct WiFi is lowest latency at 5-10ms. Best for lab work.

iPhone Hotspot works anywhere, adds a bit more latency around 15-30ms.

Cloud Relay works over internet with 50-200ms latency. Good for demos but not real-time control.

Battery Life

With all sensors active, battery lasts about 2 hours. This is just physics. LiDAR plus camera plus ARKit plus WiFi all burn power.

Different modes change this:

  • Full Sensor: 2 hours
  • RGBD Camera: 2.5 hours
  • Visual-Inertial: 3 hours
  • LiDAR Only: 3.5 hours
  • IMU Only: 6 hours

ARKit handles thermal management automatically. If the phone gets too hot, it throttles to prevent damage.

How I Actually Use This

I work on SLAM and autonomous navigation. Arvos lets me test algorithms without dealing with hardware setup. No mounting sensors on a Pi, no driver configuration, no power management headaches. Open the app, start streaming, iterate on code.

The time difference is significant. What used to be a weekend of hardware work is now done in five minutes.

Students use it for computer vision classes. Instant RGB-D data without buying RealSense cameras or hunting for datasets.

Someone on GitHub mounted an iPhone to a mobile robot for SLAM testing. Zero hardware cost since they already had the phone. Another person mapped their apartment by walking around recording LiDAR and converting it to a 3D mesh. Someone else built gesture recognition using the 200Hz IMU stream.

Cost Comparison

A typical robotics sensor setup might need:

  • Intel RealSense D435i: $300-500
  • IMU breakout board: $50-200
  • Camera module: $100-300
  • GPS module: $50
  • Raspberry Pi or Jetson: $50-200

The bigger issue isn’t money though. It’s time. Wiring sensors together, writing drivers, synchronizing different clock systems, calibrating coordinate frames, managing power. This takes hours or days.

Arvos skips all of that. The sensors are already integrated and calibrated. You just stream.

The iPhone’s LiDAR isn’t 360 degrees like the expensive Velodyne scanners in autonomous cars. But for most prototyping and research, you don’t need that. You need quick access to synchronized depth, camera, and IMU data.

What Doesn’t Work

Range: LiDAR only goes 0.5 to 5 meters. Nothing beyond that.

Field of View: Matches camera FOV at roughly 69 degrees horizontal. Not omnidirectional.

Lighting: LiDAR has trouble in direct sunlight. Works best indoors or in shade.

Battery: Two hours with everything on. Keep it charged or plan around this.

iOS Limits: Can’t run in background. Screen must stay on, though you can minimize brightness.

Network: Needs WiFi or hotspot. No cellular streaming due to bandwidth and battery constraints.

Adding New Sensors

The architecture makes adding sensors fairly straightforward:

Write a service class that handles the sensor callbacks. Add it to SensorManager. Update the mode configuration. Handle the data in NetworkManager. That’s basically it.

I built a protocol abstraction layer too, so supporting gRPC or MQTT or whatever else is just implementing the protocol interface.

For Developers

GitHub Repository: github.com/jaskirat1616/arvos

Python SDK: pypi.org/project/arvos-sdk

Web Studio: arvos-studio.vercel.app

Build it in Xcode 14+. Needs iOS 16+ on the phone. Works on iPhone 12 Pro or newer for LiDAR support.

Project layout:

arvos/
├── Managers/      SensorManager, NetworkManager, RecordingManager
├── Services/      ARKitService, CameraService, IMUService, etc
├── Models/        Data structures
├── Views/         SwiftUI UI
└── Utilities/     Helper code

Python SDK:

pip install arvos-sdk
from arvos_sdk import ArvosClient

client = ArvosClient("192.168.1.100:8765")
client.connect()

for data in client.stream():
    if data["type"] == "imu":
        print(data["linear_acceleration"])
    elif data["type"] == "camera":
        # JPEG frame in data["image_jpeg"]
        pass
    elif data["type"] == "depth":
        # Point cloud in data["points"]
        pass

Why Open Source

The barrier to professional sensors shouldn’t be cost or locked-down software. Your phone already has this hardware. Making it accessible seemed worth doing.

Repository: github.com/jaskirat1616/arvos

The code’s there if you want to fork it or modify it. I tried to keep it readable.

GPL v3 license. Use it however you need.

What I Learned Building This

Memory management on iOS will bite you if you’re sloppy. ARFrames and large buffers need careful handling or iOS kills your app.

Threading matters for high-frequency sensors. Don’t block the main thread with IMU processing.

Sensors fail in the real world. Camera permissions get denied, GPS drops out. Handle these cases or your app becomes frustrating to use.

Using existing standards (WebSocket, JSON, PLY, MCAP) meant instant compatibility with tons of tools. Inventing proprietary formats is rarely worth it.

Write documentation while you remember why you made decisions. Six months later you won’t.

Current State

The project works. I’ve been using it for months. Memory issues are fixed, timestamp sync is solid, streaming is reliable.

Possible future additions:

  • Magnetometer and barometer
  • More protocol adapters
  • Multi-phone synchronization
  • On-device ML

But the core functionality is stable and usable now.

If hardware setup has been stopping you from testing ideas, this might help. The sensors are already there.


This content originally appeared on DEV Community and was authored by Jaskirat Singh