Part-19: GCE Ops Agent: Logging & Monitoring in Google Cloud Platform (GCP)



This content originally appeared on DEV Community and was authored by Latchu@DevOps

When running workloads on Google Compute Engine (GCE), monitoring and logging are critical to keeping your systems healthy and your applications reliable. Google now recommends using the Ops Agent — a modern, unified solution for collecting logs, metrics, and traces from your VMs.

Let’s break it down. 👇

Why Ops Agent?

Google had legacy agents for logging and monitoring, but:

  • ❌ No new feature development
  • ❌ No support for newer OS versions
  • ⚠ Maintenance-only mode

That’s why Ops Agent is the recommended choice for all new workloads. If you’re still running the old agents, it’s time to migrate.

What is Ops Agent?

Ops Agent is a single agent that runs on Compute Engine VMs to:

  • 📜 Collect logs → send to Cloud Logging
  • 📊 Collect metrics & traces → send to Cloud Monitoring
  • 🛠 Uses Fluent Bit for logs
  • 🛠 Uses OpenTelemetry Collector for metrics & traces

It’s designed for both Linux and Windows VMs, with flexible installation options.

gcp-ops-1

Key Features

🔧 Installation & Management

You can deploy Ops Agent in multiple ways:

  • Auto-install during VM creation
  • Fleet installation using gcloud or automation tools like Ansible, Chef, Puppet, Terraform
  • Agent policies via CLI
  • Manual install on individual VMs

📝 YAML-based Configuration

  • Simple and flexible config files
  • Easy customization for log collection, parsing, and filtering

Logging Features

🚀 Better performance than the legacy logging agent

📂 Collects logs from:

  • System logs (/var/log/syslog, /var/log/messages)
  • File-based logs (customizable paths)
  • TCP protocol streams
  • Forward protocol (Fluent Bit/Fluentd)

🛠 Flexible processing:

  • Parse unstructured logs into structured JSON
  • Regex-based parsing
  • Exclude logs with labels/regex

🔌 Third-party app support: Apache Kafka, Nginx, Hadoop, MongoDB, MySQL, Redis, Oracle DB, SAP HANA, and more.

Full list here

Monitoring Features

📊 System metrics out of the box:

  • CPU, disk, memory, processes, networking, swap
  • GPU (Linux)
  • IIS, MSSQL, Pagefile (Windows)

🔌 Third-party app integrations (Kafka, Nginx, MariaDB, MongoDB, Redis, WildFly, etc.)

📡 Prometheus metrics collection for apps running on Compute Engine

🎮 NVIDIA GPU monitoring with DCGM integration

Final Thoughts

If you’re running workloads on GCE, adopting Ops Agent is a no-brainer:

✅ One agent for both logs & metrics
✅ Actively developed and future-proof
✅ Better performance & third-party support
✅ Flexible deployment at scale

Google has made it clear: transition your workloads to Ops Agent now and unlock better observability for your infrastructure.

👉 Have you already migrated from the legacy agents? What was your experience with Ops Agent so far?


This content originally appeared on DEV Community and was authored by Latchu@DevOps