Terraform Fundamentals: Config



This content originally appeared on DEV Community and was authored by DevOps Fundamental

Terraform Config: A Deep Dive into Dynamic Configuration Management

Infrastructure often requires configuration data that isn’t suitable for hardcoding directly into Terraform modules. This data might be environment-specific, application-specific, or simply too large and complex to manage effectively within HCL. Traditionally, this led to complex templating, external data sources, or brittle workarounds. Terraform Config, leveraging the terraform_remote_state data source and potentially combined with external data sources, provides a robust and scalable solution for managing this dynamic configuration, fitting seamlessly into modern IaC pipelines and platform engineering stacks. It’s a critical component for building truly reusable and adaptable infrastructure.

What is “Config” in Terraform Context?

“Config” isn’t a single Terraform resource, but rather a pattern built around the terraform_remote_state data source and, increasingly, the external data source. terraform_remote_state allows you to read the state file generated by another Terraform configuration, effectively treating that configuration’s outputs as inputs to your current configuration. The external data source allows you to execute external programs and consume their output as data within Terraform.

The core idea is to decouple configuration data from infrastructure definition. A dedicated “config” Terraform configuration manages the data (often stored in a remote backend like S3, Azure Blob Storage, or GCS), and other infrastructure configurations consume that data via terraform_remote_state.

Registry/Module References: While there isn’t a dedicated “Config” module in the Terraform Registry, many modules are designed to consume configuration data from a remote state. Look for modules that accept input variables expecting data from a terraform_remote_state data source.

Terraform-Specific Behavior: terraform_remote_state introduces dependencies on the remote state file. Terraform must successfully pull the remote state before proceeding. Incorrectly configured remote backends or access permissions will lead to errors. The external data source introduces dependencies on the external program’s availability and execution time. Careful error handling and timeout configuration are crucial.

Use Cases and When to Use

  1. Environment-Specific Configuration: Managing different database connection strings, API keys, or feature flags across development, staging, and production environments. DevOps teams can update the config without modifying the core infrastructure code.
  2. Application-Specific Configuration: Providing application-level settings (e.g., memory allocation, number of replicas) to infrastructure modules. Application teams can control their environment without needing direct infrastructure access.
  3. Complex Data Structures: Handling large, nested JSON or YAML configuration files that are impractical to embed directly in Terraform. SREs can manage complex routing rules or security policies centrally.
  4. Centralized Secrets Management: Integrating with secrets managers (HashiCorp Vault, AWS Secrets Manager, Azure Key Vault) and exposing secrets as inputs to infrastructure modules. Security teams can enforce consistent secrets management practices.
  5. Dynamic Resource Sizing: Determining the size of compute instances or storage volumes based on external metrics or business logic. Platform engineers can automate scaling based on real-time demand.

Key Terraform Resources

  1. terraform_remote_state: Reads the state of another Terraform configuration.
data "terraform_remote_state" "config" {
  backend = "s3"
  config = {
    bucket = "my-terraform-config-bucket"
    key    = "config/terraform.tfstate"
    region = "us-east-1"
  }
}
  1. external: Executes an external program and captures its output.
data "external" "get_app_config" {
  program = ["/bin/bash", "${path.module}/get_app_config.sh"]
}
  1. aws_s3_bucket: Used as a backend for terraform_remote_state on AWS.
resource "aws_s3_bucket" "config_bucket" {
  bucket = "my-terraform-config-bucket"
  acl    = "private"
}
  1. azurerm_storage_account: Used as a backend for terraform_remote_state on Azure.
resource "azurerm_storage_account" "config_account" {
  name                = "myterraformconfigsa"
  resource_group_name = "my-resource-group"
  location            = "eastus"
  account_kind        = "StorageV2"
}
  1. google_storage_bucket: Used as a backend for terraform_remote_state on GCP.
resource "google_storage_bucket" "config_bucket" {
  name                        = "my-terraform-config-bucket"
  location                    = "US"
  storage_class               = "STANDARD"
}
  1. aws_iam_policy: Controls access to the S3 bucket storing the remote state.
resource "aws_iam_policy" "config_bucket_policy" {
  name        = "config-bucket-policy"
  description = "Policy for accessing the config bucket"
  policy      = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = ["s3:GetObject"]
        Effect = "Allow"
        Resource = "arn:aws:s3:::my-terraform-config-bucket/config/*"
      }
    ]
  })
}
  1. azurerm_role_assignment: Controls access to the Azure Storage Account.
resource "azurerm_role_assignment" "config_storage_account_reader" {
  scope              = data.azurerm_storage_account.config_account.id
  role_definition_name = "Reader"
  principal_id       = "your-service-principal-id"
}
  1. local_file: Useful for creating the configuration file that the external data source consumes.
resource "local_file" "app_config_script" {
  content  = "echo '{\"api_key\": \"YOUR_API_KEY\", \"feature_flag\": true}'"
  filename = "${path.module}/get_app_config.sh"
  file_permission = "0755"
}

Common Patterns & Modules

  • Remote Backend with State Locking: Always use a remote backend (S3, Azure Blob Storage, GCS) for terraform_remote_state to ensure state consistency and prevent concurrent modifications. Enable state locking to prevent race conditions.
  • Dynamic Blocks: Use for_each or count to iterate over configuration data retrieved from terraform_remote_state and create multiple resources.
  • Monorepo Structure: A monorepo can house both the “config” Terraform configuration and the infrastructure configurations that consume it, simplifying dependency management.
  • Layered Approach: Separate configuration data into layers (e.g., base, environment-specific) to promote reusability and reduce duplication.
  • Env-Based Configuration: Organize configuration data by environment (dev, staging, prod) to ensure environment-specific settings are applied correctly.

Hands-On Tutorial

This example demonstrates reading a simple configuration from a remote S3 backend.

1. Config Configuration (config/main.tf):

terraform {
  backend "s3" {
    bucket = "my-terraform-config-bucket"
    key    = "config/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_s3_object" "app_config" {
  bucket = "my-terraform-config-bucket"
  key    = "config/app_config.json"
  content = jsonencode({
    "api_url" = "https://api.example.com"
    "replica_count" = 2
  })
}

output "app_config_json" {
  value = aws_s3_object.app_config.content
}

2. Infrastructure Configuration (infra/main.tf):

data "terraform_remote_state" "config" {
  backend = "s3"
  config = {
    bucket = "my-terraform-config-bucket"
    key    = "config/terraform.tfstate"
    region = "us-east-1"
  }
}

resource "aws_instance" "example" {
  ami           = "ami-0c55b2ab99196932a"
  instance_type = "t2.micro"

  tags = {
    Name = "Example Instance"
  }

  user_data = base64encode(templatefile("${path.module}/user_data.tpl", {
    api_url = data.terraform_remote_state.config.outputs.app_config_json["api_url"]
    replica_count = data.terraform_remote_state.config.outputs.app_config_json["replica_count"]
  }))
}

# user_data.tpl
# {{ .api_url }}
# {{ .replica_count }}

3. Apply & Destroy:

terraform init
terraform plan
terraform apply
terraform destroy

terraform plan will show the instance being created with the API URL and replica count from the remote state.

Enterprise Considerations

Large organizations leverage Terraform Cloud/Enterprise for centralized state management, remote runs, and policy enforcement. Sentinel policies can be used to validate configuration data retrieved from terraform_remote_state, ensuring compliance with security and governance standards. IAM design must carefully control access to the remote state backend, using least privilege principles. State locking is critical in multi-user environments. Costs are associated with the remote backend storage and network egress. Multi-region deployments require careful consideration of data replication and latency.

Security and Compliance

Enforce least privilege by granting only necessary permissions to access the remote state backend. Use IAM policies (e.g., aws_iam_policy) or role assignments (e.g., azurerm_role_assignment) to restrict access. Implement drift detection to identify unauthorized changes to the configuration data. Tagging policies can enforce consistent metadata labeling. Audit logs should be enabled to track access and modifications to the remote state.

Integration with Other Services

graph LR
    A[Terraform Config] --> B(S3/Azure Blob/GCS);
    A --> C[Secrets Manager (Vault/AWS Secrets Manager)];
    A --> D[CI/CD Pipeline (GitHub Actions/GitLab CI)];
    A --> E[Monitoring (Prometheus/CloudWatch)];
    A --> F[Notification (Slack/PagerDuty)];
  1. Secrets Manager: Retrieve secrets from a secrets manager and store them in the remote state.
  2. CI/CD Pipeline: Trigger a Terraform apply after configuration data is updated in the remote state.
  3. Monitoring: Monitor the health of the remote state backend and alert on any issues.
  4. Notification: Send notifications when configuration data is updated.
  5. Database: Use the external data source to query a database for configuration data.

Module Design Best Practices

  • Abstraction: Encapsulate the terraform_remote_state data source within a module to hide the implementation details.
  • Input/Output Variables: Define clear input variables for the remote state backend configuration and output variables for the configuration data.
  • Locals: Use locals to simplify complex expressions and improve readability.
  • Backends: Support multiple remote backend types (S3, Azure Blob Storage, GCS) through conditional logic.
  • Documentation: Provide comprehensive documentation for the module, including examples and usage instructions.

CI/CD Automation

# .github/workflows/deploy.yml

name: Deploy Infrastructure

on:
  push:
    branches:
      - main

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: hashicorp/setup-terraform@v2
      - run: terraform fmt
      - run: terraform validate
      - run: terraform plan -out=tfplan
      - run: terraform apply tfplan

Pitfalls & Troubleshooting

  1. Incorrect Backend Configuration: Double-check the bucket name, key, and region in the terraform_remote_state block.
  2. Access Permissions: Ensure the Terraform service account has the necessary permissions to access the remote state backend.
  3. State Locking Conflicts: Resolve state locking conflicts by ensuring only one Terraform process is modifying the state at a time.
  4. Data Type Mismatches: Verify that the data types of the configuration data retrieved from terraform_remote_state match the expected input types of the infrastructure resources.
  5. External Program Errors: Check the logs of the external program executed by the external data source for errors.
  6. Remote State Corruption: Implement regular backups of the remote state to protect against data loss.

Pros and Cons

Pros:

  • Decoupling: Separates configuration data from infrastructure definition.
  • Reusability: Enables reusable infrastructure modules that can be adapted to different environments.
  • Scalability: Supports large and complex configuration datasets.
  • Centralized Management: Provides a central location for managing configuration data.

Cons:

  • Complexity: Introduces additional complexity to the infrastructure deployment process.
  • Dependencies: Creates dependencies on the remote state backend and external programs.
  • Latency: Retrieving configuration data from a remote backend can introduce latency.
  • Security: Requires careful attention to security to protect the remote state backend.

Conclusion

Terraform Config, built around terraform_remote_state and external data sources, is a powerful technique for managing dynamic configuration data in complex infrastructure environments. It enables decoupling, reusability, and scalability, making it a critical component of modern IaC pipelines and platform engineering stacks. Engineers should prioritize implementing this pattern in their next Proof-of-Concept, evaluate existing modules that leverage remote state, and establish a robust CI/CD pipeline to automate configuration updates.


This content originally appeared on DEV Community and was authored by DevOps Fundamental