AWS Cost Optimization- Identify Stale Resources using Lambda Function.



This content originally appeared on DEV Community and was authored by Abhishek Korde

what is lambda function in AWS services?
AWS Lambda is a serverless compute service that allows you to run code without provisioning or managing servers. It executes code in response to events and automatically manages the computing resources. You upload your code as a Lambda function and it runs only when triggered by an event, scaling automatically.

Problem Statement: When a user creates an EC2 instance, an associated EBS volume is also created. The user typically takes snapshots of this volume for backup purposes. However, after a few months, the user deletes the EC2 instance and the volume but forgets to delete the associated snapshot. As a result, the unused snapshots continue to incur storage costs, leading to unnecessary and increasing expenses over time. As DevOps engineer, It is essential to address this by implementing cost optimization strategies to identify and clean up unused snapshots to reduce AWS costs.

Solution: To solve this problem statement we use Lambda funtion fetches all EBS snapshots owned by the same account and also retrieves a list of active EC2 instances. For each snapshot, It checks if the associated volume(if exists) is not associated with any active instance. If it finds a stale snapshots, it deletes it, effectively optimizing storage costs.

  1. fetch all the EBS snapshots
  2. filter out snapshots that are stale.
  3. stale snapshot will be deleted

First step is to create the EC2 instance, while creating CE2 instance volume also created. In my case below is volume created.
Image description
After successfully created EC2 instance we have to manually create snapshot of volume.
Snapshot is nothing but copy of your volume.
Image description
Now first we create Lambda function
Image description
after creating lambda function, go to code section and write below code.

import boto3

def lambda_handler(event, context):
    ec2 = boto3.client('ec2')

    # Get all EBS snapshots
    response = ec2.describe_snapshots(OwnerIds=['self'])

    # Get all active EC2 instance IDs
    instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
    active_instance_ids = set()

    for reservation in instances_response['Reservations']:
        for instance in reservation['Instances']:
            active_instance_ids.add(instance['InstanceId'])

    # Iterate through each snapshot and delete if it's not attached to any volume or the volume is not attached to a running instance
    for snapshot in response['Snapshots']:
        snapshot_id = snapshot['SnapshotId']
        volume_id = snapshot.get('VolumeId')

        if not volume_id:
            # Delete the snapshot if it's not attached to any volume
            ec2.delete_snapshot(SnapshotId=snapshot_id)
            print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.")
        else:
            # Check if the volume still exists
            try:
                volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
                if not volume_response['Volumes'][0]['Attachments']:
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.")
            except ec2.exceptions.ClientError as e:
                if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
                    # The volume associated with the snapshot is not found (it might have been deleted)
                    ec2.delete_snapshot(SnapshotId=snapshot_id)
                    print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")

after writing the above code click deploy. and create configure test event.
after this click test but it will fail as shown delow.
Image description
This is failing because of lambda execution time is by default 3 second and some permission error as well.
we will solve this issue one by one.
first go to configure tab and change lambda execution time upto 10 seconds
another one is go to attached IAM policy and attached below policy.
Image description
after attaching the policy come back and execute lambda function again
after execution we will see snapshot will not be deleted because snapshot attached to volume and volume attached to EC2.
Image description
manually delete the EC2 instance, that EC2 will delete volume as well.
Image description
Now I am executing the lambda function again.
Image description
Deleted EBS snapshot snap-02c1afa1d20b2c81b as its associated volume was not found. is shown in above screenshot.
Image description
as you see the snapshot also deleted if i delete EC2 instance.

Summary: using Lambda function we can reduce cost optimization of organization.


This content originally appeared on DEV Community and was authored by Abhishek Korde