How to auto-deploy Puppeteer in AWS Lambda using Github actions



This content originally appeared on DEV Community and was authored by Ivan Muñoz

Original Article: How to auto-deploy Puppeteer in AWS Lambda using Github actions

In this article, we will explore how to set up a Puppeteer project that can be automatically deployed to AWS Lambda using GitHub Actions.

It includes a repo that serves as a base template for deploying a Puppeteer project to AWS Lambda using GitHub Actions. The repo is designed to convert your Puppeteer project into an API that can be invoked via HTTP requests, allowing you to run Puppeteer scripts in a serverless environment.

You can find the complete code in the GitHub repository.

Table of Contents

  • Development
    1. Cloning the repository
    2. Installing dependencies
    3. Creating the .env file
    4. Running the development server
    5. Testing the API
  • Deployment
    1. Creating the Lambda function
    2. Creating the S3 bucket
    3. Creating an IAM user and access keys
    4. Configuring your lambda function
    5. Configuring the GitHub Actions workflow
    6. Testing the deployment
  • Troubleshooting
  • Conclusion
  • Additional Resources

Development

1. Cloning the repository

Clone the repository to your local machine:

git clone https://github.com/ivanalemunioz/puppeteer-lambda-auto-deploy.git
cd puppeteer-lambda-auto-deploy

2. Installing dependencies

Install the required dependencies using npm:

npm install

3. Creating the .env file

Create a .env file in the root of the project based on the .env.example file and fill in the required values.

4. Running the development server

Run the development server:

npm run dev

This command will start the server on http://localhost:5123 and will watch for changes in the code.

5. Testing the API

You can test the API by sending a POST request to http://localhost:5123/v1/run using a tool like Postman or curl. You should see the Puppeteer script running and returning a response.

curl --location 'http://localhost:5123/v1/run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_GENERATED_AUTH_TOKEN' \
--data '{
    "action": "scrape-pptr-docs",
    "params": {}
}'

Deployment

You can jump to step 4 if you already have a Lambda function, S3 bucket and IAM role set up.

1. Creating the Lambda function

  1. Go to the AWS Lambda console.
  2. Click on “Create function”.
  3. Choose “Author from scratch”.
  4. Enter a name for your function, e.g., puppeteer-lambda-auto-deploy.
  5. Select the runtime as Node.js 22.x.
  6. Select the architecture as x86_64. It will not work with arm64 architecture.
  7. Under “Additional configurations”, enable “Function URL” to allow HTTP access to your Lambda function.
    • “Auth type” select NONE.
    • “Invoke mode” select “BUFFERED (default)”.
  8. Click on “Create function”.

2. Creating the S3 bucket

  1. Go to the S3 console.
  2. Click on “Create bucket”.
  3. As bucket type select “General purpose”.
  4. Enter a unique name for your bucket, e.g., puppeteer-lambda-auto-deploy-bucket.
  5. Let the rest of the settings as default.
  6. Click on “Create bucket”.

3. Creating an IAM user and access keys

  1. Go to the IAM console.
  2. Click on “Users” in the left sidebar.
  3. Click on “Create user”.
  4. Enter a username, e.g., puppeteer-lambda-auto-deploy-user.
  5. Click on “Next”.
  6. Click on “Attach policies directly” and select the following policies:
    • AWSLambda_FullAccess
    • AmazonS3FullAccess
  7. Click on “Create user”.
  8. Click on the user you just created.
  9. Under the “Security credentials” tab, click on “Create access key”.
  10. As Use case select “Other”
  11. Click on “Next”.
  12. Click on “Create access key”.
  13. Copy the Access key ID and Secret access key and store them securely.

4. Configuring your lambda function

  1. Go to the AWS Lambda console.
  2. Click on the function you created earlier.
  3. Under the “Configuration” tab, click on “Environment variables”.
  4. Click in “Edit” and add the BUGLESSTACK_ACCESS_TOKEN and BROWSER_AUTOMATIONS_ACCESS_TOKEN variables. You can get more info about how to generate these tokens in the .env.example file.
  5. Under the “Configuration” tab, click on “General configuration” and “Edit”.
  6. Increase the timeout to 2 minutes and memory to 1024MB (recommended) and click on “Save”.

5. Configuring the GitHub Actions workflow

  1. Go to your GitHub repository.
  2. Click on “Settings” in the top menu.
  3. In the left sidebar, click on “Secrets and variables” and then “Actions”.
  4. Add the following secrets:
    • AWS_ACCESS_KEY_ID: Your AWS access key ID generated in step 3.
    • AWS_SECRET_ACCESS_KEY: Your AWS secret access key generated in step 3.
    • AWS_REGION: The AWS region where your Lambda function is deployed (e.g., us-east-1).
  5. In the same section, click on “Variables” and add the following variables:

    • S3_BUCKET: The name of the S3 bucket where the Lambda function package will be uploaded.
    • S3_KEY: The key (path) in the S3 bucket where the Lambda function package will be stored (e.g., lambda/puppeteer.zip).
    • S3_LAYER_BUCKET: The name of the S3 bucket where the Lambda layer package will be uploaded.
    • S3_LAYER_KEY: The key (path) in the S3 bucket where the Lambda layer package will be stored (e.g., layers/puppeteer.zip).
    • LAYER_NAME: The name of the Lambda layer to be created or updated (e.g., puppeteer-lambda-auto-deploy-layer).
    • LAMBDA_FUNCTION_NAME: The name of the Lambda function to be updated (e.g., puppeteer-lambda-auto-deploy).
  6. In “Settings > Actions > General”, ensure that “Allow all actions and reusable workflows” is selected under “Actions permissions”.

6. Testing the deployment

  1. Push your changes to the main branch of your GitHub repository.
  2. The GitHub Actions workflow will automatically build and deploy your Puppeteer project to AWS Lambda.
  3. Once the workflow is complete, you can test the API by sending a POST request to https://YOUR_LAMBDA_URL/v1/run (you can get your Lambda URL in the Lambda details) using a tool like Postman or curl. You should see the Puppeteer script running and returning a response.

    curl --location 'https://YOUR_LAMBDA_URL/v1/run' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer YOUR_GENERATED_AUTH_TOKEN' \
    --data '{
        "action": "scrape-pptr-docs",
        "params": {}
    }'
    

Troubleshooting

  • If you encounter issues with Puppeteer not launching or crashing, ensure that the Lambda function has sufficient memory allocated (at least 1024 MB is recommended).
  • Configure the Lambda function timeout to a reasonable value (e.g., 30 seconds) to allow Puppeteer enough time to execute.
  • Ensure your Lambda architecture is set to x86_64, it will not work with arm64 architecture.
  • Ensure your Lambda and S3 bucket are in the same region.
  • Check the permissions of the IAM user to ensure it has access to S3 and Lambda.
  • Check the AWS Lambda logs in CloudWatch Logs for any errors or issues during execution.
  • If you encounter issues with the Github Actions workflow, check the workflow logs for any errors or issues during the build and deployment process.

Conclusion

In this guide, you learned how to set up a CI/CD pipeline for your Puppeteer project using GitHub Actions and AWS Lambda. By following these steps, you can automate the deployment of your Puppeteer scripts to AWS Lambda, making it easier to run headless browser tasks in the cloud. This setup not only streamlines your development workflow but also leverages the scalability and reliability of AWS Lambda for running your Puppeteer scripts.

You can find the complete code in the GitHub repository.

You can also explore the Buglesstack integration for error tracking and monitoring, which is included in the project.

You can also check the GitHub Actions workflow file.

Additional Resources


This content originally appeared on DEV Community and was authored by Ivan Muñoz