AWS Database Blog

Schedule jobs in Amazon RDS for PostgreSQL using AWS CodeBuild and Amazon EventBridge

When you want to migrate on-premises database workloads with jobs to AWS, you need to select the right AWS services to schedule the jobs. Database administrators traditionally schedule scripts to run against databases using the system cron on the host where the database is running. When you migrate such workloads from on premises to a managed database service like Amazon Relational Database Service (Amazon RDS), you lose the ability to log in to the host instance to schedule cron jobs.

Newer releases of Amazon RDS for PostgreSQL starting with version 12.5 enable you to schedule PostgreSQL commands within your database. However, if you need to orchestrate the Amazon RDS jobs and integrate with native AWS services such as Amazon Simple Notification Service (Amazon SNS) to send notifications, this post provides an alternate way to schedule and run Amazon RDS jobs using AWS CodeBuild and Amazon EventBridge rules.

Why AWS CodeBuild?

An Amazon RDS job can run stored procedures and functions, extract the data, purge the data, or do any other SQL operations. You can run these jobs with different AWS services, such as AWS Lambda, AWS Batch, Amazon Elastic Container Service (Amazon ECS), and AWS Step Functions. As your data grows, jobs might take more than 15 minutes to run. CodeBuild can natively handle workloads that run for a longer time, in comparison to services like Lambda, which are designed to finish within 15 minutes.

If you want to use AWS Batch or Amazon ECS, you need to package your code as a Docker container. CodeBuild is a good option when your job takes more than 15 minutes or you can’t (or don’t want to) package your code as a Docker container, or you want to use different programming language runtimes for your jobs.

This post demonstrates how to use the combination of CodeBuild and EventBridge rules to schedule and run functions or stored procedures on an RDS for PostgreSQL database instance.

Overview of solution

The following diagram illustrates the architecture of the solution.

Architecture diagram with numbers defining a worklofw.

The workflow includes the following steps:

  1. EventBridge triggers a CodeBuild project environment based on your schedule.
  2. CodeBuild retrieves the source code from an Amazon Simple Storage Service (Amazon S3) bucket.
  3. CodeBuild retrieves the database credentials from AWS Secrets Manager.
  4. CodeBuild initiates the database connection and runs the PostgreSQL stored procedure or function.
  5. CodeBuild events are published to EventBridge.
  6. Amazon SNS notifies subscribed users of the job status.

Prerequisites

Before you begin, you need to complete the following prerequisites:

Walkthrough overview

The following steps provide a high-level overview of the walkthrough:

  1. Clone the project from the AWS code samples repository.
  2. Deploy the AWS CloudFormation template to create the required services.
  3. Upload the source code to the S3 bucket.
  4. Run database scripts and create the SQL function.
  5. Run the CodeBuild project manually.
  6. Verify if the batch job is running successfully based on the EventBridge rule.

Clone the source code

Download the files required to set up the environment. See the following code:

$ git clone https://github.com/aws-samples/aws-codebuild-rds-job-scheduling 
$ cd aws-codebuild-rds-job-scheduling

You use the following files:

  • CreateFunct.sql – Has a code to create a sample SQL function. The CodeBuild project is configured to run this SQL function.
  • Jobschedulingcft.yml – Defines all the AWS resources required for this solution.
  • invokepostgresqldbpy.zip – Contains buildspec.yml and a Python script. CodeBuild installs the libraries such as boto3 and psycopg2 defined in the buildspec.yml and invokes the Python script, which has a code to connect to PostgreSQL database and run the SQL function.

Deploying the CloudFormation template

To deploy the CloudFormation template, complete the following steps:

  1. Update the email address parameter in the CloudFormation template Jobschedulingcft.yml. Amazon SNS uses this email address to send a notification about the job status.
  2. Run the CloudFormation template to provision the required services. See the following code for macOS or Linux:
    $ aws cloudformation create-stack --stack-name codebuildjob --template-body file://Jobschedulingcft.yml --capabilities CAPABILITY_NAMED_IAM --region us-east-1

You receive the following output:

{
    "StackId": "arn:aws:cloudformation:us-east-1:xxxxxxxx:stack/codebuildjob/aade45d0-0415-11eb-9c12-0ed4f058f52d"
 }

The template creates the following resources:

  • A CodeBuild project
  • A PostgreSQL instance
  • An S3 bucket
  • Secrets Manager with PostgreSQL database login credentials
  • An EventBridge rule to trigger the job every day at 10:00 AM (UTC)
  • AWS Identity and Access Management (IAM) service roles for CodeBuild, EventBridge, and Amazon SNS
  • An SNS topic to send notifications to the provided email address with the status of the job

Upload the source code to the S3 bucket

On the AWS CloudFormation console, choose the Outputs tab. Make a note of the S3 bucket name.

Upload aws-codebuild-rds-job-scheduling/src/invokepostgresqldbpy.zip to the S3 bucket with the following CLI command:

$ aws s3 cp ./src/invokepostgresqldbpy.zip s3://{your S3 bucket name}

Running database scripts

In this post, I create a job that runs a simple PostgreSQL function.

To retrieve PostgeSQL database login credentials from Secrets Manager and create the PostgreSQL function, complete the following steps:

  1. On the Secrets Manager console, choose the secret name (stack name-secret).
  2. Choose Retrieve secret value.
    Screenshot of AWS Secrets Manager Console. Highlights the menu "Retrieve secret value"
  3. Record the PostgreSQL credentials.
    Screenshot of AWS Secrets Manager Console. Shows the stored secret information.
  4. Connect to the PostgreSQL database that the CloudFormation template provisioned.Screenshot of DBeaver connection configuration.
  5. Run the following SQL script to create the PostgreSQL function.

The function concat_lower_or_upper has two mandatory parameters, a and b. It also has an optional parameter, uppercase, which defaults to false. The a and b inputs are concatenated, and forced to either upper or lowercase depending on the uppercase parameter.

DROP FUNCTION  IF EXISTS concat_lower_or_upper(text,text,bool);

CREATE FUNCTION concat_lower_or_upper(a text, b text, uppercase boolean DEFAULT false)
RETURNS text
AS
$$
 SELECT CASE
        WHEN uppercase THEN UPPER(a || ' ' || b)
        ELSE LOWER(a || ' ' || b)
        END;
$$
LANGUAGE SQL IMMUTABLE STRICT;

     SELECT concat_lower_or_upper('Hello', 'World', 'true');

Screenshot of DBeaver showing the previous function and the output of a test execution.

Run the CodeBuild Batch job manually

For on-demand jobs, complete the following steps to run the CodeBuild batch job manually:

  1. On the CodeBuild console, in the navigation pane, under Build projects, choose Build project.
  2. Choose the project and choose Start build.Screenshot of the CodeBuild Console showing the created project.
  3. On the Phase details tab, verify the status.Screenshot of CodeBuild Console showing the "Phase details" tab.

The CodeBuild job is configured to run the following PostgreSQL function in Python script. This function concatenates two strings, ‘hello’ and ‘world’, converts to an uppercase, and returns ‘HELLO WORLD’:

# Call PostgreSQL function
callproc('concat_lower_or_upper', ('hello','world','true'))

You can verify the job’s output in Amazon CloudWatch Logs, which CodeBuild generated.

screenshot of Amazon CloudWatch logs highlighting the output "HELLO WORLD"

CloudWatch also sends the SNS notification to the email address configured in the CloudFormation template.

screenshot of a notification email from SNS highlighting the status of "SUCCEEDED".

Verify the scheduled run of CodeBuild jobs

The CodeBuild job is scheduled using EventBridge cron scheduler. The CloudFormation template created an EventBridge rule to run this job every day at 10:00 AM (UTC). You can change the schedule by editing the rule.

  1. On the EventBridge console, under Events, choose Rules.screenshot of Amazon EventBridge Console showing the rule created by the CloudFormation stack.
  2. Select your rule and choose Edit.
  3. Based on your requirement, change the schedule by updating the CRON expression.screenshot of Amazon EventBridge console showing the Cron expression.
  4. Verify the CodeBuild ARN and role exist.screenshot of Amazon EventBridge console showing the "Select targets" screen.
  5. Choose Update.

Clean up

To avoid incurring future changes, clean up the resources you created.

  1. Delete the S3 objects:
    $ aws s3 rm s3://{your s3 bucket name} --recursive
  2. Delete the CloudFormation stack:
    $ aws cloudformation delete-stack --stack-name codebuildjob

Alternatively, delete the stack on the AWS CloudFormation console. To troubleshoot if stack deletion fails, see Delete stack fails.

Conclusion

This post demonstrated how to use CodeBuild and EventBridge to schedule and run jobs on a PostgreSQL database. You can run jobs on any RDS database with the same solution by including the compatible Python adapter in your Python script.

Additionally, this solution helps you to orchestrate the job workflows by integrating CodeBuild with Step Functions for more advanced dependency management.

If you have any questions or suggestions about this post, feel free to leave a comment.


About the author

Suresh Moolya is a Cloud Application Architect with Amazon Web Services. He works with customers to architect, design, and automate business softwares at scale on AWS cloud.