Anthony Chu Contact Me

Running a Playwright scheduled job with Azure Container Apps

Monday, June 10, 2024

In my last post on how to deploy a serverless Playwright app to Azure Container Apps, I promised that I'd write a follow-up article "soon" on how to run Playwright in Azure Container Apps using a scheduled job. Well, it's been almost nine months, but here we are!

Azure Container Apps jobs

There are two types of resources in Azure Container Apps: apps and jobs. Apps are long-running services that respond to HTTP requests or events. Jobs are tasks that run to completion and can be triggered by a schedule or an event.

It's easy to build a job. It can be any process that you can put into a container image. Today, we'll use Node.js to create a Playwright job that runs every five minutes. Each time it runs, it'll do the following:

  1. Open a browser
  2. Navigate to Playwright's home page
  3. Navigate to a few other pages
  4. Output the these metrics to Application Insights:
    • The time it took to navigate to each page
    • The Playwright website's availability, as determined by whether all the pages loaded successfully
  5. Save a video recording of the browser session to Azure Storage

Example of video recording of Playwright session

Building the job

You can find the source code in this GitHub repository.

The job depends on a couple of packages:

npm install playwright
npm install applicationinsights

Here's the code for the job:

const appInsights = require('applicationinsights');
const { chromium } = require('playwright');

(async () => {
  const client = new appInsights.TelemetryClient(process.env.APPINSIGHTS_CONNECTION_STRING);
  const availabilityTelemetryBase = {
    name: process.env.TEST_NAME || "test",
    runLocation: process.env.TEST_REGION,
  };
  let browser;

  try {
    browser = await chromium.launch();
    const page = await browser.newPage({
      recordVideo: {
        dir: './videos',
      }
    });

    performance.mark('start');

    await page.goto('https://playwright.dev/');
    performance.mark('homePageLoaded');

    await page.locator('text=Get Started').click();
    performance.mark('getStartedPageLoaded');

    await page.getByRole('link', { name: 'Library' }).click();
    performance.mark('libraryPageLoaded');

    await page.getByRole('link', { name: 'API', exact: true }).click();
    performance.mark('apiPageLoaded');

    await page.getByLabel('Docs sidebar').getByRole('link', { name: 'Selectors' }).click();
    performance.mark('selectorsPageLoaded');

    const marks = performance.getEntriesByType('mark');
    marks.reduce((previousMark, currentMark) => {
      performance.measure(currentMark.name, previousMark.name, currentMark.name);
      return currentMark;
    });

    const measures = performance.getEntriesByType('measure');

    const availabilityProperties = measures.reduce((measurements, measure) => {
      measurements[measure.name] = measure.duration;
      return measurements;
    }, {});

    console.log(availabilityProperties);

    const totalDuration = marks[marks.length - 1].startTime - marks[0].startTime;
    console.log(`Total duration: ${totalDuration}ms`);

    client.trackAvailability({
      ...availabilityTelemetryBase,
      duration: totalDuration,
      success: true,
      properties: availabilityProperties,
    });

  } catch (error) {
    console.error(error);
    client.trackException({ exception: error });
    client.trackAvailability({
      ...availabilityTelemetryBase,
      duration: 0,
      success: false,
    });

  } finally {
    await client.flush();
    await browser.close();
  }

})();

There are a few things to note about this code:

  • When we create a new page, we enable video recording. The videos are saved to the videos directory. Later on, we'll mount a volume to this directory to save the videos to Azure Storage file share.
  • We use Node.js's built-in performance API to measure the time it takes to navigate to each page.
  • We use Azure Application Insights to track the availability of the Playwright website. We send the total duration of the test and the time it took to navigate to each page as metrics. We also mark the test as a success or failure based on whether all the pages loaded successfully.

To build the job into a container image, we create a Dockerfile with the following content:

FROM mcr.microsoft.com/playwright:v1.44.1-jammy
WORKDIR /usr/src/app
COPY ["package.json", "package-lock.json*", "npm-shrinkwrap.json*", "./"]
RUN npm install --production --silent && mv node_modules ../
COPY . .

CMD ["npm", "start"]

Deploying the job

Create Azure resources

Before we create the Container Apps job, we need to create a few Azure resources:

  • Azure Container Registry: stores our job's container image
    • Create an Azure Container Registry (ACR) in the Azure portal. We can use the Basic or Standard tier.
  • Azure Log Analytics workspace: needed by Application Insights to store telemetry data
    • Create an Azure Log Analytics workspace in the Azure portal. We can use the Free tier.
  • Azure Application Insights: stores the availability data measured by the job
    • Create an Azure Application Insights resource in the Azure portal. Select the same region as the Log Analytics workspace, and select the Log Analytics workspace we created.

Build and push the container image

Clone the job's source code from GitHub:

git clone https://github.com/anthonychu/20230610-aca-playwright-puppeteer.git
cd 20230610-aca-playwright-puppeteer
cd playwright-job

Run the following command to build the container image in the ACR:

az acr build --registry <acr-name> --image playwright-job .

Create the Container Apps job

In the Azure portal, search for "Container App jobs" and click "Add" to create a new job. Fill in the required fields:

  • Basics
    • Name: playwright-job
    • Region: The region where we created the ACR, Log Analytics workspace, and Application Insights
    • Image: <acr-name>.azurecr.io/playwright-job
    • Trigger type: Scheduled
    • Cron expression: */5 * * * * (runs every 5 minutes)
  • Container
    • Name: playwright-job
    • Image source: Azure Container Registry
    • Registry: Select the ACR the image was built in
    • Image: playwright-job
    • Image tag: latest
    • Workload profile: Consumption
    • CPU and memory: 1 vCPU, 2 GiB
    • Environment variables:
      • APPINSIGHTS_CONNECTION_STRING: connection string from the Application Insights instance
      • TEST_REGION: the region our job is created in

After we create the job, it should start running every five minutes. We can see the job's execution history in the Azure portal.

Screenshot of job execution history

Viewing the results in Application Insights

In the Application Insights resource, we can see the availability data as a scatter plot. We can also click on any of the dots, we see the details of that specific Playwright session.

Screenshot of availability test results

To create a chart that shows the duration of each page load over time, we can create a custom log query:

availabilityResults 
| project
    timestamp,
    homePageLoaded=todecimal(customDimensions['homePageLoaded']),
    getStartedPageLoaded=todecimal(customDimensions['getStartedPageLoaded']),
    libraryPageLoaded=todecimal(customDimensions['libraryPageLoaded']),
    apiPageLoaded=todecimal(customDimensions['apiPageLoaded']),
    selectorsPageLoaded=todecimal(customDimensions['selectorsPageLoaded'])
| render areachart

Screenshot of page load duration chart

Mounting a file share volume

To save the video recordings to Azure Storage, we need to create an Azure Storage account and a file share.

Then in the Azure portal, go to the Container Apps environment at add the file share. Then go to the job and add a volume. Lastly, update the job's container definition to mount the volume to the /usr/src/app/videos directory.

Screenshot of volume mount configuration

After a job runs, we can see the video recordings in the Azure Storage file share.

Screenshot of file listing

How much does it cost?

In the Azure Container Apps Consumption plan, we're only charged for the CPU and memory used by our job when it's running. Often, most or all of the cost is covered by the monthly free grant. Check out pricing to learn more.