Running a Playwright scheduled job with Azure Container Apps
Monday, June 10, 2024
In my last post on how to deploy a serverless Playwright app to Azure Container Apps, I promised that I'd write a follow-up article "soon" on how to run Playwright in Azure Container Apps using a scheduled job. Well, it's been almost nine months, but here we are!
Azure Container Apps jobs
There are two types of resources in Azure Container Apps: apps and jobs. Apps are long-running services that respond to HTTP requests or events. Jobs are tasks that run to completion and can be triggered by a schedule or an event.
It's easy to build a job. It can be any process that you can put into a container image. Today, we'll use Node.js to create a Playwright job that runs every five minutes. Each time it runs, it'll do the following:
- Open a browser
- Navigate to Playwright's home page
- Navigate to a few other pages
- Output the these metrics to Application Insights:
- The time it took to navigate to each page
- The Playwright website's availability, as determined by whether all the pages loaded successfully
- Save a video recording of the browser session to Azure Storage
Building the job
You can find the source code in this GitHub repository.
The job depends on a couple of packages:
npm install playwright
npm install applicationinsights
Here's the code for the job:
const appInsights = require('applicationinsights');
const { chromium } = require('playwright');
(async () => {
const client = new appInsights.TelemetryClient(process.env.APPINSIGHTS_CONNECTION_STRING);
const availabilityTelemetryBase = {
name: process.env.TEST_NAME || "test",
runLocation: process.env.TEST_REGION,
};
let browser;
try {
browser = await chromium.launch();
const page = await browser.newPage({
recordVideo: {
dir: './videos',
}
});
performance.mark('start');
await page.goto('https://playwright.dev/');
performance.mark('homePageLoaded');
await page.locator('text=Get Started').click();
performance.mark('getStartedPageLoaded');
await page.getByRole('link', { name: 'Library' }).click();
performance.mark('libraryPageLoaded');
await page.getByRole('link', { name: 'API', exact: true }).click();
performance.mark('apiPageLoaded');
await page.getByLabel('Docs sidebar').getByRole('link', { name: 'Selectors' }).click();
performance.mark('selectorsPageLoaded');
const marks = performance.getEntriesByType('mark');
marks.reduce((previousMark, currentMark) => {
performance.measure(currentMark.name, previousMark.name, currentMark.name);
return currentMark;
});
const measures = performance.getEntriesByType('measure');
const availabilityProperties = measures.reduce((measurements, measure) => {
measurements[measure.name] = measure.duration;
return measurements;
}, {});
console.log(availabilityProperties);
const totalDuration = marks[marks.length - 1].startTime - marks[0].startTime;
console.log(`Total duration: ${totalDuration}ms`);
client.trackAvailability({
...availabilityTelemetryBase,
duration: totalDuration,
success: true,
properties: availabilityProperties,
});
} catch (error) {
console.error(error);
client.trackException({ exception: error });
client.trackAvailability({
...availabilityTelemetryBase,
duration: 0,
success: false,
});
} finally {
await client.flush();
await browser.close();
}
})();
There are a few things to note about this code:
- When we create a new page, we enable video recording. The videos are saved to the
videos
directory. Later on, we'll mount a volume to this directory to save the videos to Azure Storage file share.
- We use Node.js's built-in
performance
API to measure the time it takes to navigate to each page.
- We use Azure Application Insights to track the availability of the Playwright website. We send the total duration of the test and the time it took to navigate to each page as metrics. We also mark the test as a success or failure based on whether all the pages loaded successfully.
To build the job into a container image, we create a Dockerfile
with the following content:
FROM mcr.microsoft.com/playwright:v1.44.1-jammy
WORKDIR /usr/src/app
COPY ["package.json", "package-lock.json*", "npm-shrinkwrap.json*", "./"]
RUN npm install --production --silent && mv node_modules ../
COPY . .
CMD ["npm", "start"]
Deploying the job
Create Azure resources
Before we create the Container Apps job, we need to create a few Azure resources:
- Azure Container Registry: stores our job's container image
- Create an Azure Container Registry (ACR) in the Azure portal. We can use the Basic or Standard tier.
- Azure Log Analytics workspace: needed by Application Insights to store telemetry data
- Create an Azure Log Analytics workspace in the Azure portal. We can use the Free tier.
- Azure Application Insights: stores the availability data measured by the job
- Create an Azure Application Insights resource in the Azure portal. Select the same region as the Log Analytics workspace, and select the Log Analytics workspace we created.
Build and push the container image
Clone the job's source code from GitHub:
git clone https://github.com/anthonychu/20230610-aca-playwright-puppeteer.git
cd 20230610-aca-playwright-puppeteer
cd playwright-job
Run the following command to build the container image in the ACR:
az acr build --registry <acr-name> --image playwright-job .
Create the Container Apps job
In the Azure portal, search for "Container App jobs" and click "Add" to create a new job. Fill in the required fields:
- Basics
- Name:
playwright-job
- Region: The region where we created the ACR, Log Analytics workspace, and Application Insights
- Image:
<acr-name>.azurecr.io/playwright-job
- Trigger type: Scheduled
- Cron expression:
*/5 * * * *
(runs every 5 minutes)
- Container
- Name:
playwright-job
- Image source: Azure Container Registry
- Registry: Select the ACR the image was built in
- Image:
playwright-job
- Image tag:
latest
- Workload profile: Consumption
- CPU and memory: 1 vCPU, 2 GiB
- Environment variables:
APPINSIGHTS_CONNECTION_STRING
: connection string from the Application Insights instance
TEST_REGION
: the region our job is created in
After we create the job, it should start running every five minutes. We can see the job's execution history in the Azure portal.
Viewing the results in Application Insights
In the Application Insights resource, we can see the availability data as a scatter plot. We can also click on any of the dots, we see the details of that specific Playwright session.
To create a chart that shows the duration of each page load over time, we can create a custom log query:
availabilityResults
| project
timestamp,
homePageLoaded=todecimal(customDimensions['homePageLoaded']),
getStartedPageLoaded=todecimal(customDimensions['getStartedPageLoaded']),
libraryPageLoaded=todecimal(customDimensions['libraryPageLoaded']),
apiPageLoaded=todecimal(customDimensions['apiPageLoaded']),
selectorsPageLoaded=todecimal(customDimensions['selectorsPageLoaded'])
| render areachart
Mounting a file share volume
To save the video recordings to Azure Storage, we need to create an Azure Storage account and a file share.
Then in the Azure portal, go to the Container Apps environment at add the file share. Then go to the job and add a volume. Lastly, update the job's container definition to mount the volume to the /usr/src/app/videos
directory.
After a job runs, we can see the video recordings in the Azure Storage file share.
How much does it cost?
In the Azure Container Apps Consumption plan, we're only charged for the CPU and memory used by our job when it's running. Often, most or all of the cost is covered by the monthly free grant. Check out pricing to learn more.
In my last post on how to deploy a serverless Playwright app to Azure Container Apps, I promised that I'd write a follow-up article "soon" on how to run Playwright in Azure Container Apps using a scheduled job. Well, it's been almost nine months, but here we are!
Azure Container Apps jobs
There are two types of resources in Azure Container Apps: apps and jobs. Apps are long-running services that respond to HTTP requests or events. Jobs are tasks that run to completion and can be triggered by a schedule or an event.
It's easy to build a job. It can be any process that you can put into a container image. Today, we'll use Node.js to create a Playwright job that runs every five minutes. Each time it runs, it'll do the following:
- Open a browser
- Navigate to Playwright's home page
- Navigate to a few other pages
- Output the these metrics to Application Insights:
- The time it took to navigate to each page
- The Playwright website's availability, as determined by whether all the pages loaded successfully
- Save a video recording of the browser session to Azure Storage
Building the job
You can find the source code in this GitHub repository.
The job depends on a couple of packages:
npm install playwright
npm install applicationinsights
Here's the code for the job:
const appInsights = require('applicationinsights');
const { chromium } = require('playwright');
(async () => {
const client = new appInsights.TelemetryClient(process.env.APPINSIGHTS_CONNECTION_STRING);
const availabilityTelemetryBase = {
name: process.env.TEST_NAME || "test",
runLocation: process.env.TEST_REGION,
};
let browser;
try {
browser = await chromium.launch();
const page = await browser.newPage({
recordVideo: {
dir: './videos',
}
});
performance.mark('start');
await page.goto('https://playwright.dev/');
performance.mark('homePageLoaded');
await page.locator('text=Get Started').click();
performance.mark('getStartedPageLoaded');
await page.getByRole('link', { name: 'Library' }).click();
performance.mark('libraryPageLoaded');
await page.getByRole('link', { name: 'API', exact: true }).click();
performance.mark('apiPageLoaded');
await page.getByLabel('Docs sidebar').getByRole('link', { name: 'Selectors' }).click();
performance.mark('selectorsPageLoaded');
const marks = performance.getEntriesByType('mark');
marks.reduce((previousMark, currentMark) => {
performance.measure(currentMark.name, previousMark.name, currentMark.name);
return currentMark;
});
const measures = performance.getEntriesByType('measure');
const availabilityProperties = measures.reduce((measurements, measure) => {
measurements[measure.name] = measure.duration;
return measurements;
}, {});
console.log(availabilityProperties);
const totalDuration = marks[marks.length - 1].startTime - marks[0].startTime;
console.log(`Total duration: ${totalDuration}ms`);
client.trackAvailability({
...availabilityTelemetryBase,
duration: totalDuration,
success: true,
properties: availabilityProperties,
});
} catch (error) {
console.error(error);
client.trackException({ exception: error });
client.trackAvailability({
...availabilityTelemetryBase,
duration: 0,
success: false,
});
} finally {
await client.flush();
await browser.close();
}
})();
There are a few things to note about this code:
- When we create a new page, we enable video recording. The videos are saved to the
videos
directory. Later on, we'll mount a volume to this directory to save the videos to Azure Storage file share. - We use Node.js's built-in
performance
API to measure the time it takes to navigate to each page. - We use Azure Application Insights to track the availability of the Playwright website. We send the total duration of the test and the time it took to navigate to each page as metrics. We also mark the test as a success or failure based on whether all the pages loaded successfully.
To build the job into a container image, we create a Dockerfile
with the following content:
FROM mcr.microsoft.com/playwright:v1.44.1-jammy
WORKDIR /usr/src/app
COPY ["package.json", "package-lock.json*", "npm-shrinkwrap.json*", "./"]
RUN npm install --production --silent && mv node_modules ../
COPY . .
CMD ["npm", "start"]
Deploying the job
Create Azure resources
Before we create the Container Apps job, we need to create a few Azure resources:
- Azure Container Registry: stores our job's container image
- Create an Azure Container Registry (ACR) in the Azure portal. We can use the Basic or Standard tier.
- Azure Log Analytics workspace: needed by Application Insights to store telemetry data
- Create an Azure Log Analytics workspace in the Azure portal. We can use the Free tier.
- Azure Application Insights: stores the availability data measured by the job
- Create an Azure Application Insights resource in the Azure portal. Select the same region as the Log Analytics workspace, and select the Log Analytics workspace we created.
Build and push the container image
Clone the job's source code from GitHub:
git clone https://github.com/anthonychu/20230610-aca-playwright-puppeteer.git
cd 20230610-aca-playwright-puppeteer
cd playwright-job
Run the following command to build the container image in the ACR:
az acr build --registry <acr-name> --image playwright-job .
Create the Container Apps job
In the Azure portal, search for "Container App jobs" and click "Add" to create a new job. Fill in the required fields:
- Basics
- Name:
playwright-job
- Region: The region where we created the ACR, Log Analytics workspace, and Application Insights
- Image:
<acr-name>.azurecr.io/playwright-job
- Trigger type: Scheduled
- Cron expression:
*/5 * * * *
(runs every 5 minutes)
- Name:
- Container
- Name:
playwright-job
- Image source: Azure Container Registry
- Registry: Select the ACR the image was built in
- Image:
playwright-job
- Image tag:
latest
- Workload profile: Consumption
- CPU and memory: 1 vCPU, 2 GiB
- Environment variables:
APPINSIGHTS_CONNECTION_STRING
: connection string from the Application Insights instanceTEST_REGION
: the region our job is created in
- Name:
After we create the job, it should start running every five minutes. We can see the job's execution history in the Azure portal.
Viewing the results in Application Insights
In the Application Insights resource, we can see the availability data as a scatter plot. We can also click on any of the dots, we see the details of that specific Playwright session.
To create a chart that shows the duration of each page load over time, we can create a custom log query:
availabilityResults
| project
timestamp,
homePageLoaded=todecimal(customDimensions['homePageLoaded']),
getStartedPageLoaded=todecimal(customDimensions['getStartedPageLoaded']),
libraryPageLoaded=todecimal(customDimensions['libraryPageLoaded']),
apiPageLoaded=todecimal(customDimensions['apiPageLoaded']),
selectorsPageLoaded=todecimal(customDimensions['selectorsPageLoaded'])
| render areachart
Mounting a file share volume
To save the video recordings to Azure Storage, we need to create an Azure Storage account and a file share.
Then in the Azure portal, go to the Container Apps environment at add the file share. Then go to the job and add a volume. Lastly, update the job's container definition to mount the volume to the /usr/src/app/videos
directory.
After a job runs, we can see the video recordings in the Azure Storage file share.
How much does it cost?
In the Azure Container Apps Consumption plan, we're only charged for the CPU and memory used by our job when it's running. Often, most or all of the cost is covered by the monthly free grant. Check out pricing to learn more.