Anthony Chu Contact Me

Changing Cosmos DB Write Locations on a Schedule with Azure Functions and Managed Service Identity

Sunday, September 24, 2017

Azure Cosmos DB can replicate a single account's data to as many of Azure's 30+ regions as we want to. Applications around the world can read from the closest location to minimize latency. Write operations, however, are still sent to a single designated write region.

Update: July 19, 2018 - Cosmos DB now has multi-master support to allow writing to the closest region!

Thankfully, the write region can be changed. A read region can be designated as the new write region in a matter of seconds, and this change can be initiated programmatically via a REST API. If our application's dominant workload changes regions based on time of day, we can respond by changing the write region to the one with the most activity. Sometimes this strategy is called "follow the sun" or "follow the clock".

Today, we'll look at how to use Azure Functions to automatically change a Cosmos DB account's write region on a schedule. We'll do this using Azure's newly announced Managed Service Identity feature.

Azure REST API - Cosmos DB resource provider

Azure has a comprehensive REST API for managing resources. Cosmos DB's resource provider allows us to perform operations such as, list database accounts, query a database account's properties, and change a database account's failover priorities.

The Failover Priority Change endpoint allows us to change the write region.

Managed Service Identity

Managed Service Identity (MSI) is a new feature that automatically gives Azure services an identity in Azure Active Directory. This means we no longer have to manually create and manage service principals if we want our Azure services to access other Azure resources. MSI also manages the service principal's secrets on our behalf, so we no longer have to deal with secrets in our application.

MSI is currently available for Azure virtual machines and Azure App Service (including Azure Functions).

We will use MSI in our Azure Function to retrieve a token that we'll use to call the Azure REST API.

MSI can be enabled in a Function App under Platform Features.

Enable MSI

Enable MSI

Cosmos DB Account with two regions

We'll start by creating a Cosmos DB account with 2 regions. In this example, we're using the West US 2 and East US locations, but this will work with any combination of regions.

Cosmos DB regions

Our Function App needs permissions to update our Cosmos DB account. Because MSI is enabled in the Function App, it should appear in Azure Active Directory. We can then add it as a "DocumentDB Account Contributor" to our Cosmos DB account.

Cosmos DB permissions

Timer Function to change Cosmos DB regions

To change the write region of our database on a schedule, we'll create the function in C# Script, but it works the same way in C# compiled functions and Node.js.

function.json

We'll start with a very straight-forward function.json for a timer trigger set to fire every hour.

{
  "bindings": [
    {
      "name": "myTimer",
      "type": "timerTrigger",
      "direction": "in",
      "schedule": "0 0 */1 * * *"
    }
  ],
  "disabled": false
}

Get the Azure Active Directory token from MSI

For .NET, there's a library (Microsoft.Azure.Services.AppAuthentication) on NuGet that that makes it really simple to get a token from MSI. ** It's highly recommended to use a library to do this; however, today we'll get the token manually to see how it works and how to use it from any language.

Update: July 19, 2018 - There's also a JavaScript library: ms-rest-azure

When MSI is enabled on an Azure Function app, the app has access to a couple of environment variables: MSI_ENDPOINT and MSI_SECRET. We can make a REST call to the endpoint using the secret to obtain an Azure AD token for accessing the Azure REST API.

var endpoint = System.Environment.GetEnvironmentVariable("MSI_ENDPOINT", EnvironmentVariableTarget.Process);
var secret = System.Environment.GetEnvironmentVariable("MSI_SECRET", EnvironmentVariableTarget.Process);

// get token from MSI
var tokenRequest = new HttpRequestMessage()
{
    RequestUri = new Uri(endpoint + "?resource=https://management.azure.com/&api-version=2017-09-01"),
    Method = HttpMethod.Get
};
tokenRequest.Headers.Add("secret", secret);

var tokenResp = await httpClient.SendAsync(tokenRequest);
tokenResp.Content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
dynamic result = await tokenResp.Content.ReadAsAsync<object>();

string token = result.access_token;

Get the current failover region priorities

Now that we have a token, we can use it to call the Azure REST API. The first call we'll make is to get the current state of our Cosmos DB account. We are interested in what regions our account is in, and which of those regions is the write region.

var cosmosDbResourceId = System.Environment.GetEnvironmentVariable("COSMOS_DB_RESOURCE_ID", EnvironmentVariableTarget.Process);

// get current failover policies
var getDbAccountRequest = new HttpRequestMessage
{
    RequestUri = new Uri($"https://management.azure.com{cosmosDbResourceId}?api-version=2015-04-08"),
    Method = HttpMethod.Get
};
getDbAccountRequest.Headers.Add("Authorization", "Bearer " + token);

var dbAccountResp = await httpClient.SendAsync(getDbAccountRequest);
dynamic dbAccountResult = await dbAccountResp.Content.ReadAsAsync<object>();
IEnumerable<dynamic> locations = dbAccountResult.properties.readLocations;

var locationNames = locations
    .OrderBy(l => (int)l.failoverPriority)
    .Select(l => (string)l.locationName);

log.LogInformation("Current locations: " + string.Join(",", locationNames));

At this point, locationNames holds the names of our database regions. The first one is the current write region.

Change Cosmos DB's current write region

The final step is to change the write region. In our simplistic application, we'll simply set the other region as the write region (i.e., reverse the failover priorities).

// update write location
var reversedLocationNames = locationNames.Reverse();

var requestBody = new
{
    failoverPolicies = reversedLocationNames
        .Select((n, i) => new
        {
            locationName = n,
            failoverPriority = i
        })
};
var requestBodyJson = JsonConvert.SerializeObject(requestBody);

var changeWriteLocationRequest = new HttpRequestMessage
{
    RequestUri = new Uri($"https://management.azure.com{cosmosDbResourceId}/failoverPriorityChange?api-version=2015-04-08"),
    Method = HttpMethod.Post,
    Content = new StringContent(requestBodyJson, Encoding.UTF8,"application/json")
};
changeWriteLocationRequest.Headers.Add("Authorization", "Bearer " + token);

var changeWriteLocationResp = await httpClient.SendAsync(changeWriteLocationRequest);
log.LogInformation("Response status: " + changeWriteLocationResp.StatusCode);
changeWriteLocationResp.EnsureSuccessStatusCode();

If a database write region change is successfully initiated, the API will return a status code of 202 (Accepted).

Monitoring in Application Insights

If Application Insights is enabled in the app, we'll be able to see the traces we emitted during execution.

One way to view these traces is running a custom query in Application Insights Analytics:

Query Application Insights

Notice that every hour, the failover priorities are swapped.

Source code

https://github.com/anthonychu/cosmos-db-change-locations-function