Setting up Azure CDN to deliver your Sitecore Media

At SUGCON EU 2016 I presented about the different options of using Content Delivery Networks with Sitecore. At the time, I had been working on a particular task to offload large media items into Azure Blob storage and serve them to via Azure CDN and wrote a number of posts detailing how I achieved this.

One of the options that I presented was utilising Azure CDN to serve your media, allowing you to benefit from Azure’s Geo-located Edge Servers meaning that assets are served from locations closer to your users, your own servers can focus on just delivering content (possibly meaning less content delivery servers and licensing costs) as well as improving browser response times by domain sharding the requests.

Use of Azure CDN will work with any version of Sitecore, and is not specific to Sitecore 8.2 Update-1 which added Azure Web Apps support. In fact, you don’t even need to be hosting your servers in Azure to utilise the CDN service.

I’ve been asked by several Sitecorians about configuring CDN, so I thought I would share a step-by-step guide in setting up Azure CDN with Sitecore.

there-is-no-cloud

Sitecore with Azure CDN Walkthrough

Add a CDN profile to your Azure account:

cdn-01-create-profile

And configure the CDN profile by giving it a name and selecting a pricing tier:

cdn-02-configure-profile

It’s worth noting that although a Resource Group Location is required, CDN by it’s very nature is not tied to a specific datacentre:

The Azure CDN service is global and not bound to a location. However, you must specify a location for the resource group where the metadata associated with the CDN profile will reside. This location will have no impact on the runtime availability of your profile.

Wait for the CDN profile to be created, it should take a couple of minutes. Once that is done, you’re ready to set up some endpoints to tell the CDN where your site is and where to fetch data from:

cdn-03-configure-endpoint

Add a unique name for your Endpoint, this will form the URL for your CDN in the format endpoint-name.azureedge.net.

Ensure you set Origin type as Custom Origin.

Then add in the details of your live Sitecore site. This should be the publicly accessible site, e.g. the CD server or the load balancer. The site doesn’t actually need to be hosted on Azure.

Once the endpoint is created you’ll be able to find the Endpoint hostname listed, we will require this later for configuration in Sitecore.

cdn-04-endpoint-address

Note, it can take up to 90 minutes to for the endpoint to be created and propagate to the Edge Servers.

Finally, set the Cache to Cache every unique URL:

cdn-05-cache-settings

This will ensure that query parameters are taken into account. This is good, esp for resized images where height and width parameters are supplied (as well as media hash). We’ll also make use of this feature a little later in our Sitecore set up.

Don’t like that Azure domain name? There’s an option to add your own custom domain, so you could map a sub-domain like media.yoursite.com to point to this Azure CDN Endpoint instead.

That’s all that needed in Azure. Simple eh.

Sitecore Changes to Support CDN

This follows some advice from my previous article about domain sharding in Sitecore, but now we have Custom Origin support in Azure…

<setting name="Media.MediaLinkServerUrl">
  <patch:attribute name="value">https://jammycdn.azureedge.net</patch:attribute>
</setting>

Set the Media Server URL to the Azure endpoint from earlier.

<setting name="Media.AlwaysIncludeServerUrl">
  <patch:attribute name="value">true</patch:attribute>
</setting>

This will cause the Media Link Provider to include the above URL in the media links that are rendered. You only need this on the CD servers, leaving this as false on CM will mean those assets on that server are rendered from the database as normal.

<!--  MEDIA RESPONSE - CACHEABILITY
        The <see cref="HttpCacheability">cacheability</see> to use in media response headers.
        Possible values: NoCache, Private, Public, Server, ServerAndNoCache, ServerAndPrivate
        Default value: private
  -->
<setting name="MediaResponse.Cacheability">
  <patch:attribute name="value">public</patch:attribute>
</setting>

We need to set the cache-control headers to public, otherwise the CDN will not cache our media, making the exercise rather pointless.

In case you’re wondering about those cache settings, most articles only seem to talk about no-cache, private and public. The setting is actually a wrapper for
HttpCacheability Enumeration setting in System.Web namespace. You can read more about this setting in the official Microsoft documentation.

How are requests served?

So how does this all work? When Sitecore renders the link to the media on your CD with these settings it will look like this:

<img src="https://jammycdn.azureedge.net/-/media/path-to/image.png?la=en&h=123&w=123&hash=1a2b3c4d5e6f7g8h9i0" height="123px" width="123px" />
  1. Your browser requests the image from the CDN Endpoint
  2. The endpoint does not have this image in cache
    • CDN calls your website (that you specified in Azure as the origin host)
  3. Website returns the image, CDN adds it to cache
  4. CDN returns image to user, everyone is happy
  5. Subsequent requests for the same asset…
  6. If the image has previously been called with these parameters then the CDN has it in cache and returns it to the user.

The typical journey looks something like this:

cdn-06-overview

  • So the first request for an image on a fresh cache always makes a request to your CD server. Any subsequent ones do not.
  • CDNs can be distributed anywhere globally
  • The HTML of your site is always served from your actual CD server, although there is no reason why the HTML could not be cached if the correct cache header is sent

Invalidating Caches

The problem now is caching on the CDN itself.

If you change an image for another, the image size/parameters may stay the same (on a scaled image for example) and the path may not change. So you need to somehow tell the CDN to remove its cache.

Azure CDN TTL / Cache expiry is 7 days by default. I couldn’t find a way of setting this in the portal, I think it needs it set on upload (into blob storage) but we are not using that, or possibly the cache header of the original image from Sitecore.

You can do this by manually purging:

cdn-07-purge-cache

Or you can hook up a publish:end processor and call the REST API to clear the CDN cache. You need to write this but pretty simple REST call with webclient.

This is pretty aggressive though because it means EVERYTHING is purged, not just the media that has changed. It may be fine for your requirements, it just means that everything will get automatically re-fetched again as required.

An alternative is to make use of the “Cache every unique URL” feature we enabled earlier, and update the Media Link Provider to also append either the Updated Date or the Revision Field of the media item.

This will cause the URLs to look like:

https://jammycdn.azureedge.net/-/media/path-to/image.png?la=en&h=123&w=123&hash=1a2b3c4d5e6f7g8h9i0&modified=20160126120123
https://jammycdn.azureedge.net/-/media/path-to/image.png?la=en&h=123&w=123&hash=1a2b3c4d5e6f7g8h9i0&revision=guid

Either way will do, since these values change when the image is updated or a new version added. All good.

Updating Media Provider

Create a new class, inheriting from the default MediaProvider and append the revision or the modified date:

using Sitecore.Data.Items;
using Sitecore.Diagnostics;
using Sitecore.Resources.Media;

namespace MyProject.CMS.Custom.Media
{
    public class MediaProvider: Sitecore.Resources.Media.MediaProvider
    {
        public override string GetMediaUrl(MediaItem item)
        {
            Assert.ArgumentNotNull((object)item, "item");
            return this.GetMediaUrl(item, MediaUrlOptions.Empty);
        }
 
        public override string GetMediaUrl(MediaItem item, MediaUrlOptions options)
        {
            Assert.ArgumentNotNull((object) item, "item");
            Assert.ArgumentNotNull((object) options, "options");
 
            string mediaURL = base.GetMediaUrl(item, options);
 
            mediaURL = Sitecore.Web.WebUtil.AddQueryString(mediaURL, new string[] {"revision", ((Item)item).Statistics.Revision });
            //OR
            mediaURL = Sitecore.Web.WebUtil.AddQueryString(mediaURL, new string[] {"modified", ((Item)item).Statistics.Updated.ToString("yyyyMMddHHmmss") });
 
            return mediaURL;
        }
    }
}

And update the config to point to your new class.

<mediaLibrary>
  <mediaProvider>
    <patch:attribute name="type">MyProject.CMS.Custom.Media.MediaProvider, MyProject.CMS.Custom</patch:attribute>
  </mediaProvider>
</mediaLibrary>

Cache busting in Sitecore 8.2 update 5 +

Sitecore have made some changes to the default LinkProvider and added a new setting to enable it in Sitecore 8.2 update 5 onwards:

<!--  MEDIA ALWAYS APPEND REVISION
      If true, Sitecore will append media item revision when it uses the MediaProvider API and/or the link provider to render media URLs.
      Default value: false (do not append media item revision)
      -->
<setting name="Media.AlwaysAppendRevision" value="false" />

This means that you no longer require a custom provider for cache busting, simply set the above setting to true. Sitecore 9.1+ always includes an example patch config with all the setting needed to enable CDN support in \App_Config\Include\Examples\CDN.config.example

As you can see, it’s very simple to set up. Apologies if my previous blog post caused confusion and made it seem difficult, but that was solving a very specific problem – off-loading very large files into Azure Blob Storage (currently over 60GB and counting!).

As always, feel free to reach out if you have any questions.

SUGCON Presentation

You can watch my presentation from SUGCON 2016 here:

There are several other presentations on that playlist which I highly recommend watching.

You can also download the slides from my presentation.

Related Links

20 comments

  1. FAIYAZ (@faiyazulnoor) · February 14, 2017

    You are amazing JammyKam ! Insightful explanation

  2. Dan Cruickshank · February 16, 2017

    This post is prettay prettay good. *finger guns*

  3. Stijn Planckaert · February 17, 2017

    Nice post.

    Using the ‘Purge’ options doesn’t seem like a usable option to me because, in addition to the things you already mentioned, the browser will still serve the old image from its cache until cache expiration.

    We’re using azure cdn for a while in combination with sitecore, and always use the patched media provider.

  4. Jarmo Jarvi · May 2, 2017

    This is excellent! Trying to decide between Akamai and Azure CDN.

  5. Pingback: Sitecore Media Library integration with Azure CDN using origin pull | Brian Pedersen's Sitecore and .NET Blog
  6. Anders Gjelstrup · September 27, 2017

    The updated/revision stamp is definitely good for serving images, but what about files for file download? Typically you would not add parameters to these? Or would you?

    • jammykam · September 27, 2017

      No reason you can’t add parameters, Sitecore does this by default anyway (language and possibly protection hash). All media would be served via the CDN and if they are changed (either due to versioning or a simple file replacement) then you still need to cache-break. Adding the parameter will not cause any isseus.

      • Anders Gjelstrup · September 27, 2017

        My comment was a bit too “simple”.
        Was thinking about following scenario:
        An editor sends a mail to a list of users via EXM containing a “download pdf” link which links directly to a pdf file in Sitecore media library (with above parameters).
        After a day or two the editor realizes that the file is missing some vital information. Editor up-loads a new pdf with same name and publishes. A new filename would not make be a viable solution as path is now statically because being placed in the mail.
        If pdf was shown on website, the path/revision would be updated and CDN caches would just cache the new file.
        Pdf with old path/revision is still cached in CDN and people accessing pdf from the link in the mail would still download the old file (until cache is purged on CDN). Editor have no clue about this scenario and cannot solve it with purging cache on CDN.
        If this scenario could happen, which I believe it could, then the publish:end hook seems like a better all-round solution, even though it is somewhat harder to implement.
        I am not too big of a fan of all those parameters on media urls in the first place, so introducing more makes my OCD spike a little :oD
        Actually I am still wondering if parameters instead could be added to path somehow and mediahash avoided strictly by defining valid size parameters that are only allowed to be returned for a given solution.

      • jammykam · September 27, 2017

        I don’t know how EXM works to generate the links, but sure I can see that causing an issue here. This is not a catch-all solution, and if you have this scenario then, maybe you should consider a publish:end handler. If you are going this route then I suggest using the Publishing Service, aside from faster publishing it also provides a manifest of items that have been published so you can selectively clear cache for specific items rather than blowing the entire cache away. Alternate may be to configure Azure not to cache PDF’s but then you’re starting to negate the benefits. The cache clearer is a more logical choice.

        In terms of the OCD-ness… Sitecore already has a bunch of parameters, what’s one more between friends 😄 Besides. it’s only the developers that care about this – the end users of the site couldn’t care less, and neither could the robot overlords like Google.

        You could probably rewrite the Media Link Provider to add the parameters to the path, you’d also need a matching Media Handler to decode this instead of the default one. All well and good, until Sitecore decides to change something (i.e. DI in 8.2) and you find yourself in upgrade hell. Personally, I try to keep the changes as light as possible unless absolutely necessary.

  7. thebeardedguy · January 29, 2018

    Hi Jammy, I trying to setup an Azure CDN for my WP site. I have followed everything in this https://ppolyzos.com/2016/05/16/speed-up-your-wordpress-site-using-azure-cdn-for-static-content/ and checked for steps on your blog but I am still not able to do it. Am I doing something wrong. Here’s my website- husbandwiferelationship.com I can see images are not loading and when I view the source the website is indeed requesting static content hwr.azureedge.net Can you please tell me what am I not doing correctly? Is it like, this will only work when I am hosting my website with Azure? Currently I am on a VPS with KnownHost.

    • jammykam · January 29, 2018

      I guess you resolved the issue, images seem to be loading fine and referencing the http://hwr.azureedge.net Azure CDN domain name. The site does not need to be on Azure to use CDN with Custom Origin support.

      • thebeardedguy · January 30, 2018

        Yes, Jammykam. Thank you so much!

  8. Rik · March 27, 2018

    Nice article, but how do you set this up if you have a multi domain setup?
    Do you need to create a CDN for every domain?
    And how do you configure the MediaLinkServerUrl?

  9. Pingback: SUG Ottawa – August 2018 – the experience platform blog
  10. Joel · September 15, 2018

    Thanks for the detailed post! I have however just experienced an issue with Sitecore 9 with SXA installed and configuring the CDN as above. Basically the MediaRequest Handler was throwing “ERROR MediaRequestProtection: An invalid/missing hash value was encountered….” exceptions from all the images coming in via the CDN. After spending sometime, I narrowed it down to the fact that I needed to include the request scheme in the CDN url. so I needed to have https://xyz.azureedge.net/ instead of //xyz.azureedge.net/.
    The scheme being in the CDN url was ensuring the hash was the same from when it is generated, to when it was then validated.

    • jammykam · September 17, 2018

      Great, thanks for the feedback. I’ll update the article to reflect this, everything should always be run under https anyway 🙂

      • Glenn · December 27, 2018

        Was facing this issue to. Don’t forget to update the article 🙂

      • jammykam · January 10, 2019

        Thanks. I’ve updated the article! Seems like something changed in the hash check in SC9 that now takes the scheme into account for some silly reason… 🤦‍♂️

  11. unboxingweb · July 24, 2020

    Hey JammyKam, Thank you for great article. I helped me figure out custom origin issue.

    My sandbox website is whitelisted for selected IP’s. I want to test CDN integration on my sandbox, but since site is not public, i cannot do it. Do you know if you can whitelist specific set of IP’s of Azure to make this work? Since this is geo location based media serving, i am not sure whether it does have very specific IP’s to whitelist.

    Thank you!

Leave a reply to jammykam Cancel reply