Storing Files in Azure Cloud Storage through the Sitecore Media Library – Part Two – Link Management

This is a continuation of my previous blog post in which we uploaded files into Azure Blob Storage through the Sitecore Media Library. If you haven’t read it yet then first go and do that since this may not make much sense otherwise…

The previous post showed how to upload media files into Azure Blob Storage using a custom pipeline. I started off the post by stating that the requirement was to only store large files in Azure and that was primarily the focus of what we needed to achieve but some small tweaks meant it was possible to store all media in Cloud storage.

We still need to make some small updates to code to allow us to actually correctly link to the media in Azure though, so I’ll present a couple of options we have to do that.

4. Linking to Media

Now we just link to the media item, nothing changes here from default Sitecore functionality. You can even upload images and link to them in the Rich Text Editor, in which the size parameters work but bear in mind that there is no true image resizing since the image is served directly from the cloud storage and therefore does not run through Sitecore’s image processor pipelines. The size parameters simply set the height/width attribute for client side scaling:

<img src="https://jammykam.blob.core.windows.net/test/files/software/dsc_1155.jpg" height="300px" width="200px" />

In order to generate the link to the media you have 2 options: direct linking or using redirects.

4.1. Direct Links to Azure Hosted Media

Directly linking to images is pretty straight forward, we can create a custom Media Link Provider by inheriting from the default Sitecore.Resources.Media.MediaProvider and override the GetMediaUrl() methods to generate the correct URLs to the final cloud storage location.

The full URLs would be rendered in the final HTML output. This has the advantage that the media is loaded from a different domain and there we benefit from domain sharding:

<img src="https://jammykam.blob.core.windows.net/test/files/software/dsc_1155.jpg" height="300px" width="200px" />
<a href="https://jammykam.blob.core.windows.net/test/software/githubsetup.exe">Download Github Setup</a>

The updated Media Link Provider is straight forward:

using Sitecore.Custom.Helpers;
using Sitecore.Data.Items;
using Sitecore.Diagnostics;
using Sitecore.Resources.Media;

namespace Sitecore.Custom.Media
{
    public class MediaProvider: Sitecore.Resources.Media.MediaProvider
    {
        public override string GetMediaUrl(MediaItem item)
        {
            Assert.ArgumentNotNull((object)item, "item");
            return this.GetMediaUrl(item, MediaUrlOptions.Empty);
        }

        public override string GetMediaUrl(MediaItem item, MediaUrlOptions options)
        {
            Assert.ArgumentNotNull((object) item, "item");
            Assert.ArgumentNotNull((object) options, "options");

            if (!item.FileBased || options.Thumbnail)
                return base.GetMediaUrl(item, options);

            var helper = new MediaHelper();
            return helper.GetCloudBasedMediaUrl(item);
        }
    }
}
public class MediaHelper
{
    public string GetCloudBasedMediaUrl(MediaItem mediaItem)
    {
        string cloudURL = Sitecore.StringUtil.EnsurePostfix('/',
            ConfigurationManager.AppSettings[Constants.AppSettings.AzureStorageUrl]);

        return cloudURL + mediaItem.FilePath;
    }
}

And we can simply patch the default provider with our implementation:

<mediaLibrary>
  <mediaProvider>
    <patch:attribute name="type">Sitecore.Custom.Media.MediaProvider, Sitecore.Custom</patch:attribute>
  </mediaProvider>
</mediaLibrary>

4.2. Intercepting Media Request

One of the final pieces of the puzzle is to intercept any media request for file based media and instead redirect the user to the Azure hosted file.

The media item URL will be rendered as any other media item normally stored in Sitecore if you don’t use implement the media provider in 4.1.

<img src="/-/media/files/software/dsc_1155.jpg" height="300px" width="200px" />
<a href="/-/media/files/software/githubsetup.exe">Download Github Setup</a>

The item is then served by intercepting the media request and then using a 302 redirect to the item in cloud storage.

using System.Web;
using Sitecore.Custom.Helpers;
using Sitecore.Diagnostics;
using Sitecore.Resources.Media;

namespace Sitecore.Custom.Media
{
    public class MediaRequestHandler : Sitecore.Resources.Media.MediaRequestHandler
    {
        protected override bool DoProcessRequest(HttpContext context)
        {
            Assert.ArgumentNotNull((object)context, "context");
            MediaRequest request = MediaManager.ParseMediaRequest(context.Request);
            if (request == null)
                return false;
            Sitecore.Resources.Media.Media media = MediaManager.GetMedia(request.MediaUri);

            if (!request.Options.Thumbnail && IsCdnMedia(media))
            {
                return this.DoProcessRequest(context, media);
            }

            return base.DoProcessRequest(context);
        }

        private bool DoProcessRequest(HttpContext context, Sitecore.Resources.Media.Media media)
        {
            var helper = new MediaHelper();
            string redirectUrl = helper.GetCloudBasedMediaUrl(media.MediaData.MediaItem);
            context.Response.Redirect(redirectUrl);
            return true;
        }

        private bool IsCdnMedia(Sitecore.Resources.Media.Media media)
        {   
            return (media != null && media.MediaData.MediaItem.FileBased);
        }
    }
}

We have to make a direct modification to web.config to swap out the MediaRequestHandler. I couldn’t figure out any entry point to extend into Sitecore config/events to get this to work so the modification is the only way.

<!--
  Override default MediaRequestHandler with custom implementation to support Cloud based storage
  <add verb="*" path="sitecore_media.ashx" type="Sitecore.Resources.Media.MediaRequestHandler, Sitecore.Kernel" name="Sitecore.MediaRequestHandler" />
-->
<add verb="*" path="sitecore_media.ashx" type="Sitecore.Custom.Media.MediaRequestHandler, Sitecore.Custom" name="Sitecore.MediaRequestHandler" />

You can see the redirect in the network activity tab:

Media-in-azure-download-redirect

Be sure to keep the media interceptor code in place if you want to host images in Azure and want to still provide rich text editor support. Since the media links are not expanded in the editor mode, inserting images means they would not render in edit mode and therefore does on not provide a good experience for your users.

Media in RTE with no redirects

Media in RTE with no redirects

Media Linked in RTE with Redirects in place

Media Linked in RTE with Redirects in place

As I mentioned, you don’t strictly need to override the MediaProvider from 4.1, in which case ALL media requests would be served using a 302 redirect. If you have additional business logic which may need to be carried out then this is useful, e.g. gathering click statics, dynamically changing download location, adding additional parameters etc.

There’s no hiding the actual endpoint though, it’s right there in the download history. We were using time limited download links to stop hot linking so this worked for us:

Media-Chrome-Download-History

Sitecore is also kind enough to provide default icons within the Media Library. This is why all the code above checks if the request is for a thumbnail. This is important, since without this the full scale image would be served to the browser. If the images are large then this would mean unnecessary bandwidth usage and client slow down.

Media-in-azure-default-icons

5. Using CDN Endpoints

As you can see, the file are linked or served directly from the blob storage container. Depending on your requirements that may be acceptable, for example if most of your traffic is only located within a certain region. If you need to reach a global market then you may wish to wish to switch to using a CDN endpoint.

Setting up the endpoint is pretty straight forward in Azure, just create the endpoint and set the origin to the blob storage URL.

Then all you need to do is update the Azure.StorageUrl in AppSettings with the URL of the CDN endpoint:

<appSettings>
  ...
  <add key="Azure.StorageUrl" value="http://az123456.vo.msecnd.net/test" />
</appSettings>

6. Enabling Upload of Large Files

Just one last tweek required, in order to be able to upload large files to your server you must update settings in your config to allow files over a limit.

By default, any files over 500 MB will get rejected by the server due to limits specified in the security request filtering in web.config. You can find more information in this article on SDN

<security>
  <requestFiltering>
    <requestLimits maxAllowedContentLength="2097152000" />
  </requestFiltering>
</security>

We are also going to patch a Sitecore setting:

<setting name="Media.MaxSizeInDatabase">
  <patch:attribute name="value">2000MB</patch:attribute>
</setting>

The setting interferes with the setting we specified in Step 1 above since the Flash uploader will check the file size before any upload begins and the only way round it is to use the Advanced Upload Dialog. For now we will increase the Sitecore limit, I’ll expand on this in a future blog post to handle it in a better way.

7. What about that Uploaded To Cloud field?

You may have noticed that the template included a Uploaded to Cloud boolean field, but it has not been utilised anywhere. This was taken from Tim Braga’s post, the reasoning being that if the item has not been uploaded into the Cloud Storage yet then we would want to generate a regular link. Since we are uploading as files it means that a publish will not copy the actual media file to the CD server therefore a bit of additional logic would need to be carried out:

  • In a CM environment the default Sitecore logic can be used since the media is initially on disk.
  • In a CD environment maybe a 404 should be thrown while the media item is processed?
  • But this may lead to a problem. What if the media is being processed, in the meantime a publish occurs. The web database would not have the flag set until the next publish from master…
  • So maybe we should always link to Azure since the processing should be finishing soon…?

We were using Virtual Machines in Azure and since our Blob Storage Container was in the same datacenter, the copy of the file did not take much time. With images and small files and copy time is negligible, a few seconds only. Testing with large files of approx 1GB it took approx 5 minutes for the file to copy and be available in Blob Storage.

These were all acceptable for us (we’re not using this for general media remember) and so we never ended up using this anywhere but it’s there if we want to extend this logic in the future.

Media In Cloud!

media-yuno-store-in-cloud

And that’s it, files in Azure! All in all, the individual pieces of the puzzle are all small and simple but an interesting puzzle to put together that gives us a DAM for Sitecore stored in Azure and all managed through the standard Media Library interface and without adding any bloat to the database!

Let me know what you think, if I’ve missed anything or you know of a better technique.

Additional Comments / Thoughts

  • The code does not deal with item moves. If an item is moved, and the subsequently another item is created with the same name than the file in Azure will be overwritten with the same file.
  • The code does not deal with attach/detach handlers.
  • The code does not deal with any images manipulation or resizing through the Sitecore pipelines.
  • The code only deals with unversioned media. For versioned media you may want to prepend the language/version as a folder path.
  • If you’re using a CDN, then be careful about cache expiration. The default is 7-days TTL using Azure CDN Endpoints.
    • One way round this is to always append the item modified date as a URL parameter.
  • Since the push is into Azure Blob Storage, everything is essentially published and available immediately.
  • It may be interesting to integrate the Dianoga Image Optimization module, which should be possible to tap into our custom pipeline!

You may want to resolve these if you have issues with these.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s