SEO Friendly URLs in Sitecore – Prevention is Better than Cure

I’ve seen a fair number of posts already on SEO Friendly URLs, search engines love them apparently and as humans we love them too – although it could be argued that it is mainly for vanity reasons (go check out the product URLs on Amazon quickly…)

Whatever the argument, we agree that Semantic URLs are a good thing for us mere mortals and the vast majority of sites.

EncodeNameReplacements

The simplest way of achieving friendly URLs is by using a combination of settings. Firstly set the LinkManager (from Sitecore 6.6 onwards) to use lowercase urls and remove the aspx extension:

<linkManager defaultProvider="sitecore">
  <providers>
    <add name="sitecore" addAspxExtension="false" encodeNames="true" lowercaseUrls="true" ... />
  </providers>
</linkManager>

And then set encodeNameReplacements to replace spaces with dashes:

<encodeNameReplacements>
  ...
  <replace mode="on" find=" " replaceWith="-" />
</encodeNameReplacements>

The LinkManager will now give us some better looking URLs:
/Blog Posts/Posts About Sitecore/Article About SEO Friendly URLs.aspx
=> /blog-posts/posts-about-sitecore/article-about-seo-friendly-urls

When an Item is requested, the ItemResolver pipelines will now decode the incoming request URL, reversing any encodeNameReplacements using the same settings and Sitecore will resolve the Item as if nothing had happened.

What’s the problem with this?

Quite a few issues 😦

Due to the replacement the following happens:

Original Encoded Decoded
/Blog Posts/All About Sitecore /blog-posts/all-about-sitecore /blog posts/all about sitecore
/Blog Posts/SEO – Some Approaches /blog-posts/seo—some-approaches /blog posts/seo   some approaches

The encoded Item URL is a bit strange with the triple dashes but due to the decoding process reversing the replacements we end up with 3 spaces, meaning the Item does not resolve since at all since an exact match does not actually exist in the content tree. Not what we were expecting at all 😥

To get around this issue, we can stop the editors putting dashes in dashes in Item names.

<!--  INVALID CHARS
        Characters that are invalid in an item name
-->
<setting name="InvalidItemNameChars" value="/:?"<>|[]-" />

This solves the “issue”, albeit with a slightly annoying warning. It’s not really set up for speed or user friendly. But it does the job:

seo-invalid-characters

Except it doesn’t fully solve the problem. Apply the above setting change and now go and install the WFFM module. Go ahead, I’ll wait…

seo-wffm-install

Then there is the issue when you try to replace multiple characters with dashes or issues due to the parent item including invalid characters also.

Changes to Media URLs in Sitecore 7.1

A change in Sitecore 7.1 means that the encodeReplacement are now also applied to the URLs for media items as well. Previously you would get those “ugly” URLs, but no one cared because they were not directly exposed to users. Well that changed and caught a few of us out when we upgraded and wondered why our media items with dashes in the names suddenly started throwing a 404. You can revert to the “old way of doing things” by following this knowledge base article.

One of the recommended practices is to change the handler from ~/media to -/media since the tilda causes some performance issues on some systems and it is known issue logged in the Sitecore Knowledge Base. I’ve seen some instance where firewall rules were overly aggressive and disallow or strip out the tilda also:

<sitecore>
  <settings>
    <setting name="Media.RequestExtension" value="" />
    <setting name="Media.MediaLinkPrefix" value="-/media" />
  </settings>

  <customHandlers>
    <handler patch:before="*[@trigger='~/media/']" trigger="-/media/" handler="sitecore_media.ashx" />
  </customHandlers>

  <mediaLibrary>
    <mediaPrefixes>
      <prefix value="-/media"/>
      <prefix value="~/media"/>
    </mediaPrefixes>
  </mediaLibrary>
</sitecore>

Due to this same issue you can no longer use hyphens if it is also specified in your encodeNameReplacements.

Spaces are for people…

So clearly we need to do something a little more custom which is a little more clever. One option is to hook into the item:saved and item:renamed events, intercept any changes the user makes and then ensure that item names are automatically “fixed” to meet the SEO needs:

public class ItemEventHandler
{
    protected void HandleItemName(object sender, EventArgs args)
    {
        var item = (Item)Event.ExtractParameter(args, 0);
        string friendlyName;
 
        if (item.Database.Name != "master"
            || !item.Paths.Path.StartsWith("/sitecore/content/")
            || item.Name == (friendlyName = item.Name.ToLower().Replace(' ', '-')))
        {
            return;
        }
 
        item.Editing.BeginEdit();
        item.Appearance.DisplayName = item.Name;
        item.Name = friendlyName;
        item.Editing.EndEdit();
    }
}

There is a great blog post by fellow Sitecore MVP Adam Najmanowicz about using the technique: http://www.cognifide.com/blogs/sitecore/sitecore-best-practice-9/

You can also find a more “configurable” implementation of this in the LaunchSitecore project. You can view the source of these handlers in LaunchSitecore Github repo.

seo-launchsitecore

A step in the right direction, but clearly this needs to be extended for all the other cases of invalid characters.

Rules Engine to the Rescue

An alternative option is to utilise the Rules Engine and let it do the heavy lifting for you!

There is already a great post by John West on Using the Sitecore Rules Engine to Control Item Names, and the code available on Sitecore Marketplace. I’m actually quite surprised more people are not using this, or that it has not been integrated into Sitecore already…

Some advantages of using the Rules Engine rather than the event handlers:

  • Logic is controlled via an easy to use browser-based interface rather than hardcoded
  • Rules are easy to extend and change
  • Multiple rules can be created that apply to different parts of the content tree
  • Rules only run in the database in which they are created (i.e. these will not run in core database)
  • Separation of concerns – rule checks are written into the condition of the rule

This keeps the rules for SEO super flexible and very easy to change.

The module is marked as compatible with Sitecore 6.1 but it is actually just a zip file containing a set of C# files which will need to add to your project and compile. You still need to create the rules yourself though, and there were some fairly major changes to the Rules Engine in Sitecore 7.1. There is a great walkthrough on creating rule conditions, actions and creating tag groups in order for these to be correctly show in the ruleset editor: https://www.sitecore.net/learn/blogs/technical-blogs/getting-to-know-sitecore/posts/2013/11/limiting-conditions-and-actions-with-sitecore-71.aspx

So here’s the walkthrough of how to integrate this into Sitecore 7.1+

  • Add the code into a project, compile and add it to your project

  • Create a tag for our conditions and actions. We’ll call it Item Naming:

seo-tags

  • Create a new Element Folder. Call it whatever you like, we’ll stick with Item Naming. Select the tag we created in the previous step in the Taxonomy section:

seo-element-tag

  • Now add in the Action rules using the /sitecore/templates/System/Rules/Action template

seo-actions

  • We don’t need to worry about the Has Layout Details condition since we can use the where the item has a layout condition from /sitecore/system/Settings/Rules/Definitions/Elements/Item Version CM/Layout. There’s an additional HasLayoutDetailsForAnyDevice condition. Add this if you are using multiple devices.

  • Add the Item Naming tag to the Item Saved Rule Context Folder and then create your Rule. Create a new tag and don’t add it to the existing Default tag, if you’re using TDS or Unicorn we only want to include our own changes and not modify any Sitecore Items in case it is updated at some future date

seo-action-tag

  • We’re now ready to use the Ruleset Editor to define our rule and actions:

seo-launchsitecore

seo-item-name-rule

Rule Text

The following are the Rule names, text and type we needed to add:

Rule Name Text Type
Ensure Maximum Length of Item Name ensure item name does not exceed [MaxLength,int,,maximum length] characters Sitecore.Sharedsource.ItemNamingRules.Actions.EnsureUnique, Sitecore.Sharedsource.ItemNamingRules
Ensure Minimum Length of Item Name ensure a minimum name length by appending characters from [DefaultName,Text,,index] Sitecore.Sharedsource.ItemNamingRules.Actions.EnsureMinimumLength, Sitecore.Sharedsource.ItemNamingRules
Ensure Name is Unique ensure item name is unique Sitecore.Sharedsource.ItemNamingRules.Actions.EnsureUnique, Sitecore.Sharedsource.ItemNamingRules
Lowercase Item Name convert to lowercase letters Sitecore.Sharedsource.ItemNamingRules.Actions.Lowercase, Sitecore.Sharedsource.ItemNamingRules
Replace Invalid Characters in Item Name replace characters in the item name that do not match the regular expression [MatchPattern,Text,,pattern] with [ReplaceWith,Text,,this character sequence] Sitecore.Sharedsource.ItemNamingRules.Actions.ReplaceInvalidCharacters, Sitecore.Sharedsource.ItemNamingRules
Store Pretty Name in Display Name store the pretty name in the display name Sitecore.Sharedsource.ItemNamingRules.Actions.SaveNameChanges, Sitecore.Sharedsource.ItemNamingRules
Save Name Changes save the changes to the item name Sitecore.Sharedsource.ItemNamingRules.Actions.StorePrettyNameInDisplayName, Sitecore.Sharedsource.ItemNamingRules

What about Media Items?

We’re using the Rules Engine remember, we can add whatever we want without much trouble. Add a rule to also rename the media item as well, all without writing any additional code.

seo-media-item-name

The advantage here is that we can easily target specific folders, if you create a /sitecore/media library/[project] folder then no other content is affected, esp if you install other modules. Or easily add multiple folder into the conditions. In theory, items should be linked Item ID, but I’d rather not mess with Items which I do not own…

Code Changes

I made a number of changes to the original code that John West wrote. You can find the updated code and Sitecore packages in this Github fork from the updates that Sean Kearney made.

Summary of the updates:

  • To store the user entered text into the Display Name field so they still look pretty to the users
  • Updated code to check Item Name is unique
  • Split out minimum and maximum length actions
  • Simplified logic in the Replace Invalid Characters action

The main change I made was the addition of the SaveNameChanges action. The original code called the RenamingAction at the end of every action. Whilst debugging I noticed that this resulted in several recursive calls to the item:saved event, and therefore these rules would get called several times until all the actions had eventually been applied.

The save action uses the following code:

using (new Sitecore.SecurityModel.SecurityDisabler())
{
    using (new Sitecore.Data.Items.EditContext(item))
    {
        using (new Sitecore.Data.Events.EventDisabler())
        {
            item.Name = newName;
        }
    }
}

Which should disable events and stop these recursive calls. Except it wasn’t quite doing that, most likely because the EventDisabler is called within the EditContext block. Moving the order of the block meant that the first action would get call, save the changes and then exit without running any of the further actions, obviously not quite what we want!

By explicitly adding a separate Save Name Changes action we can avoid both avoid multiple saves and the recursive calls.

using (new Sitecore.Data.Items.EditContext(item, Sitecore.SecurityModel.SecurityCheck.Disable))
{
    // modifications have been made to the item already
    // this call will commit those changes
}

We still end up with a single call back to the item:saved event since we are not using the EventDisabler. I initially tried using the EventDisabler and this worked fine for new items, but renaming items meant that it wasn’t updated in the UI. Clearing the cache manually solved the problem, but not too user friendly and so there must be some other events which fire further down the line (but I couldn’t track it down). I eventually settled on the fact that a single additional event was acceptable: http://blog.coates.dk/2014/10/03/sitecore-save-event/

To accommodate this, a check is made in action and the rule aborted if there are no changes:

if (ruleContext.Item.Name == ruleContext.Item.InnerData.Definition.Name)
{
    ruleContext.Abort();
}

Publishing Items

I mentioned that one of the advantages of using the rules engine is that they only run in the database in which they are created, i.e. these will not run in core database.

Unfortunately, since the rules are published to the web database and the item:saved:remote event also calls the RunItemSavedRules method they also fire for each item published.

The fix is very easy though, simply make sure that the rules are set to not be publishable 🙂

seo-publish-settings

How are the Rules called?

The Item Saved rules are not actually that different from the item:saved event earlier, and are in fact called from the very same handlers:

<event name="item:saved">
  <handler type="Sitecore.Rules.ItemEventHandler, Sitecore.Kernel" method="OnItemSaved"/>
</event>
<event name="item:saved:remote">
  <handler type="Sitecore.Rules.ItemEventHandler, Sitecore.Kernel" method="OnItemSavedRemote" />
</event>

Internally, the method simply applies all rules specified in /sitecore/system/Settings/Rules/Item Saved/Rules

private void RunItemSavedRules(Item item)
{
    ItemEventHandler.RunRules(item, RuleIds.ItemSavedRules);
}

It may seem like a lot of parts to pull together to essentially end up at the same position of just using an event handler directly, but hopefully you can already see the flexibility in being able to configure the rules.

But I already have existing content!

It’s best to get these rules configured at the start of the project, but if you need to retrospectively apply the rules to existing items then you need a trigger a save on the items. This will force it to fire the save events again, thus applying your naming rules.

Alternatively, we can just call the rules on the items again:

private void ApplyRulesToItems()
{
    Database master = Sitecore.Configuration.Factory.GetDatabase("master");
    Item startItem = master.GetItem("/sitecore/content");
    SaveRecursively(startItem);
}

private void SaveRecursively(Item item)
{
    if (item == null)
        return;
        
    RunRules(item);
    
    if (item.HasChildren)
    {
        foreach (Item child in item.GetChildren())
        {
            SaveRecursively(child);
        }
    }
}

private void RunRules(Item item)
{
    // Use reflection to invoke private method RunItemSavedRules in the  ItemEventHandler
    Type t = typeof(Sitecore.Rules.ItemEventHandler);
    t.InvokeMember("RunItemSavedRules", 
                        BindingFlags.InvokeMethod | BindingFlags.Instance | BindingFlags.NonPublic, 
                        null, 
                        new Sitecore.Rules.ItemEventHandler(), 
                        new object[] { item } );
}

It should also be possible to re-save an item with Sitecore Powershell Extensions:
http://sitecorejunkie.com/2014/06/02/make-bulk-item-updates-using-sitecore-powershell-extensions/

How am I going to deal with redirecting all the old URLs?

No one wants to lose that old SEO goodness that had been build up over that time and your potential customers landing on a 404 page 😦 I previously blogged how to deal with mass re-organisation or rename of Sitecore items:

https://jammykam.wordpress.com/2015/01/19/redirecting-urls-after-major-content-restructure-in-sitecore/

Item Listing in the Content Editor tree

In order to still keep things friendly for the users you can specify in Application Options how the item names should be displayed in the content tree:

seo-application-options

seo-application-options-menu

This affects both the content tree:

seo-content-tree

But also the navigation bar in the Experience Editor:

seo-editor-nav-bar

What’s the right way?

You could just rely on content editors to be smart and teach them to put in SEO friendly item names.

I find it best to stamp out the problem to begin with and using the Rules Engine makes it super flexible should those requirements change over time. And best of all, no runtime performance hit of having to try and resolve in the pipelines. Sure there is some performance impact, but it’s a one-time hit at Item creation/rename instead of every time an Item is requested.

There is no right way and wrong, I’m no SEO expert but I’m pretty sure Google figured out the differences between spaces and dashes. Less weight is put on the URL and more on the content on your site so it’s certainly not a magic bullet but every bit of best practice helps. Besides, the URLs DO look better and copy+pasting links in emails/documents doesn’t break them due to the spaces in the URLs.

This has worked pretty well for me in most scenarios, use whatever works for you. But I’m always interested to hear how other people are tackling the same problem so get in touch.

Code and Packages

Can be found in this Github repo: https://github.com/jammykam/Sitecore-ItemNamingRules

Advertisements

4 comments

  1. weeder200 · July 14, 2015

    Reblogged this on Sitecore Newbie => Code + {};.

  2. Brijesh Patel · July 17, 2015

    Nice detailed post, all solutions at one place. Thanks for this.

  3. Pingback: SEOFriendlyURLs – Watch out, it’s a trap! | jammykam

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s