Updating Sitecore Telerik RTE to disable XHTML and strip MS-Word Formatting on Paste

The RTE field in Sitecore is a little bit old fashioned and a little stricter than the most of us would like, especially when you are working with tech savvy editors who may prefer to work directly in the HTML. The current project I am working on is based on the Foundation framework and like most frameworks a lot of the features are enabled using data-* attributes.

Unfortunately the Telerik controls used in Sitecore try to be super helpful and strip out any invalid XHTML. Also unfortunately data-* attributes are HTML5, and not XHTML 😦

And while we’re at it, let’s automatically strip the MS-Word formatting when users copy/paste from that amazing editor we all love :p

It’s possible to do this via configuration of the toolbar, but the user must manually choose to run this. Editors are about bit funny about remembering these things, and we should obviously automate what we can.

RTE-strip-formatting

Fortunately, it’s quite easy to extend Sitecore’s configuration of the Telerik controls to fix both these issues.

Extending Telerik Editor Settings

The code to achieve these both is very minimal:

using Sitecore.Data.Items;
using Telerik.Web.UI;

namespace MyProject.CMS.Custom.Controls
{
    public class RteConfiguration : Sitecore.Shell.Controls.RichTextEditor.EditorConfiguration
    {
        public RteConfiguration(Item profile) : base(profile)
        {
        }

        protected override void SetupFilters()
        {
            base.SetupFilters();
            Editor.DisableFilter(EditorFilters.ConvertToXhtml);
            Editor.EnableFilter(EditorFilters.IndentHTMLContent);
            Editor.StripFormattingOptions = EditorStripFormattingOptions.MSWordRemoveAll
                                            | EditorStripFormattingOptions.ConvertWordLists
                                            | EditorStripFormattingOptions.Css
                                            | EditorStripFormattingOptions.Font
                                            | EditorStripFormattingOptions.Span;
        }
    }
}

We’ve found the above setup to be a good combination for real world use on Sitecore 8+.

To use the new configuration, update the HtmlEditor.DefaultConfigurationType setting to your own class:

<settings>
  <setting name="HtmlEditor.DefaultConfigurationType">
    <patch:attribute name="value">MyProject.CMS.Custom.Controls.RteConfiguration, MyProject.CMS.Custom</patch:attribute>
  </setting>
</settings>

If your HTML Editor Profile specifies Configuration Type directly then you will need to either remove it so that it uses default specified in config or update the Type setting in the item itself. For example, the Rich Text Default profile in the Core database this is specified in /sitecore/system/Settings/Html Editor Profiles/Rich Text Default/Configuration Type.

Other Editor Options

The list of possible Formatting Options are:

  • NoneSupressCleanMessage: Doesn’t strip anything on paste and does not prompt user whether MS formatting should be cleaned.
  • None: If no MS Word formatting is detected, content is pasted as is. If MS Word formatting exists, user is prompted to clean it.
  • MSWord: strips Word-specific tags, preserving fonts and text sizes.
  • MSWordNoFonts: strips Word-specific tags, preserving text sizes.
  • MSWordRemoveAll: strips Word-specific tags, fonts and text sizes.
  • Css: strips CSS styles.
  • Font: strips Font tags.
  • Span: strips Span tags.
  • ConvertWordLists: converts Word ordered/unordered lists to HTML tags.
  • AllExceptNewLines: Clears all tags except “<br />” and new lines (\n).
  • All: strips all HTML formatting and pastes plain text.

There are also a whole bunch of EditorFilter options which can be found in the documentation for the Telerik Editor

Further Reading:
1. Pasting Content Overview
2. Clean MS Word Formatting
3. Editor – Cleaning MS Word Formatting

Advertisements

6 comments

  1. Thad · April 11, 2016

    Could you confirm whether or not this works for IE11? I’m on Sitecore 8.0 Update-5 and the code is working for Firefox, Chrome, and IE8/9/10, but using IE11 all of the ugly Word formatting is preserved in the RTE on paste. Unfortunately for me, IE11 is the default browser here, and the authors who don’t know better than to use a different browser are the same authors who will never paste as plain text.

    • jammykam · April 11, 2016

      I hadn’t tried it before now, we’re using Chrome internally, but am able to reproduce the issue but unfortunately it appears to be an issue with the Telerik controls themselves, e.g. try it on this demo page in IE11

      The Telerik support forums also don’t seem like much help on the matter 😦

      The Strip formatting options from the RTE seems to work, although not ideal.

    • Thad · April 11, 2016

      Sitecore includes a 2012 version of the Telerik controls (including RadEditor), which does not support IE11. An updated version should fix this issue, but I’m awaiting response from Sitecore Support to verify.

      • jammykam · June 15

        Hi Thad, I’m in the process of upgrading to SC8.1 and the release notes of the initial mentions “A number of issues in the Rich Text Editor control have been closed by updating the Telerik component to the latest version.”

        So it looks like the latest version of Telerik (as of SC8.1 release) is now in Sitecore!

        Hope that helps.

  2. Raymond Goh · June 15

    Hi Jammykam, have you noticed that with this implementation, when you enter a data attribute that the RTE still adds an =”” to the end of the data attribute? Is there a way to prevent this from happening?

    • jammykam · June 15

      I don’t know, all our attributes have values set. But my understanding is that an empty value means the attribute is still set and is valid (just that it has an empty value set, instead of no value set).

      You should still be able to test for the presence of the data attribute in code though: https://jsfiddle.net/jammykam/hso6hg73/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s