October 23, 2023

Sitecore XM Cloud

As Sitecore released XM Cloud sometime last year, the first project revealed itself and the objective was to do a lift&shift of a solution already running with a React-based frontend on a custom JSS-like backend and get things running with the lovely combination of XM Cloud and Vercel, essentially removing most of the custom stuff and rely on OOTB functionality, as offered by Sitecore and NextJS/Vercel, with the end goal of using Pages as the new editing solution.

Data migration

The first part of the project was to get all the customer's existing Sitecore content into the database, and by here, challenges really manifested themselves. There were a LOT of items in 35-40 languages. In XM Cloud, unfortunately, you can't simply restore a database like in the on-prem days (it IS SaaS) so we tried several different approaches:

  1. Sitecore packages - to me it was an obvious choice, as this allowed us to package media in a package and parts of the content tree as we saw fit. We quickly stopped creating packages, as the XM Cloud instance would take forever to install a package, and sometimes even crash in the attempt.
  2. Sitecore CLI, by serializing the production content into the filesystem of a developer machine and attempting to push it to the XM Cloud instance. This also failed, probably due to the same capacity problems as with Sitecore packages.
  3. The Razl tool has a scripting option, which we got working by sneaking the Razl web service endpoint to the XM Cloud instance. This worked, but had a few downsides, such as problems with network connectivity stability and worst of all, it was SLOW for such an amount of items (approx. 100 hours), but it worked.
  4. As a last resort (and with a sort-of approval from Sitecore Support), we began experimenting with making database connections from the XMC instance to a database hosted in Azure SQL in our own Azure tenant. This approach allowed us to do very fast transfers of raw data and reduced the time usage to 4 hours. The only downside of this approach is that we're modifying the Sitecore master database directly, rather than going via the Sitecore API. This also requires rebuilding the link database and indexes afterwards.

I have heard that an official migration tool is in the works at Sitecore, so I am looking forward to seeing that!

Monorepo vs multiple repos

Due to the way we've typically worked with detached frontends, we chose to have separate git repos for the Sitecore backend and the different frontends. In the beginning, we saw no benefit in using the Rendering Host provided by the XM Cloud tenants. Simply switching to a preview branch hosted by Vercel to have the customer check something in rendered in Pages worked pretty well.
At some point, we began getting 413 errors in Vercel's log and complaints about some (large) pages not being editable in Pages, unfortunately, this was due to our configuration and the active choice not to use the Rendering Host for Pages.

Fortunately, the quick fix for this problem is to do a git subtree command and thereby reference the frontend git repo from the backend repo. To do so, you set up a subtree like this:

$ git subtree add --prefix src/some-frontend https://github.com/some-frontend-repo.git main --squash

Don't forget to commit and push after doing this! Now, to update the backend repo with changes in the frontend repo, run this command:

$ git subtree pull --prefix src/some-frontend https://github.com/some-frontend-repo.git main --squash

After adding the repo and creating corresponding configuration in the xmcloud.build.json file and adjusting the rendering host endpoint in Sitecore, 413 errors were no longer a problem.

One thing to mention was that with the above change, the custom fonts in the website could no longer load in Pages. Adam Brauer helped us with this NextJS config plugin:

const config = require('../../../temp/config');
/**
 * @param {import('next').NextConfig} nextConfig
 */
const corsHeaderPlugin = (nextConfig = {}) => {
  return Object.assign({}, nextConfig, {
    async headers() {
      return [
        {
          source: '/_next/:path*',
          headers: [
            { key: 'Access-Control-Allow-Origin', value: config.sitecoreApiHost },
            { key: 'Access-Control-Allow-Methods', value: 'GET,OPTIONS' }
          ],
        },
      ];
    },
  });
};

module.exports = corsHeaderPlugin;

The change did not help much initially. The Sitecore XM Cloud interfaces (Pages, XM instance, Rendering Hosts) appear to sit behind Cloudflare and changes to headers seem to be cached quite heavily.

Sitecore magnetism

Ideally, with XM Cloud, it is my clear understanding that Sitecore would want us to rely less on Sitecore for running things that aren't necessarily required to run on a CMS, which in any case seems to have been a quite broadly adopted approach in the industry for years - hey, we need something on the website and hey, we have a few CMS servers exposed already; let's just deploy this thing to the CMS servers -> hence the Sitecore Magnetism :)

As a consequence of this, XM Cloud essentially provides a CM instance and the Experience Edge - and no CD servers. This requires us to put logic somewhere else, which in our case could be

  • NextJS APIs
  • Azure Functions
  • Azure WebApps for more advanced stuff

In terms of going all-in on the composable architecture, separating things like this is really good, as it's much easier to scale individual components when you know which component is responsible for what.

In our case, we had some functionality which sent some information from the frontend which ultimately ended up in a database somewhere. Having separated the frontend into a NextJS app running on Vercel, using a message queue for shipping such messages from the frontend to the aforementioned database was a no-brainer. Still, a little processing had to be done before submitting the data to the database, and as this essentially was a lift&shift migration, rewriting the logic on the CM was not initially part of scope.

The solution design ended up utilizing Azure Functions, configured with being triggered either by a timer/scheduler or simply getting triggered when a message from the frontend appears in the message queue.

As part of the developer documentation of XM Cloud it is stated that you can still do customizations, but we've concluded that this indeed is a SaaS product now and that some modifications are really better off being moved outside of the CMS.
Another point worth mentioning was that we had jobs/agents running that turned out to require a lot of resources and that eventually could interrupt the publishing process for instance.

Future

Many people have over the years been bashing Sitecore for not supplying a native .NET Core version of the CMS and as being formerly employed by Sitecore Corporation, I can understand the sheer magnitude of work hours needed in rewriting the CMS for more modern .NET.

However, I believe that with the introduction of Pages, Components, Explorer, etc, all hosted outside of the CM server and only communicating with the CM via APIs, we're now seeing editing apps being built and that one day, another piece of software will fulfill those API requests, essentially replacing the CM as we know it with a new CMS core. Who wants to make a bet?