Managing Shared Configuration Part 2: Configuration Snapshots

This is the second installment in a series presenting configuration management work that comes out of the Drutopia initiative.

In part 1 of this series, Configuration Providers, we introduced the problem of managing shared configuration, starting with how to determine what configuration is available.

But knowing what configuration is available is only part of the picture. To import configuration updates, we also need to answer the question: what's changed since the configuration was originally installed? That's where snapshotting comes in.

Comparing states

To break the problem down a bit, when we update configuration that's provided by extensions (modules, themes, or the install profile), we have three different states to consider:

  • The configuration as it was provided by an extension when we first installed or last updated.
  • The configuration as it's currently provided by the extension.
  • The configuration as it's currently saved in the site's 'active' storage.

The second and third states are easy--we just read what's there. But the first state is tricky. When we update to a new release of a module or theme, we lose the previous version of the configuration. Not only that, but there may have been many intermediate versions between when we last installed or updated and now. So even if we knew what the version was right before we updated, that may or may not be the same as the version we installed or updated.

What we need is a snapshot: a record of the state of the configuration at the time it was installed or updated.

And if we're searching for a model of configuration snapshots, we don't have far to look. Drupal core uses snapshotting as part of its solution for staging configuration.

Snapshots in core

Using Drupal core, you can stage configuration between different environments or versions of a given site--say, from a development environment to a testing one. For details of how this works, see the Configuration Management section of the Drupal 8 handbook. The Synchronizing Configuration Versions section of the Drupal 8 User Guide also reviews this functionality.

When you're staging configuration, you might see this message:

The following items in your active configuration have changes since the last import that may be lost on the next import.

How does Drupal know there are changes? Because it has a snapshot of configuration as last imported.

Core uses a service, config.storage.snapshot, to provide a storage for configuration snapshotting. Shapshots are stored in a database table.

One of the components Drupal 8 gets from its use of Symfony is a system for creating and responding to specific events. For more on events in Drupal 8, see:

Core uses an event to create snapshots. When configuration is staged (imported), the ConfigImporter dispatches an event, onConfigImporterImport. The config_snapshot_subscriber service subscribes to that event and saves a snapshot. So the next time configuration is staged, there's a prior baseline to compare to.

Differences from the core use case

That's all very similar to what we need for the purposes of staging configuration. But there are important nuances.

For one thing, core needs only a single storage. A site's configuration is staged as a whole, so all configuration can be snapshotted together.

But when we're bringing in configuration updates from modules and themes, we have many different sets of configuration. If we import configuration updates from a single extension, we want to compare against and then refresh the snapshot for just that extension. So, rather than a single snapshot for the whole site, we need multiple snapshots, one per extension that provides configuration.

For another thing, in core the snapshot happens after staging configuration. Because the snapshot captures the result of staging, there's no need to stage the snapshot itself.

When importing configuration updates from extensions, however, the snapshot typically happens before staging, and what we stage includes the result of the update operation. So we do need to be concerned about staging the snapshots. That means, rather than being stored just in a database table, the snapshots themselves need to be stored as configuration that can be exported and staged.

Configuration Snapshot

The Configuration Snapshot module addresses the need for creating and using multiple configuration snapshots that themselves can be staged.

In Drupal 8, configuration can be stored using Drupal's entity system. To do so, you create a custom configuration entity type.

So the first step is to create a custom configuration_snapshot entity type. This allows us to store snapshots.

But reading and writing directly to an entity isn't going to work for our purposes. As noted in part 1 of this series, the standard ways to read, write, and compare configuration use configuration storages. So we provide a custom ConfigSnapshotStorage that provides all the standard configuration storage functionality, such as the ability to read from and write to a storage. Only in this case the storage reads from and writes to a snapshot configuration entity. Modules using configuration snapshots don't have to directly work with the snapshot entities. Instead, they can use storages that take care of the details.

The Nitty Gritty

For those interested in the messy details, here they are!

To facilitate working with the configuration snapshot storages, we register a service for each storage.

The usual way to register a service is by adding it to a simple *.services.yml text file; see relevant documentation. In our case, though, we can't know in advance which services are needed on a given site since we need to register one service per installed module or theme that provides configuration. So we use a slightly more complex pattern to provide dynamic services.

An analogous requirement in core is for multilingual-related services. These services are needed only if a site is multilingual. Determining whether a site is multilingual, in turn, depends on the number of languages installed and enabled on the site. In other words, it depends on the current state of configuration.

Core handles this case by providing a service provider class, LanguageServiceProvider. Using the ::register() method, that class tests the installed languages and, if appropriate, dynamically registers the relevant services.

We do pretty much the same thing in ConfigSnapshotServiceProvider. We load all snapshot entities and register a service for each one.

Often you need to call the same snippets of code in multiple contexts. One pattern for doing this is to use a trait. Traits allow code reuse. More technically, as explained in the PHP documentation, traits

[enable] a developer to reuse sets of methods freely in several independent classes living in different class hierarchies.

Traits are used extensively in Drupal core. One of the most commonly used traits provides text (string) translation. It's called, appropriately enough, StringTranslationTrait.

There's a bit of complexity to using snapshot configuration storages. If there's an existing service we want to use it, but if not we need to spin up a new storage. So we make this happen in a trait, ConfigSnapshotStorageTrait.

A final issue addressed in Configuration Snapshot has to do with deciding what differences are actually significant.

Sometimes what shows up as a difference can turn out to be irrelevant. For example, a user role in Drupal can have multiple permissions assigned to it. The order of those permissions in a piece of configuration makes no difference to what permissions the role will have. So, if the order of permissions is different between a prior snapshot and the current configuration state, there's no actual update to import.

To avoid spurious updates, we normalize the data by sorting in a predictable way. Here we base our solution on relevant code from what is by far the most-used Drupal 8 contributed module for configuration management: Configuration Update Manager. We crib from the ConfigDiffer::normalizeArray() method in that module that normalizes configuration.

The StorageComparer class in core is used to compare one storage to another to determine differences (items to create, update, delete, or rename). To test for updates, StorageComparer::addChangeListUpdate() uses simple equality between a given item as read from two different storages, meaning that differences in ordering will indeed show up as available updates. This is the relevant line:

if ($source_data !== $target_data) {

To change the behaviour of the storage comparer, we need to alter the way configuration is read from the storage. In that alter, we'll add normalization. That way, when the storage comparer does its work, insignificant differences won't show up.

Importantly, we don't want to change our storage so it always normalizes. Doing so could have unintended consequences. Rather, we want to provide the option of normalizing on read, and trigger that option only when we're reading for the purposes of comparison. To do so, we need to change a standard configuration storage into one that will optionally normalize data on read.

We have an interface to work with, StorageComparerInterface. We extend this interface to add our own methods. Our custom interface, NormalizableStorageInterface, defines methods that relate to whether data should be normalized on read.

Then, in our ConfigSnapshotStorage, we implement NormalizableStorageInterface by providing all the public methods required by core's StorageInterface and our custom ones defined in NormalizableStorageInterface. That way, we can invoke normalization when needed while leaving the default read behaviour for all other cases.

Potential enhancements

The normalization we're doing is needed to minimize spurious differences showing up, but it cuts a lot of corners. For one, not all differences in ordering can be safely ignored. Some may in fact include important data that we're incorrectly filtering out when we update. Beyond that, a module focused on snapshotting is not a great place to put a general-use interface for normalizing config data. The whole point of normalizing is to facilitate comparing different storages, and in our case that means comparing a snapshot to another storage where we will also need to normalize.

There are a couple of issues on Configuration Update Manager to try to sort these questions out.

Related core issue

Next up

Stay tuned for the next post in this series: Respecting Customizations.

Feed: