Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lost configuration checking to deploy command. #5718

Open
wants to merge 3 commits into
base: 12.x
Choose a base branch
from

Conversation

adamzimmermann
Copy link
Contributor

@adamzimmermann adamzimmermann commented Jul 25, 2023

This replaces #5713


Motivation

Current State

I added an update hook that simply did this and it failed as expected. ✅

/**
 * Force a failure in the drush deploy command.
 */
function example_update_8004() {
  \Drupal::service('module_installer')->uninstall(['some_module']);
}
Screenshot 2023-07-24 at 8 07 18 PM

I tested this code on the 11.x branch and then ported it to 12.x without being able to 100% test it as I don't have time to scaffold up an environment that is 12.x friendly, but it looks like it will work.

If someone else could test this and help 100% validate it that would be great.

I also added logging to help add visibility to the config items affected. Hopefully we are all on board with that and find that valuable. Happy to adjust as needed though.

@bircher
Copy link
Contributor

bircher commented Aug 1, 2023

Hi, thanks for the PR, but I think this is the wrong approach.
Update hooks (and hook update N) should definitely be allowed to change configuration.
The workflow should be:

  1. run composer update to get new versions of the module
  2. run drush updb to run update hooks
  3. run config export to export the changed config to the files
  4. commit everything and push

on the other system

  1. run composer install (which updates the module)
  2. run drush deploy (which runs update hooks and config import)

So this PR would fail the recommended workflow. How else are contrib modules supposed to install other modules that they now depend on in a new version? This didn't fundamentally change since my presentation at devdays in Seville (slide 39 and 40 https://www.slideshare.net/nuvoleweb/advanced-configuration-management-with-config-split-et-al )

@bircher
Copy link
Contributor

bircher commented Aug 1, 2023

I had a small chat with @weitzman and I think it would be helpful to illustrate why I think failing on a perfectly ok scenario is not the right solution.

Imagine I maintain a contrib module and your site is using it.
Now in the next version I refactor my module to separate the UI into a _ui sub module because I want to be as cool as views or fields. But I know that just making the UI disappear for you will upset you so I install the ui module in an update hook.
You followed the best practice and in your development environment ran composer update drush updb and exported the config.
But when you deploy this change then before update hooks ran the config status will show the module not installed and after the update hooks it shows no diff because the module is already installed.
With this PR your deployment would now fail.
But it is not simply a matter of comparing the wrong thing here. Because what if after you ran the update hooks, you decided to uninstall the new UI module because you don't want to have it in production? Then the deployment would also fail with there not being a diff before the update hooks but one afterwards.

@markdorison
Copy link

@bircher I totally get the scenario you describe. Maybe "fail" is the wrong approach but ideally we want to somehow notify or flag users with automated checks and workflows that an update hook has produced config changes that need to be exported.

D.O. #3110362: If an update hook modifies configuration, then old configuration is imported, the changes made by the update hook are forever lost

This issue recently affected us with the release of Drupal 10.1. #1845004: Replace custom password hashing library with PHP password_hash() included an update hook to enable the new phpass core module. Once that update hook ran, the module was enabled only to be disabled when config:import was subsequently run.

We have automated tests that check the status of Drupal config state, but these did not fail because once config:import had run (via drush deploy), the configuration on disk matched the active configuration as far as Drupal was concerned.

If we are allowing folks to believe that configuration on disk is their source of truth and then modifying active configuration with an update hook, this seems like it will inevitably cause issues like this.

@weitzman
Copy link
Member

weitzman commented Aug 1, 2023

Yeah, I think this is still valuable - we need to decide on behavior. My initial thought is to add an option --ignore-mismatched-config which logs a warning instead of throwing an Exception. I don't think many sites will choose to use it but it is there for the scenario described by @bircher.

@adamzimmermann
Copy link
Contributor Author

I'm late to the party, but wanted to chime in quick too. Thank you all for keeping this moving and the good discussion. 🙏🏼

@bircher those are both valid scenarios that concern me too. In fact it is scenarios and workflows just like the ones you described that this is meant to help address.

The problem is that no matter what state we end with after database updates won't matter, as we will lose whatever (config) changes were made in the update hooks when config is imported. So the net result is that the new UI module will be enabled and then subsequently disabled. This check would simply flag that and allow the user to react to the issue, which would likely entail following the exact workflow you outlined above.


Yeah, I think this is still valuable - we need to decide on behavior. My initial thought is to add an option --ignore-mismatched-config which logs a warning instead of throwing an Exception. I don't think many sites will choose to use it but it is there for the scenario described by @bircher.

That sounds like a good plan to me. Should I add that to this PR?


But it is not simply a matter of comparing the wrong thing here. Because what if after you ran the update hooks, you decided to uninstall the new UI module because you don't want to have it in production? Then the deployment would also fail with there not being a diff before the update hooks but one afterwards.

You are correct that the check assumes that we would want to export/commit any config changes made in an update hook into code. When that is not the case the scenario you outlined is an issue. However, I would assume that most of the time we would want to commit changes to configuration made by an update hook to code, so adding a flag that allows this to succeed in those cases seems like a good compromise to me. If this or other similar scenarios do occur more often than I'm thinking they do, then we do need to rethink this though.


But when you deploy this change then before update hooks ran the config status will show the module not installed and after the update hooks it shows no diff because the module is already installed.

I'm not sure I'm following this scenario. 🤔 If update hook are the first step in any deployment, how would the module already be enabled?

@weitzman
Copy link
Member

I added a new ignore-mismatched-config option. Is that good enough, @bircher? Any thoughts, @alexpott?

@adamzimmermann
Copy link
Contributor Author

@weitzman did you forget to push that up to this branch? I don't see any new commits. 👀

@weitzman
Copy link
Member

Pushed. Had some trouble with github CLI.

@adamzimmermann
Copy link
Contributor Author

The new changes look great to me FWIW. ✅

Thank you for adding that. 🙏🏼

@alexpott
Copy link
Contributor

I think this would be a tricky change to make - every deployment that changes config outside of a hook_update_N or post update will need to use the new flag. I'm not sure that this make sense to me. In my experience deployments not purely module and core updates but often contain new features built from config like new fields / content types, views etc..

@markdorison
Copy link

In my experience deployments not purely module and core updates but often contain new features built from config like new fields / content types, views etc..

I believe this is one of the scenarios that this change intends to highlight so that these changes can be exported to configuration. Is there a scenario where configuration changes are made via an update hook that you don't want to export to config on disk? Maybe I do not understand.

@benvoynick
Copy link

Saw this come up on Drupal slack. I am also concerned about this change. Even if individual commits don't mix exporting database updates back to the codebase with other config changes, at my workplace we often deploy multiple commits/concerns up all at once to at least some environments such as Production, due to deploying to them less than to QA/lower environments. Having the deploy behavior simply change - particularly in the middle of the 12.x lifecycle? - would be highly disruptive.

I get that the "phpass scenario" is a big problem. (And I've already seen it happen at my own workplace.) However, putting overall constraints on what kind of configuration package can be deployed up seems problematic for all the reasons already outlined.

If nothing else, seems to me it would be appropriate to turn the option the other way around: default to a warning and have a flag to opt in to throwing an exception.

I believe this is one of the scenarios that this change intends to highlight so that these changes can be exported to configuration. Is there a scenario where configuration changes are made via an update hook that you don't want to export to config on disk? Maybe I do not understand.

A good example is the Scheduler module. There were changes between the 1.x and 2.x modules to form widget display. The update hook has some logic to automatically add or update form displays accordingly. That hook outputs a note that it is making its automated best judgment regarding these form display changes, and I believe it in fact encourages the reader to double check its work and make changes if necessary, e.g. unhiding or hiding Scheduler widgets if one disagrees with the update hook's decision.

@adamzimmermann
Copy link
Contributor Author

If nothing else, seems to me it would be appropriate to turn the option the other way around: default to a warning and have a flag to opt in to throwing an exception.

That sounds good to me.


A good example is the Scheduler module. There were changes between the 1.x and 2.x modules to form widget display. The update hook has some logic to automatically add or update form displays accordingly. That hook outputs a note that it is making its automated best judgment regarding these form display changes, and I believe it in fact encourages the reader to double check its work and make changes if necessary, e.g. unhiding or hiding Scheduler widgets if one disagrees with the update hook's decision.

This really helped me understand the concern better. Thank you for sharing this. I have not encountered an update hook like that before.

@markdorison
Copy link

I believe this is one of the scenarios that this change intends to highlight so that these changes can be exported to configuration. Is there a scenario where configuration changes are made via an update hook that you don't want to export to config on disk? Maybe I do not understand.

A good example is the Scheduler module. There were changes between the 1.x and 2.x modules to form widget display. The update hook has some logic to automatically add or update form displays accordingly. That hook outputs a note that it is making its automated best judgment regarding these form display changes, and I believe it in fact encourages the reader to double check its work and make changes if necessary, e.g. unhiding or hiding Scheduler widgets if one disagrees with the update hook's decision.

If Scheduler is "updating form displays" in an update hook and then config-import is run subsequently by drush deploy, won't those changes be overwritten by the configuration on disk? This sounds like a scenario that this PR is trying to solve for.

@benvoynick
Copy link

If Scheduler is "updating form displays" in an update hook and then config-import is run subsequently by drush deploy, won't those changes be overwritten by the configuration on disk? This sounds like a scenario that this PR is trying to solve for.

I'm thinking of the following scenario:

  1. On a local site instance, I update the Scheduler module code (presumably via Composer).
  2. I run database updates.
  3. I make further changes to some of the same config Scheduler changed in its database updates. It so happens that some of these changes make it so that a config file goes back to exactly the state is was before the database update. (Real example: Scheduler resets form display weights back to their default, even if they were already customized under 1.x. Maybe I don't like that, and maybe the update hook didn't need to make any other changes.)
  4. I export my configuration changes, which are now a mixture of database update changes and my own changes.

If I then commit the above and run a deployment process that uses drush deploy, under this PR Drush would flag my commit/deployment for "reverting" the effects of the database update, right? Because it'll see these form display configs are slated to be altered by config import.

@adamzimmermann
Copy link
Contributor Author

adamzimmermann commented Sep 26, 2023

If I then commit the above and run a deployment process that uses drush deploy, under this PR Drush would flag my commit/deployment for "reverting" the effects of the database update, right? Because it'll see these form display configs are slated to be altered by config import.

These changes would not flag that. Let me try to explain below.


This PR does the following:

  1. Uses drush config:status to retrieve the diff between active configuration and tracked configuration before any deployment commands and then stores that in a variable.
  2. Runs update hooks and config import as we always have.
  3. Uses drush config:status to retrieve the diff between active configuration and tracked configuration after deployment commands and then stores that in a variable.
  4. Uses the two stored differential lists at those two points/states in time to see if there is a relative diff between them. Keep in mind we are comparing the relative differences between active/tracked configuration at those two points in time, not the diff created by importing configuration.

For the scenario above, you wouldn't encounter this at all because you exported your configuration, which would then be imported, and then there wouldn't be any additional subsequent diff between the active configuration state and the tracked configuration state.


I export my configuration changes, which are now a mixture of database update changes and my own changes.

This PR is seeking to catch scenarios where you did not export your changes and flag for you that a relative diff exists in active vs tracked configuration which can often mean that configuration was not exported after an update hook. It doesn't care where the changes came from as long as you export configuration to remove the diff between active and tracked configuration.


Perhaps some illustrated examples would help us all understand this more clearly. 🤞🏼

For the sake of this example, each color could be a line in a config file or just some discrete piece of config.

Failure:

  1. Tracked Configuration: 🔴 🟠
  2. Active Configuration: 🔴 🟠 🟡
  3. Record Diff (🟡)
  4. Run Update Hook (🟢)
  5. Tracked Configuration: 🔴 🟠
  6. Active Configuration: 🔴 🟠 🟡 🟢
  7. Record Diff (🟡,🟢)
  8. Command Fails as 🟡,🟢 != 🟡 (new relative diff)

Success:

  1. Tracked Configuration: 🔴 🟠
  2. Active Configuration: 🔴 🟠 🟡
  3. Record Diff (🟡)
  4. Run Update Hook (🟢)
  5. Export Configuration
  6. Tracked Configuration: 🔴 🟠 🟢
  7. Active Configuration: 🔴 🟠 🟡 🟢
  8. Record Diff (🟡)
  9. Command Succeeds as 🟡 = 🟡 (no new relative diff)

Ideally there is no 🟡 diff either, but we often don't work in ideal environments. So this change allows for those differences to continue to exist without causing issues.


Does that help clear this up?

@adamzimmermann
Copy link
Contributor Author

I just added an explanation to my comment above clarifying my example, as I realized it might not be as clear as it was in my head. 🙈

Comment on lines +52 to 74
// Record the state of configuration after database updates.
$process = $manager->drush($self, ConfigCommands::STATUS, [], $configStatusOptions);
$process->mustRun();
$newConfigDiff = $process->getOutputAsJson();

// Check for new changes to active configuration that would be lost
// during the subsequent configuration import.
if ($originalConfigDiff !== $newConfigDiff) {
$configDiff = array_diff(
array_keys($newConfigDiff),
array_keys($originalConfigDiff),
);
$this->logger()->warning(dt('The following config changes will be lost during config import: :config', [
':config' => implode(', ', $configDiff),
]));
if (!$options['ignore-mismatched-config']) {
throw new \RuntimeException('Update hooks altered config that is about to be reverted during config import. Use --ignore-mismatched-config to bypass this error. Aborting.');
}
}

$this->logger()->success("Config import start.");
$process = $manager->drush($self, ConfigImportCommands::IMPORT, [], $redispatchOptions);
$process->mustRun($process->showRealtime());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not track what you explained in your last comment.
Here the second diff is created after the update hook has run but before the config import.

@bircher
Copy link
Contributor

bircher commented Sep 27, 2023

The pull request does not what you describe here because you say that 2. Runs update hooks and config import as we always have.. But in the Pull request the config import comes after the second config status.

But also your cases with the colours is not entirely correct, because you can not do a config export during the deployment.

Success would be:

  • Tracked Configuration: 🔴 🟠 🟢
    
  • Active Configuration: 🔴 🟠 🟡
    
  • Record Diff (🟡, 🟢)
    
  • Run Update Hook (🟢)
    
  • Tracked Configuration: 🔴 🟠 🟢
    
  • Active Configuration: 🔴 🟠 🟡 🟢
    
  • Record Diff (🟡)
    
  • Command Succeeds as 🟡, 🟢 includes 🟡 (no new relative diff)
    

@bircher
Copy link
Contributor

bircher commented Sep 27, 2023

But I am arguing that this is still wrong!

Because I might have removed 🟢 on purpose from the "tracked config" during site building after I ran the update hooks locally.
But I can not prevent the update hook from running (other than patching it) update hooks are not optional and there is no UI to choose which update hook to run. Update hooks are meant to fix the database so that it works with the newly deployed codebase.

@AlexSkrypnyk
Copy link

Hi all

There are workflows that use tools for automated dependency updates (like Renovate bot) that automatically update modules and, while those module updates still run a set of automated tests on the consumer projects, not having an explicit failure would not notify the dev team. As a result - those module updates will automatically go into production (in cases where auto-patching workflow is used). With a hard fail - the dev team would not need to read auto-update logs but rather see the update failures.

So, whatever the outcome of this discussion is - whether to have a flag or not - could you please consider to use an erroneous exit code on failure.

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants