[Exherbo-dev] Config management ideas

Ilya A. Volynets-Evenbakh ilya at total-knowledge.com
Mon Nov 23 13:18:44 GMT 2009

Hello everyone.

I already mentioned my ideas about proper config file
management on IRC. Now I'd like to start real discussion

Latest document is at http://www.theilya.com/docs/config-system.txt

Below is the text, for convenience. Please comment.

Managing configuration using etc-update (or any similar tool)
is a nightmare for any complex system. This proposal tries to
come up with a better solution.

This is very rough draft at this point. I'll appreciate any
comments. Well, almost any. Send them to ilya at theilya.com.
Put [config-system] in the subject.


1. Single desktop scenario should be no more of a burden on
   end-user then it is now.
2. It should be easy to see history of configuration file changes
3. It should be easy to roll back
4. It should be easy to manage common configuration between multiple machines


Use SCM with proper branching and merging capabilities as back end
for storing configuration. I will use git as an example, since it
has very good support for such things. Subversion, at the time of
this writing seems to be incapable of doing it well enough. Other
SCMs might work though. Requirements for an SCM are
- Atomic commits
- Proper tracking of branch merge states
- Ability to push commits between repositories
- Ability to track sync state of multiple repositories

1. Single user, single desktop
2. Organization with groups of similar desktops
   (think beancounter machines, developers' machines, etc.)
3. Few HA or HPC clusters

Primary difference between scenarios 2 and 3 is that machines
in a cluster get upgraded/reconfigured almost simulteniously,
while in case #2 each individual machine can be upgraded at any


If the same version of same package is compiled on two different
machines with identical package manager configuration, all config
files generated by installing such package will be identical.

"Machine group" - set of machines which share master config tree,
will have identical package manager configuration.

1. Should we stick with CONFIG_PROTECT concept, or make ebuilds
   specify each config file explicitly?
 A: No, drop that. Package manager owned files can be defined
    as configuration by ebuild authors. Package manager will
    provide sane defaults (like /etc) to be config-managed,
    but ebuild can remove from them.
    Non package manager owned files can be marked for config
    management by user. Optionally, user may associate custom
    config files with specific atoms, so that they get a
    reminder when corresponding packages are updated or removed.
2. What is the best way to track ownership and permissions of
   config files
 A: package manager will provide it to the config system, and
    internally we could just keep a file in mkdevs format somewhere.
3. What is the best way to recover from inconsistencies between
   config files generated by installs of the same package?
4. How do we handle reinstalls and use flag changes?
5. What other factors can legitimately affect generated config
6. What information should package manager itself be aware of?
   (E.g. what paths do we record?)

Config repository organization.
 master tree - one where package manager keeps all the changes.
	It is never directly modified. May be non-local.
 secondary tree - local copy of master tree. Optional.
	Can be pulled from, but never pushed into. Updated by
	pulling from master tree. Can be promoted to become master.
 root tree - one checked out into /. End user may modify files there.
 temp tree - one where all the work is done. It is clone of the master
	tree, created to perform some action, and then destroyed.

There will be following branches:
 - pkgmgr - branch where all the config files are generated by package manager
 - "group" - branches which have all the changes, common to some group of machines
 - "machine" - branch where the final machine-specific changes are made

Steps during package management ((re)install, upgrade, remove):

0. package manager checks that root tree has no changes
   which weren't pushed to the master tree.
1. package manager saves list of config files from
   previously installed version
2. package manager creates a tree containing new
   configuration files
3. Package manager lets configurator know which packages
   have been changed (reinstalled, upgraded, or removed)
3. configurator client is called by end user.
4. pkgmgr branch is checked out into the temp tree
5. it iterates over changed packages, skipping packages,
   if a tag with corresponding package version and options
   is present, committing config changes to the pkgmgr
   branch, and tagging it with version info.
6. pkgmgr branch is merged into group branch(es), and
   then into the branch for local machine.
   All merges get tagged with installed package state
7. All changes are pushed back into the master tree
8. root tree pulls from the master.

Special case for a package downgrade:

If package downgrade is performed, and configurator
sees a tag in the pkgmgr branch with newly installed
version and options, user is offered to roll back
corresponding config files to the version on the machine
branch right before the update to next version of same

Steps for making changes to configuration:

1. The user makes modifications to config files
2. The user runs configurator in commit mode
3. Configurator checks out next level of branch
   into the temp tree.
4. Configurator checks if user has a next level
   group branch configured. If not, skip to step 7.
5. Configurator interates over changes in the root
   tree, prompting user to decide which of them belong
   to the group. Each change selected as group is
   applied to the temp tree. Once done, temp tree changes
   are committed.
6. Configurator merges changes from current branch branch
   into the next branch.
7. Configurator applies remaining changes to the temp tree.
8. Configurator commits the temp tree, and pushes it
   to master tree.
9. Configurator resets root tree, then pulls master tree.

Goals vs. the proposal

If we make it possible to configure package manager to
run configurator in commit mode automatically before
merging any new packages, single desktop case has no
extra administrative burden compared to etc-update way.
At the same time goot UI tools for examining changes
at all levels are available, which can make end-user's
life lot easier.
This satisfies goal 1.

Any git history browsing tool allows to navigate root
This satisfies goal 2.

Group branches make management of multiple machines
easy. Goal 4.

Random thoughts.

I am not sure what is the use case for more then one
level of group branches. Or even for more then one
group branch per master tree. Cases I can imagine
(multiple clusters with identical packages but different
/etc/passwd? Sure, but what for?) are not likely to
be happeining in real life. OTOH, it doesn't add
any complexity, neither to configurator logic nor
to the end user, so let it be there.

We need means of distributing information about which tree
is the master between machines in a group.

It would be nice to provide a way for new package to be installed
into / with an already merged configuration files. This way
service interuption is minimal, especially when config file formats
or locations change considerably.

It would be nice to have a way to record config file location
changes between package versions. Implementation described above
doesn't let us do it. Probably ebuilds would have to supply
the information to make it possible.

It might make sense to do group->machine merge for all
available machine branches, and then just pull from master
to root on each machine after upgrade, if installed package
state tag is present.

Ilya A. Volynets-Evenbakh

More information about the Exherbo-dev mailing list