[Exherbo-dev] Config management ideas
Ilya A. Volynets-Evenbakh
ilya at total-knowledge.com
Mon Nov 23 13:18:44 GMT 2009
I already mentioned my ideas about proper config file
management on IRC. Now I'd like to start real discussion
Latest document is at http://www.theilya.com/docs/config-system.txt
Below is the text, for convenience. Please comment.
Managing configuration using etc-update (or any similar tool)
is a nightmare for any complex system. This proposal tries to
come up with a better solution.
This is very rough draft at this point. I'll appreciate any
comments. Well, almost any. Send them to ilya at theilya.com.
Put [config-system] in the subject.
1. Single desktop scenario should be no more of a burden on
end-user then it is now.
2. It should be easy to see history of configuration file changes
3. It should be easy to roll back
4. It should be easy to manage common configuration between multiple machines
Use SCM with proper branching and merging capabilities as back end
for storing configuration. I will use git as an example, since it
has very good support for such things. Subversion, at the time of
this writing seems to be incapable of doing it well enough. Other
SCMs might work though. Requirements for an SCM are
- Atomic commits
- Proper tracking of branch merge states
- Ability to push commits between repositories
- Ability to track sync state of multiple repositories
1. Single user, single desktop
2. Organization with groups of similar desktops
(think beancounter machines, developers' machines, etc.)
3. Few HA or HPC clusters
Primary difference between scenarios 2 and 3 is that machines
in a cluster get upgraded/reconfigured almost simulteniously,
while in case #2 each individual machine can be upgraded at any
If the same version of same package is compiled on two different
machines with identical package manager configuration, all config
files generated by installing such package will be identical.
"Machine group" - set of machines which share master config tree,
will have identical package manager configuration.
1. Should we stick with CONFIG_PROTECT concept, or make ebuilds
specify each config file explicitly?
A: No, drop that. Package manager owned files can be defined
as configuration by ebuild authors. Package manager will
provide sane defaults (like /etc) to be config-managed,
but ebuild can remove from them.
Non package manager owned files can be marked for config
management by user. Optionally, user may associate custom
config files with specific atoms, so that they get a
reminder when corresponding packages are updated or removed.
2. What is the best way to track ownership and permissions of
A: package manager will provide it to the config system, and
internally we could just keep a file in mkdevs format somewhere.
3. What is the best way to recover from inconsistencies between
config files generated by installs of the same package?
4. How do we handle reinstalls and use flag changes?
5. What other factors can legitimately affect generated config
6. What information should package manager itself be aware of?
(E.g. what paths do we record?)
Config repository organization.
master tree - one where package manager keeps all the changes.
It is never directly modified. May be non-local.
secondary tree - local copy of master tree. Optional.
Can be pulled from, but never pushed into. Updated by
pulling from master tree. Can be promoted to become master.
root tree - one checked out into /. End user may modify files there.
temp tree - one where all the work is done. It is clone of the master
tree, created to perform some action, and then destroyed.
There will be following branches:
- pkgmgr - branch where all the config files are generated by package manager
- "group" - branches which have all the changes, common to some group of machines
- "machine" - branch where the final machine-specific changes are made
Steps during package management ((re)install, upgrade, remove):
0. package manager checks that root tree has no changes
which weren't pushed to the master tree.
1. package manager saves list of config files from
previously installed version
2. package manager creates a tree containing new
3. Package manager lets configurator know which packages
have been changed (reinstalled, upgraded, or removed)
3. configurator client is called by end user.
4. pkgmgr branch is checked out into the temp tree
5. it iterates over changed packages, skipping packages,
if a tag with corresponding package version and options
is present, committing config changes to the pkgmgr
branch, and tagging it with version info.
6. pkgmgr branch is merged into group branch(es), and
then into the branch for local machine.
All merges get tagged with installed package state
7. All changes are pushed back into the master tree
8. root tree pulls from the master.
Special case for a package downgrade:
If package downgrade is performed, and configurator
sees a tag in the pkgmgr branch with newly installed
version and options, user is offered to roll back
corresponding config files to the version on the machine
branch right before the update to next version of same
Steps for making changes to configuration:
1. The user makes modifications to config files
2. The user runs configurator in commit mode
3. Configurator checks out next level of branch
into the temp tree.
4. Configurator checks if user has a next level
group branch configured. If not, skip to step 7.
5. Configurator interates over changes in the root
tree, prompting user to decide which of them belong
to the group. Each change selected as group is
applied to the temp tree. Once done, temp tree changes
6. Configurator merges changes from current branch branch
into the next branch.
7. Configurator applies remaining changes to the temp tree.
8. Configurator commits the temp tree, and pushes it
to master tree.
9. Configurator resets root tree, then pulls master tree.
Goals vs. the proposal
If we make it possible to configure package manager to
run configurator in commit mode automatically before
merging any new packages, single desktop case has no
extra administrative burden compared to etc-update way.
At the same time goot UI tools for examining changes
at all levels are available, which can make end-user's
life lot easier.
This satisfies goal 1.
Any git history browsing tool allows to navigate root
This satisfies goal 2.
Group branches make management of multiple machines
easy. Goal 4.
I am not sure what is the use case for more then one
level of group branches. Or even for more then one
group branch per master tree. Cases I can imagine
(multiple clusters with identical packages but different
/etc/passwd? Sure, but what for?) are not likely to
be happeining in real life. OTOH, it doesn't add
any complexity, neither to configurator logic nor
to the end user, so let it be there.
We need means of distributing information about which tree
is the master between machines in a group.
It would be nice to provide a way for new package to be installed
into / with an already merged configuration files. This way
service interuption is minimal, especially when config file formats
or locations change considerably.
It would be nice to have a way to record config file location
changes between package versions. Implementation described above
doesn't let us do it. Probably ebuilds would have to supply
the information to make it possible.
It might make sense to do group->machine merge for all
available machine branches, and then just pull from master
to root on each machine after upgrade, if installed package
state tag is present.
Ilya A. Volynets-Evenbakh
More information about the Exherbo-dev