[Exherbo-dev] Labels - cross and granularity

Johannes Nixdorf mixi at shadowice.org
Tue Feb 9 12:46:58 UTC 2016

On Sun, Feb 07, 2016 at 02:48:43PM -0800, Saleem Abdulrasool wrote:
> On Wed, Feb 3, 2016 at 3:26 PM, Alex Elsayed <eternaleye at gmail.com> wrote:
> > On Wed, 03 Feb 2016 23:52:08 +0100, Bo Ørsted Andresen wrote:
> >
> > <snippity>
> >
> > > 2) Change all our labels from defining when a dependency is used to
> > > defining how it is used. E.g. read:, import:, link:, exec:, etc.
> > >
> > > If this needs more fleshing out, I'll leave that to eternaleye..
> > >
> > > This might provide a lot of additional granularity but may also make the
> > > common case too complicated?
> >
> > It's less changing them _from_ when a dependency is used, and more
> > _adding_ how it is used as a separate thing.
> >
> > There'd basically be two axes for dependencies: When it needs to be
> > present, and how it is used.
> >
> > On the "how" axis, I've wound up with:
> > - read: the dep contains data files which will be read by the package
> > - import: the dep contains code that will become part of the package
> > - link: the dep contains code that the package will refer to
> > - exec: the dep contains executables that the package will run
> >
> In terms of bike shedding: these mostly are terrible names. They are
> ambiguous and confusing for those of us who are not primarily packagers. I
> think that these are better alternative names:
> read -> data (or share)
> import -> static-link (or include)
> link -> dynamic-link (or lib)
> exec -> invoke (or libexec)
> Im not particularly fond of the alternative suggestions, but it does make
> it a bit more obvious without looking up what they are referring to.
> Ive given two suggestions per label, the parenthetical having a scheme that
> is more readily accessible to most packagers. The suggestion here
> indicates both the behavioral aspect of the dependency as well as the file
> system origin. The data that you are reading is usually from (/usr/)share,
> the statically included content is from (/usr/)include, the dynamically
> included content from (/usr/)lib, and the tools invoked usually live in
> (/usr/)llibexec. The names here almost mostly map to common configuration
> options (llibdir, includedir, libexecdir).

In my opinion it should be 's/libexec/bin/g'.

But lets not discuss names before we agreed on the concept.

> That said, Im not sure I understand the need for the increased verbosity of
> the requirements of a dependency in contrast to the current label set.  You
> are making the common use case more complex for no gain.  The above set
> really boils down to two sets: build time dependencies
> (static-link/include, what you called "import") or run time dependencies
> (everything else).  Now, there are cases where the dependency is ambiguous
> (e.g. provides pkg-config data which is used at build time, but the
> remainder of the dependency is runtime), but the status quo is maintained
> there.

I think it mainly comes from ciaranm wanting some sort of "descriptive"
(for the use case, not the resolver logic as far as I understood him)
name for the proposed new dependency labels.

> On the "when" axis, I feel the sensible values are:
> > - fetch: the dep is needed to download the package
> > - build: the dep is needed to go from source to binary
> > - test: the dep is needed when tests are run
> > - install: the dep is needed to run preinst/postinst/etc hooks
> > - use: the dep is needed when the package is used by the end user (or
> > other packages)
> >
> > Note "use" rather than "run" - "use" time is when a package can _satisfy_
> > dependencies, including read, build, and exec; only the last is apropos
> > to "run".
> >
> > The behavior of these regarding cross is as follows:
> >
> > read: required on host, regardless of timing
> > import: required on target, regardless of timing
> > link: required on target, regardless of timing
> > exec: required on host at fetch and build; required on target at test,
> > install, and use. At fetch and build time, dependencies of exec deps have
> > their target == host.
> >
> Im afraid I do not understand this.  host and target are being mixed here.
> The three terms are:
> build: the target of the machine performing the build
> host: the target of the machine where the generated binary will be run
> target: the target of the machine to be targeted when compiling.  This is
> only relevant for toolchain components.

I think he's using the terms the following way:
    - "his" host is "your" build
    - "his" target is "your" host
    - he doesn't mention "your" target

(This terminology probably makes sense if one doesn't think about "your"
target at all)

So let's come back to the question which labels are actually needed.
First I'll redefine the "when" and "how"s together with a more explicit
reasoning why the semantics are as they are proposed than eternaleye

I'll use your(/autotools') terminology here, but will probably use
"${name_according_to_your_terminology} host", creating the nice and
term "host host", because I don't know how else I should call them. "The
build" as opposed "the build host" doesn't work too well.

For the label names I'll use eternaleye's names in the summary as he only
proposed one name per idea, which makes it easier to refer to them.

(Also definitions like
> install, and use. At fetch and build time, dependencies of exec deps
> have their target == host.
didn't make much sense to me, so I'll need to provide the definition
I've been working with for those)

The next part will get quite boring and only serves to reduce
eternaleye's matrix of proposed labels back to a list of labels that
make sense in the cases I can think of (and this list only adds one new
dependency label). The list of the proposed labels starts after a line
that's just "<<<", so you can just skip to it.

the easy "when":

- test:
    It only ever makes sense to require the dependencies to be satisfied
    by the build host. If host != build they aren't run. They also need
    to be handled as our current test label is (disabled if tests are
    disabled). This makes test and easy label to handle, and I propose
    to make it only one item without supporting the whole matrix of
    possible labels. I'll disregard this "when" item in the following

- install:
    As we can't run binaries from the host host together with exec this
    definitely means we require the dependency to be satisfied from the
    build host. I'd argue together with read it should be provided by
    the build host too. It'll be data files used by (read by) an
    executable that probably already needs to be native to the build
    host, so it'll probably already be required for the build host by a
    exec+install dependency. Here I also propose to make it one label
    without the possibility to combine them with different "when"s as
    the semantic won't change anyway.


- read/data/share:
    Use case:
        Shared data (can by design provided by the build host or the
        host host)
        When used together with a build-time "when" (fetch/build) it
        should be required from the build host. When used together with
        "use" I'd propose to require it from the host host, as that way
        I can only copy the shared data files used by packages on the
        host host to my real cross target machine and expect it to work.

        On "use" I diverge from eternaleye's proposed semantic here.

- import/static-link/include:
    Use case:
        Using data that's specific to the host host at build time. This
        can be including include files, linking static libraries or
        anything similar (does any package use a pkg-config variable
        from the host host without linking any library?).
        This only makes sense at build time, as its use case suggests.
        One could argue that it could be required for tests as they may
        compile something too (or do we not like packages to compile in
        src_test?), but I'd argue we don't include the test "when" here
        and require a separate test dependency for that.

- link/dynamic-link/lib:
    Use case:
        Using data that's specific to the host host at build or
        runtime, like linking a shared library (build and runtime) or
        loading a shared library dynamically (dlopen, only runtime)
        This always needs to be provided from the host host as
        eternaleye suggested. Also this only makes sense at build or use
        time. Also according to the use case this definitely needs to be
        used with "use" if it's used at all.

- exec/invoke/libexec/(bin):
    Use case:
        Executing binaries at build or runtime.
        At build time it may only be from the build host. At runtime of
        course this should be provided by the host host.

To summarize what which label is supposed to mean:

        | read  | import | link  | exec
fetch   | build | -      | -     | build
build   | build | +host  | +host | build
use     | host  | -      | host  | host

The first thing one notices there is that the concepts aren't actually
orthogonal. import and link have a lot of cases where their "when" axis
is predetermined.

First I'd like to propose here that we put read and exec under one
target label:

        | read  | import | link
fetch   | build | -      | -
build   | build | +host  | +host
use     | host  | -      | host

Also it's easy to see that import doesn't need a "when" axis, as it can
only be used in one case, so we'll take import (let's call it
build-native later on) out of the matrix for now:

        | read  | link
fetch   | build | -
build   | build | +host
use     | host  | host

At this point our only case for the build dependencies is read/exec at
fetch or build time. I propose we just keep our old build/fetch label
for that case. This leaves us with:

        | read  | link
fetch   | -     | -
build   | -     | +host
use     | host  | host

If we now say we keep the old run label for use+read/exec and use+link,
we'll have the following:

        | read  | link
fetch   | -     | -
build   | -     | +host
use     | -     | -

The semantics of the last one are already contained in build-native.

- test (+test-expensive)
- install
- build-native (was import)
- build (+fetch)
- run (+post)

Which, as shown with the matrices, covers all the proposed use cases
where using the "when" and "how" axes together made sense. This also
gives us a minimal definition of those labels.


test: (+test-expensive)
    Use case:
        Binaries or data provided by a package are required for
        tests/expensive tests. (same as before)
        Required if tests are enabled. This means only if build == host.
        So in the cross case we only ever need to require them from the
        build host.

    Use case:
        Binaries or data provided by a package are required in the pkg_*
        functions. (same as before)
        Required from the build host at install time.

build: (+fetch)
    Use case:
        Host-agnostic (or "host-symlink host"-specific) data/executables
        provided by a package are used at build/fetch time (executing
        host tools, etc.) (same as before)
        Required from the build host at build time only.

    Use case:
        Host-specific data provided by a package is used at build time
        (include files, static libraries, etc.) only.
        Required from the host host at build time only.

run: (+post)
    Use case:
        Host-specific data/executables provided by a package are used at
        runtime. (same as before)
        Required from the host host at runtime.

For shared libraries we would now need to use build-native+run instead
of build+run. The names are of course subject to bikeshedding.

This proposal doesn't discuss suggestion/recommenation semantics for
cross, which is something we probably need to consider too (if we break
all dependency labels we might as well do it only once, but this
proposal as it stands is backwards-compatible for now).

Now to http://exherbo.org/docs/multiarch-TODO.html#labels:

This proposal only finds a solution for build-time at target dependencies
(and mainly shows that the full matrix of "how" and "when" labels isn't

run-time at native dependencies are ruled out early on when making the
first matrix out of a similar reason to the one mentioned on the page.
It'll break the switch-the-host-symlink use case in the same way it
breaks the only-copy-the-data-files-used-by-the-cross-host use case.
Fixing this needs more complicated semantics than considered here.
Regardless I'd like to propose run-shared as the name for the label (as
it refers to shared data or host-symlink executables).

build-cross dependencies aren't considered either, as they need more
complex semantics than were considered in here.

More information about the Exherbo-dev mailing list