[Exherbo-dev] Tags vs. categories

Jonathan Dehan jdehan at gmail.com
Thu Jul 10 06:37:19 BST 2008


---------- Forwarded message ----------
From: Jonathan Dehan <jdehan at gmail.com>
Date: Thu, Jul 10, 2008 at 1:36 AM
Subject: Re: [Exherbo-dev] Tags vs. categories
To: Bernd Steinhauser <exherbo at bernd-steinhauser.de>


I am going to word the problems a bit differently.
The first is how to uniquely identify a package. This covers on-disk, which
should be how the package manager and maintainer sees the package.
The second is how to browse or search for a particular package, with or
without knowing its name.

I propose having the on-disk format as follows:
$packagedir/packages/$repository/$provider/$package-$version-$revision.$format
$provider can be {organization,main_developer,homepage}. It does not need to
be consistent, for it will only be used to help uniquely identify a package
(along with $repository) and all its versions if just the $name is
ambiguous.

To help search for a particular package, since replacing categories with
providers almost actively discourages browsing, there is a short term
workaround and a medium term possible solution.

For searching, tag support could be artificially added by appending '\n
tags: blah blah' to the description field. Thus 'inquisitio -s any number of
fake tags here' will return good enough results. In addition, using a script
to automatically add the current category as a tag name can help. The only
thing manually browsing the categories is currently good for is if you
forgot a package name, know its in the tree, know what it does/what category
it should be in (and that it is in that category), and if you see the name
will remember it. Google is also your friend, in these cases.

For more in depth browsing, and not just tag searching, the tags can be
exported as symlinks in
$packagedir/tags/{$all-tags}/{$all-tags-plus-the-tag-selected-above}/{etc,etc}/name-provider-repository
->
 $packagedir/packages/$repository/$provider/$name-$best_available_version_and_revision.$format

Exporting all permutations of tags as symlinks on the filesystem makes it
very flexible to browse for packages, at least until a proper interactive
client comes around.

- Jonathan Dehan


On Wed, Jul 9, 2008 at 11:09 PM, Bernd Steinhauser <
exherbo at bernd-steinhauser.de> wrote:

> Michael Croes schrieb:
>
>>
>>
>>
>>            So what would a user do to install a package, lets say gcc?
>>            First, he would search for the package, which might look
>>            like this:
>>            inquisitio --search --tags compiler cpp
>>
>>
>>        First off, there's 2 possibilities:
>>        1. The user knows the package name and types 'paludis -i
>>        package-name'
>>        2. the user doesn't know the name and uses something else to
>>        find the
>>        name out (presumably inquisitio) and then types 'paludis -i
>>        package-name'
>>
>>    Yes, and finding stiff isn't always that easy, and tags help to find
>>    things easier, and they help to find alternatives (you can search
>>    for packages that are similar).
>>
>>  What I tried to point out is that either way you still tell paludis to
>> install the package using the unique package identifier, not the tags.
>>
> Yes, definitely. But it still gives you an advantage, for
> non-interactive clients (which atm, we don't have.
> Maybe there could also be a way you can tell paludis to install results
> 10 of the last search. (Which would imply --pretend.)
> But maybe that isn't a good idea.
>
>
>>            Then he would install the package using paludis.
>>            Maybe an installation mode might be possible using tags, if
>>            the tags are enough to narrow it down, but maybe that might
>>            not be a good idea.
>>            The good thing about this is, that 1. and 2. can change, but
>>            what the user uses (and he should mainly use the tags) stays
>>            the same, so a change wouldn't cause as much confusion as it
>>            would if all three change.
>>
>>
>>        The thing where you go wrong is that you assume that tags are
>>        part of
>>        the unique package name. Of course you don't want tags as part
>>        of the
>>        unique package name, you don't want a category as part of the
>> unique
>>        package name either. The way this turns out right now in gentoo
>> with
>>        paludis is that if a package exists in multiple categories you
>>        need to
>>        provide paludis with extra information to establish the unique
>>        identifier for the package.
>>
>>    No, not at all. The tags don't have anything to do with the way the
>>    packages is identified. They are just additional information that
>>    the user can access.
>>
>>
>> Do you misunderstood/misread what I wrote? You say tags don't have
>> anything to do with the way a package is identified, which is exactly what I
>> was saying, would be nice if you could clarify...
>>
> Yeah, I guess I misread your mail, sorry.
>
>
>>
>>
>>            If we also (auto-)create some special tags, like system,
>>            world or installed, it would also be possible to for example
>>            search for all installed cpp compilers using --tags
>>            installed compilers cpp.
>>            (Currently one would use --kind for that.)
>>
>>
>>        Let's do versions with tags too and create the Totally Tagged
>>        Package
>>        Manager. All you need to do is figure out tags, you don't need
>>        any other
>>        metadata than tags, that would be useless...
>>
>>    Stop the useless nagging, thank you.
>>
>>
>> I hope that my nagging did point out to you that if you're gonna add tags
>> because a package is installed, you might as well add tags for other stuff
>> that's not supposed to be in package metadata.
>>
> Hm, not sure if I understood you here.
> inquisitio would add the tag, if the package is installed, on the fly.
> Of course it shouldn't modify anything on the disk for that.
> And of course, we shouldn't do things without thinking about it, and I
> guess that we would realize, when it goes horribly wrong.
> (So yes, tags for everything wouldn't work.)
>
>
>>
>>
>>            But here we see the problem with tags.
>>            Changing tags should not have any affect on 1. or 2., so
>>            tags should are not a complete replacement for the
>>            categories we currently use.
>>            The should be used for the user interface and only there,
>>            not for the structure our repos have.
>>
>>
>>        Because tags should not be part of the unique identifier for a
>>        packge
>>        this issue doesn't exist.
>>
>>        The real issue when dropping categories is how to distinguish
>>        between
>>        different packages with the same name, I think it has already been
>>        mentioned on the mailing list before. Your email shows that
>>        having tags
>>        fixes the issues you see with categories and shows that if you
>>        use tags
>>        as if they were categories, stuff would go wrong. That's why
>>        they're not
>>        called categories, they're different.
>>
>>    See, you didn't really read what I wrote.
>>    What you are trying to solve belongs to point 1. in my list.
>>    I was *only* talking about what the user uses.
>>
>>
>> Actually the notion of tags solves point 3. Tags and categories are
>> somewhat similar, but with categories as implemented in gentoo you limit
>> packages to one category. If you extend this so packages can be in more than
>> one category then you might as well call it tags so you can mix more
>> metadata in. In the end you can still have a multi-level structure where you
>> can find your packages, so that should not be an issue for the end user.
>>
> Well, in the end, it is more what you are calling it, isn't it? ;)
>
>
>>
>>
>>    You could even use a system for the actual storage, inspired by tags
>>    or similar. What I supposed gives you the ability to basically
>>    select just about any solution, because you don't have to worry
>>    about the user interface as much as you had to before.
>>
>>
>> Which you pointed out in your previous mail to suck because if the tags
>> change packages would have to be moved around on the 'system for actual
>> storage'.
>>
> Well, I was referring to something "tag-like" whatever that might be.
> I'm not really in favor of such a solution, I just wanted to say, that
> what I proposed allows you to freely choose the solution for 1. and 2.
> There might be other solutions for 3. that achieve that, too, of course. :)
>
>  I don't think point 2 is a real issue. It's more a personal preference how
>> you should lay out packages.
>>
>> So to answer your three points of concern:
>> 1. Not changed by using tags, only changed by removing categories. Not
>> solved by categories either because 2 packages can have the same name and be
>> in the same category. Solved by having a unique identifier for a package
>> (which you need anyway), then the structure on disk could perhaps be
>> /path/to/repository/[unique-identifier]. There's other ways too, but they
>> probably all end up using the unique identifier as part of the path.
>>
> It's not that easy, because you have to think about performance, too.
> But basically yes, that's the main problem.
>
>  2. Let's say I work on package foo. I don't really care if the dir with
>> files is /path/to/repository/app-useless/foo or if the dir with is
>> /path/to/repository/foo, I still end up with the same dir with files. I
>> think this has nothing to do with tags again, only with removing categories.
>> Then again, even removing categories doesn't really change anything, because
>> in gentoo the category is just part of the unique identifier for a package.
>> The only real difference is that if I work on half the packages in
>> app-useless, they now might be all over the place, because I also work on
>> (app-useless/)zoo and (app-useless/)aoo. So the list which also includes the
>> packages I work on has now grown from just the 50% of app-useless to all
>> availible packages in a repository... BUT: if you're sane you're probably
>> not modifying stuff inside the repository, but rather somewhere outside of
>> the repository where there's still only those packages I work on, and
>> nothing else...
>>
> Working on stuff outside of the repository isn't really a nice thing,
> because you lose the functionality that git provides.
> What I mainly wanted to say with bringing up a difference between 1. and
> 2. is, that basically the layout of 1. could be very very ugly (looking
> at it as a human being). (But maybe doesn't have to be ugly.)
> But only, if you, at the same time, provide something nice in the repo,
> that the devs can work on.
>
>  3. If we start by using the category as a tag for every package, then by
>> searching for the tag www-client I would get exactly the same as there's in
>> the category www-client. Now I think that with tags you can certainly
>> improve a lot from here, but in worst case it's just as bad or good as
>> categories, so this seems like an absolute non-issue to me...
>>
> Absolutely.
>
>
> Regards,
> Bernd
>
>
> _______________________________________________
> Exherbo-dev mailing list
> Exherbo-dev at lists.exherbo.org
> http://lists.exherbo.org/mailman/listinfo/exherbo-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.exherbo.org/pipermail/exherbo-dev/attachments/20080710/9a8c35c5/attachment-0001.htm>


More information about the Exherbo-dev mailing list