The experiment.

A tagging prototype, a room full of librarians, and the idea that became self-growing tag trees.

V Peter · Founder, VDAM

So, last week I wrote about the discovery interviews I conducted, and one major pain point was inconsistent tagging. The next phase was to present a prototype of a feature.

In my prototype, the user was guided to tag an image, and when entering a tag such as "tele-mark," a standardized value such as "telemark" was suggested. The idea was to use some fuzzy logic and pattern matching to find closely related terms, discourage new tags, and in the process, enforce tag consistency. Overall, the user interface was positively received, and with quite some enthusiasm. So it became clear to me that these DAM admins, librarians, and Creative Ops Leads wanted an easy, natural way to enforce consistent tagging.

Controlled vocabulary.

A phrase that DAM librarians love to hear is "controlled vocabulary." This implies a few things. It implies that there is a set of specific values for a field that can serve as the "vocabulary," and that this set is under some sort of control. A free-form text box is not controlled. A dropdown list is controlled, but it may be a limited vocabulary. A tag taxonomy can be a highly structured set of values, comprehensive enough to fulfill the requirement of a field's controlled vocabulary. Maintaining these taxonomy structures and values can be very tedious.

So there are quite a few things happening here. Taxonomies need to be established so that when new assets come in, they are comprehensive enough that asset taggers can find what they need and apply those values to the asset. Sometimes the tagger just doesn't find the tag, and either misses it or mistags the asset. It is also often the case that the tagger feels a new tag is needed. This can go a couple of different ways.

One: the tagger has the authority to add new tags. If I said that out loud in front of an audience, I'd get at least a few "boos" from the room. Two: the tagger can file a ticket to recommend adding a new tag to the taxonomy. This presents problems, because it reduces tagging efficiency and you need a way to put those assets "on hold" until a decision has been made. These tickets eventually pile up, and the committee needs to meet to review the new tag suggestions, research the taxonomy to see if there's an equivalent somewhere else, and determine if a suggestion is truly novel enough to warrant its own tag. For example, a few years ago a jacket that looked like a plaid shirt got popular and the term "shacket" needed to be deliberated on. In that case, if the tag is worth it, they create the new tag, make sure the affected assets are updated, and let the team know about it.

To even have such a process implies that the business finds value in these tags. Typically this has come about because there were painful experiences before, where people weren't able to find what they needed, and the organization learned the importance of proper tagging.

So, what are we to do?

The modern way.

These days, using AI to auto-tag with general keyword tags is becoming more common. Some of the leading systems are starting to employ custom-field prompt tagging. That's a lot of fancy words, so let me give you an example. Let's say you're working on images in fashion photography, and it's useful to know how many people are in the shot, what accessories the model talent is wearing, what hairstyle and makeup style the model was featured in for that shot, and so on. These can be custom metadata fields that are more specific than general tagging like color palettes.

The trad-tagging way of doing this is to have a human look at the image and tag it — or, if you're next-level, tag it at the source, as far upstream as possible: the photoshoot. The modern way is defining a custom prompt so that a vLLM (vision large language model) can take your prompt, combine it with the image, and return an answer that goes into the metadata field. And this AI tagger should be aware of the list options or taxonomy tag values that can go into that field.

Problem solved, right? No.

Self-growing tag trees.

What about new tags? Should we have AI automatically file tickets for new tags it wants to insert into the taxonomies, and wait for humans to meet and deliberate on the merits of each proposed tag? My take is to let the system have autonomy in this area, since it is a tough problem to tackle. Let the AI do its thinking and research and add the new tag itself. This is AI-assisted, self-tagging, self-maintaining, controlled vocabulary. (Somewhere out there, a DAM librarian just fainted.)

AI-assisted, self-tagging, self-maintaining controlled vocabulary. Somewhere out there, a DAM librarian just fainted.

I've patented this, because it seemed novel enough and useful enough to stake a claim on, and no one else has it except VDAM. I sometimes refer to it as self-growing tag trees. Trees are meant to be living and to grow, right?

Have you tried out VDAM's custom-field, custom-prompt tagging? Have you let the system auto-grow tag trees for those taxonomies? If not, you should 'go out on a limb' and try it. ;)

Field notes from VDAM — written by a human.

Filed under controlled vocabulary tag taxonomy auto-growing tag trees custom prompt tagging

The experiment.

Controlled vocabulary.

The modern way.

Self-growing tag trees.

Let your taxonomy grow itself.