So why do everything you can to keep metadata intact? Because it’s from this information that new products can be automatically created, at a scale and rapidity that would be impossible otherwise. With every piece of metadata that you don’t throw away, you gain a factor more potential ways of slicing through your content and delivering it as a separate product, simply as a result of a database lookup.
Let me give a worked-out example of this, for starters. AnnArbor.com uses Movable Type as a publishing platform; as a part of that, we have tags for each story. A staff of journalists tagging their stories creates a lot of tags (because you are writing about a lot of things) and a few tags with a lot of stories (because you are writing a lot of stories about those things).
Keeping a collaboratively maintained set of tags sane is work (in the same way that moderating comments is work; but that's another post). When you are writing on a deadline, you don't have the luxury to work out every last possible reuse of your work, and so you don't tag aggressively. When you are copy editing, you might well have a reason to use or prefer a specific tag, in part because it lets you direct readers to previous coverage on the topic in a way that's much simpler than specifically deep linking to each page.
What helps me keep some part of some of the metadata I care about within the AnnArbor.com world sane is to build links to the stories that I want to refer to from an outside source - in this case Arborwiki - and as I'm doing updates to that site use it as a sanity spot-check for coverage. Some wiki templates help speed the process of explicit linking, and an internal category helps me figure out what fragment of the tag space I've covered. When I see a story that isn't easy to link to from a relevant page, I go back and add the tag - not because it's useful in the abstract, but because it's relevant to one specific external instance.
In this way I think that Ben is missing one of the elements of metadata, the element that says that you never really stop working on it, and that simple repurpose through a database lookup only works when you are still actively editing the database. There's nothing worse that going to a lot of work doing an attractively formatted page that's driven by a database query against a database that you can't control, and having to answer the question why a certain piece of unwanted data is there and having someone stop you in your chair and watch over your keyboard until it's gone.
Metadata only works when you un-meta it and deal with it again as data. The list of metadata elements that I care enough to keep updating is not just meta; it's a first class real list, one that has to be treated as a first class citizen and not just some accidental system artifact. Sometimes the metadata you expose just makes it clear how incomplete your first pass at storytelling was and what it takes to bring it back to the level of refinement that you expect.
(tags: stopped watched; so meta it hurts; arborwiki)