Industries that talk proudly of the “content” they offer — raise your hand, journalism organizations — have a special need to preserve what they’ve created in a consistent and easy-to-find way. Content, in this context, includes the links that people have been using to find it.
You would imagine that the news industry would understand this. If so, you would be overestimating the industry’s collective common sense.
A current case in point is what the New York Times Co. (a company in which I, perhaps stupidly at this point, still hold a bit of stock) just did to my friend Thomas Crampton and a host of other journalists whose work has appeared in the International Herald Tribune (owned by the Times) in recent years. In an open letter to the NYT chief executive, “Reporter to NY Times Publisher: You Erased My Career“, he observes:
The IHT website earned an ever-increasing pagerank due to all of the blogs and sites linking to stories there. (Based on the number of Internet pages linking back to a site, pagerank starts at 1 and rises to 10. A page with a Google rank of 5 will show up higher than a page with a Google rank of 3 and the IHT.com grew to match nytimes.com at a Google rank of 9. You can check pagerank of any site here.)
So, what did the NY Times do to merge these sites?
They killed the IHT and erased the archives.
No doubt, the Times will eventually make his and other IHT journalists’ articles available again via search through the mother ship’s own archives. But not preserving the URLs is truly foolish move.
Thomas isn’t alone among media folks in this content carnage, as he reported in a follow-up posting. Changing the URL structure of websites is a too-common event. Even if, as is the case most of the time, the originals are still around, disappearing the links is tantamount to hiding the original material.
In my case, when I worked for Knight Ridder, what happened was considerably worse. My former employer deleted my entire archive of blog postings — not just once but twice.
The first time, Knight Ridder moved all of its Web properties to a centralized system. This was part of a move decreed by the bosses who’d been sold on the notion that homogenizing the company’s content — and, more importantly, centralizing the display advertising engine — would be a brilliant business move. I question that, by itself, but I can assure you it was a stupid journalistic move to wipe out years worth of what I’d been creating for Knight Ridder.
The second time came after I left the company. I was offered a CD of my blog archives before the site was turned off, but I don’t recall ever receiving it, and by the time I realized I didn’t have anything it was too late.
I was enraged, both times. And entirely powerless to do anything about it.
I’ve learned my lesson. Anything I write — for myself or for someone else — is backed up on my machines under my control. I’m creating a cloud backup as well. I realize that there are circumstances under which I could lose even those copies, but I can’t make my stuff 100 percent safe.
This applies in spades to other kinds of things we store online. As far as I know, practically every service you use reserves the right to delete your account. Some of them will give you an opportunity to download what you’ve posted, but you should not even count on that when push comes to shove, especially in when economic pressures are as high as they have become today.
The point is that I no longer rely entirely on the good graces of other people, including employers, to preserve what I’ve created, much less keep it available for you to see. I try to rely on myself.
3 thoughts on “When Others Delete Your Past”
Thanks for weighing in on this.
One commenter on my posting suggests offering a service to journalists in backing up and tracking the quality of links to their work. I’m not sure many journalists would want to pay for this, but it would be an interesting route to pursue.
PS: A further issue was raised today in the Wikipedia community. What should they do about the dead links to the IHT website?
Institutions regularly wipe out electronic-only chunks of history. I managed the creation and operation of the DNC’s first website, which went live in 1995, and disappeared after the 1996 election, so I know the dismay that Crampton felt when he discovered that his links were all dead.
Seeing your own work disappear is unpleasant; but what about the historical record?
Who is responsible for preserving history in any form, electronic or otherwise? Historians have always struggled to answer questions about issues for which the original sources no longer exist, whether carved on stone, printed on paper, or typed into a computer.
Who decides what electronic records to attempt to preserve–leaving aside the well-known problems with archiving electronic records in the first place, with programming languages going extinct, formats disappearing, etc.
Dealing with these archival problems at a global level would cost a lot of money. Who should pay? Should there be a global database that sucks down the Internet every day?
Entropy is one of the most powerful forces in the universe. Preserving the records of the past is swimming upstream, and the electronic revolution has vastly increased the speed of the current. Can some variant of Moore’s Law decrease the cost of storage fast enough to keep up?
Lots of deep questions here–right now, it looks like everyone for him/herself, sauve qu’il peut.
I just took your advice and once again downloaded my blog onto my computer. In a sense these organisations are repeating the mistakes of their media forebears. TV stations regularly wiped and re-used magnetic tapes in the past and film companies were notoriously lax about how they stored movies. Think of what kind of revenue that could generate now?
Strange that with storage capacity growing at such a rapid rate the Times couldn’t just keep the IHT stuff in storage.
Comments are closed.