By: Steve Outing
(This is the third and final column on the topic of permanent links for Web site news articles.)
As documented in this column this week, the majority of news Web sites are engaging in what I believe is an unwise policy of letting hypertext links to published stories expire after short periods — in many cases, as short as 24 hours. The result is that search engines, personal Web sites, and sites that link to news stories on various news sites (such as New Century Network’s much-ballyhooed NewsWorks), are being prevented from driving traffic to these news sites.
This week, I heard numerous complaints from linking sites and from consumers complaining about the annoyance of finding a link to an article on a news site, but getting the dreaded “Error 404 – File Not Found on This Server” message because the URL had intentionally been expired. News sites’ policies of terminating article links quickly are creating holes in the World Wide Web. Put bluntly, news site archiving policies are angering consumers, as well as proprietors of Web services that can send substantial business to news Web sites. Quick-expiring news links defeat the best thing about the Web.
I received a lot of input and letters on this topic this week, and I’m hearing about heated debates at newspaper Web sites over this issue. Comments like these convince me of the need for use of permanent URLs by news sites:
“The best thing about the Web should be that no one needs to keep personal copies of anything they might want later; just keep the URL. In fact, you now have to keep things locally because they won’t be there when you go back. … As an early Web aficionado, I almost feel betrayed.”
— Jack Scholfield, computer editor, The Guardian (London)
“I come down firmly on the side of permanent links. One of the main attractions of the Web is that it allows readers/users to control how information comes to them. For us to say it’s only available now, and for X number of hours, demonstrates the kind of arrogance that is doing in the newspaper industry.”
— Patricia Sullivan, online editor, Mercury Center/San Jose Mercury News
The money thing
The real issue, of course, is money. Newspapers, in particular, long before the Internet got popular generated substantial revenue from participating in paid proprietary digital archives like Nexis, Datatimes and others. With the Web, newspaper archives are now more readily and easily searchable by the general consumer (minus the middle man). Publishers, naturally, want to make money from this.
That’s all well and good, but the way many publishers are going about it, they’re shooting themselves in the foot. Expiring news article URLs in 24 hours, then moving the stories to a paid archive where those original URLs no longer function, is throwing away business.
Consider this. Search engines and Internet directories like Yahoo!, Excite, AltaVista and InfoSeek are the busiest sites on the Web, generating hits that the largest news publisher can’t even conceive of approaching. Yet you can’t search for an old article from the Los Angeles Times using Excite or AltaVista; you’ll have to visit the Times’ archive on the Web (or a proprietary service like Nexis).
The Times Web archive may generate some substantial revenues for the newspaper, but imagine how much greater the revenue stream if a consumer search using AltaVista turned up Times archived stories along with all the other “stuff” on the Web — this being possible because the Times site started using permanent links.
Here’s how this scenario might play out, continuing to use The Times for our hypothetical scenario:
The Times begins using permanent URLs on its content published on the Web, so that all stories are kept on the Web site indefinitely (except those that must be removed from the Web for contractual reasons, such as some wire service content, or freelancers’ works, depending on contract provisions). Some consumers continue to use The Times’ searchable Web archive to find — and pay for — old articles. Others continue to search for Times articles by using time-worn proprietary services like Nexis and DataTimes. Other consumers use one of the big-name Internet search engines, and specify in their searches to look only for articles from The Times. Because of the paper’s use of permanent URLs, the search is successful, and results turn up capsule information about particular stories. Consumers using the Internet search engines also may do a Web-wide search, and among the results will be articles from The Times, mixed in with all the other relevant search hits. When a consumer clicks on a Times archived article link found via an Internet search engine, if the article is older than a news site-defined period (say, 30 days), then a payment screen will be returned to the consumer — because the older-than-30-days articles are tied in to the paper’s archive system. Likewise, someone who finds a URL linking to an older Times article on some unaffiliated Web site would not face “Error 404,” but rather be informed that the story still exists, but it will cost a nominal fee to view it because it has been “archived.” Some observers, such as author David Rothman, who I have quoted in earlier columns on this topic, want to see a more benevolent model, where publishers keep aging articles online permanently with no fee for viewing. He would have this be supported with targeted, current advertising dynamically inserted into old content as it is requested by Web users.
While some publishers are apt to try this, I doubt that many larger news organizations will adopt such a model any time soon, for (the legitimate) fear of decimating their traditional and Web archive revenue streams. Rather, I would expect to see the model of free for a set period, then older content has a price tag.
The price tag needs to be nominal, however, if you expect Joe Websurfer to pay up when he hits a paid link. The Los Angeles Times, for instance, currently charges $1.50 to view an article in its archives. Joe is highly likely to balk at that when stumbling on an old Times article via surfing or an AltaVista search, but he might pay a modest fee like 25 cents or 50 cents. Thus, Web surfers become a prime new market for archived news site stories when pricing is adjusted for Web customer expectations. (Lower pricing like this also can be subsidized by adding advertising into the mix.)
Also, it’s highly recommended that you consider making some content from your news site permanently available on the Web for free. This might include such content as obituaries and birth notices, or anything for which you determine a public service reason for “giving to the community.”
Short of the proposal above, another option for the newspaper business is to create an industry-wide electronic archive of news content (permanently) housed on the Web. If news sites cooperate, an organization like New Century Network could manage the archive billing and provide the technology that is shared by all its members. (Thus, aging URLs that point to articles on a news site older than 30 days would generate a request for payment from the central archive administrator, which intercepts the request before the actual article is served up to the customer.)
So far, no such system has been proposed. These are just ideas.
Of course, I don’t mean to suggest that any of this is simple. A large reason that nothing like this has yet been implemented is the lack of an adequate online microtransactions system to support it. Those sites (mentioned in my last column) that do use permanent links do not charge a consumer to read an old article; in practice, they probably take away some traditional archive revenue.
Even discounting the pay per read angle, keeping all of a busy news site’s URLs permanent presents data management challenges. Bob Kennedy, director of development at Knight-Ridder MediaStream, which produces digital archiving systems for the news industry, says that permanently saving static HTML pages eventually will create an unmanagable system. What’s called for if you want an efficient system for permanent article links is a database publishing system, in which URLs are actually database calls. Thus, a URL for today’s story when used 1 year from now will retrieve the relevant article from the database, rather than go in search of a static HTML page from a bloated server.
A good example of such a system in action is CNET’s News.com Web site. (CNET uses Vignette’s StoryServer system.) CNET does not have a paid archive.
Who will fix this?
I don’t have all the answers, but I do know that the Web news industry needs to come up with a solution to the dead links and lost business created by short-expiration URL policies. The news industry needs to have some hard discussions about existing archival models — which are geared to generating revenue from businesses, professionals and librarians who can afford to pay high archive retrieval fees — and how to rewrite them to accommodate the existing and coming masses of consumers using the Internet.
There’s potentially much money to be made by publishers who create a consumer-priced archival scheme — and permanent links are a major part of the eventual solution.
Previous day’s column | Next day’s column | Archive of columns
This column is written by Steve Outing exclusively for Editor & Publisher Interactive three days a week. News, tips, and other communications may be sent to Mr. Outing at firstname.lastname@example.org
The views expressed in the above column do not necessarily represent the views of the Editor & Publisher company