By: Steve Outing
In my last column, I presented the problem of news Web sites allowing article URLs to expire, so that linking sites throughout the rest of the Web point to dead links after a few days or weeks (depending on news site policy). This problem particularly impacts sites like New Century Network’s NewsWorks, which link to the best articles from newspapers around the U.S.; too often, a link on such a site will go bad because the originating news publisher expires the link and moves articles to a permanent digital archive. The solution, I suggested, is permanent links.
A growing number of publishers are recognizing this problem, and their Web sites have adopted policies of never letting their Web articles expire — in effect creating a Web site archive as the days and weeks pass. Last week, I surveyed Web news sites to learn about their link practices. Here I’ll review what I found. Then, in my next (Friday) column, I’ll discuss some technical and business model issues that you’ll need to address if you decide that permanent links are for you, and suggest an industry-wide solution to the permanent links dilemma.
No permanent links: the standard
The timing of expired links varies widely, but the majority of news sites seem to cling to the notion of letting Web articles expire after a set period — after which older stories can be found only in an electronic archive, typically for a fee. Obviously, this is a strategy designed to enhance archive revenues. The Web visitor who can’t find an article with a free search of the Web site next tries the paid digital archive.
At the Post and Courier (Charleston, South Carolina), most of the site’s URLs die within 24 hours and are transferred to an archive area, according to webmaster David MacDougall. Daily news and sports stories survive only until the next publishing cycle overwrites them. Recent stories are held in a “Recent news stories” area for seven days, after which they are purged from the system. The paper’s online archive system goes back to 1994, and there is a charge for accessing the archives. MacDougall says that columns are held on the Web for a week, and selected features stick around for one month.
At the Vancouver (British Columbia, Canada) Sun and Province, local news stories on the Web also are replaced each day, with exceptions made for certain stories with longer shelf lives. The newspaper site keeps no archive of any of the stories directly, but all stories from the two Southam papers are part of a paid archive service provided by InfoMart.
At Syracuse Online, the Web site of the Syracuse (New York) Newspapers, new media chief Stan Linhorst reports that articles from the newspapers remain online for about one week, then are automatically purged. Once the paid archives can be reached from the Web, Linhorst says he expects to take that down to a 24-hour expiration period.
And at the New York Times Web site, most links are good only for 24 hours, with certain articles (such as Arts & Leisure and CyberTimes) good for a week. Some articles are hand-selected for topical archives (for example, Princess Dianna package stories and theater reviews), which stick around for longer periods.
It is these short-expiration policies that give fits to sites that link to stories within other news sites.
Some Web sites do keep up “Webified” articles for longer periods. At Network World, the Web site keeps stories from the trade magazine up for about six months. Online editor Adam Gaffin explains why they don’t keep the articles online indefinitely: “It’s not so much that I’m concerned about our content going stale as I am about the hyperlinks we build into the stories going bad. Actually, even six months might be too long. I’ve had a couple of cases where links went bad between the time I found them on Friday and our first users logged in on Monday!”
At the Knoxville (Tennessee) News-Sentinel, Web director Jack Lail says that most news stays online for 30 days, but columnists’ works are online “pretty much forever.”
At many sites, article expiration dates can get complex. At the Los Angeles Times, stories last anywhere from one day to one year. A columnist’s URL might stay on the Web for a year; an article from a weekly section of the newspaper will stay for one week; and daily stories die after one day, according to deputy editorial director Travis Smith.
Steve Yelvington, editor of the (Minneapolis, Minnesota) Star Tribune’s Web site, says that expirations for content on his site are sometimes dictated by contractual obligations. Local news stories are kept on the Star Tribune site for three weeks before being purged, while Associated Press content is deleted after two weeks. And because of the demands of the New York Times Syndicate, NYT stories are killed after only one day.
In my research on what news sites are doing with durable links, I also found a number that use temporary storage areas for articles older than one day. With this scheme — which, again, works to the detriment of external sites that want to link to a news site’s specific articles — an initial URL for a story is good for only one day, then the article URL is changed and it is put in a temporary “recent news” archive. The original URL is “recycled” — used again for the next day’s story.
(The is similar to the system used by Editor & Publisher for archiving this column. The same URL will always get you to my current column. When a new column goes online, the previous one is given a new URL and kept on the E&P Web site as part of a column archive. You can always find a list of my previous columns — going back to 1995 — via a link at the bottom of this column.)
Let’s here it for permanent links
While not the majority, an increasing number of news Web sites are embracing the idea of permanent URLs — recognizing the advantages of letting linking sites bring them site traffic long after an article is originally published.
At Dagbladet in Oslo, Norway, Web site director Arne Krumsvik explains: “We used to remove news articles after three weeks. (But) this limited our possibilities to link to earlier related stories, and therefore we terminated the automatic removal.” He says that the newspaper does not currently have a paid archive on the Web, “but if we choose to offer that, we might use free entries to selected stories in the archive as related links instead of letting the Web site grow bigger and bigger.”
At the Modesto (California) Bee, URLs currently live for only 24 hours. But online news manager Tom Rouillard reports that the Web site next month will change over to database format for its pages (as opposed to static HTML pages), with story URLs looking like odd strings of numbers and commas — which in effect creates permanent URLs.
Another site that takes a similar approach is CNET’s News.com, which uses database calls as its URLs, thus making them “permanent.” This technique has the disadvantage of the URLs looking cryptic, but the article URLs the News.com visitors see will always pull up a particular story, no matter how old. And this system can work over the long term, where permanent link schemes that merely keep track of static HTML pages can become too cumbersome over time. CNET reportedly is also working on incorporating references to more recent related stories from within older stories — a nice feature made possible by the database approach to article storage.
At the Christian Science Monitor, the Web site staff has been assigning “durable URLs” to articles since July. Supervising online editor Tom Regan explains: “It used to be that once we changed our stories at 6 p.m., all the previous day’s stories disappeared. After that, the only place to find them was in the archives. But we found that created a problem for people in other time zones, particularly on the (U.S.) West Coast. So we started creating a ‘durable’ HTML archive. Although the original URLs still are wiped out at 6 p.m., we’ve created links from our Today’s Paper page to a mini-archive of the past two weeks’ worth of the Monitor.”
When people express an interest in linking to a story on the Monitor site — which the Monitor actively encourages — they are given the durable URL. “This allows us to not only permit but to actively seek permanent links from other sites directly into ours at the story level,” says director of electronic publishing Dave Creagh.
Mercury Center, the Web site of the San Jose Mercury News, doesn’t do permanent URLS, but it does encourage links to its articles. Managing editor Bruce Koon says that the site has created a “reprint center” where people who want to link to a Mercury News story can send an e-mail request to do so. When such a request is approved, the story is put in a special reprint area on the Mercury Center server and the URL is sent to the person making the request. Those reprint stories are kept online for six months, after which the Mercury staff informs them that they are welcome to take the story onto their server.
“We think it’s a good way to provide a public service, drive traffic to our site, but long term allow the user a way to finally have the story,” Koon says. “Generally speaking, this is for the occasional user and not for someone or some institution that’s trying to harvest all our stories.”
Another interesting approach to permanent links is being employed by MaineStreet Communications, a Gray, Maine, company that produces Web sites for publishers. Publisher Christopher Miller explains that Web advertising is constantly updated whenever someone requests an old article. (His sites use permanent URLs.) “So the article could be a year old and the advertising current,” he says. “We’ve gone through considerable hoops to make that possible and to map out a process so that will continue; it is part of our model.”
Other newspaper sites embracing permanent links include:
American City Business Journals, which operates a Web site for its 35 business newspapers. “A person who has linked to our site since we went up in June 1996 continues to reach the story with the same link,” says managing editor for online services Ben Eubanks. The Detroit News keeps all of its Web articles online permanently, linked to their original URLs. The site’s managers report “a good deal of traffic” to old stories maintained on the Web. The San Francisco Chronicle site, The Gate, uses permanent URLs. The Amarillo (Texas) Globe-News. The (Raleigh, North Carolina) News & Observer. Next column …
I started writing my previous (Monday) column on this topic fully expecting ito to be a single piece. But as I dug into this topic, I found it to be of considerable interest — not just to me, but to the many people who wrote to me with information and comments. And there seems to be a problem with the archiving model in use by the majority of news Web sites. I apologize if you’ve heard too much about the permanent links issue already, but I plan to do one more column (on Friday) discussing the technical and business model issues involved in switching to permanent links.
Previous day’s column | Next day’s column | Archive of columns
This column is written by Steve Outing exclusively for Editor & Publisher Interactive three days a week. News, tips, and other communications may be sent to Mr. Outing at firstname.lastname@example.org
The views expressed in the above column do not necessarily represent the views of the Editor & Publisher company