Newspaper Web Archives Slow to Hit the Market

By: Steve Outing

Over the (Thanksgiving) holiday weekend, I went hunting for some newspaper articles by using the Web archival services of some newspaper companies. A growing number of papers now allow Internet users to search through their back archives, in some cases going back to the 1980s. The services open up newspaper archives to the general public, since you no longer have to use the proprietary archival services like Nexis/Lexis, Datatimes, Dialog, Newsbank and others. (Some of those services also sport Web interfaces now, too, of course, and are useful when you want to search multiple publications.)

Newspaper Web archive services are evolving, and remain far from perfect. I spent some time sampling newspaper archives on the Web, to gauge the progress that the newspaper industry is making. There's still work to be done, but the archives that do exist are getting better. Below, I've included "grades" (on an A-B-C-D-F scale) on several publications' Web archive services.

(For this column, I am dealing only with archive services that cover content from a newspaper going back several years. I do not include search features on Web sites that only search content that is a few days or weeks old and remains on a paper's Web site.)

All the News That's Not Archived: No Grade

First, let's note that many newspapers -- including some prominent ones -- do not yet have archive access available on the Web. The New York Times is a glaring example. The user who wants to find an old Times article is referred to Nexis/Lexis or their public library. A Web archive service is forthcoming, perhaps in 1998.

The Times isn't alone in its tardiness to get a Web archive running. Other notable newspapers that still lack a Web archive interface include: USA Today; The Washington Post (which says it will have a newspaper archive dating back to 1986 on the Web soon); Seattle Times; Dallas Morning News (which says a Web archive is coming soon); The Guardian (London); The Times of London (which says it's having trouble with its archive system and is rebuilding it); Sydney Morning Herald (Australia); and the Jerusalem Post (archive searches currently unavailable) -- to name just a few.

A lot of newspapers continue to do without a Web archive service that lets consumers pay for access to their archived articles. These publishers are missing out on an obvious and potentially lucrative Web revenue stream for their sites -- when the technology is available for them to create such services.

Here is a sampling of newspapers that do have long-term archives available on the Web, with some commentary about good points and bad:

Wall Street Journal: B+

Journal Web site subscribers, who pay annual fees to view the site, are referred to the Dow Jones News/Retrieval Publications Library premium Web service, where they can search the Journal, Dow Jones content, and that of many other publications. Searching is free, but you are charged $2.95 for each article actually ready, which is added to your normal WSJ Interactive account.

If you're searching for an example of a good implementation of a Web archive feature, this is the place to look. A search returns the headline; publication name; date the article ran; number of words in the story; and the first two or three lines of the article (typically the first sentence). I'd like to see a little more preview text in the search results screens, but this can be enough information to make a decision on whether to spend the money to see the piece. Also, the articles are ranked in order of relevance to the searched term. Use the "Advanced search" feature, and you can specify date ranges to search and get results returned in date order.

The pricing model could use some work in order not to price this service out of the small-office/home-office and educational markets. The basic per-article fee of $2.95 is too high -- especially since to use the archive, a user already is paying an annual Web subscription fee. As an alternative, you can pay $9.95 per month, which gets you 15 articles. The problem is, you have to keep track of how many article you've accessed; the system doesn't tell you. Go over 15, and it's back to being charged $2.95 each. You also can view just first paragraphs of stories ($2 a piece) or citations ($1). Again, I don't think that's enough of a price break, especially when other newspaper Web archive services offer first paragraphs for free.

Knight-Ridder Newspapers: A-

Knight-Ridder's newspapers all share the News Library system, which enables a user of any K-R Web site to search any of the company's newspapers for past articles. Anyone can use the system to conduct searches, either of an individual paper or all K-R papers. Searching is free, but it will cost you $1 for each article you choose to read.

This is one of the better newspaper archive services on the Web. You can specify what order you want results displayed (newest to oldest, oldest to newest, relevance, or term frequency). Search results returns headline, date, publication, number of words in article, and the first couple paragraphs of the story. News Library gives you a good peek at an article before you decide to plunk down your dollar to view it. It lacks an adequate date range selector; you have to choose a year, but can't specify your search to be just for the last 10 days. Available databases vary by paper, but many go back pretty far. The Miami Herald archives available via a News Library Web search go back to 1982, for example.

While anyone can search the service for free, when you want to retrieve a full-text article, you have to be a "member" of the service. This can be done by becoming an InfiNet ISP customer (because K-R papers offer Internet access, co-branded through InfiNet), or by applying for a "LibraryCard," which allows you to use News Library and be billed for article retrievals via credit card. A third option is to purchase a News Library subscription by calling an 800 telephone number. What the service lacks is a way for an infrequent user to simply purchase a single article without having to go through the hassle of becoming a member.

Los Angeles Times: A-

The Times also has a good Web archive service, with content available back to 1990. Searching is free, and results include the headline, date, author, length of story, and the first few sentences -- again, enough to make a decision about whether to "buy." Price is $1.50 per article, which I consider too high for a consumer publication that wants to get a lot of archive business from small businesses, students, teachers, etc.

The Times Web site offers two monthly subscription plans, to give regular users a break on the $1.50 per story fee. A $4.95 a month subscription gets 10 stories, then subsequent stories retrieved are $1.50. Or for $25 a month, you can get 25 stories, with subsequent retrievals at $1.50. These subscriptions automatically renew each month.

It also features a nice "Power search" feature, where a user can enter multiple words or phrases in a series of text field, separated by AND, OR or NOT choices selected by clicking on radio buttons. This is an excellent way to make it easy for people who may not understand the standard AND/OR/NOT conventions of many search engines. You can search for "Last 7 days," "Last 30 days," etc. Better are those search interfaces that allow you to specify exactly what days to search.

Boston Globe/ D includes an archive search service for the Boston Globe's content going back to 1982 -- although it only covers staff-written articles. I had trouble even using this service, because even after signing up to be a registered user (including my credit card number so that I could pay to retrieve full-text archived articles), clicking on the Search link brought up an error message -- repeatedly.

Like most other newspaper Web archives, charges nothing to do a search, but retrieving full-text articles costs. If you do your retrieving between 6 a.m. and 6 p.m. U.S. Eastern time, each article costs $2.95; other times it's $1.50. Not only is this too expensive for a consumer market Internet service, but it's foolish to stick with the different fees for different times of the day. Because the Internet is an international medium, and's archive customers are as likely to be in Europe as in Boston, the site should get rid of this antiquated approach to charging. (It's worth noting that Knight-Ridder axed a similar pricing scheme on its News Library service in favor of a flat $1 per article charge at any time of day. In fact, is using News Library itself as its archive search technology, but with a different business model.)

San Francisco Newspapers: C+

The Gate, the Web site serving the Chronicle and the Examiner, has a free searchable archive available on the Web. While the price is right, it covers content only back to January 1995, and it includes only staff-written articles -- nothing that appeared in the newspapers that originated from wire services or freelancers. Search results return the headline, date of publication, and publication (Chronicle or Examiner), ranked in order of date. Oddly, there's no mention on the Gate's "Search" page of how to find pre-1995 articles.

One nice feature is the ability to look back at the news of a particular day. Using the search form, just select a date, and the headlines of that day are returned to you. (You can select News, Sports, Business, etc.) This is a feature that every site should have (assuming, of course, that you don't dump your news content into a paid archive and delete it from your Web site).

Toronto Star: D

This Canadian daily has a skimpy archive, which allows you to search issues of the newspaper going back a little over one month. The search function consists of a field to enter a search string, but no "advanced search" options to narrow your search. At least it's free.

New Jersey Online: Not graded

The New Jersey Online Web site uses a different business model than the rest. To search back issues of the Star-Ledger and the Times of Trenton, users can sign up for a single day's access to the Web archives for $6.95. Corporate users can pay $150 a month for unlimited use of the newspapers' archive. I like this model, yet I think it will scare some infrequent users off who otherwise might pay 50 cents or $1 for an occasional article. I'll admit to that myself; I was too "cheap" to pay $6.95 just to try out the interface in order to review it for this column.

Some good models, if you can find them

What I noticed foremost in my newspaper Web archive surfing expedition was how few papers have successfully implemented a good system. A consumer-oriented Web archive service is a revenue stream not to be overlooked, yet many publishers are dragging their feet. To be sure, it's no easy task to get a Web archive system up and running, but the potential payoff -- especially for large metro papers -- is significant.

As noted above, there are a few good Web archive systems worth emulating. To my mind, these are a few of the key elements to creating the best newspaper archive service on the Web:

Charge fees for viewing full-text articles that are palatable to consumers. That would be a maximum of $1 per article, plus bulk discounts such as 50 articles for $20. This puts the price point in reach of the home-office crowd; educators and students. You'll make up in volume what you lose from the price breaks. Give archive users options. Let them purchase a single story without having to take out a monthly subscription. Give them bulk discounts, or a corporate unlimited-use rate. One price does not fit all. Let the user closely define searches. That means creating a date range field that can be as specific as "search for stories between January 13 and January 23, 1997." Let the user define how the results are returned -- in date order (forward or reverse); by relevancy; or by frequency of searched keywords in the articles. Let users search for free, then charge for full-text retrieval. Return enough information from a search so that the user can determine whether or not to "buy" an article. Include headline, author, date, length in number of words, and at least the first paragraph of the story. Steve

Previous day's column | Next day's column | Archive of columns
This column is written by Steve Outing exclusively for Editor & Publisher Interactive three days a week. News, tips, and other communications may be sent to Mr. Outing at

The views expressed in the above column do not necessarily represent the views of the Editor & Publisher company


No comments on this item Please log in to comment by clicking here