My Word!

Why google is in the dictionary but AJAX isn't.

Podcast is in. PL/1 is out. Mashup is on the short list. But pity poor eight-track: It came and went without ever making it into The Merriam-Webster English Dictionary.

The ephemeral nature of high-tech terminology poses a challenge to dictionary editors, who tend to wait years -- or decades -- to adopt new words. Since technology moves so quickly, the slow rate at which high-tech terms are added to popular mainstream dictionaries makes it look as though the editors are out of step.

Not so, says Peter Sokolowski, editor at large at Merriam-Webster Inc., which offers both a print edition of its dictionary and free access to its contents at Merriam-Webster.com. Editors track new words from the point of first citation. Most terms don't make the cut simply because they disappear too quickly. "We don't want to have words that come and go," he says.

The word that was adopted most quickly by Merriam-Webster was Google as a verb. It entered the dictionary in 2006, five years after the first citation was noted. "In lexicographical terms, that's light speed," Sokolowski says.

In contrast, the editors had been monitoring malware since 1990, but it just entered the dictionary this year.

There's good reason to hold back, says Steve Kleinedler, supervising editor for The American Heritage Dictionary of the English Language at Houghton Mifflin Harcourt Publishing Co. "It is not the editors' intent to capture every word that comes out of any given industry," he says.

In the high-tech field, where terms such as those in Wired magazine's "Jargon Watch" may rise and fall in a matter of weeks, time separates the wheat from the chaff. "Leaving out the whimsy and the flash in the pan leaves room for essential vocabulary," says Kleinedler.

Jesse Sheidlower, editor at large for The Oxford English Dictionary, says the OED is less conservative than Merriam-Webster in how quickly it adopts new words, adding 200 to 250 every quarter. Blog, for example, entered the dictionary in 2003, just four years after the first occurrence was noted.

Where Is It Used?

Though the earliest citations for many new high-tech words come from the blogosphere and other online venues, those media carry less weight than do print publications when it comes to deciding which words make it into the dictionary.

Words are accepted when the editors observe a critical mass of citations in "serious" publications, Sokolowski says. That means (with a few exceptions) major print publications. Online-only publications with "carefully edited prose," such as Slate and Salon, are also influential. But blogs are not. "Most blogs are just not carefully edited," he says. "The gold standard has always been the printed page."

Because of the reliance on print citations, the world's biggest search engine also plays a limited role in research at Merriam-Webster. Sokolowski says he uses Google and other online search engines only informally, to begin investigating new words. "Most of the citations we take come from print sources or Nexis [a searchable archive of print and online publications] because that really represents the press," he says.

The OED's Sheidlower says citations in blogs and other online sources do influence his decision-making process, and he acknowledges that some terms appear chiefly in online media. But he, too, wants to see words spill over into print publications before adopting them. "We would be less likely [to include words] that are only used in online sources," he says. "But it could happen."

Although widespread usage is a requirement for including new technology terms, editorial judgment also plays a role in determining what goes in and what stays out. For example, Blu-ray the technology won the next-generation DVD format battle, but Blu-ray the word won't make it into the OED. The dictionary passed on the names of both competing DVD formats, HD-DVD and Blu-ray. "You want to make sure you're not putting in something that's going to go away in two months when the technology goes one way or another," says Sheidlower. "Neither of those products is that widespread."

Nonetheless, both technologies are popular enough for Merriam-Webster's editors to include them in the next edition of its dictionary.

Seemingly outdated terms may also be added in some cases. For example, gopher (in the sense of a network protocol) was added to the OED several years ago, but it is just now on the short list for inclusion in an upcoming version of the Merriam-Webster dictionary Sheidlower says he wouldn't add the term if it were under consideration today, but he notes that the fact that a technical term is obsolete isn't sufficient reason not to include it -- or to remove it. "For us to put something in is a statement of faith that it is important," Sheidlower says.

The more technical the term, the less likely it is to make it into the dictionary. While Wi-Fi is in, the protocols it uses, such as 802.11b and 802.11g, are not. Another term, AJAX, has yet to prove its permanence. "It's a common technology right now, but it is by no means the only technology for dynamically updating Web sites," says Sheidlower.

He also sees it as more of a technical term. "The man on the street probably hasn't heard of AJAX, even if they've used Google Maps," which uses the technology, he adds.

Another sticky area is known as the "Xerox problem," which involves trademarked names that become synonymous with a function, as Xerox has with photocopying. In the high-tech realm, Google has come into common use as a verb. That's a problem for Kleinedler. "We cannot legally add Google [to the dictionary] as a verb. Trademark law is one area where lexicographers' hands are tied," he says, adding that publishers are legally obligated to define trademarks as trademarks (rather than, say, as a verb).

As a result, American Heritage uses roundabout language in its definition of Google, as cited on Dictionary.com: "A trademark used for an Internet search engine. This trademark often occurs in print as a verb, sometimes in lowercase."

But not all publishers are so cautious. Merriam-Webster defines google as "to use the Google search engine to obtain information about (as a person) on the World Wide Web," giving the etymology as "Google, trademark for a search engine."

Losing Favor

For a word, getting into the dictionary is the lexicographical equivalent of achieving tenure in academia. Once a term gets in, editors are loath to remove it. Editors at Merriam-Webster and American Heritage do occasionally retire words, but those that make it into the OED stay forever and become part of the historic record of the English language, says Sheidlower. That's one reason why he's selective when it comes to high-tech terms. "Other dictionaries can put something in, make a marketing splash and take it out two years later," he says. "We can't do that."

Competitors are also very conservative when it comes to removing words. Merriam-Webster drops words only during major revisions, which occur once a decade, and those that do get dropped tend to be real antiques. Some of the terms that have gotten the ax recently include microreader (first cited in 1949) and record changer (first cited in 1931). A relatively modern term by that measure, PL/1, the name of a programming language, first cited in 1973, was recently removed from both the Merriam-Webster and American Heritage dictionaries.

Taking out words carries real risks, says Kleinedler. In 1998, during a review of obsolete computer science terminology, the word chad faced the chopping block. Then Kleinedler recalled voting with a punch card machine in the previous election. "We decided to hold onto it [for the dictionary's fourth edition, published in 2000]. And lo and behold, the election of 2000 comes along, and suddenly chad is everywhere."

American Heritage has removed a handful of technology terms, including data diddling. "Anything you'd need to know about that term can be adequately defined by diddle itself," says Kleinedler.

Although dictionaries consistently add new high-tech terms as they go mainstream, the editors don't seem too worried about keeping up with either the fast-changing computer technology landscape or high-tech readers. "They're not following us; we're following them. It is they who inform us what needs to go in," says Kleinedler.

For the typical technology enthusiast, the dictionary serves a more important function than providing definitions of high-tech terms that they already know, he says. "[They] are much more likely to use a dictionary to find out how to spell mischievous or figure out what the word olivine or slatternly means, rather than looking up some term that they themselves are on the front lines of establishing in the language."

This is a print version of a story that originally ran on Computerworld.com. To see the original online version with additional content, please click here.

Got something to add? Let us know in the article comments.

Related:

Copyright © 2008 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon