uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Developing Methods for Very-Large-Scale Searches in Proquest Historical Newspapers Collection and Infotrac The Times Digital Archive: The Case of Two Million versus Two Millions
Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of English.
2004 (English)In: Journal of English Linguistics, Vol. 32, no 2, 124-143 p.Article in journal (Refereed) Published
Abstract [en]

Historical corpora designed for linguistic research are often too small to provide statistically robust information about infrequent items. Alternative sources exist in the form of historical collections available online, but these databases may present methodological problems. Some of these problems can be circumvented, and useful results can be gleaned, including a proxy for incidence. In studies on the integration of the word 'million' into the English system of number words, based on billions of words from historical newspapers, it was possible to determine that parity was reached between obsolescent ('two millions') and Present-Day ('two million') forms in American papers around 1880 and in The Times around 1920. The explosive growth in the use of 'million' proved to start with WWII in the U.S. and in the 1950s in the U.K. This information could not be teased from a 20-million-word 'megacorpus' of commonly used diachronic and synchronic corpora designed by linguists.

Place, publisher, year, edition, pages
2004. Vol. 32, no 2, 124-143 p.
Keyword [en]
historical corpus linguistics, number words, s-inflection, newspaper language, American-British variation, electronic sources
National Category
Specific Languages
URN: urn:nbn:se:uu:diva-69117DOI: doi:10.1177/0075424204265944OAI: oai:DiVA.org:uu-69117DiVA: diva2:97028
Available from: 2005-03-22 Created: 2005-03-22 Last updated: 2011-01-12

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
MacQueen, Donald
By organisation
Department of English
Specific Languages

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 183 hits
ReferencesLink to record
Permanent link

Direct link