community
directory
books
authors
images
encyclopedia

Email:
Password:
Register

Knowledgerush Search

 

Google
  Web knowledgerush


Search for images of Corpus linguistics


Message boards   Post comment

Corpus linguistics

Corpus Linguistics is the study of language as expressed in samples (corpora) or "real world" text. The approach runs counter to Noam Chomsky's view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. Corpus Linguistics does away with Chomsky's competence/performance split, viewing that we can only ever reliably analyse language if the researcher does not interfere.

In some areas there is an overlap with computational linguistics, as the latter moves towards language processing applications. This means dealing with real input data, where descriptions based on a linguist's intuition are not usually helpful.

The field was established in 1967 when Henry Kucera and Nelson Francis published their classic work Computational Analysis of Present-Day American English, on the basis of the Brown Corpus, a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources. Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, psychology, statistics, and sociology.

Shortly thereafter Boston publisher Houghton-Mifflin approached Kucera to supply a million word, three-line citation base for its new American Heritage Dictionary, the first dictionary to be compiled using corpus linguistics. The AHD made the innovative step of combining prescriptive elements (how language should be used) with descriptive information (how it actually is used).

Other publishers followed suit. The British publisher Collins' COBUILD dictionaries, designed for users learning English as a foreign language, were also compiled using corpus linguistics.

See further

External links

Referenced By

COBUILD | Computational Linguistics | Computer linguistics | Dictionary

 

Compose Your Message

Your Email Address or Pen Name (optional):
Subject:
Your Message:
 

 

 

 

 

 

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Corpus linguistics".

 

Contact UsPrivacy Statement & Terms of Use

 
Copyright © 1999-2003 Knowledgerush.com. All rights reserved.