Skip to Content



Corpus Linguistics

Categories

Links

American National Corpus

The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development. When completed, the ANC will contain a core corpus of at least 100 million words, comparable across genres to the British National Corpus (BNC). Beyond this, the corpus will include an additional component of potentially several hundreds of millions of words, chosen to provide both the broadest and largest selection of texts possible.
Review It Rate It Bookmark It Details Hits: 303 - Added: Jun 29 2006

Brown Corpus Manual

This Standard Corpus of Present-Day American English consists of 1,014,312 words of running text of edited English prose printed in the United States during the calendar year 1961. So far as it has been possible to determine, the writers were native speakers of American English. Although all of the material first appeared in print in the year 1961, some of it was undoubtedly written earlier. However, no material known to be a second edition or reprint of earlier text has been included.
Review It Rate It Bookmark It Details Hits: 234 - Added: Apr 19 2004

Centre for English Corpus Linguistics

The UCL Centre for English Corpus Linguistics (CECL) is a specialist research centre with two core areas of research activity: 1. Computer learner corpus research 2. Cross-linguistic research
Review It Rate It Bookmark It Details Hits: 250 - Added: May 29 2005

Corpus.Byu.Edu

The following are some of the corpora that have been created by Mark Davies, Professor of Corpus Linguistics at Brigham Young University.
Review It Rate It Bookmark It Details Hits: 302 - Added: Jun 22 2007

Dialogue Diversity Corpus

DDC is intended to facilitate all varieties of research that require dialogues from multiple situations as data. For studies of dialogue dynamics, situational effects in dialogue, dialogue coherence, dialogue genre comparison, studies of role and status in dialogue and many other topics, very diverse dialogue data must be brought to bear on single studies.
Review It Rate It Bookmark It Details Hits: 269 - Added: May 24 2004

Framenet

The Berkeley FrameNet project is a lexicon-building effort in which we (1) study words; (2) describe the frames or conceptual structures which underlie these; (3) examine sentences, using a very large corpus of contemporary English that contains these words; and (4) record the ways in which information from the associated frames are expressed in these sentences.
Review It Rate It Bookmark It Details Hits: 209 - Added: Jan 30 2005

Michigan Corpus of Academic Spoken English

Welcome to the on-line, searchable part of our collection of transcripts of academic speech events recorded at the University of Michigan. There are currently 152 transcripts (totaling 1,848,364 words) available at this site.
Review It Rate It Bookmark It Details Hits: 233 - Added: Jun 06 2004

Natural Language and Computational Linguistics

The Natural Language and Computational Linguistics (NLCL) group (part of the Department of Informatics at the University of Sussex) is one of the largest groups in the UK of researchers focusing on statistical and corpus-based approaches to natural language processing.
Review It Rate It Bookmark It Details Hits: 202 - Added: Jun 18 2004

Online Concordancers

A concordance gives a list of several words, phrases, or distributed structures along with immediate contexts, from a corpus or other collection of texts assembled for language study.
Review It Rate It Bookmark It Details Hits: 256 - Added: Jun 16 2006

Variation in English words and phrases

This website allows you to quickly and easily search for a wide range of words and phrases of English in the 100 million word British National Corpus. You can search for words and phrases by exact word or phrase, wildcard or part of speech, or combinations of these.
Read 1 Review Review It Rate It Bookmark It Details Hits: 347 - Added: Mar 03 2005

What is Computational Linguistics?

Computational linguistics (CL) is a discipline between linguistics and computer science which is concerned with the computational aspects of the human language faculty. It belongs to the cognitive sciences and overlaps with the field of artificial intelligence (AI), a branch of computer science aiming at computational models of human cognition. Computational linguistics has applied and theoretical components.
Review It Rate It Bookmark It Details Hits: 220 - Added: Jan 11 2004

Word Frequency

Word frequency lists and dictionary from the Corpus of Contemporary American English
Review It Rate It Bookmark It Details Hits: 122 - Added: Feb 25 2011

WordSmith Tools

WordSmith Tools is lexical analysis software for the PC. Published by Oxford University Press since 1996 and now at version 4.0.
Review It Rate It Bookmark It Details Hits: 232 - Added: Dec 16 2004