Forum newsfeeds
Forum Newsfeeds


Sites for Teachers

Sites for Teachers


Go Back   UsingEnglish.com ESL Forum > Analysing Language > Text Analysis and Statistics

Reply
 
LinkBack Thread Tools Display Modes
  #21 (permalink)  
Old 04-Apr-2005, 10:52
Editor, UsingEnglish.com
 
Join Date: Nov 2002
Country: UK
Posts: 25,125
Current Location: Phnom Penh
First Language: English
Thanks: 2
Thanked 240 Times in 230 Posts
Tdol has disabled reputation
Default Re: 100 million word British National Corpus.

In that sense, though it generates error, it is the most accurate picture of English use of all.

"depends of" - 176,000
Yikes!
Reply With Quote
Sponsored Links
  #22 (permalink)  
Old 04-Apr-2005, 21:31
M56
Guest
 
Posts: n/a
Default Re: 100 million word British National Corpus.

Quote:
Originally Posted by tdol
In that sense, though it generates error, it is the most accurate picture of English use of all.

"depends of" - 176,000
Yikes!
It gets worse:

Googled:

155,000 English pages for ABSORBTION.
37,200 English pages for "conflicting feelings".
4,420 English pages for "I me myself ".
1,790 English pages for "married with him
571,000 English pages for "grammer".
759 English pages for "he teached me".

BNC:
BNC:

• ABSORBTION 4
• CONFLICTING FEELINGS 6
• I me myself 0
• got married with her 0
• he teached me 0
• many photo of 0
• office equipments 0
• acquainted to 0
• was borned 0

Last edited by M56; 04-Apr-2005 at 22:30.
Reply With Quote
  #23 (permalink)  
Old 05-Apr-2005, 02:15
Editor, UsingEnglish.com
 
Join Date: Nov 2002
Country: UK
Posts: 25,125
Current Location: Phnom Penh
First Language: English
Thanks: 2
Thanked 240 Times in 230 Posts
Tdol has disabled reputation
Default Re: 100 million word British National Corpus.

There are so many students searching for 'grammer' that many sites include the word to bring in the students through their error.
Reply With Quote
  #24 (permalink)  
Old 05-Apr-2005, 09:09
M56
Guest
 
Posts: n/a
Default Re: 100 million word British National Corpus.

Quote:
Originally Posted by tdol
There are so many students searching for 'grammer' that many sites include the word to bring in the students through their error.
Let's hope the porn sites don't catch on to that spelling and tactic then.

Reply With Quote
  #25 (permalink)  
Old 06-Apr-2005, 02:40
Editor, UsingEnglish.com
 
Join Date: Nov 2002
Country: UK
Posts: 25,125
Current Location: Phnom Penh
First Language: English
Thanks: 2
Thanked 240 Times in 230 Posts
Tdol has disabled reputation
Default Re: 100 million word British National Corpus.

I'm sure they already have for many common mistakes.
Reply With Quote
  #26 (permalink)  
Old 01-Sep-2005, 05:30
Newbie
 
Join Date: Sep 2005
Posts: 4
First Language: English
Thanks: 0
Thanked 0 Times in 0 Posts
wordhound is on a distinguished road
Default Re: 100 million word British National Corpus.

To find the most common words in the media, why not just Google them? I find the corpus opaque, it must be taken on trust, Google is open and transparent.
Reply With Quote
  #27 (permalink)  
Old 01-Sep-2005, 06:24
Newbie
 
Join Date: Sep 2005
Posts: 4
First Language: English
Thanks: 0
Thanked 0 Times in 0 Posts
wordhound is on a distinguished road
Default Re: 100 million word British National Corpus.

Sept 2005, Google Word Hits

the 3.5 billion
to 3.4
and 3.3
of 3.2
a 3.2
for 3.2
in 3.1
on 2.8
home 2.7
about 2.6
site 2.5
is 2.5
all 2.4
from 2.3
search 2.3
at 2.3
you 2.3
this 2.2
web 2.2
our 2.2
more 2.2
new 2.1
your 2.1


more to come
Reply With Quote
  #28 (permalink)  
Old 01-Sep-2005, 10:55
M56
Guest
 
Posts: n/a
Default Re: 100 million word British National Corpus.

Quote:
Originally Posted by wordhound
To find the most common words in the media, why not just Google them? I find the corpus opaque, it must be taken on trust, Google is open and transparent.
How would I do that in Google? I mean, how could I separate all other uses from uses in the media. What would my search term be?
Reply With Quote
  #29 (permalink)  
Old 02-Sep-2005, 15:50
Newbie
 
Join Date: Sep 2005
Posts: 4
First Language: English
Thanks: 0
Thanked 0 Times in 0 Posts
wordhound is on a distinguished road
Default Re: 100 million word British National Corpus.

When I took a TEFL/TESL course last year I got the list of the most common words in rank order. The same handout also said the 8 words are 33% of spoken English. I thought later, how do they know without some numbers attached? Most common means most used. So counts can be established but this is never done. What we get are statistical analysis like joint frequency. This corpus is taken on faith, we trust someone to do a good job. Where can we verify?

Most lists get the first eight words right, then after that it varies. How about some consistency? Google is consistent.

I suspect web companies use this system already but the ESL world has declined to notice it.
Reply With Quote
  #30 (permalink)  
Old 05-Sep-2005, 10:17
Editor, UsingEnglish.com
 
Join Date: Nov 2002
Country: UK
Posts: 25,125
Current Location: Phnom Penh
First Language: English
Thanks: 2
Thanked 240 Times in 230 Posts
Tdol has disabled reputation
Default Re: 100 million word British National Corpus.

I use both Google and the BNC for language concordancing. No matter how big, any database will naturally have a certain degree of innacuracy or bias built in, based on the choice of texts and sources. However, the BNC, at sites like http://view.byu.edu does allow us to use more language tools than Google does, for the moment. Any list of the most popular words is going to start getting suspect as it moves down the list.
Reply With Quote
Reply

Bookmarks

Tags
million, word, british, national, corpus

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
what word is spoken over 300 million times a day in spanish? Anonymous Ask a Teacher 3 19-Feb-2007 14:52
Floridians for All's other benefactors, NewHope Ask a Teacher 3 20-Oct-2004 08:38
Word Checker 2 - The Brown Corpus Tdol UsingEnglish.com Content 0 24-May-2004 12:26
Word Checker 2 - The Brown Corpus Tdol General Language Discussions 0 19-Apr-2004 14:30


New To Site? Need Help?

All times are GMT. The time now is 10:23.


vBulletin, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0 RC5
Copyright © 2002 - 2008 UsingEnglish.com