Results 1 to 2 of 2
  1. VIP Member
    Retired English Teacher
    • Member Info
      • Native Language:
      • British English
      • Home Country:
      • Europe
      • Current Location:
      • Czech Republic

    • Join Date: Jul 2015
    • Posts: 18,412

    New corpora available

    Probably not the right forum, but at least people will see it here.

    I received this today:

    We are pleased to announce two new corpora from the BYU suite of corpora:

    -- The TV Corpus​: 325 million words in 75,000 very informal TV episodes (e.g. comedies and dramas) from 1950-2018

    -- The Movie Corpus: 200 million words in 25,000 movies from 1930-2018

    Brief overview (PDF) of the new TV and Movie corpora

    These corpora serve as a great resource to look at very informal language -- at least as well as corpora of actual spoken English. Users can also examine frequency and usage over time (1930-2018 for movies, 1950-2018 for TV shows), as well ascompare between different dialects of English (for example British vs American English).

    You can also quickly and easily create, search, and create keyword lists from their own "Virtual Corpora", such as (for TV) all episodes of Dr Who, Star Trek Next Generation, The Office, or The Good Place, or (for movies) all James Bond movies, or all American sci-fi movies from 1990-present, which have a certain movie rating or IMDB score, and with a given keyword in the IMDB plot summary.

    Finally, all 75,000 episodes from TV shows and all 25,000 movies are linked directly to their IMDB entry and OpenSubtitlespage. As a result, if you find some interesting data in the corpus and want to see the original subtitles file or find out more about the TV show or movie (actors, rating, extended plot summary, etc), it's just one click away.

    In summary, we believe that the new TV Corpus and Movie Corpus provide are the largest, most searchable corpora of veryinformal English, and we hope that they are of value to you in your research and teaching.

    (Also, we're glad to announce that "one click" comparisons in the BYU corpora are back, which allows you to seamlessly move between and compare results in the different BYU corpora -- TV, Movies, Soap Operas, iWeb, COCA, COHA, GloWbE, BYU-BNC, NOW, Wikipedia, and others).

    Mark Davies

    BYU Corpora

    Mark Davies
    Professor of Linguistics / Brigham Young University
    ** Corpus design and use // Linguistic databases **
    ** Historical linguistics // Language variation **
    ** English, Spanish, and Portuguese **

    Last edited by Piscean; 08-Feb-2019 at 21:11. Reason: fixed title
    Typoman - writer of rongs

  2. teechar's Avatar
    English Teacher
    • Member Info
      • Native Language:
      • English
      • Home Country:
      • Iraq
      • Current Location:
      • Iraq

    • Join Date: Feb 2015
    • Posts: 11,196

    Re: New corproa available

    Thanks, Piscean. I have pinned this thread.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts