Page 2 of 2 FirstFirst 12
Results 11 to 14 of 14
  1. #11
    Sanmayce is offline Junior Member
    • Member Info
      • Member Type:
      • Student or Learner
      • Native Language:
      • Bulgarian
      • Home Country:
      • Bulgaria
      • Current Location:
      • Bulgaria
    Join Date
    Jan 2011
    Posts
    36
    Post Thanks / Like

    Default Re: Google Ngram Viewer

    Thanks birdeen's call,

    Regarding 1):
    My style of writing is both brutal and nearly-incorrect due to the simple fact that I have never learned English, I have had only self-approaches, that is I learn it on the fly.
    It is useful for me when my mistakes are shown, thanks.

    Regarding 2):
    Yes, it is true, I will do some efforts to make the package usable from the desktop and not to force the users to go in prompt by themselves.

    Regarding 3):
    Openness is one of my strong qualities, but I intend to use, as always, my own resources. Currently the package is not worth to be uploaded, it is 1day old - I started it yesterday.

    Regarding 4):
    That is right, I did feel the gap, it is to be explained for sure, but I wanted to give some overview - it is hard to me to explain the idea(moreover the specifications) while the need is unclear/not-explained. I firmly believe that the ability to make proper word arrangements is the hardcore of English, especially for guys like me who don't want to read study-books but prefer learning by reading texts on daily basis.

    Regards


  2. #12
    birdeen's call is offline VIP Member
    • Member Info
      • Member Type:
      • Student or Learner
      • Native Language:
      • Polish
      • Home Country:
      • Poland
      • Current Location:
      • Poland
    Join Date
    Jul 2010
    Posts
    5,099
    Post Thanks / Like

    Default Re: Google Ngram Viewer

    Quote Originally Posted by Sanmayce View Post
    My style of writing is both brutal and nearly-incorrect due to the simple fact that I have never learned English, I have had only self-approaches, that is I learn it on the fly.
    Your English is very good!

  3. #13
    Sanmayce is offline Junior Member
    • Member Info
      • Member Type:
      • Student or Learner
      • Native Language:
      • Bulgarian
      • Home Country:
      • Bulgaria
      • Current Location:
      • Bulgaria
    Join Date
    Jan 2011
    Posts
    36
    Post Thanks / Like

    Default Re: Google Ngram Viewer

    Enter the 'Graphein' package revision 1

    Mini-guide in HTML format, 18 screenshots: here
    Mini-guide in PDF format, 4MB: here
    The package itself in ZIP format, 757MB: here

    Two regimes/modes of operation are available:
    - Fully-automatic mode: by running 'Graphein_TXT.bat' you can compare your text files (with .TXT extension placed in _TXT-TREE folder) versus US English Google books 4grams from 2009-07-15;
    - Semi(keyboard input is needed)-automatic mode: by running 'GRAPHEIN_keyboard.bat' you can search for your patterns(4grams) into US English Google books 4grams from 2009-07-15.

    Pluses:
    - Desktop launching;
    - In this initial revision the console tools can be executed either from command prompt or desktop via 2 icons;
    - Open-source.
    Minuses:
    - In fully-automatic mode files with more than a few thousand words are processed awfully slow;
    - Unfortunately US English Google books 4grams from 2009-07-15 made me lose my momentum, not as rich corpus (still) as I expected;
    - This revision gives the outline(it is mainly illustrative), it has no future unless some serious approach(mixing Graffith & Leprechaun is a nifty one) is applied;
    - It took me 3 days to realize the [f]utility of current (100 brute-force way of comparing), still processing chunks is THE one/main way to go because of 32bit address limitations - I dream of 5x100+ million phrases inserted in million b-trees which demands 64bit code, sadly I am not ready to walk this way, yet.

    Anyway enjoy!

    P.S.
    If anyone finds it useful please feel free to make mirrors at once, I cannot guarantee hosting the .ZIP file for more than month or so.

  4. #14
    Sanmayce is offline Junior Member
    • Member Info
      • Member Type:
      • Student or Learner
      • Native Language:
      • Bulgarian
      • Home Country:
      • Bulgaria
      • Current Location:
      • Bulgaria
    Join Date
    Jan 2011
    Posts
    36
    Post Thanks / Like

    Default Re: Google Ngram Viewer

    Maybe the easiest way an idea to be understood is to tell its ultimate goal, in a 'I have a dream ...' style.

    In my case: I dream of instant English-phrase sidekick (suggester/hinter) tool.
    The goal is everybody to have the ability to write, in real time, as correctly as possible.
    Nowadays when a text is typed in search-engine fields some pale text/suggestions appear, that is what I am talking about.

    Imagine this: A kid playing with words and wanting to construct a sentence by using two afore-chosen words but not knowing any grammar.
    The situation is similar to forcing some unexperienced person to drive bicycle instead of tricycle first.

    The point is, that even when one is extensively trained, it is never enough - the tricycle remains as a reminder-of-everlasting-ignorance forever.
    Or as one of our renowned translators of Jack London has said: 'The difference between the better translator and the good one is in using a dictionary.' - contrary to the expected the former uses it whereas the latter is too 'versed'/(vanity sick) and needs it not, well-said.

    Of course for a corporation or skillful programmers it is not a big deal to achieve such functionality.
    The problem lies not so much in applied algorithms or used programming language but in the scarcity of data which feeds the suggester.

    Currently only Google (I am not their fan) shared their ngram datasets, as far as I know made from 4% of all printed books or 5+ million books.
    In my view these huge numbers weigh little because the practice shows one poor (yet) corpus far from needed, to talk here for comprehensiveness is a nonsense (look at 210 unfamiliar 4grams encountered in 'The Little Match Girl').
    So my dream needs a NEARLY-UNABRIDGED corpus of English(British, American, Australian ...) 4grams (as minimum) preferably 9grams (enabling whole sentences to appear under your fingers).
    I know I know that it looks/sounds like/as a long shot, but such greediness fits well here, after all I speak of billions of phrases, no room for profanity, floppy thinking, half-done encoding.
    In my vocabulary 'greedy' and 'uncompromising' are synonyms and I use them interchangeably.

    In next paragraphs I tried to visualize poorly one static suggestion for word 'getting' by giving all/PROPER three-words-collocations from left and right side:
    to get the suggestions I ran Graphein in search for:
    *_getting|
    also for:
    getting_*
    and finally the first search yields 17,425 hits, while the second search yields 18,773 hits.

    The results for '*_getting|':
    a_baby_is_getting
    a_baby_without_getting
    a_bad_time_getting
    a_balance_between_getting
    ...
    a_habit_of_getting
    a_half_in_getting
    a_hand_at_getting
    a_hand_in_getting
    a_handicap_in_getting
    a_hard_job_getting
    a_hard_time_getting
    a_harder_time_getting
    ...
    as_things_were_getting
    as_time_was_getting
    as_to_avoid_getting
    as_to_ensure_getting
    as_to_his_getting
    as_to_my_getting
    as_to_my_getting
    as_to_our_getting
    as_to_the_getting
    ...
    assist_clients_in_getting
    assist_her_in_getting
    assist_him_in_getting
    ...
    be_accustomed_to_getting
    be_achieved_by_getting
    be_active_in_getting
    be_adopted_for_getting
    be_afraid_of_getting
    be_aided_in_getting
    be_aimed_at_getting
    ...
    expressed_concern_about_getting
    expressed_interest_in_getting
    extraordinary_power_of_getting
    extreme_difficulty_in_getting
    extreme_difficulty_of_getting
    extreme_passion_for_getting
    extremely_desirous_of_getting
    extremely_effective_in_getting
    extremely_fortunate_in_getting
    extremely_helpful_in_getting
    extremely_important_in_getting
    extremely_interested_in_getting
    extremely_useful_in_getting
    ...
    within_despair_of_getting
    within_minutes_of_getting
    ...
    your_work_is_getting
    yourself_began_by_getting
    yourself_for_not_getting
    zero_chance_of_getting


    The results for 'getting_*':
    ...
    getting_a_better_chance
    getting_a_better_class
    getting_a_better_deal
    getting_a_better_education
    getting_a_better_feel
    getting_a_better_grade
    getting_a_better_grasp
    getting_a_better_grip
    ...
    getting_a_hand_on
    getting_a_handful_of
    getting_a_handle_on
    getting_a_handle_on
    getting_a_handle_on
    getting_a_handle_on
    getting_a_hard_on
    getting_a_hard_time
    ...
    getting_along_very_fast
    getting_along_very_nicely
    getting_along_very_slowly
    getting_along_very_well
    ...
    getting_as_much_fun
    getting_as_much_good
    getting_as_much_help
    getting_as_much_in
    getting_as_much_information
    ...
    getting_started_on_an
    getting_started_on_her
    getting_started_on_his
    getting_started_on_it
    getting_started_on_its
    getting_started_on_my
    getting_started_on_our
    ...
    getting_yourself_ready_for
    getting_yourself_ready_to
    getting_yourself_talked_about
    getting_yourself_to_the
    getting_yourself_very_wet
    getting_yourself_worked_up
    getting_youth_by_years


    Having all these suggestions is not enough, here comes the last but not least part: visualization.
    Well, I cannot show anything but the suggestive outcome is something like:
    Step 1: write 'getting'
    two pop-up flowing windows appear each filled with above suggestions.
    Step 2: you write 'in getting'
    the useful part here is that the entered 2gram is to be searched not into 4grams but say 5grams (3 words from both sides, again) and so on until a suitable sentence (or 9gram) appears.
    For example one suggestion is: 'helpful in getting information'.
    Step 3: you have obtained a stable/proper 4gram containing needed words from left and right, and you finish the sentence by yourself.

    The magical part is that (even if you are not sure what you are looking for) the sheer abundance of phrases will guide/supply/form your thought dynamically i.e. writing and thinking will happen (hopefully) simultaneously.
    I recall a funny scene from 'Captain Apache' movie in which Lee Van Cleef was being asked 'What are you searching?' and his answer was something like: 'Nothing. It is amazing how many things pop-up when you just search.'

    Let there be greediness ...
    The style choosing would be try-once-give-me-more i.e. addictive/indispensable, as in the next example:
    getting_a_handle_on [Google books]
    getting_a_handle_on [Australian magazines & newspapers]
    getting_a_handle_on [Agatha Christie - anthology]
    getting_a_handle_on [Arthur Conan Doyle - 'Sherlock Holmes' collection]
    getting_a_handle_on [Irish legends]
    getting_a_handle_on [Fairy tales translated into English]

    Obviously relying on one corpus alone is not remotely as useful as using hundreds-why-not-thousands of corpora.

    It must feature smooth scrolling and animated-like appearance, but this is up to Graphic-Interface designers and gadget manufacturers.
    Yes, the futuristic way of writing is neither left-to-right nor right-to-left anymore but from middle-to-edges, he-he.

    I believe/KNOW the time (for such an assistant) is nearer than most of us think.

Page 2 of 2 FirstFirst 12

Similar Threads

  1. The artist's work should take the viewer to a place
    By Volcano1985 in forum Ask a Teacher
    Replies: 3
    Last Post: 26-Oct-2009, 19:28
  2. I did Google?
    By flytothesky in forum Ask a Teacher
    Replies: 1
    Last Post: 30-Dec-2008, 20:11
  3. [General] viewer discretion is advised
    By thedaffodils in forum Ask a Teacher
    Replies: 2
    Last Post: 10-Sep-2008, 12:09

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Hotchalk