In language discussions, results taken from search engines are often quoted as examples to show whether something is used as a form or to compare forms to see which is more common, etc. GoogleBlogoscoped has run 27,000 words from a dictionary through Google for popularity- the full results of the study can be downloaded here. The table below shows the top thirty words from the 2006 and 2003 surveys, together with the top thirty words from the British National Corpus (BNC).
The method used in the Google study does not count multiple occurrences in a single page, so the presence of a copyright message at the foot of a page will count for the same as all the times that the occurs, which accounts for the presence of copyright, contact, site, home, etc. However, the other entries suggest that the contents of the Google databases, and therefore any other reputable search engine, are likely to give a fairly accurate reflection for terms that are not related directly to the language of the layout of a webpage. As a rough and ready tool for checking, it seems that search engines can be used as basic concordancing tools.
We have just updated the Phrasal Verb section of the site and it now automatically creates quizzes based on the definitions of the verbs. One of the types of quizzes uses the particles and, therefore, generates a list. We have over 1,400 phasal verbs, which is a fairly representative sample, and it is clear that a small number of particles dominate: up, out, on, off, in and down account for about 65% of the total. Here's the full particle list: