A language corpus is, as you described, basically a collection (a really huge one, that is) of texts/speech transcripts of this language. The Collins WordbanksOnline English corpus, for example, is composed of 56 million words.
The greatest value of a Corpus, is that it offers you a chance to see whether a sentence is grammatical, whether something is more often used than something else (e.g. "all of the above" vs "everything of the above"), or to see in which semantic context a sentence is usually found.
The great difference between a Corpus and, for example, Google, is that a Corpus is combined exclusively from texts (books, journals, newspaper articles, etc.) or speech transcripts. This excludes (in most, but not all cases) texts written by non-native speakers and, generally speaking, non grammatical texts. Google is used by millions of non-native speakers, and as a result it can't be considered 100% accurate.