"SEE" works as a sensory verb (smell, taste, feel, look...) and the verb it takes as its object is either in the simple or base form (dance, sing) or the progressive form (dancing, singing). Usually, if you use the base form, it means you have, in this case, seen the entire action from beginning to end ( "I saw him sing on American Idol": I saw the entire song); if you use the progressive form, you saw part of the action ("I saw him singing on American Idol": I saw him for a minute when it was a commercial (for example)