It seems to me that you are trying too hard- you want to follow spoken language "word-for-word", as if it were the same as written language. If you do this, it's natural that you will find it hard to continue once you miss something. However, spoken English isn't word-for-word; we squash things together, use forms so unstressed that they are virtually inaudible, run several words together into a single unit, add sounds to make the flow better, cut sounds for the same reason. Spoken language is usually improvised so it often lacks the organisational clarity of written text- we start, stop change, go back, reormulate, drift, etc. It strikes me that you might be trying to impose an artificial model os expectation on the language you hear. You should try not to go for the word-for-word approach, but to go with the flow. Spoken language can be disjointed and seem messy in comparison with written and we make many changes to words. You should try to use the same analytical techniques that have enabled you to reach a high level of proficiency with written text. Try to focus not on the individual words, but try to step back and try to look at what we do with the sounds and how we try to organise our idea, but don't try to match it to the patterns of written communication.
Do you practise listening specifically?
When you say you read more slowly than average, what's the average? If it's the average native speaker, I wouldn't lose any sleep over that- given the language differences, it may take more processing time. Have you any idea of your approximate reading speed? Also, how fast a reader are you in Mandarin?