Both are fine, and there's not much difference that I can detect. The first concentrates on the event (the reported speech), and the second on the fact (the name). So in some contexts there'd be a preference for one over the other.
E.g. - someone habitually gives a false name: What did he say his name was [this time]?
Someone is sending a post-card, but can't remember the name of the addressee: What did he say was his name?
But the difference is very slight. In most cases, they're equivalent.
Interested in Language