All information we receive is colored by the medium it comes through. Even the most credible system will present you with information that doesn't match the question or context you hold in your mind (even though it may seem to). It's not that it's wrong or illicit, it's just a misunderstanding. The key is enabling people to continue the dialogue and qualify what is returned, removing the reliance on blind faith. When dealing with audio-only, there are very few cues to help people do that.
For analytics over audio or voice-first systems, it's best to use the voice for information requests and then supply a visual means for the reply. Speech is very effective for requests and orders, but visual information is better for cognition and understanding, in part because it supplies all those extra visual cues that we are very skilled at spotting, especially when that's in the form of data visualisation. By all means supplement it with an audible statement or 'key insight,' but always back it up with the visual evidence. It’s that which enables people to qualify whether the system understood them, spot other interesting data points, and even build data literacy. Also, it's a lot quicker to visually scan a list of possibilities than sit through someone reading them out. Yes, the intelligence behind the curtain could just give you what it thinks is the most useful, but then you'll never know what you're missing.
In the age of alt-facts and fake news, it's more important than ever help people check and qualify the information they receive.
So remember: be prepared for doubt, be open to questions and imagine your users are channeling the Lou Reed song Last Great American Whale: "Don't believe half of what you see, and none of what you hear."