Another one of my anti-hype articles 🙂 .
AI conversational analytics is currently being sold as the next evolution of self-service BI: the moment when anyone in a company can ask questions about data in natural language and get an answer without SQL and without waiting on the analytics team. Technically, that direction is real. NL2BI approaches and new conversational layers on top of BI genuinely do lower the barrier to entry.
Strategically, though, that is the least interesting layer of the whole thing. Value does not come from people asking questions. It comes from what the company systematically learns from those questions and how it turns that learning into distributed analytics, decision workflows, and institutional memory. Conversational analytics is not the goal. It is an interface, an observability layer, and a temporary mechanism for exploration.
1. The biggest mistake: confusing the access layer with the strategy
Many companies today think about conversational analytics primarily as a new access layer… the user asks a question, the system translates it into a query, returns an answer, done. A valid starting point, but a weak end state.
If the goal is simply to let more people “talk to the data,” organizations usually end up with more ad hoc queries, more parallel interpretations of the same thing, and higher costs for inference, governance, and support. On the surface, it looks like democratization. In reality, it is often just chaos moving from dashboards into chat, analogous to the problem of unmanaged self-service BI, where broader access without semantic discipline led to duplicate and conflicting metrics.
The ability to generate answers is not the same as the ability to build an analytics system that scales.
2. The biggest limitation is not the LLM, but the context
It is convenient to debate models, prompting, or which vendor has the better demo. In practice, most failures do not happen because the model cannot answer. They happen because the organization has not prepared the context: ambiguous metric definitions, a weak business glossary, inconsistent naming, a fragile semantic layer, unclear permissions, and missing links to decision workflows. The NIST AI RMF repeatedly emphasizes that the performance and interpretation of AI outputs must be evaluated in the specific context of use.
That is exactly why the semantic layer matters so much: it translates raw data into business language and centralizes definitions. Alongside Looker, this also includes layers such as the dbt Semantic Layer and Cube, which address the same problem: define a metric once and use it everywhere. With conversational analytics, this becomes even more visible. Tight integration with the semantic layer increases answer trustworthiness and reduces error rates in natural-language querying.
Governance is just as important. Not as a corporate brake, but as a condition for scale. For systems that influence decisions, it is not enough for an answer to be plausible. It has to be auditable, explainable, and situated in a known risk context, which the NIST RMF Playbook makes more concrete through documentation, audit logs, interpretability, and continuous monitoring.
3. Chat is not the product, but the input
Most people picture conversational analytics as a chat interface. But chat is not the product. Chat is the input.
The real product is what emerges, after enough repetition and enough importance, from recurring query patterns: a standardized dashboard, an alert, an executive brief, a decision workflow, an automated recommendation, or an approved metric in the semantic layer.
Concretely: if the sales team asks about churn rate every week in slightly different ways, the output should not be twenty similar answers. It should be one productized churn alert with a metric definition, a threshold, and routing to an owner. At that point, the question stops needing to exist. That is success, not the fact that it was answered quickly.
Instead of optimizing for the number of questions answered, organizations should optimize for how many recurring questions no longer need to be asked at all. That is the same logic companies use when building institutional memory and expert knowledge transfer.
The goal of a mature analytics organization is not to maximize the number of queries. The goal is to minimize the number of repeated uncertainties.
4. Query logs are not a side effect. They are a data asset
The most underestimated part of this whole topic is usually the query log. In both search and conversational systems, it has long been clear that logs and query sequences carry valuable signals about user intent, recurring tasks, and latent needs, from intent clustering to the analysis of task-level search behavior.
Every query in conversational analytics is a signal of several things at once:
- what people need to know
- what they do not understand
- which metrics matter to the business
- where decision pressure keeps recurring
- where the analytics infrastructure does not match users’ mental models
Repeat queries often do not just indicate demand for an answer. They reveal a structural deficit: a missing dashboard, a missing alert, an undefined metric, or an insight that has not yet been distributed. In that sense, conversational analytics is also a diagnostic tool for the entire data stack.
5. Explorers and consumers are not the same problem
Different user groups use conversational analytics in completely different ways.
Explorers are senior analysts, product people, strategists, and leadership. They iterate on questions, test hypotheses, and compare perspectives. They need flexibility and depth. They generate discovery value.
Consumers are most of the organization. They do not need to discover the unknown. They need a stable, validated, and contextualized answer they can rely on operationally, delivered quickly. They generate scale value.
This distinction aligns with familiar BI user typologies, such as analyst vs. consumer personas described by Eckerson, and it matters because many companies try to serve both needs equally with a single system. Scalable value only emerges when exploration is converted into reusable artifacts for the rest of the organization.
6. The real ROI is not in answers, but in decisions
Many implementations implicitly optimize for answer retrieval: the ability to return an answer that looks correct, arrives quickly, and sounds natural. But the business does not pay for answers. It pays for better decisions. Across analytics more broadly, the value of data is repeatedly tied above all to the speed and quality of decision-making, not to the mere production of insights.
Meaningful evaluation therefore rests more on questions like:
- What types of decisions does the system support?
- How often do those decisions recur?
- What is the cost of a wrong answer?
- Where does the system shorten the time from question to action?
- Where does it replace repeated work with a stable, distributed output?
That is where the real economics begin.
7. Without an operating model, there is no value
The most expensive mistake is running conversational analytics without a clear operating model. The typical picture: the system keeps answering the same questions, but no one productizes them. The team collects feedback, but no one turns it into changes in the semantic layer. An expensive model handles even trivial queries that should be routed through a cheaper mechanism. No one evaluates the business impact of individual intents. It is the same problem that shows up in organizations as weak learning from repetition and experience.
The result: high costs for inference, support, and adoption, but low value accumulation.
What makes sense:
- do not start with everyone, start with a small number of high-value use cases
- separate the discovery layer from the delivery layer
- implement intent logging and clustering from day one
- evaluate recurring intents by frequency, business impact, cost of error, and standardization potential
And separately, because it is often underestimated: cost discipline does not come from banning AI. It comes from allocating different types of queries to the right layers of the system. A trivial lookup should not run through the same expensive generative stack as open-ended exploratory analysis.
8. The end state is not more searching, but less friction
A mature analytics system gradually moves from a pull model to a push model. Relevant insight is not just available on request, but distributed at the moment when its decision relevance is highest: as an alert, a briefing, a workflow-linked recommendation, or a contextual notification. Search research also shows that value does not come from answering a single query, but from understanding the full task and the full interaction sequence.
This is exactly where conversational analytics can become the seed of organizational memory. Not because chat itself is the company’s memory, but because it captures which questions keep returning, what has already been analyzed, which conclusions are robust, and what should be converted into a more stable form. The company does not start from zero every time someone asks the same thing again.
That is the real shift. Not from dashboards to chat. But from fragmented search to the systematic distribution of decision intelligence.
Summary
AI conversational analytics is not strategically interesting because it removes the need for SQL. It is interesting because it gives companies a way to observe what questions their people actually have, where the data model is failing, and what should be productized.
If you understand it only as chat over data, you get an expensive access layer.
If you understand it as an observability, learning, and productization layer, you get a mechanism for reducing decision friction, building institutional memory, and turning exploration into scalable value.
The highest level of maturity, then, is not an organization where everyone knows how to ask a question.
It is an organization that knows which questions should no longer need to be asked.
