GenBI and the semantic layer: the missing link between your data and natural language

Monday morning, exec meeting. The VP of Sales reports quarterly revenue of €14.2M. Thirty seconds later the CFO, who typed an almost identical question into the same GenBI tool, has €12.8M on her screen. Nobody in the room can say which one is right. The meeting goes sideways: instead of talking about the pipeline, everyone spends the next half hour trying to figure out where two different numbers came from. One session like that is usually enough to kill trust in a generative BI tool for good.

The culprit is neither the language model nor the user. It’s the missing semantic layer. Without it, GenBI stays a slick demo that falls apart the moment a real decision rides on it.

Text-to-SQL over a raw schema is a trap

The architecture you see in most demos is appealingly simple: hand the LLM the database schema, let the user ask a question, the model writes the SQL, the engine runs it. Across three clean tables and a well-phrased question, for the length of a POC, it holds. Point the same machinery at a warehouse with hundreds of tables and it comes apart fast.

Because a raw schema carries no business logic. Back to our two executives. “Revenue” has no single definition in the database. To Sales it means bookings, net of credits, ex-tax, dated on the order. To Finance it means recognised revenue, dated on the invoice, scoped to consolidated subsidiaries. Each query is correct under its own definition, and the gap between them is baked in. The LLM has no way of knowing which one you meant. It picks one, and it doesn’t tell you.

Then add the usual hazards of SQL written on the fly:

  • Wrong joins. The model links two tables on a key that looks right but isn’t, and nothing in the result gives it away.
  • Double counting. A one-to-many join inflates the total: an order with three line items gets counted three times in the revenue sum.
  • Forgotten filters. Cancelled rows, test accounts and technical duplicates stay in the calculation because no rule excludes them.
  • Mixed granularity. Adding monthly figures to daily ones produces a number that means nothing.

The real danger isn’t that the LLM gets it wrong. It’s that it hands you a wrong answer with exactly the same confidence as a right one: cleanly formatted, ready to paste into a slide. A traditional dashboard is at least wrong the same way for everyone, so the error surfaces sooner or later. Free-form GenBI is wrong a different way on every question, and you never know where to look.

What a semantic layer actually is

A semantic layer is a centralised, version-controlled, shared definition of your business logic, sitting between the warehouse and the tools that query it. It stores no data. It stores the meaning of the data. In practice, it declares:

  • Metrics and their exact formula: what “net revenue”, “gross margin” or “monthly churn” really is, with the filters, exclusions and reference date baked in.
  • Dimensions to slice by: region, customer segment, product line, time period, and how to aggregate each one correctly.
  • Relationships between entities, so joins are written once, correctly, and reused everywhere.
  • Business synonyms: “revenue”, “sales”, “turnover” and “top line” all resolve to the same validated metric.

None of this is new. The metadata layer in the old BI suites did exactly this. GenBI just changes the stakes: what used to be a nice-to-have is now the thing that keeps the project alive, because it’s the only place to anchor a language model that, by design, has no idea what your accounting truth is.

Bounding the answers, not boxing in the user

A well-built semantic layer makes results deterministic and auditable. The same question returns the same number today and six months from now, no matter who is asking. And every answer is traceable back to its definition: you can show the metric that was called, its filters, its formula.

The objection always lands at the same spot: if everything is predefined, haven’t you lost the freedom of conversational BI? It’s the other way round. Users are still free to mix whatever metrics and dimensions they like, to filter, to put two quarters side by side. The one thing they can’t do anymore is improvise a revenue figure that contradicts the one sitting next to them. You bound the space of calculations, never the space of questions. The whole value lives in that gap.

The LLM’s real job: translate intent, not invent the math

With a semantic layer in place, the model’s job changes completely. It no longer writes free SQL against a raw schema. It does text-to-metrics: it turns a sentence in plain language into a structured call to metrics and dimensions that are already validated.

“What’s our net revenue by region this quarter versus last quarter?” becomes a query that selects the net_revenue metric, the region dimension, a quarterly grain and a period-over-period comparison. The LLM decides what to ask. The semantic layer decides how it’s computed. The warehouse engine runs it and returns the one authoritative number.

That division of labour is what moves a GenBI project from gadget to decision tool. The model keeps all of its linguistic flexibility, coping with ambiguous phrasing, typos, whatever in-house jargon people throw at it. What it loses is the right to invent the business logic. The surface area for hallucination shrinks from a whole calculation down to a single metric choice, and that choice is something you can put on screen and correct.

The tools doing this today

  • dbt Semantic Layer / MetricFlow. You define metrics in dbt, right where data is transformed, and they become consistently queryable by every downstream tool. A natural fit if your pipeline already runs on dbt.
  • Cube. A standalone semantic layer with an API, caching and access control, built to serve dashboards, agents and applications alike. Often the pick when you want a single entry point on top of several sources.
  • LookML (Looker). The long-standing semantic model of the Looker ecosystem: metrics and explores defined in code, governed and reused. Solid if you already live in that environment.

The tool you pick matters less than the principle: one definition of your metrics, in code, version-controlled, with an LLM that plugs into it instead of writing SQL into the void.

Governance runs through the layer, not around it

There’s one more point, and it carries real weight in a mid-market or enterprise setup. When the semantic layer is the only path to the data, it’s also where access rights get enforced. A sales rep queries their own territory, not the payroll. A subsidiary sees its own figures, not the group’s. The rules live in the same place as the metrics, and they hold whether the question comes from a dashboard, an agent or an API.

Without it, governance gets replayed at every tool and every generated query, and sooner or later a clever bit of SQL slips past a confidentiality filter. With it, you have a single place to audit: who can reach which metric and on what scope, plus a record of who asked for what. That trail is what your risk function will eventually come asking for, and it’s what makes GenBI defensible in front of an auditor.

So the order of operations isn’t “let’s wire an LLM to the warehouse and see what happens.” Formalise the semantic layer on the handful of metrics that actually matter first, then connect natural language to it. Never the other way round. That’s exactly the approach we tool up in the GenBI offer. To dig into the method and the architecture choices, our guides & resources go deeper.


Move from experimentation to AI in production

Start with a short, fixed-price assessment: maturity, high-ROI use cases, and a prioritised roadmap. No commitment.