Now generally available

Your data,
finally thinking
for itself.

Talk to Data deploys autonomous AI agents across your entire data stack — querying, interpreting, and surfacing insights that would take your team weeks to find.

Trusted by 1,00+ data teams

Talk to Data — Query Workspace
You

上周各门店的销售排名是什么?为什么有差异?

30+Agent Nodes
4Memory Layers
7Route Decisions
7Security Layers

Semantic Governance

Three-Layer Data Governance

A structured semantic layer that transforms raw metadata into governed, auditable knowledge — from schema definitions to compliance packs and industry archetypes.

Layer 1

Canonical Metadata

Table schemas, column definitions, metric contracts, join rules, and few-shot NL2SQL examples — version-controlled as YAML, ingested into PostgreSQL.

table_assetmetric_assetjoin_rulefew_shot_example
Layer 2

Term Governance

Business term lifecycle management, bilingual disambiguation, synonym networks, and ambiguity matrices — powered by pgvector embeddings.

business_termterm_relationshipambiguity_matrix
Layer 3

Business Knowledge

Compliance rules, business context assumptions, industry archetypes, and regulatory constraints — enforced via Apache AGE graph traversal.

business_rulebusiness_contextcompliance_packindustry_archetype

Governance as Code — Git-Managed YAML → PG Supernode

YAML Metadata
Schema Validation
Cross-Layer Refs
Version Check
CI Pipeline
PG Ingest (R+V+G)

PG Supernode

R/V/G Triple-Engine Retrieval

One PostgreSQL instance, three retrieval paradigms. Relational precision, vector semantics, and graph governance — unified in a single transaction.

R

Relational

Engine

Structured DDL, metric definitions, join rules, lifecycle tracking, and ML tool registry in PostgreSQL

V

Vector

Engine

7 embedding types with pgvector — cosine similarity search for fuzzy term matching and SQL cache

G

Graph

Engine

Apache AGE property graph enforces governance boundaries, relationship traversal, and compliance packs

Four-Stage RAG Pipeline

  • Precision matching exact term lookup + SQL cache fast-path
  • Semantic recall cosine similarity across 7 embedding types
  • Graph traversal relationship discovery + governance boundary enforcement
  • Reranker + Corrective relevance re-rank, column pruning, token budgeting, error-based re-retrieval
Graph (AGE)Vector (pgvector)Relational (PG)

Multi-Agent System

30+ Node Agentic Orchestration

LangGraph four-layer orchestration — Supervisor → Query Understanding → Router → Sub-agents. LLM reasons and plans, ML tools compute and analyze.

Supervisor

Four-layer orchestration entry

Query Understanding

Intent + analytical depth detection

Router

7-route intelligent dispatch

Planner

ML DAG step decomposition

Data Retrieval

R/V/G hybrid search

Analytical Agent

ML tool selection & execution

SQL Generation

Code-specialized LLM

Code Interpreter

Sandboxed Python execution

Guardrails

7-layer security validation

Visualization

Auto chart + ML charts

Insights

ML-grounded data insights

Suggested Followup

Next-best-action guidance

7 Intelligent Route Decisions

KPI LookupNL2SQL QueryDeep AnalysisGraph ReasoningKnowledge QAClarificationAnalytical Workflow

Personalized Intelligence

Trace-Native Memory System

Four-layer personalized memory that learns from every interaction. Every retrieve, write, merge, and suppress operation is linked to the request trace for full observability and replay.

Working Memory

In-flight conversational state for the current request — powered by LangGraph checkpointer.

Profile Memory

Durable user preferences: chart styles, favored grains, default comparisons, repeated metric choices.

Episodic Memory

Decaying summaries of successful past interactions — preserves continuity without replaying raw history.

Correction Memory

User-approved corrections and disambiguation outcomes — high-value for accuracy, governance-gated.

Per-user isolation

Zero cross-user sharing. Every memory record is scoped to user_id with adversarial isolation guarantees.

Replayable & evaluatable

Memory-on vs memory-off replay experiments validate every retrieval policy change before production rollout.

Governance-first lifecycle

Explicit retention, redaction, suppression, and expiry semantics. User deletion is a first-class operation.

High-Signal Memory Formation

Memory writes only from durable signals — explicit preferences, accepted clarifications, recurring patterns, and user-approved corrections. Never from transient failures or low-confidence single turns.

Unified Observability

Full-Pipeline Tracing & Event Streaming

One ObservabilityFacade unifies Langfuse tracing, AutoMQ event streaming, and structured logging. Every agent step, memory operation, and evaluator result shares the same trace context.

Langfuse Deep Tracing

Every request gets a single trace_id. All spans — LLM calls, retrieval, SQL generation, memory operations — nest under one trace with full input/output capture.

AutoMQ Event Streaming

Lightweight, reference-based events streamed to Kafka-compatible topics. Build real-time dashboards, alerts, and audit trails from canonical lifecycle events.

Evaluation & Replay

Compare policy variants with memory-on/off replay experiments. Correlate memory effectiveness with SQL quality, user feedback, and evaluator pass rates.

Memory Lifecycle Events

Seven dedicated event types — retrieve, rank, write, upsert, suppress, expire, delete — each linked to trace context for full auditability.

Single-Trace Request Lifecycle

API Request

start_trace()

Agent Spans

start_span() → end_span()

Memory Events

emit_event(memory.*)

Score Recording

record_score()

Trace Finalize

finalize_trace()

Defense in Depth

7-Layer Security

Every query passes through seven independent security checks — from JWT authentication to result-level PII masking. Fail-safe by default: when in doubt, reject.

GDPRHIPAAPCI-DSSSOX
1

JWT Authentication

Role-based access control + rate limiting

2

Permission Filter

Domain-scoped data access boundaries

3

Metadata Safety

PII exposure policy (hidden / masked / visible)

4

Graph Constraints

Apache AGE enforces governance joins

5

SQL Policy Engine

SELECT-only whitelist + column validation

6

Execution Isolation

Dedicated workgroup + statement timeout

7

Sandbox & Result Masking

Code interpreter whitelist + column-level PII masking

Capabilities

Built for Enterprise Hybrid BI

Trace-Native Memory

Four-layer personalized memory (profile, episodic, correction, working) learns your preferences and patterns — every interaction makes the agent smarter.

Full-Pipeline Observability

Every query, span, and decision traced end-to-end via ObservabilityFacade. Langfuse for deep tracing, AutoMQ for real-time event streaming.

LLM + ML Hybrid

LLM handles reasoning, planning, and explanation. Traditional ML handles attribution, forecasting, clustering, anomaly detection, and regression.

Four-Level Analytical Depth

What (descriptive) → Why (diagnostic) → Next (predictive) → How (prescriptive). System detects analytical depth and routes to the right execution path.

Code Interpreter

Sandboxed Python execution with whitelisted ML libraries. LLM generates analysis code, sandbox executes safely with memory/time limits.

Block-Based Output

Results as decision-ready card blocks — narrative, metrics, charts, ML results, insights, and next-best-action suggestions. Pin any block to Dashboard.

Multi-Model Strategy

Smart model routing — DeepSeek-v3.2 for understanding, qwen3-coder-plus for SQL, Qwen3-max for insights. Automatic fallback chains.

Pin to Dashboard

Convert conversational analysis blocks into reusable BI dashboard widgets. Persist chart configs, SQL, filters, and ML context as durable assets.

Ready to talk to your data?

From natural language to governed SQL, ML-powered analysis, and actionable dashboard cards. Deploy on your own infrastructure with full data sovereignty.