Let's explore an exciting project that leverages LangGraph Cloud's streaming API to create a data visualization agent. You can upload an SQLite database or CSV file, ask questions about your data, and the agent will generate appropriate visualizations. This blog is a brief dive into the agent’s workflow and key features.
datavis demo (1).mov
The entire workflow is orchestrated using LangGraph Cloud, which provides a framework for easily building complex AI agents, a streaming API for real-time updates, and a visual studio for monitoring and experimenting with the agent's behavior.
First, let us see the current SOTA text to sql workflow:
Schema and Metadata Extraction:
- The system processes the provided database (e.g., SQLite or CSV) to extract crucial information like table structure and column details.
- This initial step grants a comprehensive understanding of the database's organization.
Embedding Creation:
- For larger datasets, embeddings for schema elements (tables, columns) and sample data are generated. These embeddings improve efficiency during retrieval and matching tasks later on.
Entity and Context Retrieval:
- The user's query is analyzed to identify entities and the overall context.
- For database values, a syntactic search leveraging a Locality Sensitive Hashing (LSH) index can be implemented.
Relevant Table Extraction using Retrieval-Augmented Generation (RAG):
- This step utilizes RAG to pinpoint the relevant tables that hold the information the user seeks.
- Experimental Approaches:
- If the schema is manageable within the context window, this step might be skipped.
- Exploring a Knowledge Graph-based RAG for multi-hop functionalities is a potential avenue for future development.
- Extracting relevant columns can be fed into the RAG for more precise table extraction.
Large Schema Handling :