Simplifying AI Development: A Practical Guide to HeatWave GenAI’s RAG & Vector Store Features
This article builds upon the concepts introduced in my previous blog posts, HeatWave GenAI: Sentiment Analysis Made Easy-Peasy, HeatWave GenAI: Your AI-Powered Content Creation Partner, In-Database LLMs for Efficient Text Translation with HeatWave GenAI and HeatWave GenAI: Sentiment Analysis Made Easy-Peasy.
For a deeper understanding, also consider reading these articles.
This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI’s architecture helps users gain insights from their data. By the end, you’ll learn how to implement these features effectively to unlock the full potential of generative AI for your applications.
HeatWave GenAI empowers you to interact with unstructured data seamlessly using natural language. Whether you’re performing content generation, summarization, sentiment analysis, or retrieval-augmented generation (RAG), HeatWave GenAI simplifies the process with its intuitive design and robust features.
Here’s why HeatWave GenAI stands out:
- Natural-Language Queries: Communicate with your data effortlessly by running natural-language searches using in-database or external large language models (LLMs).
- SQL Familiarity: The system leverages a SQL-based interface, making it accessible and easy to adopt for users already familiar with relational databases.
- Integrated and Optimized Pipeline: All the necessary components—vector store generation, RAG-based vector search, and chatbot capabilities—are seamlessly integrated, ensuring smooth performance and ease of use.
- End-to-End Generative AI Capabilities: HeatWave GenAI supports direct integration of generative AI into your applications, providing a complete pipeline for data retrieval, analysis, and interaction.
What is RAG?
RAG stands for Retrieval-Augmented Generation. It’s a technique that combines the power of Large Language Models (LLMs) with external knowledge bases. The primary goal of RAG is to enhance the quality, relevance, and accuracy of AI-generated responses by retrieving relevant information from these sources.
Essentially, it’s like giving an LLM access to a vast library, allowing it to provide more informative and contextually relevant answers.
How RAG Works
- Data Retrieval: The system retrieves relevant information from a connected knowledge base, vector store, or database using a query.
- Generation: The retrieved data is then used as context to guide the language model in generating a response.
- Output: The result is a response that blends the creative capabilities of the LLM with factual, targeted insights from the retrieved data.
Key Features of RAG
- Accuracy: Responses are grounded in real-world data, reducing errors and hallucinations typical of standalone LLMs.
- Relevance: Ensures the output directly addresses user queries by relying on contextually relevant information.
- Scalability: Can handle large datasets stored in vector databases for quick, efficient searches.
Some use Cases for RAG
- Customer Support: Enhancing chatbot responses with real-time company knowledge or product details.
- Content Creation: Generating articles or summaries based on specific datasets.
- Research: Assisting with document retrieval and summarization for academic or industry research.
- Personalization: Tailoring recommendations or messages based on customer data.
What is a Vector Store?
Vectors are numerical representations that capture the meaning, context, or characteristics of data, often used to represent text, images, or other complex data types. By storing these vectors, vector stores enable efficient similarity search, making it possible to find similar items based on their underlying semantic meaning.
Vector stores are crucial for modern AI applications like recommendation systems, natural language processing, and retrieval-augmented generation (RAG). They provide a powerful tool for understanding and analyzing complex data.
How a Vector Store Works
- Data Vectorization: Input data, such as text or images, is transformed into vector embeddings using a model like a large language model (LLM) or an image encoder.
- Storage: These embeddings are stored in the vector store, often alongside metadata about the original data.
- Search and Retrieval: When a query is made, it is also converted into a vector, and the store performs similarity searches to retrieve the most relevant results based on distance measures like cosine similarity or Euclidean distance.
Use Cases for Vector Stores
- Search and Retrieval: Powering AI-driven search engines that find relevant documents, images, or videos based on content rather than keywords.
- Recommendation Systems: Suggesting products, movies, or music by comparing user preferences to stored vectors.
- Chatbots and Virtual Assistants: Retrieving knowledge or context to answer user queries more accurately.
- Content Matching: Identifying similar or related content, such as plagiarism detection or duplicate image recognition.
What is HeatWave?
Oracle’s HeatWave is a fully-managed Oracle Cloud Infrastructure service. It offered different solutions for different workloads:
- HeatWave MySQL: built on MySQL, for your transactional needs.
- HeatWave Analytics: get real-time analytics on your data.
- HeatWave Lakehouse: query data in various formats in object storage
- HeatWave AutoML: automate the pipeline to build, train, and explain ML models
- HeatWave GenAI: integrated & automated GenAI with in-database LLMs
HeatWave GenAI (from version 9.0.1 onward) is an powerful integrated platform that combines automated in-database large language models (LLMs), vector stores, and scalable vector processing. This simplifies content generation, enabling businesses to harness generative AI without requiring AI expertise, data transfer, or extra costs.
The key benefits of such architecture are:
- Effortless LLM Integration: Simple SQL queries unlock AI power.
- Seamless Database Integration: No complex external connections.
- Simplified Architecture: Easy management and scalability.
- Scalable Performance: Handles large datasets and high traffic.
- Always Up-to-Date Data: In-database LLMs ensure data freshness.
HeatWave GenAI is available on major cloud platforms: Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), and Microsoft Azure.
In this article I’m using HeatWave 9.1.0.
SQL > SELECT version();
+-------------+
| version() |
+-------------+
| 9.1.0-cloud |
+-------------+
I’ll leverage HeatWave Lakehouse and GenAI capabilities.
HeatWave Lakehouse
Allows to process data stored in OCI Object Storage. Data are retrieved from Object Storage, transformed into a memory-optimized format, and stored in the HeatWave persistence layer. This ensures efficient data processing, with the data loaded into the HeatWave cluster for real-time analytics and insights.
HeatWave Generative AI
Enables natural-language interactions with unstructured data, using SQL-based queries for content generation, summarization, and retrieval-augmented generation (RAG). This makes it easy to generate and extract valuable insights from your data with minimal complexity.
Preparation
The data sources that will augmented the LLM are 6 pdf files based on the following Wikipedia pages:
- https://en.wikipedia.org/wiki/Iris_versicolor
- https://species.wikimedia.org/wiki/Iris_versicolor
- https://en.wikipedia.org/wiki/Iris_virginica
- https://species.wikimedia.org/wiki/Iris_virginica
- https://en.wikipedia.org/wiki/Iris_setosa
- https://species.wikimedia.org/wiki/Iris_setosa
They were loaded into an OCI Object Storage bucket.
Load data into the HeatWave cluster, generate vector embedding & load them into a vector store table
If you’re loading a vector store table on HeatWave for the first time, call the following method to create a schema used for task management:
sql> SELECT mysql_task_management_ensure_schema();
RAG & Vector Store preparation
There are many ways to do this, the easiest is probably to use the sys.HEATWAVE_LOAD stored procedure with a Pre-Authenticated Requests (PAR).
An alternative is to use sys.vector_store_load. More info here.
In short, the PDF files (stored in the Object Storage bucket) will be uploaded to the HeatWave cluster, where they will be split into smaller segments, transformed into numerical representations (vector embeddings), and stored in a dedicated vector store table for efficient search and retrieval.
-- Load unstructured data (pdf) to HeatWave Lakehouse.
-- Ingesting Files Using Auto Parallel Load
SET @dl_tables = '[ { "db_name": "Iris_AI",
"tables": [
{"table_name": "IrisAIembeddings", "engine_attribute": { "dialect": {"format": "pdf", "language": "en"},
"file": [ {"par": "https://objectstorage.eu-frankfurt.../Iris_AI/o/"}
]}}] }]';
SET @options = JSON_OBJECT('mode', 'normal');
CALL sys.HEATWAVE_LOAD(CAST(@dl_tables AS JSON), @options);
-- Checks (count embeddings):
SELECT count(*) FROM Iris_AI.IrisAIembeddings;
The vector store table is: Iris_AI.IrisAIembeddings. Its structure is:
SQL >
SHOW CREATE TABLE Iris_AI.IrisAIembeddings\G
*************************** 1. row ***************************
Table: IrisAIembeddings
Create Table: CREATE TABLE `IrisAIembeddings` (
`document_name` varchar(1024) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN',
`metadata` json NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN',
`document_id` int unsigned NOT NULL,
`segment_number` int unsigned NOT NULL,
`segment` varchar(1024) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN',
`segment_embedding` vector(384) NOT NULL COMMENT 'RAPID_COLUMN=ENCODING=VARLEN' /*!80021 ENGINE_ATTRIBUTE '{"model": "minilm"}' */,
PRIMARY KEY (`document_id`,`segment_number`)
) ENGINE=Lakehouse DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci SECONDARY_ENGINE=RAPID /*!80021 ENGINE_ATTRIBUTE='{"file": [{"par": "https://objectstorage.eu-frankfurt.../Iris_AI/o/"}], "dialect": {"format": "pdf", "language": "en", "is_strict_mode": false}}' */
Some data:
SELECT document_name, metadata, document_id, segment_number, segment FROM Iris_AI.IrisAIembeddings ORDER BY RAND() LIMIT 2\G
*************************** 1. row ***************************
document_name: https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_versicolor_Wikipedia.pdf
metadata: {"TITLE": "Iris versicolor - Wikipedia", "PUBLISHER": "Skia/PDF m128", "CREATIONDATE": " 9/14/2024 02:36:16 PM", "REVISIONDATE": " 9/14/2024 02:36:16 PM", "USERDEFINEDPROP": "Creator Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/128.0.0.0 Safari/537.36"}
document_id: 2
segment_number: 8
segment: The purple iris is the official flower of Kappa Pi International Honorary Art Fraternity.[13]
Iris versicolor 'Blue Flag'
Iris versicolor
Iris versicolor – botanical illustration in American Medicinal Plants, 1887
Saint-Prosper-de-Champlain, Quebec, Canada
Batiscan, Quebec floral emblem, Canada[14]
*************************** 2. row ***************************
document_name: https://objectstorage.eu-frankfurt.../Iris_versicolor_Wikispecies.pdf
metadata: {"TITLE": "Iris versicolor - Wikispecies", "PUBLISHER": "Skia/PDF m128", "CREATIONDATE": " 9/14/2024 02:39:44 PM", "REVISIONDATE": " 9/14/2024 02:39:44 PM", "USERDEFINEDPROP": "Creator Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/128.0.0.0 Safari/537.36"}
document_id: 3
segment_number: 6
segment: References: BRUMMITT, R.K. 2001. TDWG – World Geographical Scheme for Recording Plant Distributions, 2nd Edition (https://github.com/tdwg/wgsrpd/blob/master/109-488-1 -ED/2nd%20Edition/TDWG_geo2.pdf)
LINNAEUS, C. 1753. Species Plantarum. Tomus I: 39 (http://www.biodiversitylibrary.org/ite m/13829#page/51/mode/1up). Reference page.
Flora of North America Editorial Committee (2002). Flora of North America North of Mexico 26: 1-723. Oxford University Press, New York, Oxford.
We can find the PDF references:
SELECT DISTINCT document_name FROM Iris_AI.IrisAIembeddings;
+---------------------------------------------------------------------------------+
| document_name |
+---------------------------------------------------------------------------------+
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_versicolor_Wikispecies.pdf |
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_virginica_Wikispecies.pdf |
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_setosa_Wikispecies.pdf |
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_virginica_Wikipedia.pdf |
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_setosa_Wikipedia.pdf |
| https://objectstorage.eu-frankfurt.../Iris_AI/o/Iris_versicolor_Wikipedia.pdf |
+---------------------------------------------------------------------------------+
If you have ever wondered what a vector (stored in the VECTOR data type) might look like, you can use VECTOR_TO_STRING()
or FROM_VECTOR()…
SELECT VECTOR_DIM(segment_embedding), FROM_VECTOR(segment_embedding) FROM Iris_AI.IrisAIembeddings LIMIT 1\G
*************************** 1. row ***************************
VECTOR_DIM(segment_embedding): 384
FROM_VECTOR(segment_embedding): [8.00615e-02,2.11577e-02,1.61691e-02,4.23411e-02,-3.24994e-02,4.45356e-02,-9.23082e-02,-3.37680e-02,-2.03409e-02,6.48912e-02,4.07830e-02,1.68436e-02,-3.48651e-02,5.55622e-02,-3.63239e-02,6.86629e-02,4.45186e-02,3.82741e-02,1.09397e-02,1.40209e-03,2.67049e-02,2.23335e-03,5.67095e-02,-2.48191e-02,-5.93457e-02,5.69686e-02,1.88271e-03,7.52686e-02,3.77259e-02,-2.84845e-02,-3.70396e-02,1.23335e-01,6.92697e-02,-4.33865e-02,6.80390e-03,-1.47766e-02,2.61221e-02,4.78904e-02,5.68956e-02,7.15472e-02,-5.28972e-02,-1.00536e-02,-1.24508e-02,-5.44387e-02,7.01239e-03,2.39283e-03,-4.52311e-02,-1.01765e-02,1.66316e-02,-5.89363e-03,-7.39317e-02,-4.40087e-02,-4.70511e-02,-7.55958e-04,7.00977e-02,1.76238e-02,-3.46344e-02,3.42340e-02,3.16051e-03,2.01610e-02,6.16689e-02,-3.29546e-02,-1.77070e-02,-3.53866e-02,-4.38236e-02,2.54502e-02,-9.36790e-02,-9.03236e-03,8.05398e-03,-1.30124e-01,-2.29885e-02,4.14870e-02,7.31113e-02,4.12350e-02,-5.03330e-02,1.53553e-03,1.31312e-02,1.68512e-02,-7.28380e-02,-2.29530e-02,-5.44221e-02,-2.94721e-02,1.01206e-01,-1.42966e-02,3.71721e-02,1.80400e-02,-2.11835e-02,2.84860e-02,-3.72609e-02,1.53701e-02,-4.35036e-02,-5.51637e-02,1.50130e-02,-3.49194e-02,-1.39024e-02,7.81750e-02,6.52888e-02,-8.32752e-02,-1.03520e-02,-1.13159e-02,-6.32495e-02,-1.73173e-02,-3.33632e-02,-1.90669e-02,-3.64169e-02,2.54803e-03,-8.01785e-02,-1.22022e-01,-2.22540e-02,-2.70522e-03,3.12901e-02,3.36771e-02,-6.77026e-02,5.30156e-02,2.64821e-02,-7.79997e-03,-5.18856e-02,-1.98439e-02,1.09730e-01,4.17542e-02,1.41583e-02,-8.88374e-03,1.00671e-02,9.43218e-03,3.44302e-03,5.37422e-02,-3.77832e-02,-1.26111e-02,2.97282e-02,-2.65987e-02,-2.07729e-02,-1.04494e-02,-8.58481e-03,-7.74056e-02,-4.44796e-02,-6.00447e-02,2.15521e-02,9.89563e-03,-5.55751e-03,3.89642e-02,-6.50692e-02,-7.91098e-02,-7.90359e-02,-3.78152e-03,-2.96150e-02,5.58467e-02,4.15553e-02,1.22073e-01,2.91632e-02,1.20790e-01,-1.12918e-03,1.55323e-02,-2.26794e-03,2.69730e-02,2.47722e-02,-1.58608e-01,-6.48694e-03,7.97874e-02,1.30401e-01,-7.63800e-02,5.34007e-02,1.20450e-01,9.73293e-03,3.36432e-02,-4.79591e-02,-8.73279e-02,8.55590e-03,5.65378e-02,4.04146e-02,2.00272e-02,-1.53599e-02,3.70622e-02,5.86816e-02,1.57900e-02,-3.75850e-02,-6.28287e-02,2.97302e-02,-6.33792e-02,-1.58955e-01,4.47092e-02,-2.49065e-03,4.68349e-02,6.57920e-03,4.96956e-02,-5.24389e-04,1.12109e-02,-5.98008e-02,8.31092e-03,4.62907e-04,7.37732e-03,-5.63335e-02,-3.69826e-02,6.56116e-02,4.68743e-02,-3.02025e-02,-1.07677e-02,-3.90057e-02,3.40080e-02,1.53216e-02,4.84835e-02,-1.91672e-02,-2.53384e-03,4.27504e-02,-1.87664e-02,6.07610e-02,4.13199e-02,4.74177e-03,-8.20841e-02,-2.18529e-02,-8.70064e-04,-3.18450e-02,9.38491e-03,-7.51281e-02,6.61539e-02,2.76727e-03,9.03535e-03,5.58634e-02,-1.13491e-01,1.05393e-01,7.08349e-02,-3.90654e-02,-9.73421e-02,-2.31931e-02,2.50503e-34,-1.33356e-03,-8.12520e-02,-3.57421e-02,6.36865e-03,6.98303e-02,-3.79885e-02,1.89998e-03,4.15160e-02,2.64551e-02,-7.81740e-02,-6.01936e-02,8.89259e-02,6.85453e-02,-4.35710e-02,6.15102e-03,-4.12384e-02,-2.30391e-02,8.04307e-02,-4.20083e-02,4.30390e-02,-6.42659e-02,-4.87679e-02,-2.36370e-02,-6.97454e-02,-3.80549e-02,8.03606e-03,4.82721e-02,-8.53376e-02,-1.30280e-01,-4.02262e-02,-1.85272e-02,-7.64051e-02,-6.02972e-02,-8.30899e-03,-3.24592e-02,-6.51084e-02,-3.45417e-02,-5.57729e-02,-2.41541e-02,9.03948e-02,-6.26660e-03,-2.82211e-02,7.14078e-02,-7.29852e-04,-9.84422e-02,-4.61796e-03,-8.73099e-02,6.90387e-02,-3.40576e-02,-4.15616e-02,1.83394e-02,1.00965e-02,-1.69803e-02,-6.09804e-02,-8.52834e-03,-7.98361e-02,3.18808e-04,4.31007e-02,4.49218e-02,3.58964e-02,-2.11669e-02,-2.59622e-02,-2.92579e-02,5.19722e-02,-2.18184e-02,1.11915e-02,7.29237e-02,-1.75118e-02,-1.87051e-02,-5.20259e-02,7.49990e-02,-3.41171e-02,-6.95442e-03,1.92275e-02,5.38307e-02,-1.38445e-02,1.54541e-03,-5.89235e-02,4.19111e-02,8.50421e-02,-1.25453e-02,2.32194e-02,3.06071e-02,-1.00385e-02,9.18060e-02,-1.00535e-02,-1.32639e-01,-5.87926e-03,5.28075e-02,2.70706e-02,-2.62010e-02,4.78546e-02,-5.04491e-02,5.85774e-02,8.84199e-02,8.55492e-32,3.13937e-02,-7.58448e-03,-1.08424e-02,5.32128e-02,7.68471e-03,-4.37979e-02,-8.93071e-02,5.37401e-02,-9.28643e-03,-1.31074e-01,-3.15308e-02,1.66151e-04,-2.33636e-02,6.18941e-02,4.06328e-04,-8.38102e-02,4.22972e-02,6.47196e-02,-3.49144e-02,-5.97541e-03,2.07894e-02,2.67512e-02,5.52530e-02,-8.61502e-02,-6.22585e-02,1.19297e-01,-3.34995e-02,-6.14199e-03,4.33000e-02,3.92206e-02,5.98244e-02,-1.97159e-02,5.18834e-02,1.14972e-02,1.28108e-02,2.56998e-02,-4.65825e-02,1.79345e-02,-5.45621e-02,-3.79862e-02,2.39628e-02,-5.79710e-02,2.82440e-02,1.96145e-02,9.74760e-03,-4.65814e-02,1.20515e-02,-5.32089e-02,9.85598e-02,-6.20520e-03,-4.63758e-02,-6.89194e-03,-1.66197e-02,-1.36524e-02,-9.35889e-02,3.43503e-02,3.12809e-02,-1.08981e-01,4.64091e-02,2.83066e-02,8.21929e-02,4.31020e-02,2.17023e-02,9.03776e-02]
VECTOR_DIM()
gets the length of a vector, that is, the number of entries it contains.
More vector functions here.
Update Vector Store
You can update your vector store by either drop the table and recreate it or by using the refresh_external_tables clause.
-- Updating the Vector Store using Auto Parallel Load
SET @dl_tables = '[
{ "db_name": "DBName", "tables": [ { "table_name": "VectorStoreTableName" }] }
]';
SET @options = JSON_OBJECT('mode', 'normal', 'refresh_external_tables', true);
CALL sys.HEATWAVE_LOAD(CAST(@dl_tables AS JSON), @options);
Running Retrieval-Augmented Generation
The ML_RAG routine performs retrieval-augmented generation (RAG), but the LLM that you will use must be loaded.
You can get the list of the supported LLMs within HeatWave GenAI by querying the ML_SUPPORTED_LLMS table in sys schema:
-- List of supported LLMs
SELECT * FROM sys.ML_SUPPORTED_LLMS;
+--------------------------------------+------------+
| model_name | model_type |
+--------------------------------------+------------+
| llama2-7b-v1 | generation |
| mistral-7b-instruct-v1 | generation |
| llama3-8b-instruct-v1 | generation |
| cohere.command-text-v14 | generation |
| cohere.command-light-text-v14 | generation |
| meta.llama2-70b-chat-v1 | generation |
| meta.llama3-70b-instruct-v1:0 | generation |
| cohere.command-r-plus | generation |
| cohere.command-r-16k | generation |
| meta.llama-3-70b-instruct | generation |
| qwen2-7b-instruct-v1 | generation |
| all_minilm_l12_v2 | embedding |
| multilingual-e5-small | embedding |
| cohere.embed-english-light-v3.0 | embedding |
| cohere.embed-multilingual-v3.0 | embedding |
| cohere.embed-multilingual-light-v3.0 | embedding |
| cohere.embed-english-v3.0 | embedding |
+--------------------------------------+------------+
At the time of writing – December 2024 – the HeatWave in-database LLMs are:
- mistral-7b-instruct-v1
- llama3-8b-instruct-v1
- llama2-7b-v1 is deprecated
the HeatWave in-Database Embedding models are:
- all_minilm_l12_v2
- multilingual-e5-small
The other models are from OCI Generative AI Service. They can also be used in the HeatWave workflow however their use will incur additional costs.
The comprehensive list of the languages, embedding models, and large language models (LLMs) that HeatWave GenAI supports is available here.
In order to run your RAG, you should load the model, specify some options like the vector store table, the distance metric, the language, and also options like temperature, top_k, top_p, repeat_penalty, etc…), then define your natural language query and finally run the sys.ML_RAG stored procedure:
-- Load the model: ex. Llama 3 8B
CALL sys.ML_MODEL_LOAD("llama3-8b-instruct-v1", NULL);
-- Specify the table for retrieving the vector embeddings, / ['distance_metric', {'COSINE'|'DOT'|'EUCLIDEAN'}]
SET @irisAI_model_info = JSON_OBJECT("vector_store", JSON_ARRAY("Iris_AI.IrisAIembeddings"), "distance_metric",
"COSINE", "model_options", JSON_OBJECT("model_id", "llama3-8b-instruct-v1", "language", "en", "temperature",0));
-- Define your natural language query
SET @query="Describe the appearance of the Iris versicolor flower";
-- Retrieve the prompt and print the output:
CALL sys.ML_RAG(@query, @output, @irisAI_model_info);
HeatWave GenAI currently supports 3 distance metrics: COSINE, DOT & EUCLIDEAN. Other options are available here.
The result of the query is stored in JSON format in the variable @output (in this example). This result includes the answer and also citations referring to segments from the loaded documents (my PDF files).
In MySQL, when dealing with JSON documents, it is usually important to know the following functions: JSON_PRETTY, JSON_UNQUOTE & JSON_EXTRACT
-- Print the outputs
SELECT JSON_PRETTY(@output)\G
*************************** 1. row ***************************
JSON_PRETTY(@output): {
"text": "\n\nAccording to the provided context, the well-developed blue flower of Iris versicolor has 6 petals and sepals that are spread out nearly flat. The longer sepals are hairless and have a greenish-yellow blotch at their base.",
"citations": [
{
"segment": "Iris versicolor is a flowering herbaceous perennial plant, growing 10–80 cm (4–31 in) high. It tends to form large clumps from thick, creeping rhizomes. The unwinged, erect stems generally have basal leaves that are more than 1 cm (1?2 in) wide. Leaves are folded on the midribs so that they form an overlapping flat fan. The well developed blue flower has 6 petals and sepals spread out nearly flat and have two forms. The longer sepals are hairless and have a greenish-yellow blotch at their base.",
"distance": 0.1099172830581665,
"document_name": "https://objectstorage.eu-frankfurt.../Iris_versicolor_Wikipedia.pdf"
},
{
"segment": "Iris versicolor\nIris versicolor\nTaxonavigation: Asparagales \nClassification System: APG IV Superregnum: Eukaryota Regnum: Plantae Cladus: Angiosperms Cladus: Monocots Ordo: Asparagales\nFamilia: Iridaceae Subfamilia: Iridoideae Tribus: Irideae Genus: Iris Subgenus: I. subg. Limniris Sectio: I. sect. Limniris Series: Iris ser. Laevigatae Species: Iris versicolor\nIris versicolor L., Sp. Pl. 1: 39 (1753).",
"distance": 0.2635544538497925,
"document_name": "https://objectstorage.eu-frankfurt.../Iris_versicolor_Wikispecies.pdf"
},
{
"segment": "Homotypic Iris versicolor f. virginica (L.) VOSS, Vilm. Blumengaertn., ed. 3. 1: 978. 1895. Iris versicolor var. virginica (L.) BAKER, J. Linn. Soc., Bot. 16: 141 (1877). Xiphion virginicum (L.) ALEF., Bot. Zeitung (Berlin) 21: 297 (1863).\nHeterotypic Iris albilinea ALEXANDER SMALL & ALEXANDER, Contr. New York Bot. Gard. 327: 349 (1931).\nTaxonavigation\nName\nSynonyms",
"distance": 0.27319884300231934,
"document_name": "https://objectstorage.eu-frankfurt.../Iris_virginica_Wikispecies.pdf"
}
],
"vector_store": [
"Iris_AI.IrisAIembeddings"
]
}
To display only the answer:
-- Get the answer from the generated JSON result.
SELECT JSON_UNQUOTE(JSON_EXTRACT(@output, '$.text')) AS answer\G
*************************** 1. row ***************************
answer:
According to the provided context, the well-developed blue flower of Iris versicolor has 6 petals and sepals that are spread out nearly flat. The longer sepals are hairless and have a greenish-yellow blotch at their base.
To display only the citations:
-- You can also get the citations referring to segments from the loaded documents.
SELECT JSON_UNQUOTE(JSON_EXTRACT(@output, '$.citations')) AS citations\G
*************************** 1. row ***************************
citations: [{"segment": "Iris versicolor is a flowering herbaceous perennial plant, growing 10–80 cm (4–31 in) high. It tends to form large clumps from thick, creeping rhizomes. The unwinged, erect stems generally have basal leaves that are more than 1 cm (1?2 in) wide. Leaves are folded on the midribs so that they form an overlapping flat fan. The well developed blue flower has 6 petals and sepals spread out nearly flat and have two forms. The longer sepals are hairless and have a greenish-yellow blotch at their base.", "distance": 0.1099172830581665, "document_name": "https://objectstorage.eu-frankfurt.../Iris_versicolor_Wikipedia.pdf"}, {"segment": "Iris versicolor\nIris versicolor\nTaxonavigation: Asparagales \nClassification System: APG IV Superregnum: Eukaryota Regnum: Plantae Cladus: Angiosperms Cladus: Monocots Ordo: Asparagales\nFamilia: Iridaceae Subfamilia: Iridoideae Tribus: Irideae Genus: Iris Subgenus: I. subg. Limniris Sectio: I. sect. Limniris Series: Iris ser. Laevigatae Species: Iris versicolor\nIris versicolor L., Sp. Pl. 1: 39 (1753).", "distance": 0.2635544538497925, "document_name": "https://objectstorage.eu-frankfurt.../Iris_versicolor_Wikispecies.pdf"}, {"segment": "Homotypic Iris versicolor f. virginica (L.) VOSS, Vilm. Blumengaertn., ed. 3. 1: 978. 1895. Iris versicolor var. virginica (L.) BAKER, J. Linn. Soc., Bot. 16: 141 (1877). Xiphion virginicum (L.) ALEF., Bot. Zeitung (Berlin) 21: 297 (1863).\nHeterotypic Iris albilinea ALEXANDER SMALL & ALEXANDER, Contr. New York Bot. Gard. 327: 349 (1931).\nTaxonavigation\nName\nSynonyms", "distance": 0.27319884300231934, "document_name": "https://objectstorage.eu-frankfurt.../Iris_virginica_Wikispecies.pdf"}]
Et voilà!
As simple as that.
Create your chatbot with HeatWave Chat
HeatWave Chat is an advanced conversational agent powered by LLMs. It utilizes RAG and a vector store to deliver accurate, data-driven insights. By simulating human-like conversations, it provides users with a dynamic and informative experience, offering deep knowledge and meaningful insights in a conversational format.
HeatWave Chat is designed to understand user inputs and respond naturally. It enhances text generation by leveraging chat history for follow-up questions and utilizes vector search to access knowledge stored in its built-in vector store. With all communication and processing securely managed within the HeatWave service, HeatWave Chat delivers fast and reliable responses.
In order to run your chat, you should load the model, specify some chat options like the vector store table, the distance metric, the language, and also options like temperature, top_k, top_p, repeat_penalty, etc…), then define your natural language query and finally run the sys.HEATWAVE_CHAT stored procedure:
-- Load the model: ex. Mistral 7B
CALL sys.ML_MODEL_LOAD("mistral-7b-instruct-v1", NULL);
-- Setup HeatWave Chat options
SET @chat_options = JSON_OBJECT("tables", JSON_ARRAY(JSON_OBJECT("schema_name", "`Iris_AI`",
"table_name","`IrisAIembeddings`")), "model_options", JSON_OBJECT("model_id", "mistral-7b-instruct-v1", "language",
"en", "temperature", 0));
-- Query in natural language (en, fr, de, es, hi...)
SET @question = "Where is Iris setosa native to?";
-- Add your query to HeatWave Chat
CALL sys.HEATWAVE_CHAT(@question);
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| response |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Iris setosa is native to a circum-arctic distribution ranging from coastal Aleutian islands, Alaska, Maine in USA, within Canada (including British Columbia, Newfoundland, Quebec, and the Yukon), within Russia (including near to the lower Lena River, Siberia), northeastern Asia, China (including Manchuria and east Jilin), Korea and southwards to Japan (including Honshu and Hokkaido). |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-- Another query in natural language (en, fr, de, es, hi...)
SET @question = "How is Iris virginica used in Seminole culture?";
-- Add your query to HeatWave Chat
CALL sys.HEATWAVE_CHAT(@question);
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| response |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Answer: Iris virginica has a cultural significance in Seminole culture as it is one of the Iris species used to treat "shock following alligator-bite". However, there is limited information available on how Iris virginica is specifically used in Seminole culture for this purpose. |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
The HEATWAVE_CHAT routine leverages the @chat_options variable as a dynamic parameter store. This variable is populated with additional context and configuration settings, which are subsequently utilized to refine the response generation process.
SELECT JSON_PRETTY(@chat_options)\G
*************************** 1. row ***************************
JSON_PRETTY(@chat_options): {
"tables": [
{
"table_name": "`IrisAIembeddings`",
"schema_name": "`Iris_AI`"
}
],
"response": "\n\nAccording to the provided context, it is mentioned that Iris virginica may be one of the Iris species used by the Seminole tribe to treat \"shock following alligator-bite\". This suggests that the plant has been traditionally used by the Seminole people for medicinal purposes in treating a specific condition related to alligator bites.",
"documents": [
{
"id": "https://objectstorage.eu-frankfurt.../Iris_virginica_Wikipedia.pdf",
"title": "Iris_virginica_Wikipedia.pdf",
"segment": "p://herb.umd.umich.edu/herb/search.pl?searchstring=Iris+virginica) 12. Hamel, Paul B. and Mary U. Chiltoskey 1975 Cherokee Plants and Their Uses -- A 400 Year History. Sylva, N.C. Herald Publishing Co. (p. 41) 13. \"USDA Plants Database\" (https://plants.usda.gov/home/plantProfile?symbol=IRVI).\nplants.usda.gov.\nKemper Center for Home Gardening.org: Iris virginica (Virginia iris) (http://www.missouribota nicalgarden.org/PlantFinder/PlantFinderDetails.aspx?kempercode=y390)",
"distance": 0.2979528307914734
},
{
"id": "https://objectstorage.eu-frankfurt.../Iris_virginica_Wikipedia.pdf",
"title": "Iris_virginica_Wikipedia.pdf",
"segment": "It may be one of the Iris species used by the Seminole to treat \"shock following alligator-bite\".[13]\nIris virginica is one of three iris species in Ronald Fisher's Iris flower data set.",
"distance": 0.3149242401123047
},
{
"id": "https://objectstorage.eu-frankfurt.../Iris_virginica_Wikispecies.pdf",
"title": "Iris_virginica_Wikispecies.pdf",
"segment": "Retrieved from \"https://species.wikimedia.org/w/index.php?title=Iris_virginica&oldid=8735713\"\nThis page was last edited on 5 April 2022, at 21:56.\nText is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy.\nVernacular names",
"distance": 0.3498268127441406
}
],
"chat_history": [
{
"user_message": "Where is Iris setosa native to?",
"chat_query_id": "d2dd2c8e-b3e8-11ef-bdb3-020017081917",
"chat_bot_message": " Iris setosa is native to a circum-arctic distribution ranging from coastal Aleutian islands, Alaska, Maine in USA, within Canada (including British Columbia, Newfoundland, Quebec, and the Yukon), within Russia (including near to the lower Lena River, Siberia), northeastern Asia, China (including Manchuria and east Jilin), Korea and southwards to Japan (including Honshu and Hokkaido)."
},
{
"user_message": "How is Iris virginica used in Seminole culture?",
"chat_query_id": "7d4ea547-b3e9-11ef-bdb3-020017081917",
"chat_bot_message": " Answer: Iris virginica has a cultural significance in Seminole culture as it is one of the Iris species used to treat \"shock following alligator-bite\". However, there is limited information available on how Iris virginica is specifically used in Seminole culture for this purpose."
}
],
"model_options": {
"language": "en",
"model_id": "mistral-7b-instruct-v1",
"temperature": 0
},
"request_completed": true
}
The list of all the parameters that you can set in the @chat_options variable is available here and here.
You can modify chat parameters using the @chat_options
variable, without resetting the chat session:
-- Modifying a chat parameter, here the model, without resetting the chat session:
-- Load the new model if needed
CALL sys.ML_MODEL_LOAD("llama3-8b-instruct-v1", NULL);
SET @chat_options = JSON_SET(@chat_options, '$.model_options.model_id', "llama3-8b-instruct-v1");
or by resetting the chat session:
-- Modifying a chat parameter, here the model, and resetting the chat session:
-- Load the new model if needed
CALL sys.ML_MODEL_LOAD("llama3-8b-instruct-v1", NULL);
SET @chat_options = JSON_OBJECT("model_options", JSON_OBJECT("model_id", "llama3-8b-instruct-v1", "language", "en", "temperature", 0));
More examples here.
As you have seen, HeatWave GenAI simplifies LLM integration with SQL queries, streamlines database connections, and offers a user-friendly architecture that lets you quickly develop applications.
That’s all for today—it’s time to wrap things up! 🙂
Peroration
HeatWave GenAI offers a seamless and powerful platform for integrating generative AI capabilities into your data workflows.
By combining advanced features like retrieval-augmented generation (RAG), an integrated vector store, and user-friendly SQL-based interfaces, HeatWave GenAI empowers users to unlock insights from unstructured data with minimal complexity.
Its efficient and secure in-database architecture ensures real-time performance, scalability, and always up-to-date data access.
Whether you aim to streamline content generation, enhance search and retrieval tasks, or build interactive chatbots, HeatWave GenAI provides all the tools you need within a single ecosystem. By leveraging the tutorials and techniques shared in this guide, you can fully harness the capabilities of HeatWave GenAI to transform how you interact with data and elevate your applications to new heights.
Key Features, Unmatched Benefits
Here’s what sets HeatWave GenAI apart:
- Effortless LLM Integration: Harness the power of large language models with simple SQL queries.
- Seamless Database Integration: No need for intricate external connections, simplifying your workflow.
- Simplified Architecture: Easy to manage and scale, ensuring a smooth user experience.
- Always Up-to-Date Data: In-database LLMs guarantee data freshness for accurate results.
Stay tuned for more insights!
Watch my videos on my YouTube channel and subscribe.
Thanks for using HeatWave & MySQL!