{"id":7566,"date":"2025-04-08T08:50:32","date_gmt":"2025-04-08T07:50:32","guid":{"rendered":"https:\/\/dasini.net\/blog\/?p=7566"},"modified":"2025-04-15T09:28:13","modified_gmt":"2025-04-15T08:28:13","slug":"build-an-ai-powered-search-engine-with-heatwave-genai-part-2","status":"publish","type":"post","link":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/","title":{"rendered":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">In <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)\">Build an AI-Powered Search Engine with HeatWave GenAI (part 1)<\/a>, we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles, improving the user\u2019s ability to find information efficiently.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this second part, we will explore how to enhance the relevance of our answers using <strong>reranking techniques<\/strong>. Next, we will further refine our results by instructing the model to generate embeddings based on article summaries. All these steps will be performed within HeatWave, leveraging its capability to write<strong> <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/stored-routines-js.html\" target=\"_blank\" rel=\"noopener\" title=\"Stored Programs in JavaScript\">Stored Programs in JavaScript<\/a><\/strong>.<\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">I&rsquo;am using HeatWave 9.2.1:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT version();\n+-------------+\n| version()   |\n+-------------+\n| 9.2.1-cloud |\n+-------------+<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)\">In the previous episode<\/a><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">if you remember, in <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1#example3.1\" target=\"_blank\" rel=\"noopener\" title=\"-- Ex 3.1 Similarity search on title &amp; excerpt (post_title_embedding &amp; post_excerpt_embedding)\">example 3.1<\/a>, we saw how to run a similarity search on title &amp; excerpt (post_title_embedding &amp; post_excerpt_embedding), using an elegant CTE:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\n-- Ex 3.1 Similarity search on title &amp; excerpt (post_title_embedding &amp; post_excerpt_embedding)\n\nWITH distances AS (\n    SELECT\n        ID, \n        post_title,\n        post_excerpt,\n        (\n          DISTANCE(post_title_embedding, @searchItemEmbedding, 'COSINE') + \n          DISTANCE(post_excerpt_embedding, @searchItemEmbedding, 'COSINE')\n        ) \/ 2 AS avg_distance\n    FROM WP_embeddings.wp_posts_embeddings_minilm\n)\nSELECT *\nFROM distances\nORDER BY avg_distance\nLIMIT 5\\G\n*************************** 1. row ***************************\n          ID: 1234\n  post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\npost_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\navg_distance: 0.5131600499153137\n...<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">This result can be further improved using reranking techniques. <br>Reranking involves reordering or refining an initial set of retrieved documents to enhance their relevance to a user\u2019s query. This step is essential for optimizing search quality, often leading to a significant boost in the relevance of the retrieved information.<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Reranking<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Here we setting up the same context (ie embedding model, natural language query encoding and vector similarity operations) than in <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1#encodeQuery\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)\">part 1<\/a>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt; \n-- Set variables\nSET @embeddOptions = '{\"model_id\": \"minilm\"}';\nSET @searchItem = \"Generative artificial intelligence\";\n\n-- Encode the query using the embedding model\nSELECT sys.ML_EMBED_ROW(@searchItem, @embeddOptions) into @searchItemEmbedding;\n<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The retrieved results is now sorted using weights on title and excerpt distances:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 3.2 Similarity search on title &amp; excerpt (post_title_embedding &amp; post_excerpt_embedding) with weights\n\nWITH initial_results AS (\n    SELECT\n        ID, \n        post_title,\n        post_excerpt,\n        DISTANCE(post_title_embedding, @searchItemEmbedding, 'COSINE') AS title_distance, \n        DISTANCE(post_excerpt_embedding, @searchItemEmbedding, 'COSINE') AS excerpt_distance, \n    guid \n    FROM WP_embeddings.wp_posts_embeddings_minilm \n    ORDER BY title_distance + excerpt_distance  -- Simple combination\n    LIMIT 15 -- Retrieve a larger initial set\n),\nreranked_results AS (\n    SELECT\n        ID,\n        post_title,\n        post_excerpt,\n        (0.3 * title_distance + 0.7 * excerpt_distance) AS combined_distance,   -- Weighted combination\n    guid \n    FROM initial_results\n)\nSELECT post_title, post_excerpt, combined_distance, guid\nFROM reranked_results\nORDER BY combined_distance ASC\nLIMIT 5\\G\n\n*************************** 1. row ***************************\n       post_title: HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\n     post_excerpt: This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they\u2019re feeling positive, negative, or neutral.\nLet\u2019s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.\ncombined_distance: 0.49683985114097595\n             guid: https:\/\/dasini.net\/blog\/?p=3456\n*************************** 2. row ***************************\n       post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\n     post_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\ncombined_distance: 0.4994780898094177\n             guid: https:\/\/dasini.net\/blog\/?p=1234\n*************************** 3. row ***************************\n       post_title: Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp;amp; Vector Store Features\n     post_excerpt: This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users gain insights from their data.\ncombined_distance: 0.6582363367080688\n             guid: https:\/\/dasini.net\/blog\/?p=2345\n*************************** 4. row ***************************\n       post_title: Webinar - Apprentissage automatique avec MySQL HeatWave\n     post_excerpt: HeatWave Machine Learning (ML) inclut tout ce dont les utilisateurs ont besoin pour cr\u00e9er, former, d\u00e9ployer et expliquer des mod\u00e8les d\u2019apprentissage automatique dans MySQL HeatWave, sans co\u00fbt suppl\u00e9mentaire.\n\nDans ce webinaire vous apprendrez...\ncombined_distance: 0.694593733549118\n             guid: https:\/\/dasini.net\/blog\/?p=6789\n*************************** 5. row ***************************\n       post_title: Building an Interactive LLM Chatbot with  HeatWave Using Python\n     post_excerpt: AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities.\n\nIn this article, we will explore a Python 3 script that connects to an HeatWave instance and enables users to interact with different large language models (LLMs) dynamically.\ncombined_distance: 0.7135995388031006\n             guid: https:\/\/dasini.net\/blog\/?p=5678<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">In the example above, the excerpt is given more than twice the weight of the title (0.7 vs 0.3), based on the assumption that the excerpt holds more relevant information in this context. <br>Depending on your use case, you may want to fine-tune these weights, as adjusting them can significantly improve the quality of search results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, let\u2019s explore how we can further refine our results by leveraging article summaries along with JavaScript-based stored procedures and functions.<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"js_sp_ai\">A Javascript, stored procedure &amp; AI story<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">When searching through articles, relying solely on the title and excerpt may not yield the most relevant results. <br>What if we used a summary instead? <br>This approach can strike an excellent balance between relevance and implementation simplicity. With HeatWave GenAI, the entire workflow can be handled directly within the database using SQL and JavaScript-based stored procedures.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In WordPress, article content is stored as HTML in the <code><em>post_content<\/em><\/code> column. To make it suitable for processing by a large language model (LLM), this content must first be sanitized \u2014 that is, all HTML tags need to be removed, as they are not meaningful to the LLM.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Table preparation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">As a preparation step, I added a new column (<em><code>post_content_text longtext NOT NULL<\/code><\/em>) to the <em><code>wp_posts_embeddings_minilm<\/code><\/em> table that we have used in <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)\">part 1<\/a>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">ALTER TABLE wp_posts_embeddings_minilm ADD COLUMN post_content_text longtext NOT NULL;<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The structure of the table is now:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">CREATE TABLE `wp_posts_embeddings_minilm` (\n  `ID` bigint unsigned NOT NULL AUTO_INCREMENT,\n  `post_title` text NOT NULL,\n  `post_excerpt` text NOT NULL,\n  `guid` varchar(255) NOT NULL DEFAULT '',\n  `post_title_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_excerpt_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_content_text` longtext NOT NULL,\n  PRIMARY KEY (`ID`),\n  KEY `post_title` (`post_title`(255)),\n  KEY `post_excerpt` (`post_excerpt`(255)),\n  KEY `guid` (`guid`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This new column \u2014 <em><code>post_content_text<\/code><\/em> \u2014 will be populated with the sanitized content of <em><code>post_content<\/code><\/em>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For example, original HTML in <em>post_content<\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><code>&lt;!-- wp:paragraph --&gt;<br>&lt;p&gt;HeatWave GenAI brings LLMs directly into your database, enabling powerful AI capabilities and natural language processing.&lt;\/p&gt;<br>&lt;!-- \/wp:paragraph --&gt;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Should be stored in <em><code>post_content_text<\/code><\/em> as:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><code>HeatWave GenAI brings LLMs directly into your database, enabling powerful AI capabilities and natural language processing<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Strip HTML tags with a javascript stored routine <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">This could be done in Javascript, inside HeatWave, as a <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/stored-routines-js.html\" target=\"_blank\" rel=\"noopener\" title=\" JavaScript Stored Programs\">stored program<\/a>. Isn&rsquo;t it magnificent?<br>HeatWave supports stored routines written in JavaScript, since version 9.0.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A simple implementation could be the following:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">CREATE FUNCTION sp.stripHtmlTags(htmlString LONGTEXT) RETURNS LONGTEXT NO SQL LANGUAGE JAVASCRIPT AS\n$$\n    if (!htmlString) {\n        return \"\";\n    }\n\n    \/\/ Replace HTML tags with a space\n    return htmlString\n        .replace(\/&lt;[^&gt;]+&gt;\/g, \" \")  \/\/ Replace all tags with a space\n        .replace(\/\\s+\/g, \" \")      \/\/ Replace multiple spaces with a single space\n        .trim(); \n$$\n;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Please bear in mind that <strong>I\u2019m not a developer<\/strong>, so&nbsp;<strong>this code is provided for illustrative purposes only<\/strong>. It may contain errors or limitations. Please&nbsp;<strong>use it at your own risk<\/strong>&nbsp;and adapt it to your specific needs (also feel free to share back).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let&rsquo;s see if it works:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT sp.stripHtmlTags('&lt;!-- wp:paragraph --&gt;&lt;p&gt;HeatWave GenAI brings LLMs directly into your database, enabling powerful AI capabilities and natural language processing.&lt;\/p&gt;&lt;!-- \/wp:paragraph --&gt;') ;\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| sp.stripHtmlTags('&lt;!-- wp:paragraph --&gt;&lt;p&gt;HeatWave GenAI brings LLMs directly into your database, enabling powerful AI capabilities and natural language processing.&lt;\/p&gt;&lt;!-- \/wp:paragraph --&gt;') |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| HeatWave GenAI brings LLMs directly into your database, enabling powerful AI capabilities and natural language processing.                                                                       |\n+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Looks like this Javascript stored function is doing the job. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We can now uses it to sanitized all the articles.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">UPDATE wp_posts_embeddings_minilm \n    INNER JOIN wp_posts USING (ID) \n    SET post_content_text = sp.stripHtmlTags(post_content) \n    WHERE post_status = 'publish' \n       AND post_type = 'post' ; <\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Now all the published posts have a sanitized text only version in the database. We can now uses it to generate article summaries.<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Generating summaries with the JavaScript HeatWave GenAI API <\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">HeatWave offers a JavaScript API that enables seamless integration with HeatWave GenAI, allowing you to perform natural language searches powered by LLMs. You can find more details in the <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjs-genai-api.html\" target=\"_blank\" rel=\"noopener\" title=\"\">official documentation<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To generate summaries for all my articles, I\u2019ll create a stored procedure using the  <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjsapi-ml.html#srjsapi-ml-generate\" target=\"_blank\" rel=\"noopener\" title=\"loads the model, generates a response based on the prompt, and returns the response\"><strong><code><em>ml.generate<\/em><\/code><\/strong><\/a> method. This method supports two modes: single invocation and batch processing. While single invocation is ideal for handling new articles individually, we\u2019ll focus on the batch mode here to process all existing articles efficiently:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt; \nCREATE PROCEDURE sp.summarizePostBatch() LANGUAGE JAVASCRIPT AS\n$$\n    let schema = session.getSchema(\"wordpress\");\n    let table = schema.getTable(\"wp_posts_embeddings_minilm\");\n\n    ml.generate(table, \"post_content_text\", \"post_summary_json\", {model_id: \"mistral-7b-instruct-v1\", task: \"summarization\"});\n$$\n;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjsapi-ml.html#srjsapi-ml-generate\" target=\"_blank\" rel=\"noopener\" title=\"loads the model, generates a response based on the prompt, and returns the response\"><em><code>ml.generate<\/code><\/em><\/a> loads the model (mistral-7b-instruct-v1), generates a response (article summary) inside the <em><code>post_summary_json<\/code><\/em> column (automatically created), based on the prompt (article) from the post_content_text column and returns the response (a summary).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then call the stored procedure:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">CALL sp.summarizePostBatch();\n\n Query OK, 0 rows affected (1 hour 2 min 16.3605 sec)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The process successfully created a new JSON column named <em><code>post_summary_json<\/code><\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt; SHOW CREATE TABLE wp_posts_embeddings_minilm;\n\nCREATE TABLE `wp_posts_embeddings_minilm` (\n  `ID` bigint unsigned NOT NULL AUTO_INCREMENT,\n  `post_title` text NOT NULL,\n  `post_excerpt` text NOT NULL,\n  `guid` varchar(255) NOT NULL DEFAULT '',\n  `post_title_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_excerpt_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_content_text` longtext NOT NULL,\n  `post_summary_json` json DEFAULT NULL,\n  PRIMARY KEY (`ID`),\n  KEY `post_title` (`post_title`(255)),\n  KEY `post_excerpt` (`post_excerpt`(255)),\n  KEY `guid` (`guid`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Using the HeatWave <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/json-function-reference.html\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave MySQL JSON functions\">MySQL JSON functions<\/a>, we can check the content of this new column:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT LEFT(post_summary_json-&gt;&gt;\"$.text\", 100) AS json FROM wp_posts_embeddings_minilm WHERE ID = 1234\\G\n*************************** 1. row ***************************\njson: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training <\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Now let&rsquo;s see how to create embeddings for all the summaries<\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Create the embeddings<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">We have seen in <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)\">Build an AI-Powered Search Engine with HeatWave GenAI (part 1)<\/a> how to create embeddings with the&nbsp;<em><code>minilm<\/code><\/em>&nbsp;model, using&nbsp;<a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-table.html\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>sys.ML_EMBED_TABLE<\/strong><\/a> routine.<br>We\u2019ll follow the same process, but this time using JavaScript stored routines. Specifically, we\u2019ll use the <code><strong><em><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjsapi-ml.html#srjsapi-ml-embed\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave - generates an embedding\">ml.embed<\/a><\/em><\/strong> <\/code>method.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Before implementing the routine, we need to do some modification on the table. The reason is that in HeatWave 9.2.1, the table columns must be in one of the following format: varchar, tinytext, text, mediumtext and longtext. If it is not the case you&rsquo;ll trigger the <em><strong><code>ERROR: 1644 (45000): ML006093: Type of 'post_summary_json' must be in [\"varchar\", \"tinytext\", \"text\", \"mediumtext\", \"longtext\"]<\/code><\/strong><\/em><br>Quite explicit though!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We&rsquo;ll transfer the content of <em><code>post_summary_json<\/code><\/em> into a text column named <em><code>post_summary<\/code><\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nALTER TABLE wp_posts_embeddings_minilm ADD COLUMN post_summary text;\n\nUPDATE wp_posts_embeddings_minilm SET post_summary =  post_summary_json-&gt;&gt;\"$.text\"; \n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">And finally to avoid error <em>ERROR: 1644 (45000): ML006093<\/em>, we must drop <em><code>post_summary_json<\/code><\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">ALTER TABLE wp_posts_embeddings_minilm DROP COLUMN post_summary_json;<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><em><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjsapi-ml.html#srjsapi-ml-embed\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave - generates an embedding\"><code>ml.embed<\/code><\/a><\/em><\/strong> &nbsp;also supports two variants, one for a single invocation, and one for batch processing.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Single invocation of ml.embed &nbsp;<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The goal is to create a stored procedure that encodes a summarized article into a vector embedding using the <em><code>minilm<\/code><\/em> embedding model:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nCREATE PROCEDURE sp.createEmbeddings(IN text2embed LONGTEXT, OUT vect VECTOR) LANGUAGE JAVASCRIPT AS\n$$\n    let embedding = ml.embed(text2embed, {model_id: \"minilm\"});\n    vect = embedding;\n$$\n;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The input parameter is the summarized article:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nCALL sp.createEmbeddings(\" This article explores the integration of HeatWave GenAI with Python for AI-driven applications. It provides a step-by-step guide to building a simple chatbot system that interacts with HeatWave using its in-database LLMs and external APIs from OCI Generative AI Service. The script demonstrates how to establish a connection to the HeatWave MySQL database, load and manage multiple LLMs within HeatWave, allow users to select their preferred model dynamically, facilitate chatbot interactions using HeatWave Chat, and retrieve and manage chat options. The article also discusses the benefits of using HeatWave GenAI for AI-driven applications, including its modular and scalable design, dynamic LLM selection, and powerful capabilities in AI.\", @myVect);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The output parameter \u2014 @myVect \u2014 contains the embedding:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nSELECT @myVect\\G\n*************************** 1. row ***************************\n@myVect: ?RK?p?????!&lt;Q?c?\n?????h?[???;??qV@=?mW???F?_?M=??&lt;??$=EN&amp;??Xv?6?`=E????P?V^v????=Dw?&lt;Q?f?.?x?\n                                                                            ?N?????&lt;?1=\n                                                                                        P-???=??O=\n                                                                                                  ?A=21?=Ez????@??1j??*]?T??K????o?V?\u017d?l\n??+??=W??=3??????                                                                                                                       &gt;?&lt;iyx&lt;?????&lt;j????&lt;???;z?_?H&lt;??,=?'@?j?&lt;N??=???M?C?`Wb?Qr?U?c???V&lt;DmT&lt;{N6=\n                ????T??$?&lt;?u\n                            ?v$?=g?=?k???D?\"??6?\\??\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">A more human representation (or not) is possible using <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector-functions.html#function_vector-to-string\" target=\"_blank\" rel=\"noopener\" title=\"Given the binary representation of a VECTOR column value, this function returns its string representation\"><code><em>FROM_VECTOR<\/em><\/code><\/a>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nSELECT FROM_VECTOR(@myVect)\\G\n*************************** 1. row ***************************\nFROM_VECTOR(@myVect): [-4.96396e-02,-8.32623e-02,9.86042e-03,-1.38911e-02,-1.98374e-02,-5.67944e-02,-3.55247e-02,2.43442e-03,-7.37070e-03,4.69574e-02,-5.25947e-02,-4.85626e-02,5.02704e-02,9.01929e-03,4.01000e-02,-2.53762e-03,-1.50358e-02,5.48794e-02,-3.82407e-02,-5.09363e-02,-6.01486e-02,6.62680e-02,1.97102e-02,-3.52313e-03,-6.07311e-02,-5.04761e-02,-2.28147e-02,2.70393e-02,3.71569e-02,-2.64454e-03,1.19642e-01,5.07343e-02,4.72823e-02,6.94298e-02,-1.87351e-02,-1.17510e-02,-5.71763e-02,-5.39956e-02,-1.96540e-02,-2.69486e-02,-3.46894e-02,-3.81926e-02,-9.65525e-02,3.40393e-02,2.23684e-02,1.51657e-02,-1.00843e-01,2.82697e-02,-2.81484e-02,2.06150e-02,-8.44855e-02,-9.88659e-02,1.22386e-02,4.21836e-02,-2.93204e-03,7.91035e-03,7.20659e-02,-8.70026e-03,-4.77326e-02,-1.38148e-02,-2.58113e-02,-4.66768e-03,-9.27910e-03,1.30998e-02,1.29655e-02,4.45084e-02,-2.69566e-02,7.25501e-02,7.15968e-02,-1.46195e-01,-6.30012e-02,-7.91235e-02,-5.19264e-02,2.40655e-02,-3.42919e-02,7.72180e-02,7.22950e-02,-7.19829e-02,-7.17984e-03,-2.89341e-02,-1.34624e-02,-4.61435e-02,1.58105e-02,7.25958e-02,-2.99175e-02,-7.80187e-02,-5.74389e-03,-2.57604e-02,7.18643e-02,4.47061e-02,2.45664e-02,-1.13831e-02,4.58998e-02,-1.19727e-02,5.77563e-02,9.52516e-02,3.84456e-02,-3.92485e-02,-3.90266e-02,2.87119e-02,-7.21781e-02,4.85492e-02,-8.24129e-02,-2.01809e-02,-1.77014e-02,6.55753e-02,-4.22585e-02,-6.67007e-02,6.90129e-02,-2.04837e-02,-5.49840e-02,-8.76547e-03,4.42857e-02,5.63733e-02,7.53812e-02,-4.63425e-02,1.82254e-02,1.18100e-02,-7.03449e-02,7.29235e-02,2.41146e-02,-3.35154e-02,-3.56129e-02,7.28611e-02,6.86842e-02,-5.54628e-02,-1.93234e-02,-1.89223e-03,9.01134e-02,6.44596e-02,2.64851e-02,9.06968e-02,4.06748e-02,5.16320e-02,3.33822e-02,5.54104e-02,-3.21358e-02,-6.66219e-02,2.00648e-02,1.09276e-01,-6.25206e-02,4.89027e-02,-4.17487e-02,-5.67389e-02,7.44691e-03,-4.42884e-02,3.18385e-02,-1.98049e-02,-7.42827e-02,-6.14671e-02,7.44849e-02,3.02982e-02,2.98403e-02,7.55829e-02,1.20768e-01,1.26906e-02,1.04550e-01,6.59261e-02,1.30538e-02,-1.39226e-02,-3.02682e-02,7.50245e-02,1.57195e-02,-2.63368e-03,-9.18113e-02,-6.67840e-02,-2.06254e-02,7.86020e-02,-2.10694e-02,1.30604e-03,5.31497e-02,5.42747e-02,-1.07493e-01,-4.50561e-02,-1.25854e-01,-2.63633e-02,2.99242e-02,-2.35737e-02,-1.81825e-02,2.76637e-02,-4.12063e-02,4.46310e-02,-1.92102e-02,-8.92773e-02,4.45359e-02,-1.11110e-02,-5.22624e-02,-2.45076e-02,-1.96262e-02,-1.98535e-02,8.15337e-02,7.87625e-02,5.73686e-02,1.47477e-02,6.91806e-03,1.53566e-02,7.33980e-03,1.07936e-02,1.51880e-03,2.81434e-02,-6.74229e-02,2.64910e-02,-1.99371e-02,5.59245e-02,2.73276e-02,-1.19287e-01,8.55642e-03,6.91925e-02,1.21589e-02,-4.05408e-02,-7.55252e-02,-3.61071e-02,2.72453e-02,-8.27803e-02,3.18795e-02,-5.05173e-03,-1.99760e-02,3.63341e-02,-6.50054e-02,-8.56357e-03,5.15133e-02,6.26782e-02,-4.57907e-02,-1.35400e-33,-3.12649e-02,1.42193e-02,-6.76464e-02,2.92021e-02,-5.87203e-02,-1.18268e-01,6.15763e-02,4.44978e-02,4.98419e-02,-1.72100e-02,-1.13092e-01,6.02123e-02,1.55891e-02,-5.68059e-02,-1.35132e-02,-4.19842e-02,-2.60558e-02,2.80308e-03,1.32598e-01,1.07273e-01,-3.47401e-02,4.75040e-02,-7.63112e-02,-2.59081e-02,2.11842e-02,2.25822e-02,-1.98663e-02,-1.04105e-02,-6.49599e-02,1.09372e-03,-5.17167e-02,1.38617e-02,-6.26832e-02,1.26345e-02,-1.71078e-02,5.77133e-02,-5.38819e-03,-6.03902e-02,1.65896e-02,-8.53037e-02,6.79781e-02,-2.80634e-02,-8.47060e-02,5.93425e-02,-1.46708e-02,4.85509e-02,-1.12832e-01,6.27912e-03,-6.16615e-03,-2.94036e-02,3.11167e-02,2.30855e-03,6.65979e-02,-6.83186e-02,4.35274e-02,-1.24276e-01,1.87970e-02,-3.84284e-02,-1.53174e-02,-3.97068e-02,1.07243e-02,-3.07626e-02,2.31247e-02,-3.29633e-02,-6.20930e-02,1.08394e-01,7.25451e-02,7.60207e-02,-4.54095e-02,-8.78456e-02,4.84518e-02,-4.07884e-02,7.50114e-02,1.14496e-02,2.41463e-02,5.81300e-02,-1.30148e-02,1.39014e-02,-1.40898e-02,-1.72861e-02,-3.93898e-02,-2.61493e-02,2.51394e-02,-2.53317e-02,-9.82916e-04,-2.21086e-02,5.22690e-02,3.53555e-02,6.22829e-03,3.66027e-02,2.17727e-03,6.67700e-02,-2.32061e-02,-1.69419e-03,-9.06145e-03,6.34304e-32,-1.78320e-02,-2.15657e-02,-6.16218e-03,5.44959e-02,1.88435e-02,-1.29032e-02,-4.80581e-02,4.21709e-02,1.50341e-02,4.37988e-02,8.35770e-02,-8.84255e-02,7.72715e-02,5.24237e-02,-3.56883e-02,-8.12593e-03,-3.41413e-02,-1.69388e-02,3.47573e-02,-1.21286e-01,3.22201e-02,3.63997e-02,-2.33719e-02,-2.07887e-02,-8.05931e-02,-7.69887e-02,-4.39181e-02,-9.76368e-03,-1.48867e-03,3.44279e-02,-3.62778e-02,-7.50277e-03,4.86021e-02,5.18756e-02,1.85040e-02,-2.77495e-03,-3.53111e-02,5.16714e-02,-2.28214e-02,-6.10510e-02,2.86529e-02,1.37693e-01,8.80472e-03,-5.79396e-02,3.57790e-02,1.58254e-03,3.59281e-02,-9.29721e-02,1.09798e-02,1.83488e-02,-1.42538e-03,-3.40717e-02,3.98733e-02,-1.05952e-02,3.72981e-02,-5.85627e-02,3.13814e-02,-5.80935e-02,-3.15143e-03,6.94674e-02,-6.51819e-03,6.39965e-02,3.89966e-02,-7.56700e-02]\n\n<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Batch processing of ml.embed &nbsp;<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To generate embeddings for all current posts at once, batch processing is the most efficient approach. Let\u2019s create a new stored procedure in JavaScript:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\nCREATE PROCEDURE sp.createEmbeddingsBatch() LANGUAGE JAVASCRIPT AS\n$$\n    let schema = session.getSchema(\"wordpress\");\n    let table = schema.getTable(\"wp_posts_embeddings_minilm\");\n\n    ml.embed(table, \"post_summary\", \"post_summary_embedding\", {model_id: \"minilm\"});\n$$\n;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/srjsapi-ml.html#srjsapi-ml-embed\" target=\"_blank\" rel=\"noopener\" title=\"generates an embedding\"><em><code>ml.embed<\/code><\/em><\/a>&nbsp;loads the model (<em><code>minilm<\/code><\/em>), generates an embedding inside the <code><em>post_summary_embedding<\/em><\/code> column (automatically created), based on the post summaries from the <code><em>post_summary<\/em><\/code> column.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Then run the stored procedure:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt; \nCALL sp.createEmbeddingsBatch();\nQuery OK, 0 rows affected (32.9131 sec)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The process successfully created a new VECTOR column named <em><code>post_summary_embedding<\/code><\/em>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt; SHOW CREATE TABLE wp_posts_embeddings_minilm;\n\nCREATE TABLE `wp_posts_embeddings_minilm` (\n  `ID` bigint unsigned NOT NULL AUTO_INCREMENT,\n  `post_title` text NOT NULL,\n  `post_excerpt` text NOT NULL,\n  `guid` varchar(255) NOT NULL DEFAULT '',\n  `post_title_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_excerpt_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_content_text` longtext NOT NULL,\n  `post_summary` text,\n  `post_summary_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  PRIMARY KEY (`ID`),\n  KEY `post_title` (`post_title`(255)),\n  KEY `post_excerpt` (`post_excerpt`(255)),\n  KEY `guid` (`guid`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Query Encoding and Vector Similarity Operations<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The penultimate step is to convert the user\u2019s query into a vector representation that captures its semantic meaning. This process, known as query encoding, transforms text into numerical embeddings. Once encoded, we will be able to perform a similarity search to find the most relevant results by comparing the query\u2019s embedding with precomputed vectors from HeatWave.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Encode the query into a vector embedding<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Like we have seen in <a href=\"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/\" target=\"_blank\" rel=\"noopener\" title=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 1) - Encode the query into a vector embedding\">part1<\/a>, to generate a vector embedding for the query, we use the <a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-row.html\" target=\"_blank\" rel=\"noopener\" title=\"ML_EMBED_ROW uses the specified embedding model to encode the specified text or query into a vector embedding.\">ML_EMBED_ROW<\/a> routine. This function applies the specified embedding model to encode the given text into a vector representation. The routine returns a <code><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector.html\" target=\"_blank\" rel=\"noopener\" title=\"The MySQL VECTOR Type\"><code>VECTOR<\/code><\/a><\/code> containing the numerical embedding of the text.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using it is straightforward. Let&rsquo;s define two variables: <code><em>@searchItem<\/em><\/code> (the text to encode) and <em><code>@embeddOptions<\/code><\/em> (the embedding model used for encoding):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SQL &gt;\n-- Set variables\nSET @embeddOptions = '{\"model_id\": \"minilm\"}';\nSET @searchItem = \"Generative artificial intelligence\";\n\n-- Encode the query using the embedding model\nSELECT sys.ML_EMBED_ROW(@searchItem, @embeddOptions) into @searchItemEmbedding;<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">You can print the vector using the following query;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT FROM_VECTOR(vect) \nFROM (\n      SELECT sys.ML_EMBED_ROW(@searchItem, @embeddOptions) AS vect\n     ) AS dt;\n[-6.21515e-02,1.61460e-02,1.25987e-02,-1.98096e-02,... (truncated)<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Similarity search<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">To retrieve relevant blog content, we perform vector similarity calculations using the <code><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector-functions.html#function_distance\" target=\"_blank\" rel=\"noopener\" title=\"Calculates the distance between two vectors per the specified calculation method\"><strong>DISTANCE<\/strong><\/a><\/code> function. This function computes the distance between two vectors using <code><strong>COSINE<\/strong><\/code>, <code><strong>DOT<\/strong><\/code>, or <code><strong>EUCLIDEAN<\/strong><\/code> distance metrics. <br>Here, the two vectors being compared are the encoded query (<code>@searchItemEmbedding<\/code>) and the precomputed embeddings stored in the <code><em>wp_posts_embeddings_minilm<\/em><\/code> table (<em><code>post_summary_embedding<\/code><\/em>)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A <strong>cosine similarity search<\/strong> for article summaries can be conducted using:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 4.1. Similarity search only on the post summaries (post_summary_embedding)\n\nWITH distances AS (\n    SELECT\n        ID, \n        post_title,\n        post_excerpt,\n        DISTANCE(@searchItemEmbedding, post_summary_embedding, 'COSINE') AS min_distance\n    FROM WP_embeddings.wp_posts_embeddings_minilm\n)\nSELECT *\nFROM distances\nORDER BY min_distance\nLIMIT 5\\G<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">A <strong>cosine similarity search<\/strong> for article titles, excerpts and summaries can be conducted using:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 5.1 Similarity search on title, excerpt &amp; post summary (post_title_embedding, post_excerpt_embedding &amp; post_summary_embedding)\n\nWITH distances AS (\n    SELECT\n        post_title,\n        post_excerpt,\n        (\n          DISTANCE(post_title_embedding, @searchItemEmbedding, 'COSINE') + \n          DISTANCE(post_excerpt_embedding, @searchItemEmbedding, 'COSINE') + \n          DISTANCE(post_summary_embedding, @searchItemEmbedding, 'COSINE')\n        ) \/ 3 AS avg_distance,\n        guid\n    FROM WP_embeddings.wp_posts_embeddings_minilm\n)\nSELECT *\nFROM distances\nORDER BY avg_distance\nLIMIT 5\\G<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Finally, you can (try to) improve the results using a reranking technique:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 5.2 Weighted Similarity search on title, excerpt &amp; post summary (post_title_embedding, post_excerpt_embedding &amp; post_summary_embedding)\n\nWITH initial_results AS (\n    SELECT\n        post_title,\n        post_excerpt,\n        DISTANCE(post_title_embedding, @searchItemEmbedding, 'COSINE') AS title_distance, \n        DISTANCE(post_excerpt_embedding, @searchItemEmbedding, 'COSINE') AS excerpt_distance,\n        DISTANCE(post_summary_embedding,  @searchItemEmbedding, 'COSINE') AS summary_distance, \n        guid\n    FROM WP_embeddings.wp_posts_embeddings_minilm \n    ORDER BY title_distance + excerpt_distance + summary_distance ASC -- Simple combination\n    LIMIT 15 -- Retrieve a larger initial set\n),\nreranked_results AS (\n    SELECT\n        ID,\n        post_title,\n        post_excerpt,\n        (0.2 * title_distance + 0.3 * excerpt_distance + 0.5 * summary_distance) AS combined_distance,  -- Weighted combination\n        guid\n    FROM initial_results\n)\nSELECT post_title, post_excerpt, combined_distance, guid\nFROM reranked_results\nORDER BY combined_distance ASC\nLIMIT 5\\G\n\n*************************** 1. row ***************************\n       post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\n     post_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\ncombined_distance: 0.5093500733375549\n             guid: https:\/\/dasini.net\/blog\/?p=1234\n*************************** 2. row ***************************\n       post_title: Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp;amp; Vector Store Features\n     post_excerpt: This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users gain insights from their data.\ncombined_distance: 0.637738311290741\n             guid: https:\/\/dasini.net\/blog\/?p=2345\n*************************** 3. row ***************************\n       post_title: HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\n     post_excerpt: This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they\u2019re feeling positive, negative, or neutral.\nLet\u2019s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.\ncombined_distance: 0.6417026937007904\n             guid: https:\/\/dasini.net\/blog\/?p=3456\n*************************** 4. row ***************************\n       post_title: Building an Interactive LLM Chatbot with  HeatWave Using Python\n     post_excerpt: AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities.\n\nIn this article, we will explore a Python 3 script that connects to an HeatWave instance and enables users to interact with different large language models (LLMs) dynamically.\ncombined_distance: 0.6545232772827148\n             guid: https:\/\/dasini.net\/blog\/?p=5678\n*************************** 5. row ***************************\n       post_title: Webinar - Apprentissage automatique avec MySQL HeatWave\n     post_excerpt: HeatWave Machine Learning (ML) inclut tout ce dont les utilisateurs ont besoin pour cr\u00e9er, former, d\u00e9ployer et expliquer des mod\u00e8les d\u2019apprentissage automatique dans MySQL HeatWave, sans co\u00fbt suppl\u00e9mentaire.\n\nDans ce webinaire vous apprendrez...\ncombined_distance: 0.7031511843204499\n             guid: https:\/\/dasini.net\/blog\/?p=6789<\/code><\/pre>\n\n\n\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;6a4a7c7c1113d&quot;}\" data-wp-interactive=\"core\/image\" data-wp-key=\"6a4a7c7c1113d\" class=\"wp-block-image size-full wp-lightbox-container\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1440\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on--click=\"actions.showLightbox\" data-wp-on--load=\"callbacks.setButtonStyles\" data-wp-on--pointerdown=\"actions.preloadImage\" data-wp-on--pointerenter=\"actions.preloadImageWithDelay\" data-wp-on--pointerleave=\"actions.cancelPreload\" data-wp-on-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=2560%2C1440&#038;ssl=1\" alt=\"Similarity search across title, excerpt and summary\" class=\"wp-image-7704\"\/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\tdata-wp-bind--aria-label=\"state.thisImage.triggerButtonAriaLabel\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.thisImage.buttonRight\"\n\t\t\tdata-wp-style--top=\"state.thisImage.buttonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Peroration<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In this second part of our journey into <strong>building an AI-powered search engine with HeatWave GenAI<\/strong>, we explored advanced techniques to refine search relevance. By incorporating reranking strategies and leveraging article summaries for embedding generation, we significantly improved the quality of retrieved results.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Furthermore, we demonstrated <strong>how to harness HeatWave\u2019s support for JavaScript-based stored programs<\/strong> to sanitize content, generate summaries, and compute embeddings \u2014 all within the database. This seamless integration of AI-powered search within HeatWave showcases its potential for efficient, scalable, and intelligent information retrieval.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With these enhancements, our search engine is now more capable of understanding and delivering relevant results, although  we still can further optimize performance by experimenting with different embedding models, or even integrating additional AI-driven ranking techniques. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Stay tuned for more insights!<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><em>To be continued&#8230;<\/em><\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/www.linkedin.com\/groups\/12524512\/\" target=\"_blank\" rel=\"noopener\" title=\"Olivier DASINI on Linkedin\">Follow me on Linkedin<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Watch my videos on my <a href=\"https:\/\/www.youtube.com\/channel\/UC12TulyJsJZHoCmby3Nm3WQ\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier's MySQL Channel\">YouTube channel<\/a> and <a href=\"https:\/\/www.youtube.com\/channel\/UC12TulyJsJZHoCmby3Nm3WQ\/?sub_confirmation=1\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Subscribe\">subscribe<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">My <a href=\"https:\/\/speakerdeck.com\/freshdaz\/\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier DASINI on Speaker Deck\">Speaker Deck account<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">My <a href=\"https:\/\/www.slideshare.net\/freshdaz\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier DASINI on Slideshare\">Slideshare account<\/a> (<em>archive<\/em>).<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"has-vivid-red-color has-text-color wp-block-paragraph\"><strong>Thanks for using HeatWave &amp; MySQL!<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this part 2 we focused on enhancing search relevance. We introduced reranking techniques using weighted distances of titles and excerpts to refine initial search results. Then we delved into leveraging article summaries for more effective semantic search, utilizing HeatWave&rsquo;s capability to execute JavaScript stored procedures for sanitizing HTML content and generating these summaries. Finally, we demonstrated how to create embeddings from these summaries and perform similarity searches, showcasing HeatWave GenAI&rsquo;s power for advanced information retrieval directly within the database.<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1702,1740,1694,339],"tags":[1704,306,1700,1697,1742,1738],"class_list":["post-7566","post","type-post","status-publish","format-standard","hentry","category-ai","category-artificial-intelligence","category-heatwave-en","category-tuto-en","tag-ai","tag-cloud","tag-genai","tag-heatwave-fr-en","tag-javascript","tag-llm"],"aioseo_notices":[],"aioseo_head":"\n\t\t<!-- All in One SEO 4.9.9 - aioseo.com -->\n\t<meta name=\"description\" content=\"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,\" \/>\n\t<meta name=\"robots\" content=\"max-image-preview:large\" \/>\n\t<meta name=\"author\" content=\"Olivier DASINI\"\/>\n\t<meta name=\"google-site-verification\" content=\"HN6x37mM2NBzUxcsZUKC8OY_DxAioI6oKazYPPNq02c\" \/>\n\t<meta name=\"keywords\" content=\"ai,cloud,genai,heatwave,javascript,llm,artificial intelligence,tuto\" \/>\n\t<link rel=\"canonical\" href=\"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/\" \/>\n\t<meta name=\"generator\" content=\"All in One SEO (AIOSEO) 4.9.9\" \/>\n\t\t<meta property=\"og:locale\" content=\"fr_FR\" \/>\n\t\t<meta property=\"og:site_name\" content=\"Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights | Exploring Beyond the Limits of Knowledge\" \/>\n\t\t<meta property=\"og:type\" content=\"article\" \/>\n\t\t<meta property=\"og:title\" content=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights\" \/>\n\t\t<meta property=\"og:description\" content=\"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,\" \/>\n\t\t<meta property=\"og:url\" content=\"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/\" \/>\n\t\t<meta property=\"article:published_time\" content=\"2025-04-08T07:50:32+00:00\" \/>\n\t\t<meta property=\"article:modified_time\" content=\"2025-04-15T08:28:13+00:00\" \/>\n\t\t<meta name=\"twitter:card\" content=\"summary\" \/>\n\t\t<meta name=\"twitter:site\" content=\"@freshdaz\" \/>\n\t\t<meta name=\"twitter:title\" content=\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights\" \/>\n\t\t<meta name=\"twitter:description\" content=\"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,\" \/>\n\t\t<meta name=\"twitter:creator\" content=\"@freshdaz\" \/>\n\t\t<script type=\"application\/ld+json\" class=\"aioseo-schema\">\n\t\t\t{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#article\",\"name\":\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights\",\"headline\":\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)\",\"author\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/author\\\/freshdaz\\\/#author\"},\"publisher\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/#person\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/i0.wp.com\\\/dasini.net\\\/blog\\\/wp-content\\\/uploads\\\/2025\\\/04\\\/HWGenAIsearchEngine2.gif?fit=2560%2C1440&ssl=1\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#articleImage\",\"width\":2560,\"height\":1440,\"caption\":\"Similarity search across title, excerpt and summary\"},\"datePublished\":\"2025-04-08T08:50:32+01:00\",\"dateModified\":\"2025-04-15T09:28:13+01:00\",\"inLanguage\":\"fr-FR\",\"commentCount\":2,\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#webpage\"},\"isPartOf\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#webpage\"},\"articleSection\":\"AI, Artificial Intelligence, HeatWave, Tuto, AI, Cloud, GenAI, Heatwave, JavaScript, LLM, English\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#breadcrumblist\",\"itemListElement\":[{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog#listItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/dasini.net\\\/blog\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/category\\\/tuto-en\\\/#listItem\",\"name\":\"Tuto\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/category\\\/tuto-en\\\/#listItem\",\"position\":2,\"name\":\"Tuto\",\"item\":\"https:\\\/\\\/dasini.net\\\/blog\\\/category\\\/tuto-en\\\/\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#listItem\",\"name\":\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)\"},\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog#listItem\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#listItem\",\"position\":3,\"name\":\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)\",\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/category\\\/tuto-en\\\/#listItem\",\"name\":\"Tuto\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/#person\",\"name\":\"Olivier DASINI\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#personImage\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d46da4fb4b99ea9ce527a93ce9b998a446577e42e1f132d8437235fa5b963636?s=96&d=mm&r=g\",\"width\":96,\"height\":96,\"caption\":\"Olivier DASINI\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/author\\\/freshdaz\\\/#author\",\"url\":\"https:\\\/\\\/dasini.net\\\/blog\\\/author\\\/freshdaz\\\/\",\"name\":\"Olivier DASINI\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#authorImage\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/d46da4fb4b99ea9ce527a93ce9b998a446577e42e1f132d8437235fa5b963636?s=96&d=mm&r=g\",\"width\":96,\"height\":96,\"caption\":\"Olivier DASINI\"}},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#webpage\",\"url\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/\",\"name\":\"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights\",\"description\":\"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,\",\"inLanguage\":\"fr-FR\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/#website\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/2025\\\/04\\\/08\\\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\\\/#breadcrumblist\"},\"author\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/author\\\/freshdaz\\\/#author\"},\"creator\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/author\\\/freshdaz\\\/#author\"},\"datePublished\":\"2025-04-08T08:50:32+01:00\",\"dateModified\":\"2025-04-15T09:28:13+01:00\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/dasini.net\\\/blog\\\/\",\"name\":\"dasini.net - Diary of a MySQL expert\",\"description\":\"Exploring Beyond the Limits of Knowledge\",\"inLanguage\":\"fr-FR\",\"publisher\":{\"@id\":\"https:\\\/\\\/dasini.net\\\/blog\\\/#person\"}}]}\n\t\t<\/script>\n\t\t<!-- All in One SEO -->\n\n","aioseo_head_json":{"title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights","description":"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,","canonical_url":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/","robots":"max-image-preview:large","keywords":"ai,cloud,genai,heatwave,javascript,llm,artificial intelligence,tuto","webmasterTools":{"google-site-verification":"HN6x37mM2NBzUxcsZUKC8OY_DxAioI6oKazYPPNq02c","miscellaneous":""},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#article","name":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights","headline":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)","author":{"@id":"https:\/\/dasini.net\/blog\/author\/freshdaz\/#author"},"publisher":{"@id":"https:\/\/dasini.net\/blog\/#person"},"image":{"@type":"ImageObject","url":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?fit=2560%2C1440&ssl=1","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#articleImage","width":2560,"height":1440,"caption":"Similarity search across title, excerpt and summary"},"datePublished":"2025-04-08T08:50:32+01:00","dateModified":"2025-04-15T09:28:13+01:00","inLanguage":"fr-FR","commentCount":2,"mainEntityOfPage":{"@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#webpage"},"isPartOf":{"@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#webpage"},"articleSection":"AI, Artificial Intelligence, HeatWave, Tuto, AI, Cloud, GenAI, Heatwave, JavaScript, LLM, English"},{"@type":"BreadcrumbList","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#breadcrumblist","itemListElement":[{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog#listItem","position":1,"name":"Home","item":"https:\/\/dasini.net\/blog","nextItem":{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog\/category\/tuto-en\/#listItem","name":"Tuto"}},{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog\/category\/tuto-en\/#listItem","position":2,"name":"Tuto","item":"https:\/\/dasini.net\/blog\/category\/tuto-en\/","nextItem":{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#listItem","name":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)"},"previousItem":{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog#listItem","name":"Home"}},{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#listItem","position":3,"name":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)","previousItem":{"@type":"ListItem","@id":"https:\/\/dasini.net\/blog\/category\/tuto-en\/#listItem","name":"Tuto"}}]},{"@type":"Person","@id":"https:\/\/dasini.net\/blog\/#person","name":"Olivier DASINI","image":{"@type":"ImageObject","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#personImage","url":"https:\/\/secure.gravatar.com\/avatar\/d46da4fb4b99ea9ce527a93ce9b998a446577e42e1f132d8437235fa5b963636?s=96&d=mm&r=g","width":96,"height":96,"caption":"Olivier DASINI"}},{"@type":"Person","@id":"https:\/\/dasini.net\/blog\/author\/freshdaz\/#author","url":"https:\/\/dasini.net\/blog\/author\/freshdaz\/","name":"Olivier DASINI","image":{"@type":"ImageObject","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#authorImage","url":"https:\/\/secure.gravatar.com\/avatar\/d46da4fb4b99ea9ce527a93ce9b998a446577e42e1f132d8437235fa5b963636?s=96&d=mm&r=g","width":96,"height":96,"caption":"Olivier DASINI"}},{"@type":"WebPage","@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#webpage","url":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/","name":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights","description":"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,","inLanguage":"fr-FR","isPartOf":{"@id":"https:\/\/dasini.net\/blog\/#website"},"breadcrumb":{"@id":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/#breadcrumblist"},"author":{"@id":"https:\/\/dasini.net\/blog\/author\/freshdaz\/#author"},"creator":{"@id":"https:\/\/dasini.net\/blog\/author\/freshdaz\/#author"},"datePublished":"2025-04-08T08:50:32+01:00","dateModified":"2025-04-15T09:28:13+01:00"},{"@type":"WebSite","@id":"https:\/\/dasini.net\/blog\/#website","url":"https:\/\/dasini.net\/blog\/","name":"dasini.net - Diary of a MySQL expert","description":"Exploring Beyond the Limits of Knowledge","inLanguage":"fr-FR","publisher":{"@id":"https:\/\/dasini.net\/blog\/#person"}}]},"og:locale":"fr_FR","og:site_name":"Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights | Exploring Beyond the Limits of Knowledge","og:type":"article","og:title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights","og:description":"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,","og:url":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/","article:published_time":"2025-04-08T07:50:32+00:00","article:modified_time":"2025-04-15T08:28:13+00:00","twitter:card":"summary","twitter:site":"@freshdaz","twitter:title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2) | Data Daz (dasini.net) - Data Systems, AI, and Real-World Insights","twitter:description":"In Build an AI-Powered Search Engine with HeatWave GenAI (part 1), we explored how to build an AI-powered search engine using HeatWave GenAI. We highlighted the advantages of AI-driven semantic search over traditional SQL-based methods and provided a detailed guide on generating embeddings and conducting similarity searches. These techniques enhance the retrieval of relevant articles,","twitter:creator":"@freshdaz"},"aioseo_meta_data":{"post_id":"7566","title":null,"description":null,"keywords":null,"keyphrases":{"focus":{"keyphrase":"","score":0,"analysis":{"keyphraseInTitle":{"score":0,"maxScore":9,"error":1}}},"additional":[]},"primary_term":null,"canonical_url":null,"og_title":null,"og_description":null,"og_object_type":"default","og_image_type":"default","og_image_url":null,"og_image_width":null,"og_image_height":null,"og_image_custom_url":null,"og_image_custom_fields":null,"og_video":"","og_custom_url":null,"og_article_section":null,"og_article_tags":null,"twitter_use_og":false,"twitter_card":"default","twitter_image_type":"default","twitter_image_url":null,"twitter_image_custom_url":null,"twitter_image_custom_fields":null,"twitter_title":null,"twitter_description":null,"schema":{"blockGraphs":[],"customGraphs":[],"default":{"data":{"Article":[],"Course":[],"Dataset":[],"FAQPage":[],"Movie":[],"Person":[],"Product":[],"ProductReview":[],"Car":[],"Recipe":[],"Service":[],"SoftwareApplication":[],"WebPage":[]},"graphName":"Article","isEnabled":true},"graphs":[]},"schema_type":"default","schema_type_options":null,"pillar_content":false,"robots_default":true,"robots_noindex":false,"robots_noarchive":false,"robots_nosnippet":false,"robots_nofollow":false,"robots_noimageindex":false,"robots_noodp":false,"robots_notranslate":false,"robots_max_snippet":"-1","robots_max_videopreview":"-1","robots_max_imagepreview":"large","priority":null,"frequency":"default","location":null,"local_seo":null,"breadcrumb_settings":null,"limit_modified_date":false,"ai":null,"created":"2025-03-31 09:41:53","updated":"2025-06-24 10:32:35","seo_analyzer_scan_date":null},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9LfWW-1Y2","jetpack-related-posts":[{"id":7711,"url":"https:\/\/dasini.net\/blog\/2025\/04\/15\/build-an-ai-powered-search-engine-with-heatwave-genai-part-3\/","url_meta":{"origin":7566,"position":0},"title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 3)","author":"Olivier DASINI","date":"15 avril 2025","format":false,"excerpt":"In this latest post, the final part of my series on building an AI-powered search engine with HeatWave GenAI, I dive into enhancing AI-powered search by embedding full article content into HeatWave. By cleaning HTML, chunking content, generating embeddings, and running semantic similarity searches directly within HeatWave, we unlock highly\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":7363,"url":"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/","url_meta":{"origin":7566,"position":1},"title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)","author":"Olivier DASINI","date":"13 mars 2025","format":false,"excerpt":"Discover how to build an AI-powered search engine for your applications using HeatWave GenAI. This approach leverages large language models (LLMs) for semantic search, offering a smarter alternative to traditional SQL and full-text search methods. By using embeddings\u2014vector representations of words\u2014the search engine understands context and intent, delivering more relevant\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":7812,"url":"https:\/\/dasini.net\/blog\/2025\/05\/13\/oracle-dev-days-2025-french-edition\/","url_meta":{"origin":7566,"position":2},"title":"Oracle Dev Days 2025 \u2013 French Edition","author":"Olivier DASINI","date":"13 mai 2025","format":false,"excerpt":"Join the Oracle Dev Days \u2013 French Edition, from May 20 to 22, 2025! This must-attend event (in French) offers a rich program exploring the latest advancements in AI, databases, cloud, and Java. Join me on May 21 at 2:00 PM for the day dedicated to \u201cDatabase & AI.\u201d I\u2019ll\u2026","rel":"","context":"Dans &quot;Conf\u00e9rence&quot;","block_context":{"text":"Conf\u00e9rence","link":"https:\/\/dasini.net\/blog\/category\/conference-en\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":8533,"url":"https:\/\/dasini.net\/blog\/2026\/05\/14\/building-an-ai-vision-search-engine-with-mysql-heatwave-genai\/","url_meta":{"origin":7566,"position":3},"title":"Building an AI Vision Search Engine with MySQL HeatWave GenAI","author":"Olivier DASINI","date":"14 mai 2026","format":false,"excerpt":"Modern AI systems increasingly rely on multimodal data: text, images, documents, audio, and video. Among these modalities, image understanding has become one of the most important capabilities for AI-powered applications. Traditionally, implementing these capabilities required specialized computer vision infrastructure, external vector databases, custom ML pipelines, and multiple frameworks. With MySQL\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2026\/05\/5.2_workflow_Reverse_Image_Search.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":7252,"url":"https:\/\/dasini.net\/blog\/2025\/02\/11\/building-an-interactive-llm-chatbot-with-heatwave-using-python\/","url_meta":{"origin":7566,"position":4},"title":"Building an Interactive LLM Chatbot with  HeatWave Using Python","author":"Olivier DASINI","date":"11 f\u00e9vrier 2025","format":false,"excerpt":"AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities. In this article, we will explore a Python 3 script that connects to an\u2026","rel":"","context":"Dans &quot;HeatWave&quot;","block_context":{"text":"HeatWave","link":"https:\/\/dasini.net\/blog\/category\/heatwave-en\/"},"img":{"alt_text":"simple but robust chatbot system leveraging HeatWave GenAI and its in-database Mistral LLM","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":7058,"url":"https:\/\/dasini.net\/blog\/2024\/12\/10\/simplifying-ai-development-a-practical-guide-to-heatwave-genais-rag-vector-store-features\/","url_meta":{"origin":7566,"position":5},"title":"Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp; Vector Store Features","author":"Olivier DASINI","date":"10 d\u00e9cembre 2024","format":false,"excerpt":"This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=1400%2C800&ssl=1 4x"},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7566","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/comments?post=7566"}],"version-history":[{"count":135,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7566\/revisions"}],"predecessor-version":[{"id":7791,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7566\/revisions\/7791"}],"wp:attachment":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/media?parent=7566"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/categories?post=7566"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/tags?post=7566"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}