
{"id":7363,"date":"2025-03-13T10:19:49","date_gmt":"2025-03-13T09:19:49","guid":{"rendered":"https:\/\/dasini.net\/blog\/?p=7363"},"modified":"2025-04-04T10:27:22","modified_gmt":"2025-04-04T09:27:22","slug":"build-an-ai-powered-search-engine-with-heatwave-genai-part-1","status":"publish","type":"post","link":"https:\/\/dasini.net\/blog\/2025\/03\/13\/build-an-ai-powered-search-engine-with-heatwave-genai-part-1\/","title":{"rendered":"Build an AI-Powered Search Engine with HeatWave GenAI (part 1)"},"content":{"rendered":"\n<p>This article builds upon the concepts introduced in my previous blog posts:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/dasini.net\/blog\/2025\/02\/11\/building-an-interactive-llm-chatbot-with-heatwave-using-python\/\" target=\"_blank\" rel=\"noopener\" title=\"Building an Interactive LLM Chatbot with HeatWave Using Python\">Building an Interactive LLM Chatbot with HeatWave Using Python<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/dasini.net\/blog\/2024\/12\/10\/simplifying-ai-development-a-practical-guide-to-heatwave-genais-rag-vector-store-features\/\" target=\"_blank\" rel=\"noopener\" title=\"Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp; Vector Store Features\">Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp; Vector Store Features<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/dasini.net\/blog\/2024\/09\/10\/heatwave-genai-sentiment-analysis-made-easy-peasy\/\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\">HeatWave GenAI: Sentiment Analysis Made Easy-Peasy<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/dasini.net\/blog\/2024\/08\/07\/heatwave-genai-your-ai-powered-content-creation-partner\/\" target=\"_blank\" rel=\"noopener\" title=\"&nbsp;HeatWave GenAI: Your AI-Powered Content Creation Partner\">HeatWave GenAI: Your AI-Powered Content Creation Partner<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/dasini.net\/blog\/2024\/08\/13\/in-database-llms-for-efficient-text-translation-with-heatwave-genai\/\" target=\"_blank\" rel=\"noopener\" title=\"In-Database LLMs for Efficient Text Translation with HeatWave GenAI\">In-Database LLMs for Efficient Text Translation with HeatWave GenAI<\/a> <\/li>\n\n\n\n<li><a href=\"https:\/\/dasini.net\/blog\/2024\/09\/10\/heatwave-genai-sentiment-analysis-made-easy-peasy\/\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\">HeatWave GenAI: Sentiment Analysis Made Easy-Peasy<\/a><\/li>\n<\/ul>\n\n\n\n<p>For a deeper understanding, also consider reading these articles.<\/p>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;6a0391b646056&quot;}\" data-wp-interactive=\"core\/image\" data-wp-key=\"6a0391b646056\" class=\"aligncenter size-thumbnail wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"830\" height=\"271\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on--click=\"actions.showLightbox\" data-wp-on--load=\"callbacks.setButtonStyles\" data-wp-on-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_logo.png?resize=150%2C150&amp;ssl=1\" alt=\"HeatWave\" class=\"wp-image-7222\" srcset=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_logo.png?w=830&amp;ssl=1 830w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_logo.png?resize=300%2C98&amp;ssl=1 300w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_logo.png?resize=800%2C261&amp;ssl=1 800w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_logo.png?resize=768%2C251&amp;ssl=1 768w\" sizes=\"auto, (max-width: 830px) 100vw, 830px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Agrandir\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"state.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n<\/div>\n\n\n<p>Traditional <strong>SQL search relies on structured queries<\/strong> (SELECT, WHERE, JOIN, &#8230;) and exact or partial matches based on conditions (e.g., <code>WHERE name = 'Olivier'<\/code> \/  <code>WHERE <\/code>name <code>LIKE '%Olivier%'<\/code>).<\/p>\n\n\n\n<p>A typical query may look like:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT title \nFROM articles \nWHERE Category = 'HeatWave' OR tag LIKE \"%AI%\";<\/code><\/pre>\n\n\n\n<p>While efficient for structured data,  it has limited flexibility for search variations and fails to grasp context or intent.<\/p>\n\n\n\n<p>An alternative is <strong>SQL <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/fulltext-search.html\" target=\"_blank\" rel=\"noopener\" title=\"MySQL Full-Text Search\">Full-Text Search<\/a><\/strong> (<strong>FTS<\/strong>), which enables efficient keyword-based searches across large text datasets. For example, MySQL implements FTS using <code>MATCH<\/code> and <code>AGAINST<\/code> (e.g. MATCH(name) AGAINST(&lsquo;Olivier&rsquo;)). This feature indexes text content within database columns, allowing for advanced search capabilities such as phrase matching, proximity searches, and relevance scoring.<\/p>\n\n\n\n<p>A typical query may look like:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT * FROM articles\nWHERE MATCH (title, body)\nAGAINST ('HeatWave' IN NATURAL LANGUAGE MODE);<\/code><\/pre>\n\n\n\n<p>FTS is usually faster and more relevant than basic SQL searches, efficiently handling large text fields. However, it remains keyword-based rather than semantic, meaning it may overlook context-based variations.<\/p>\n\n\n\n<p>Another option is <strong>AI-powered search<\/strong> using <strong>large language models<\/strong> (<strong>LLMs<\/strong>), also known as <strong>semantic search<\/strong>. Unlike keyword-based methods, it leverages <strong>embeddings<\/strong> \u2014 vector representations of words or sentences \u2014 to understand meaning. This enables it to handle synonyms, paraphrasing, and contextual relationships (e.g., searching for &lsquo;AI&rsquo; may also return articles on &lsquo;machine learning&rsquo;). Additionally, it often integrates <a href=\"https:\/\/dasini.net\/blog\/2024\/12\/10\/simplifying-ai-development-a-practical-guide-to-heatwave-genais-rag-vector-store-features\/\" target=\"_blank\" rel=\"noopener\" title=\"Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp; Vector Store Features\"><strong>retrieval-augmented generation<\/strong> <\/a>(<strong>RAG<\/strong>) to enhance responses with external knowledge.<\/p>\n\n\n\n<p>In this article, we&rsquo;ll dive deeper into AI-powered search using an LLM with the help of <a href=\"https:\/\/www.oracle.com\/heatwave\/genai\/\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave GenAI provides integrated, automated, and secure generative AI with in-database large language models (LLMs)\"><strong>HeatWave GenAI<\/strong><\/a>&#8230;<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;6a0391b646699&quot;}\" data-wp-interactive=\"core\/image\" data-wp-key=\"6a0391b646699\" class=\"aligncenter size-large wp-lightbox-container\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"299\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on--click=\"actions.showLightbox\" data-wp-on--load=\"callbacks.setButtonStyles\" data-wp-on-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/Key-features-of-HeatWave-GenAI.png?resize=800%2C299&#038;ssl=1\" alt=\"Key features of HeatWave GenAI: In-database LLMS, In_database vector store, Automated generation of embeddings, HeatWave Chat, Integrated with other generative AI services, Scale_out vector processing\" class=\"wp-image-7475\" srcset=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/Key-features-of-HeatWave-GenAI.png?resize=800%2C299&amp;ssl=1 800w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/Key-features-of-HeatWave-GenAI.png?resize=300%2C112&amp;ssl=1 300w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/Key-features-of-HeatWave-GenAI.png?resize=768%2C287&amp;ssl=1 768w, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/Key-features-of-HeatWave-GenAI.png?w=1344&amp;ssl=1 1344w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Agrandir\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"state.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><figcaption class=\"wp-element-caption\"><em><span style=\"text-decoration: underline;\">Key features of HeatWave GenAI<\/span><\/em><\/figcaption><\/figure>\n<\/div>\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">What we want to do?<\/h2>\n\n\n\n<p>The goal is to build an <strong>AI-powered search engine for an application<\/strong>, designed to provide users with the most relevant articles based on their queries, using semantic search. <br>I&rsquo;ll be using the data from my WordPress-based blog \u2014 <a href=\"https:\/\/dasini.net\/blog\/\" target=\"_blank\" rel=\"noopener\" title=\"https:\/\/dasini.net\/blog\/\">https:\/\/dasini.net\/blog\/<\/a> \u2014 with the <strong>AI component powered by HeatWave GenAI<\/strong>. This leverages its in-database large language models and vector store capabilities (<a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hw-genai-supported-models.html#mys-hw-genai-embedding-models\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave In-Database Embedding Models\">In-Database Embedding Models<\/a>, <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector.html\" target=\"_blank\" rel=\"noopener\" title=\"The HeatWave MySQL VECTOR Type\">The VECTOR Type<\/a>, &nbsp;<a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector-functions.html\" target=\"_blank\" rel=\"noopener\" title=\"HeatWave MySQL Vector Functions\">Vector Functions<\/a>).<br>To ensure a focused and concise presentation, I&rsquo;ll simplify this implementation by limiting the search to post titles and excerpts rather than the full article. Although this will significantly reduce the relevance of the results, it provides an opportunity to explore more comprehensive solutions in a future article.<\/p>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>I&rsquo;am using HeatWave 9.2.1:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT version();\n+-------------+\n| version()   |\n+-------------+\n| 9.2.1-cloud |\n+-------------+<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Create the table which will contain the embeddings<\/h2>\n\n\n\n<p>In <a href=\"https:\/\/wordpress.org\/download\/\" target=\"_blank\" rel=\"noopener\" title=\"Get WordPress\">WordPress<\/a> the blog posts are stored in the table: <em><a href=\"https:\/\/codex.wordpress.org\/Database_Description#Table:_wp_posts\" target=\"_blank\" rel=\"noopener\" title=\"The core of the WordPress data is the posts\">wp_posts<\/a><\/em>. The important columns to reach our goal are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><em><strong>ID<\/strong><\/em> bigint unsigned NOT NULL AUTO_INCREMENT,<\/li>\n\n\n\n<li><em><strong>post_title<\/strong><\/em> text NOT NULL<\/li>\n\n\n\n<li><em><strong>post_excerpt<\/strong><\/em> text NOT NULL<\/li>\n\n\n\n<li><em><strong>guid<\/strong><\/em> varchar(255) NOT NULL DEFAULT \u00a0\u00bb<\/li>\n<\/ul>\n\n\n\n<p>The columns contains respectively, the unique identifier of the post, its title, a short excerpt (hopefully) of the post and the URL of the article.<br>Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nSELECT ID, post_title, post_excerpt, guid from wp_posts WHERE post_status = 'publish' AND post_type = 'post' AND ID = 1234\\G\n*************************** 1. row ***************************\n          ID: 1234\n  post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\npost_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\n        guid: https:\/\/dasini.net\/blog\/?p=1234<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Based on these information, I created a table, <em>wp_posts_embeddings_minilm<\/em>, that contains the 4 columns:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nCREATE TABLE `wp_posts_embeddings_minilm` (\n  `ID` bigint unsigned NOT NULL AUTO_INCREMENT,\n  `post_title` text NOT NULL,\n  `post_excerpt` text NOT NULL,\n  `guid` varchar(255) NOT NULL DEFAULT '',\n  PRIMARY KEY (`ID`),\n  KEY `post_title` (`post_title`(255)),\n  KEY `post_excerpt` (`post_excerpt`(255)),\n  KEY `guid` (`guid`)\n) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Then I populate this new table with my published posts:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nINSERT INTO wp_posts_embeddings_minilm SELECT ID, post_title, post_excerpt, guid FROM wp_posts WHERE post_status = 'publish' AND post_type = 'post';<\/code><\/pre>\n\n\n\n<p>End of the first step. Now we need to create the embeddings for each published posts. <\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Create the embeddings<\/h2>\n\n\n\n<p>We can query HeatWave&rsquo;s <em>sys.ML_SUPPORTED_LLMS<\/em> table to know what embedding model we can use:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nSELECT * FROM sys.ML_SUPPORTED_LLMS WHERE model_type = \"embedding\";\n+--------------------------------------+------------+\n| model_name                           | model_type |\n+--------------------------------------+------------+\n| minilm                               | embedding  |\n| all_minilm_l12_v2                    | embedding  |\n| multilingual-e5-small                | embedding  |\n| cohere.embed-english-light-v3.0      | embedding  |\n| cohere.embed-multilingual-v3.0       | embedding  |\n| cohere.embed-multilingual-light-v3.0 | embedding  |\n| cohere.embed-english-v3.0            | embedding  |\n+--------------------------------------+------------+<\/code><\/pre>\n\n\n\n<p>The 3 firsts (<strong>minilm<\/strong>, <strong>all_minilm_l12_v2<\/strong> &amp; <strong>multilingual-e5-small<\/strong>) are HeatWave&rsquo;s in-database embedding models, there is<strong> no extra cost<\/strong> to use them (minilm and all_minilm_l12_v2 are 2 different names for the same model). Both are for <strong>encoding text or files in any supported language<\/strong>. HeatWave GenAI uses <strong>minilm<\/strong>, by default, for encoding English documents.&nbsp;While my blog also contains French article I&rsquo;ll use this LLM.<\/p>\n\n\n\n<p>The other models are from&nbsp;<strong><a href=\"https:\/\/www.oracle.com\/artificial-intelligence\/generative-ai\/generative-ai-service\/\" target=\"_blank\" rel=\"noopener\" title=\"OCI Generative AI Service\">OCI Generative AI Service<\/a><\/strong>. They can also be used in the HeatWave workflow however their use will incur additional costs.<\/p>\n\n\n\n<p>The comprehensive list of the languages, embedding models, and large language models (LLMs) that HeatWave GenAI supports is available&nbsp;<a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hw-genai-supported-models.html\" target=\"_blank\" rel=\"noopener\" title=\"Supported Languages, Embedding Models, and LLMs\">here<\/a>.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>We are going to use the <em>minilm<\/em> model. The idea is to use this embedding model to encode the rows into a vector embedding. Embeddings creation is very easy with HeatWave, we only need to use 1 routine: <a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-table.html\" target=\"_blank\" rel=\"noopener\" title=\"The ML_EMBED_TABLE routine runs multiple embedding generations in a batch, in parallel\"><strong>sys.ML_EMBED_TABLE<\/strong><\/a>. This stored procedure runs multiple embedding generations in a batch, in parallel.<\/p>\n\n\n\n<p>As a reminder this is what the <em>wp_posts_embeddings_minilm<\/em> table looks like:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">DESC wp_posts_embeddings_minilm;\n+--------------+-----------------+------+-----+---------+----------------+\n| Field        | Type            | Null | Key | Default | Extra          |\n+--------------+-----------------+------+-----+---------+----------------+\n| ID           | bigint unsigned | NO   | PRI | NULL    | auto_increment |\n| post_title   | text            | NO   | MUL | NULL    |                |\n| post_excerpt | text            | NO   | MUL | NULL    |                |\n| guid         | varchar(255)    | NO   | MUL |         |                |\n+--------------+-----------------+------+-----+---------+----------------+<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Let&rsquo;s encode the title, with <a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-table.html\" target=\"_blank\" rel=\"noopener\" title=\"The ML_EMBED_TABLE routine runs multiple embedding generations in a batch, in parallel\">sys.ML_EMBED_TABLE<\/a>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nCALL sys.ML_EMBED_TABLE(\"wordpress.wp_posts_embeddings_minilm.post_title\", \"wordpress.wp_posts_embeddings_minilm.post_title_embedding\", JSON_OBJECT(\"model_id\", \"minilm\"));<\/code><\/pre>\n\n\n\n<p>A new column &#8211; <em>post_title_embedding<\/em> &#8211; was added. It used the <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector.html\" target=\"_blank\" rel=\"noopener\" title=\"The MySQL VECTOR Type\">vector type<\/a>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">DESC wp_posts_embeddings_minilm;\n+----------------------+-----------------+------+-----+---------+----------------+\n| Field                | Type            | Null | Key | Default | Extra          |\n+----------------------+-----------------+------+-----+---------+----------------+\n| ID                   | bigint unsigned | NO   | PRI | NULL    | auto_increment |\n| post_title           | text            | NO   | MUL | NULL    |                |\n| post_excerpt         | text            | NO   | MUL | NULL    |                |\n| guid                 | varchar(255)    | NO   | MUL |         |                |\n| post_title_embedding | vector(2048)    | NO   |     | NULL    |                |\n+----------------------+-----------------+------+-----+---------+----------------+<\/code><\/pre>\n\n\n\n<p>If you want to check the content, use the <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.1\/en\/vector-functions.html#function_vector-to-string\" target=\"_blank\" rel=\"noopener\" title=\"Given the binary representation of a VECTOR column value, this function returns its string representation\">FROM_VECTOR<\/a> (alias of <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.1\/en\/vector-functions.html#function_vector-to-string\" target=\"_blank\" rel=\"noopener\" title=\"Given the binary representation of a VECTOR column value, this function returns its string representation\">VECTOR_TO_STRING<\/a>) function:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT post_title, FROM_VECTOR(post_title_embedding) FROM wp_posts_embeddings_minilm WHERE ID = 1234\\G\n*************************** 1. row ***************************\n                       post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\nFROM_VECTOR(post_title_embedding): [-6.91916e-02,-8.97512e-02,4.70568e-02,2.00090e-04,2.08057e-03,-3.68097e-02, ...<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Now let&rsquo;s create the embeddings for the excerpt:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">CALL sys.ML_EMBED_TABLE(\"wordpress.wp_posts_embeddings_minilm.post_excerpt\", \"wordpress.wp_posts_embeddings_minilm.post_excerpt_embedding\", JSON_OBJECT(\"model_id\", \"minilm\"));<\/code><\/pre>\n\n\n\n<p>Now the new table structure is:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">DESC wp_posts_embeddings_minilm;\n+------------------------+-----------------+------+-----+---------+----------------+\n| Field                  | Type            | Null | Key | Default | Extra          |\n+------------------------+-----------------+------+-----+---------+----------------+\n| ID                     | bigint unsigned | NO   | PRI | NULL    | auto_increment |\n| post_title             | text            | NO   | MUL | NULL    |                |\n| post_excerpt           | text            | NO   | MUL | NULL    |                |\n| guid                   | varchar(255)    | NO   | MUL |         |                |\n| post_title_embedding   | vector(2048)    | NO   |     | NULL    |                |\n| post_excerpt_embedding | vector(2048)    | NO   |     | NULL    |                |\n+------------------------+-----------------+------+-----+---------+----------------+<\/code><\/pre>\n\n\n\n<p>Title and excerpt embeddings have been created!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SHOW CREATE TABLE wp_posts_embeddings_minilm\\G\n*************************** 1. row ***************************\n       Table: wp_posts_embeddings_minilm\nCreate Table: CREATE TABLE `wp_posts_embeddings_minilm` (\n  `ID` bigint unsigned NOT NULL AUTO_INCREMENT,\n  `post_title` text NOT NULL,\n  `post_excerpt` text NOT NULL,\n  `guid` varchar(255) NOT NULL DEFAULT '',\n  `post_title_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  `post_excerpt_embedding` vector(2048) NOT NULL COMMENT 'GENAI_OPTIONS=EMBED_MODEL_ID=minilm',\n  PRIMARY KEY (`ID`),\n  KEY `post_title` (`post_title`(255)),\n  KEY `post_excerpt` (`post_excerpt`(255)),\n  KEY `guid` (`guid`)\n) ENGINE=InnoDB AUTO_INCREMENT=7059 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Query Encoding and Vector Similarity Operations<\/h2>\n\n\n\n<p>To retrieve the most relevant articles, we first need to convert the user\u2019s query into a vector representation that captures its semantic meaning. This process, known as query encoding, transforms text into numerical embeddings. Once encoded, we can perform a similarity search to find the most relevant results by comparing the query\u2019s embedding with precomputed vectors from our database.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"encodeQuery\">Encode the query into a vector embedding<\/h3>\n\n\n\n<p>To generate a vector embedding for the query, we use the <code><code><code><a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-row.html\" target=\"_blank\" rel=\"noopener\" title=\"ML_EMBED_ROW uses the specified embedding model to encode the specified text or query into a vector embedding.\">ML_EMBED_ROW<\/a><\/code><\/code><\/code> routine. This function applies the specified embedding model to encode the given text into a vector representation. The routine returns a <code><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector.html\" target=\"_blank\" rel=\"noopener\" title=\"The MySQL VECTOR Type\"><code>VECTOR<\/code><\/a><\/code> containing the numerical embedding of the text.<\/p>\n\n\n\n<p>Using it is straightforward. I define two variables: <code>@searchItem<\/code> (the text to encode) and <code>@embeddOptions<\/code> (the embedding model used for encoding):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \n-- Set variables\nSET @embeddOptions = '{\"model_id\": \"minilm\"}';\nSET @searchItem = \"Generative artificial intelligence\";\n\n-- Encode the query using the embedding model\nSELECT sys.ML_EMBED_ROW(@searchItem, @embeddOptions) into @searchItemEmbedding;<\/code><\/pre>\n\n\n\n<p>If you want to see the content of the variable (not sure it is a good idea):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT @searchItemEmbedding;\n0x94927EBDB244843CB36A4E3CC747A2BC7EBFC6BDB5F4B7B... (truncated)<\/code><\/pre>\n\n\n\n<p>You can print the vector using the following query;<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">SELECT FROM_VECTOR(vect) \nFROM (\n      SELECT sys.ML_EMBED_ROW(@searchItem, @embeddOptions) AS vect\n     ) AS dt;\n[-6.21515e-02,1.61460e-02,1.25987e-02,-1.98096e-02,... (truncated)<\/code><\/pre>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Similarity search<\/h3>\n\n\n\n<p>To retrieve relevant blog content, we perform vector similarity calculations using the <code><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector-functions.html#function_distance\" target=\"_blank\" rel=\"noopener\" title=\"Calculates the distance between two vectors per the specified calculation method\"><strong>DISTANCE<\/strong><\/a><\/code> function. This function computes the distance between two vectors using <code><strong>COSINE<\/strong><\/code>, <code><strong>DOT<\/strong><\/code>, or <code><strong>EUCLIDEAN<\/strong><\/code> distance metrics. <br>In our case, the two vectors being compared are the encoded query (<code>@searchItemEmbedding<\/code>) and the precomputed embeddings stored in the <code>wp_posts_embeddings_minilm<\/code> table (<code>post_title_embedding<\/code> and <code>post_excerpt_embedding<\/code>)<\/p>\n\n\n\n<p>A <strong>cosine similarity search<\/strong> for article titles can be conducted using:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 1.0. Similarity search only on titles (post_title_embedding)\n\nSELECT ID, post_title, post_excerpt, DISTANCE(@searchItemEmbedding, post_title_embedding, 'COSINE') AS min_distance \nFROM WP_embeddings.wp_posts_embeddings_minilm \nORDER BY min_distance ASC \nLIMIT 3\\G\n*************************** 1. row ***************************\n          ID: 2345\n  post_title: Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp;amp; Vector Store Features\npost_excerpt: This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users gain insights from their data.\nmin_distance: 0.1232912540435791\n*************************** 2. row ***************************\n          ID: 1234\n  post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\npost_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\nmin_distance: 0.1264844536781311\n*************************** 3. row ***************************\n          ID: 3456\n  post_title: HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\npost_excerpt: This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they\u2019re feeling positive, negative, or neutral.\nLet\u2019s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.\nmin_distance: 0.12810611724853516<\/code><\/pre>\n\n\n\n<p>A probably more elegant query, using Common Table Expression (CTE), is:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 1.1. Similarity search only on titles (post_title_embedding) w\/ CTE\nWITH distances AS (\n    SELECT\n        ID, \n        post_title,\n        post_excerpt,\n        DISTANCE(@searchItemEmbedding, post_title_embedding, 'COSINE') AS min_distance\n    FROM WP_embeddings.wp_posts_embeddings_minilm\n)\nSELECT *\nFROM distances\nORDER BY min_distance\nLIMIT 3\\G\n*************************** 1. row ***************************\n          ID: 2345\n  post_title: Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp;amp; Vector Store Features\npost_excerpt: This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users gain insights from their data.\nmin_distance: 0.1232912540435791\n*************************** 2. row ***************************\n          ID: 1234\n  post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\npost_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\nmin_distance: 0.1264844536781311\n*************************** 3. row ***************************\n          ID: 3456\n  post_title: HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\npost_excerpt: This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they\u2019re feeling positive, negative, or neutral.\nLet\u2019s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.\nmin_distance: 0.12810611724853516<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Of course, you can perform the same search on the excerpt alone. <br>Alternatively, you can run a similarity search across both columns for more comprehensive results:<\/p>\n\n\n\n<pre id=\"example3.1\" class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">-- Ex 3.1 Similarity search on title &amp; excerpt (post_title_embedding &amp; post_excerpt_embedding)\n\nWITH distances AS (\n    SELECT\n        ID, \n        post_title,\n        post_excerpt,\n        (\n          DISTANCE(post_title_embedding, @searchItemEmbedding, 'COSINE') + \n          DISTANCE(post_excerpt_embedding, @searchItemEmbedding, 'COSINE')\n        ) \/ 2 AS avg_distance\n    FROM WP_embeddings.wp_posts_embeddings_minilm\n)\nSELECT *\nFROM distances\nORDER BY avg_distance\nLIMIT 5\\G\n*************************** 1. row ***************************\n          ID: 1234\n  post_title: HeatWave GenAI: Your AI-Powered Content Creation Partner\npost_excerpt: Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains.\n\nOracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.\nThis article explores how HeatWave GenAI is empowering businesses to produce high-quality content rapidly and effectively, making it an indispensable tool for industries demanding speed, accuracy, and security.\navg_distance: 0.5131600499153137\n*************************** 2. row ***************************\n          ID: 3456\n  post_title: HeatWave GenAI: Sentiment Analysis Made Easy-Peasy\npost_excerpt: This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they\u2019re feeling positive, negative, or neutral.\nLet\u2019s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.\navg_distance: 0.5587222874164581\n*************************** 3. row ***************************\n          ID: 2345\n  post_title: Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp;amp; Vector Store Features\npost_excerpt: This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users gain insights from their data.\navg_distance: 0.6403274536132812\n*************************** 4. row ***************************\n          ID: 6789\n  post_title: Webinar - Apprentissage automatique avec MySQL HeatWave\npost_excerpt: HeatWave Machine Learning (ML) inclut tout ce dont les utilisateurs ont besoin pour cr\u00e9er, former, d\u00e9ployer et expliquer des mod\u00e8les d\u2019apprentissage automatique dans MySQL HeatWave, sans co\u00fbt suppl\u00e9mentaire.\n\nDans ce webinaire vous apprendrez...\navg_distance: 0.7226708233356476\n*************************** 72525. row ***************************\n          ID: 5678\n  post_title: Building an Interactive LLM Chatbot with  HeatWave Using Python\npost_excerpt: AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities.\n\nIn this article, we will explore a Python 3 script that connects to an HeatWave instance and enables users to interact with different large language models (LLMs) dynamically.\navg_distance: 0.736954927444458\n<\/code><\/pre>\n\n\n\n<p>For this example, the average distance was selected to combine title and excerpt similarity. However, alternative aggregation techniques can be employed.<\/p>\n\n\n\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;6a0391b6474e4&quot;}\" data-wp-interactive=\"core\/image\" data-wp-key=\"6a0391b6474e4\" class=\"wp-block-image size-full wp-lightbox-container\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1440\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on--click=\"actions.showLightbox\" data-wp-on--load=\"callbacks.setButtonStyles\" data-wp-on-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/03\/HW_GenaI_search_engine.gif?resize=2560%2C1440&#038;ssl=1\" alt=\"Similarity search across title and excerpt\" class=\"wp-image-7556\"\/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Agrandir\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"state.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Manage new articles<\/h2>\n\n\n\n<p>My last article (before this one) is named: <a href=\"https:\/\/dasini.net\/blog\/2025\/02\/11\/building-an-interactive-llm-chatbot-with-heatwave-using-python\/\" target=\"_blank\" rel=\"noopener\" title=\"Building an Interactive LLM Chatbot with HeatWave Using Python\">Building an Interactive LLM Chatbot with HeatWave Using Python<\/a>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"SQL\" class=\"language-SQL\">wordpress  SQL&gt; \nSELECT ID, post_title, post_excerpt, guid FROM wp_posts WHERE ID = 4567\\G\n*************************** 1. row ***************************\n          ID: 4567\n  post_title: Building an Interactive LLM Chatbot with  HeatWave Using Python\npost_excerpt: AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities.\n\nIn this article, we will explore a Python 3 script that connects to an HeatWave instance and enables users to interact with different large language models (LLMs) dynamically.\n        guid: https:\/\/dasini.net\/blog\/?p=4567<\/code><\/pre>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>To enable this article in the AI-powered search engine, I need to generate embeddings for both the title (<code><em>post_title<\/em><\/code>) and excerpt (<code><em>post_excerpt<\/em><\/code>). <br>This is accomplished using <code>sys.<a href=\"https:\/\/dev.mysql.com\/doc\/heatwave\/en\/mys-hwgenai-ml-embed-row.html\" target=\"_blank\" rel=\"noopener\" title=\"ML_EMBED_ROW uses the specified embedding model to encode the specified text or query into a vector embedding\">ML_EMBED_ROW<\/a><\/code>, which encodes text into a vector embedding based on a specified model, returning a <code><a href=\"https:\/\/dev.mysql.com\/doc\/refman\/9.2\/en\/vector.html\" target=\"_blank\" rel=\"noopener\" title=\"The HeatWave MySQL VECTOR Type\"><code>VECTOR<\/code><\/a><\/code> data type. <\/p>\n\n\n\n<p>These embeddings will then be inserted into the <code><em>wp_posts_embeddings_minilm<\/em><\/code> table, utilizing the same <code><em>minilm<\/em><\/code> embedding model:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"sql\" class=\"language-sql\">wordpress  SQL&gt; \nINSERT INTO wp_posts_embeddings_minilm (ID, post_title, post_excerpt, guid, post_title_embedding, post_excerpt_embedding) \n     SELECT ID, post_title, post_excerpt, guid, sys.ML_EMBED_ROW(post_title, '{\"model_id\": \"minilm\"}'), sys.ML_EMBED_ROW(post_excerpt,'{\"model_id\": \"minilm\"}')      \n     FROM wordpress.wp_posts \n     WHERE ID = 4567;<\/code><\/pre>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>Et voil\u00e0!<br>This implementation focused on post titles and excerpts for brevity, it lays a solid foundation for more comprehensive searches across entire article content, a topic to be explored in a future article.<\/p>\n\n\n\n<div style=\"height:75px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Peroration<\/h2>\n\n\n\n<p>In this article, we explored how <strong>HeatWave GenAI enables AI-powered semantic search<\/strong> by leveraging <strong>in-database embeddings and vector similarity operations<\/strong>. Unlike traditional SQL search and full-text search, which rely on keyword matching, HeatWave GenAI provides deeper contextual understanding by transforming text into vector representations and performing similarity searches using the  <code>DISTANCE<\/code> function.<\/p>\n\n\n\n<p>The power of HeatWave GenAI&rsquo;s in-database LLMs and vector store features was highlighted, showcasing its efficiency in handling both embedding generation and similarity calculations.  Furthermore, the process of integrating new articles into the search engine by generating and inserting their embeddings was outlined, ensuring the search remains up-to-date. This approach not only enhances content discovery but also lays the groundwork for more advanced applications, such as personalized recommendations and intelligent query responses.<\/p>\n\n\n\n<p>By adopting AI-powered search, we empower users with a more intuitive and effective way to discover relevant information, ultimately improving the overall user experience. Thanks to <strong>HeatWave GenAI which provides a robust and scalable solution for integrating advanced AI capabilities directly within the database<\/strong>.<\/p>\n\n\n\n<p>Stay tuned for more insights!<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><em>To be continued&#8230;<\/em><\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div style=\"height:20px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><a href=\"https:\/\/www.linkedin.com\/groups\/12524512\/\" target=\"_blank\" rel=\"noopener\" title=\"Olivier DASINI on Linkedin\">Follow me on Linkedin<\/a><\/p>\n\n\n\n<p>Watch my videos on my <a href=\"https:\/\/www.youtube.com\/channel\/UC12TulyJsJZHoCmby3Nm3WQ\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier's MySQL Channel\">YouTube channel<\/a> and <a href=\"https:\/\/www.youtube.com\/channel\/UC12TulyJsJZHoCmby3Nm3WQ\/?sub_confirmation=1\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Subscribe\">subscribe<\/a>.<\/p>\n\n\n\n<p>My <a href=\"https:\/\/www.slideshare.net\/freshdaz\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier DASINI on Slideshare\">Slideshare account<\/a>.<\/p>\n\n\n\n<p>My <a href=\"https:\/\/speakerdeck.com\/freshdaz\/\" target=\"_blank\" rel=\"noreferrer noopener\" title=\"Olivier DASINI on Speaker Deck\">Speaker Deck account<\/a>.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p class=\"has-vivid-red-color has-text-color\"><strong>Thanks for using HeatWave &amp; MySQL!<\/strong><\/p>\n\n\n\n<div style=\"height:50px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Discover how to build an AI-powered search engine for your applications using HeatWave GenAI. This approach leverages large language models (LLMs) for semantic search, offering a smarter alternative to traditional SQL and full-text search methods. By using embeddings\u2014vector representations of words\u2014the search engine understands context and intent, delivering more relevant results.<\/p>\n<p>In this article, I&rsquo;ll guide you through building an AI-powered search for a WordPress blog using HeatWave GenAI, focusing on its in-database LLMs and vector store capabilities. We&rsquo;ll create embeddings for post titles and excerpts to enable semantic search, ensuring users find the most relevant content quickly.<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[1702,1740,1694,1719,339],"tags":[1704,1700,1697,1738],"class_list":["post-7363","post","type-post","status-publish","format-standard","hentry","category-ai","category-artificial-intelligence","category-heatwave-en","category-mds-en","category-tuto-en","tag-ai","tag-genai","tag-heatwave-fr-en","tag-llm"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9LfWW-1UL","jetpack-related-posts":[{"id":7711,"url":"https:\/\/dasini.net\/blog\/2025\/04\/15\/build-an-ai-powered-search-engine-with-heatwave-genai-part-3\/","url_meta":{"origin":7363,"position":0},"title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 3)","author":"Olivier DASINI","date":"15 avril 2025","format":false,"excerpt":"In this latest post, the final part of my series on building an AI-powered search engine with HeatWave GenAI, I dive into enhancing AI-powered search by embedding full article content into HeatWave. By cleaning HTML, chunking content, generating embeddings, and running semantic similarity searches directly within HeatWave, we unlock highly\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":7812,"url":"https:\/\/dasini.net\/blog\/2025\/05\/13\/oracle-dev-days-2025-french-edition\/","url_meta":{"origin":7363,"position":1},"title":"Oracle Dev Days 2025 \u2013 French Edition","author":"Olivier DASINI","date":"13 mai 2025","format":false,"excerpt":"Join the Oracle Dev Days \u2013 French Edition, from May 20 to 22, 2025! This must-attend event (in French) offers a rich program exploring the latest advancements in AI, databases, cloud, and Java. Join me on May 21 at 2:00 PM for the day dedicated to \u201cDatabase & AI.\u201d I\u2019ll\u2026","rel":"","context":"Dans &quot;Conf\u00e9rence&quot;","block_context":{"text":"Conf\u00e9rence","link":"https:\/\/dasini.net\/blog\/category\/conference-en\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/05\/Olivier_Dasini_HeatWave.png?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":7566,"url":"https:\/\/dasini.net\/blog\/2025\/04\/08\/build-an-ai-powered-search-engine-with-heatwave-genai-part-2\/","url_meta":{"origin":7363,"position":2},"title":"Build an AI-Powered Search Engine with HeatWave GenAI (part 2)","author":"Olivier DASINI","date":"8 avril 2025","format":false,"excerpt":"In this part 2 we focused on enhancing search relevance. We introduced reranking techniques using weighted distances of titles and excerpts to refine initial search results. Then we delved into leveraging article summaries for more effective semantic search, utilizing HeatWave's capability to execute JavaScript stored procedures for sanitizing HTML content\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"Similarity search across title, excerpt and summary","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/04\/HWGenAIsearchEngine2.gif?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":7252,"url":"https:\/\/dasini.net\/blog\/2025\/02\/11\/building-an-interactive-llm-chatbot-with-heatwave-using-python\/","url_meta":{"origin":7363,"position":3},"title":"Building an Interactive LLM Chatbot with  HeatWave Using Python","author":"Olivier DASINI","date":"11 f\u00e9vrier 2025","format":false,"excerpt":"AI-powered applications require robust and scalable database solutions to manage and process large amounts of data efficiently. HeatWave is an excellent choice for such applications, providing high-performance OLTP, analytics, machine learning and generative artificial intelligence capabilities. In this article, we will explore a Python 3 script that connects to an\u2026","rel":"","context":"Dans &quot;HeatWave&quot;","block_context":{"text":"HeatWave","link":"https:\/\/dasini.net\/blog\/category\/heatwave-en\/"},"img":{"alt_text":"simple but robust chatbot system leveraging HeatWave GenAI and its in-database Mistral LLM","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2025\/02\/HW-Chat-mistral-7b.gif?resize=1050%2C600&ssl=1 3x"},"classes":[]},{"id":6752,"url":"https:\/\/dasini.net\/blog\/2024\/08\/07\/heatwave-genai-your-ai-powered-content-creation-partner\/","url_meta":{"origin":7363,"position":4},"title":"HeatWave GenAI: Your AI-Powered Content Creation Partner","author":"Olivier DASINI","date":"7 ao\u00fbt 2024","format":false,"excerpt":"Generative artificial intelligence (GenAI) is reshaping the content creation landscape. By training on vast datasets, these \"intelligent\" systems can produce new, human-quality content across a multitude of domains. Oracle's HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/07\/hw_product_image.png?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":7058,"url":"https:\/\/dasini.net\/blog\/2024\/12\/10\/simplifying-ai-development-a-practical-guide-to-heatwave-genais-rag-vector-store-features\/","url_meta":{"origin":7363,"position":5},"title":"Simplifying AI Development: A Practical Guide to HeatWave GenAI\u2019s RAG &amp; Vector Store Features","author":"Olivier DASINI","date":"10 d\u00e9cembre 2024","format":false,"excerpt":"This tutorial explores HeatWave GenAI, a cloud service that simplifies interacting with unstructured data using natural language. It combines large language models, vector stores, and SQL queries to enable tasks like content generation, chatbot, and retrieval-augmented generation (RAG). The focus is on RAG and how HeatWave GenAI\u2019s architecture helps users\u2026","rel":"","context":"Dans &quot;AI&quot;","block_context":{"text":"AI","link":"https:\/\/dasini.net\/blog\/category\/ai\/"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/dasini.net\/blog\/wp-content\/uploads\/2024\/12\/HeatWave_chatbot3.gif?resize=1400%2C800&ssl=1 4x"},"classes":[]}],"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7363","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/comments?post=7363"}],"version-history":[{"count":175,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7363\/revisions"}],"predecessor-version":[{"id":7779,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/posts\/7363\/revisions\/7779"}],"wp:attachment":[{"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/media?parent=7363"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/categories?post=7363"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dasini.net\/blog\/wp-json\/wp\/v2\/tags?post=7363"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}