HeatWave GenAI: Sentiment Analysis Made Easy-Peasy

September 10, 2024
Tags: ,

This article builds upon the concepts introduced in my previous blog posts, HeatWave GenAI: Your AI-Powered Content Creation Partner and In-Database LLMs for Efficient Text Translation with HeatWave GenAI.
For a deeper understanding, also consider reading these articles.

So, you’ve probably heard about AI getting “smarter” by the day. Well, one cool thing it can do now is read between the lines of what people say. It’s like having a mind-reading machine for text!
This new AI tech, called generative AI (or GenAI), can dive deep into what people are saying and tell us if they’re feeling positive, negative, or neutral.
Let’s see how HeatWave GenAI, can help you to enhance your understanding of customer sentiment, improve decision-making, and drive business success.

What is AI sentiment Analysis?

You can find a more formal and complete definition on Wikipedia, but in few words, it is like having a computer that can understand human emotions. This technology can analyze text data and thus dive deep into what people are saying and tell us if they’re happy, sad, or even mad.

By understanding the sentiment behind words, AI can help businesses gauge customer satisfaction, track brand reputation, and even predict market trends.
It’s a powerful tool that has a wide range of applications in fields like marketing, social media, and customer service.

What is HeatWave?

HeatWave is a fully-managed Oracle Cloud Infrastructure service. It offered different solutions for different workloads:

  • HeatWave MySQL: built on MySQL, for your transactional needs.
  • HeatWave Analytics: get real-time analytics on your data.
  • HeatWave Lakehouse: query data in various formats in object storage
  • HeatWave AutoML: automate the pipeline to build, train, and explain ML models
  • HeatWave GenAI: integrated & automated GenAI with in-database LLMs

Oracle’s HeatWave GenAI (starting with version 9.0.1) is at the forefront of this revolution, offering an integrated platform that combines in-database large language models (LLMs), vector stores, and scale-out vector processing to streamline content generation.

The key benefits of such architecture are:

  • Effortless LLM Integration: Simple SQL queries unlock AI power.
  • Seamless Database Integration: No complex external connections.
  • Simplified Architecture: Easy management and scalability.
  • Scalable Performance: Handles large datasets and high traffic.
  • Always Up-to-Date Data: In-database LLMs ensure data freshness.

In this article I’m using HeatWave 9.0.1.

For the purpose of this article, let’s consider a scenario where you’re a mobile phone manufacturer. In today’s digital age, it’s crucial to keep a pulse on customer sentiment. Your company actively monitors various social networks to stay informed about what customers are saying about your products.
For example, short messages like:

  • Just got my hands on the latest mobile phone and I’m loving it! Thanks for creating such an awesome product!
  • A mobile phone that prioritizes functionality and reliability over flashy features.
  • Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!

You get the idea!

So, how can you harnessing HeatWave GenAI for effortless sentiment analysis?
Like we have seen in HeatWave GenAI: Your AI-Powered Content Creation Partner & In-Database LLMs for Efficient Text Translation with HeatWave GenAI the first step is to load the LLM in HeatWave memory.

Load the LLM in HeatWave memory

Use the ML_MODEL_LOAD stored procedure: CALL sys.ML_MODEL_LOAD(‘<LLM>’, NULL);
The model needs to be loaded only once before generating text. It remains accessible within HeatWave as long as the service is running. This means you don’t have to reload the model for each text generation.

At the time of writing (beginning of September 2024) the available models are Mistral 7B (mistral-7b-instruct-v1) and Llama 2 (llama2-7b-v1).

mysql>
SELECT @@version, @@version_comment;
+-------------+--------------------------+
| @@version   | @@version_comment        |
+-------------+--------------------------+
| 9.0.1-cloud | MySQL Enterprise - Cloud |
+-------------+--------------------------+


-- Load Mistral 7B model
CALL sys.ML_MODEL_LOAD('mistral-7b-instruct-v1', NULL);

Mistral is loaded. Now you can think about defining your prompt.

Define your prompt in natural language

To make it more convenient, let’s use some variables. The sort messages above are now:

mysql>
SET @customer_msgA="Just got my hands on the latest mobile phone and I'm loving it! Thanks for creating such an awesome product!" ;

SET @customer_msgB="A mobile phone that prioritizes functionality and reliability over flashy features." ;

SET @customer_msgC="Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!" ;

The prompt contains a message and the inference instructions.
Let’s start with a very simple instruction text, something like:
What is the feeling from the text above?

mysql>
SET @instructions=" What is the feeling from the text above?";

SET @prompt=CONCAT(@customer_msgA, @instructions);

The final prompt is the concatenation of 2 variables: @customer_msg{X} and @instructions.

In order to have the optimal result, maybe you’ll need to do some fine tuning to set up the inference.

Inference Setting

I use generation as a task and the Mistral model and I put temperature, top-k and top-p to zero because I do not want any creativity from the model 🙂 .

mysql>
SET @inferenceSetup  = '{"task": "generation", "temperature": 0, "top_k": 0, "top_p": 0, "model_id": "mistral-7b-instruct-v1"}';

We have now all what we need to do sentiment analysis.

Sentiment Analysis

Like we have seen in HeatWave GenAI: Your AI-Powered Content Creation Partner & In-Database LLMs for Efficient Text Translation with HeatWave GenAI we must use the ML_GENERATE function.

One way to display the result is to use the following query:

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt;

Result with: SET @customer_msgA=”Just got my hands on the latest mobile phone and I’m loving it! Thanks for creating such an awesome product!” ;

mysql>
/* msg: "Just got my hands on the latest mobile phone and I'm loving it! Thanks for creating such an awesome product!" */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt;
+---------------------------------------------------------+
| Answer                                                  |
+---------------------------------------------------------+
|  The feeling expressed in the text is happiness or joy. |
+---------------------------------------------------------+

With SET @customer_msgB=”A mobile phone that prioritizes functionality and reliability over flashy features.” ;

mysql>
/* msg: "A mobile phone that prioritizes functionality and reliability over flashy features." */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt\G
*************************** 1. row ***************************
Answer:  The feeling conveyed in the text is one of practicality and pragmatism, with a focus on the importance of functionality and reliability in a mobile phone.

With SET @customer_msgC=”Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!” ;

mysql>
/* msg: "Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!" */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt\G
*************************** 1. row ***************************
Answer:  The feeling conveyed in the text above is frustration and disappointment.

Amazing!
The model provide a summary of the sentiment expressed in the message.

But what if we want to use this feature more programmatically? Well we need to improve our prompt.

Sentiment Analysis in Code

Basically we can ask the LLM to generate the result in a specific format, other than raw text like JSON, CSV, … or an integer.

Let’s say we have the following code:

  • Positive feeling: 1.
  • Negative feeling: -1
  • Neutral feeling or unknown: 0

The goal is to automatically identify problematic messages — those that express non positive sentiments — and then take action.

To do this, we only need to rewrite our prompt, improving the instruction part. For example we can have something like:
What is the feeling from the text above? if it is positive write 1. If it is negative write -1. if it is neutral or if you don’t know write 0. Regarding to your analysis, write only -1 or 0 or 1

Still a basic prompt, far from being optimal, but good enough to keep this article simple and most important, to do the job.

mysql>
SET @instructions=" What is the feeling from the text above? if it is positive write 1. If it is negative write -1. if it is neutral or if you don't know write 0. Regarding to your analysis, write only -1 or 0 or 1";

All the other variables are the same than the previous use case.
Let’s see the result:

Result with: SET @customer_msgA=”Just got my hands on the latest mobile phone and I’m loving it! Thanks for creating such an awesome product!” ;

mysql>
/* msg: "Just got my hands on the latest mobile phone and I'm loving it! Thanks for creating such an awesome product!" */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt;
+--------+
| Answer |
+--------+
|  1     |
+--------+

With SET @customer_msgB=”A mobile phone that prioritizes functionality and reliability over flashy features.” ;

mysql>
/* msg: "A mobile phone that prioritizes functionality and reliability over flashy features." */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt;
+--------+
| Answer |
+--------+
|  0     |
+--------+

With SET @customer_msgC=”Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!” ;

mysql>
/* msg: "Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!" */

SELECT Answer->>"$.text" AS "Answer" FROM (SELECT sys.ML_GENERATE(@prompt, @inferenceSetup) AS Answer ) AS dt;
+--------+
| Answer |
+--------+
|  -1    |
+--------+

Awesome!
The model produces results based on the format chosen and needed in our code. It could have been a JSON format for example and the logic would have been the same.

With such output you can easily integrate HeatWave in your data pipeline.

Sentiment Analysis into your data workflow with HeatWave

Imagine we store our customer messages in a (MySQL) table. A possible workflow involves regularly scanning this table for negative feedback. Once identified, these messages can be referenced in ( or be moved to) a separate table. This new table can then be used to populate the customer service team’s dashboard. By tracking unsatisfied customers in this way, the team can work to improve overall customer satisfaction.

A simplified data architecture could be:

mysql>
CREATE SCHEMA sentimentAnalysis;


CREATE TABLE `sentimentAnalysis`.`tweet` (
  `tweet_id` int unsigned NOT NULL AUTO_INCREMENT,
  `customer_msg` varchar(4000) DEFAULT NULL,
  PRIMARY KEY (`tweet_id`)
);


CREATE TABLE `sentimentAnalysis`.`tweet_to_review` (
  `tweet_to_review_id` int unsigned NOT NULL AUTO_INCREMENT,
  `tweet_id_to_check` int unsigned NOT NULL,
  `analysis_result` tinyint NOT NULL,
  PRIMARY KEY (`tweet_to_review_id`)
);

Table tweet contains the customer messages and a message ID..
The problematic messages (ie status 0 and -1) are referenced into table tweet_to_review, using the message ID. We also store the result of the sentiment analysis (0 or -1). Support may prefer to focus first on messages that are definitely negative (status -1).

mysql>
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Just got my hands on the latest mobile phone and I'm loving it! Thanks for creating such an awesome product!");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Loving the features on my new mobile phone! It's like a whole new world in the palm of my hand.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Your mobile phone company is a joke. Can't believe I wasted my hard-earned money on your crappy products.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Hey, just wanted to say how much I love your product! It's been a game changer for me.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Recently upgraded my old mobile phone to something newer - let's see how it goes");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("A mobile phone that prioritizes functionality and reliability over flashy features.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Why does your mobile phone company insist on selling devices that are constantly glitching and breaking? Do better!");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Shoutout for making such an awesome product. I couldn't imagine my life without it!");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Just another day dealing with the incompetence of my mobile phone company. Seriously considering switching to a different provider.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("For those who want a device that just works.");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Big thanks for creating a top-notch mobile phone. You guys never disappoint!");
INSERT `sentimentAnalysis`.`tweet` (customer_msg) VALUES("Just got my hands on a new mobile phone and I'm excited to see what it can do!");

Some data inserted in tweet table.
Most of these data were generated using AI Tweet Generator.

mysql>
SELECT * FROM `sentimentAnalysis`.`tweet` LIMIT 5\G
*************************** 1. row ***************************
    tweet_id: 1
customer_msg: Just got my hands on the latest mobile phone and I'm loving it! Thanks for creating such an awesome product!
*************************** 2. row ***************************
    tweet_id: 2
customer_msg: Loving the features on my new mobile phone! It's like a whole new world in the palm of my hand.
*************************** 3. row ***************************
    tweet_id: 3
customer_msg: Your mobile phone company is a joke. Can't believe I wasted my hard-earned money on your crappy products.
*************************** 4. row ***************************
    tweet_id: 4
customer_msg: Hey, just wanted to say how much I love your product! It's been a game changer for me.
*************************** 5. row ***************************
    tweet_id: 5
customer_msg: Recently upgraded my old mobile phone to something newer - let's see how it goes

Sample of the data.

The workflow of our data pipeline is quite simple:

  1. Get the tweet content from the source (table tweet)
  2. Do the sentiment analysis using HeatWave GenAI
  3. Write problematic tweets (status <> 0) into the target (table tweet_to_review) for further processing

It can be done with your favourite programing language. It can also be done on the database side using HeatWave MySQL stored routines capabilities.
If you are a javascript fan, you can now write your stored routine in that language since HeatWave MySQL 9.0: JavaScript Stored Programs

That said, for the love of the (SQL) art, let’s try something in SQL. But before diving into the code, let’s break down the 3 steps above.

Get the tweet content from the source (table tweet)

In order to have a flexible stored procedure, the input parameters are:

  • source schema name: VARCHAR (sentimentAnalysis)
  • source table name: VARCHAR (tweet)
  • source column message: VARCHAR (customer_msg)
  • source column primary key name: VARCHAR (tweet_id)
  • tweet ID: INTEGER UNSIGNED (ex: 1, 2, 3….)
  • target schema name: VARCHAR (sentimentAnalysis)
  • target table name: VARCHAR (tweet_to_review)
  • target column result name: VARCHAR (analysis_result)

To get the tweet content from the source the query is:

SELECT <source column message>
FROM <source schema name>.<source table name>
WHERE <source column primary key name>=<tweet ID>
INTO @customer_msg

Do the sentiment analysis using HeatWave GenAI

We’ve already seen this part:

Prompt instruction:
instructions=” What is the feeling from the text above? if it is positive write 1. If it is negative write -1. if it is neutral or if you don’t know write 0. Regarding to your analysis, write only -1 or 0 or 1″;

Add the tweet to the prompt:
prompt=CONCAT(‘”‘, @customer_msg, instructions, ‘”‘);

Inference setup:
inferenceSetup = ‘\'{“task”: “generation”, “temperature”: 0, “top_k”: 0, “top_p”: 0, “model_id”: “mistral-7b-instruct-v1″}\”;

And the Inference:
SELECT Answer->>”$.text” AS “Answer” FROM (SELECT sys.ML_GENERATE(prompt, inferenceSetup) AS Answer ) AS dt INTO @answer ;

Write problematic tweets into the target for further processing

And finally store the ID of the problematic message (tweet ID) and its status (-1 or 0):

INSERT INTO <target schema name>.<target table name> (<target column message ID name>, <target column result name>) VALUES (<tweetID>, <@answer>)

How it works?

The stored routine is called: genai_sentimentAnalysis.
In the following examples, it is run with tweet ID: 1, 3 and 10:

mysql>
CALL genai_sentimentAnalysis("sentimentAnalysis", "tweet", "customer_msg", "tweet_id", 1, "sentimentAnalysis", "tweet_to_review", "analysis_result");
+-----------------------------------------------+
| INFO                                          |
+-----------------------------------------------+
| Tweet number 1 as been analyzed as  POSITIVE  |
+-----------------------------------------------+
1 row in set (2.4341 sec)

Query OK, 0 rows affected (2.4341 sec)


CALL genai_sentimentAnalysis("sentimentAnalysis", "tweet", "customer_msg", "tweet_id", 3, "sentimentAnalysis", "tweet_to_review", "analysis_result");
+-----------------------------------------------+
| INFO                                          |
+-----------------------------------------------+
| Tweet number 3 as been analyzed as  NEGATIVE  |
+-----------------------------------------------+
1 row in set (1.9574 sec)

Query OK, 0 rows affected (1.9574 sec)


CALL genai_sentimentAnalysis("sentimentAnalysis", "tweet", "customer_msg", "tweet_id", 10, "sentimentAnalysis", "tweet_to_review", "analysis_result");
+-----------------------------------------------+
| INFO                                          |
+-----------------------------------------------+
| Tweet number 10 as been analyzed as  NEUTRAL  |
+-----------------------------------------------+
1 row in set (1.8339 sec)

Query OK, 0 rows affected (1.8339 sec)

The stored procedure output is a log that is displayed on the console.
The first call generates 1 (tweet ID 1 is positive), the second -1 (tweet ID 3 is negative) and the third 0 (tweet ID 10 is neutral).

Let’s check the tweet_to_review table:

mysql>
SELECT tweet_id_to_check, analysis_result FROM tweet_to_review;
+-------------------+-----------------+
| tweet_id_to_check | analysis_result |
+-------------------+-----------------+
|                 3 |              -1 |
|                10 |               0 |
+-------------------+-----------------+

Only , tweet 3 and 10 are selected for the customer service team.

For their dashboard you can provide more details with a query like:

mysql>
SELECT tweet_id_to_check, analysis_result, customer_msg 
FROM tweet_to_review INNER JOIN tweet ON tweet_id=tweet_id_to_check 
ORDER BY analysis_result\G
*************************** 1. row ***************************
tweet_id_to_check: 3
  analysis_result: -1
     customer_msg: Your mobile phone company is a joke. Can't believe I wasted my hard-earned money on your crappy products.
*************************** 2. row ***************************
tweet_id_to_check: 10
  analysis_result: 0
     customer_msg: For those who want a device that just works.


Show me the code

Here the stored procedure used.
As a disclaimer, this code is not meant to run in production, it is for demonstration purpose only 🙂

mysql>
DELIMITER $$
CREATE DEFINER=`MyHeatwaveUser`@`%` PROCEDURE `genai_sentimentAnalysis`(IN _sourceSchema VARCHAR(64), IN _sourceTable VARCHAR(64), IN _sourceColMsg VARCHAR(64), IN _sourceColPK VARCHAR(64), IN _tweetID INTEGER UNSIGNED, IN _targetSchema VARCHAR(64), IN _targetTable VARCHAR(64), IN _targetColResult VARCHAR(64))
    SQL SECURITY INVOKER
BEGIN
    DECLARE sourceTableName, targetTableName, targetCols VARCHAR(129);
    DECLARE instructions, inferenceSetup VARCHAR(512);
    DECLARE prompt VARCHAR(5000);
    DECLARE rep VARCHAR(100);
    
    /* Get the tweet content from source table */
    SET sourceTableName = CONCAT("`",_sourceSchema, "`.`", _sourceTable, "`");
    SET @txtStmt = CONCAT("SELECT ", _sourceColMsg, " INTO @customer_msg FROM  ", sourceTableName, " WHERE ", _sourceColPK, " = ", _tweetID);
    PREPARE getTweet FROM @txtStmt;
    EXECUTE getTweet;
    DEALLOCATE PREPARE getTweet;
    
    /* Sentiment analysis using HeatWave GenAI */
    -- Prompt instruction
    SET instructions=" What is the feeling from the text above? if it is positive write 1. If it is negative write -1. if it is neutral or if you don't know write 0. Regarding to your analysis, write only -1 or 0 or 1";
	
    -- Add the tweet to the prompt
    SET prompt=CONCAT('"', @customer_msg, instructions, '"');
	
    -- Inference setup
    SET inferenceSetup  = '\'{"task": "generation", "temperature": 0, "top_k": 0, "top_p": 0, "model_id": "mistral-7b-instruct-v1"}\'';

    -- Inference: if positive retunr 1 / if negative return -1 / if neutral or unknown return 0
    SET @txtStmt = CONCAT(' SELECT Answer->>"$.text" AS Answer FROM (SELECT sys.ML_GENERATE(', prompt, ' , ', inferenceSetup, ' ) AS Answer ) AS dt INTO @answer');
    PREPARE analysis FROM @txtStmt;
    EXECUTE analysis;
    DEALLOCATE PREPARE analysis;

    /* Write problematic tweets (status <> 0) into the target table for further processing */	
    SET rep = CONCAT('Tweet number ', _tweetID, ' as been analyzed as ');
    IF @answer < 1 THEN
        BEGIN
            SET targetTableName = CONCAT("`",_targetSchema, "`.`", _targetTable, "`");
            SET targetCols = CONCAT("(tweet_id_to_check, ", _targetColResult,")");
            SET @txtStmt = CONCAT('INSERT INTO ', targetTableName, targetCols, ' VALUES (', _tweetID, ', ', @answer, ')');
            PREPARE ins FROM @txtStmt;
            EXECUTE ins;
            DEALLOCATE PREPARE ins;
            
            /* Logging */
            CASE @answer
                WHEN 0 THEN SELECT CONCAT(rep, ' NEUTRAL ') AS 'INFO';
                WHEN -1 THEN SELECT CONCAT(rep, ' NEGATIVE ') AS 'INFO';
            END CASE;
        END;
    ELSE SELECT CONCAT(rep, ' POSITIVE ') AS 'INFO';
    END IF;
       
END$$
DELIMITER ;

As you see, all the logic is coded inside the procedure. It would probably make sense to break the different tasks to separate procedures/functions.

Areas for improvement

As you can imagine, the prompt is crucial. While I’ve tried kept things simple for this tutorial, effective prompt engineering is essential for getting the desired results from an LLM.

Remember to review the LLM’s output carefully. Even though we expect an integer, the model might generate something unexpected (not handle in the stored procedure).

As of {HeatWave} MySQL 9.0.0, you can now write stored routines in JavaScript. However, stored procedures aren’t mandatory. Choose the language that best suits your needs and start experimenting! 🙂

Peroration

In this article, we’ve explored how to leverage Oracle HeatWave GenAI to perform sentiment analysis on text data. By integrating in-database large language models HeatWave offers an easy, streamlined and efficient solution for understanding customer sentiment.

We’ve covered the essential steps involved, from loading the LLM into HeatWave memory to crafting effective prompts and analyzing the model’s output. By following these guidelines, you can easily incorporate sentiment analysis into your data pipeline and gain valuable insights into customer feedback.

Remember that effective prompt engineering is key to obtaining accurate and meaningful results. Experiment with different prompts and refine your approach to achieve the desired outcomes.

With HeatWave GenAI, you have a powerful tool at your disposal for extracting valuable information from text data. By harnessing its capabilities, you can enhance your understanding of customer sentiment, improve decision-making, and drive business success.

Stay tuned for more insights!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

Leave a Reply