loader from loading.io

Oracle AI Vector Search: Part 1

Oracle University Podcast

Release Date: 10/22/2024

What is Oracle GoldenGate 23ai? show art What is Oracle GoldenGate 23ai?

Oracle University Podcast

In a new season of the Oracle University Podcast, Lois Houston and Nikita Abraham dive into the world of Oracle GoldenGate 23ai, a cutting-edge software solution for data management. They are joined by Nick Wagner, a seasoned expert in database replication, who provides a comprehensive overview of this powerful tool.   Nick highlights GoldenGate's ability to ensure continuous operations by efficiently moving data between databases and platforms with minimal overhead. He emphasizes its role in enabling real-time analytics, enhancing data security, and reducing costs by offloading data to...

info_outline
Integrating APEX with OCI AI Services show art Integrating APEX with OCI AI Services

Oracle University Podcast

Discover how Oracle APEX leverages OCI AI services to build smarter, more efficient applications. Hosts Lois Houston and Nikita Abraham interview APEX experts Chaitanya Koratamaddi, Apoorva Srinivas, and Toufiq Mohammed about how key services like OCI Vision, Oracle Digital Assistant, and Document Understanding integrate with Oracle APEX.   Packed with real-world examples, this episode highlights all the ways you can enhance your APEX apps.   Oracle APEX: Empowering Low Code Apps with AI: Oracle University Learning Community: LinkedIn: X:   Special thanks to Arijit Ghosh,...

info_outline
AI-Assisted Development in Oracle APEX show art AI-Assisted Development in Oracle APEX

Oracle University Podcast

Get ready to explore how generative AI is transforming development in Oracle APEX. In this episode, hosts Lois Houston and Nikita Abraham are joined by Oracle APEX experts Apoorva Srinivas and Toufiq Mohammed to break down the innovative features of APEX 24.1. Learn how developers can use APEX Assistant to build apps, generate SQL, and create data models using natural language prompts.   Oracle APEX: Empowering Low Code Apps with AI: Oracle University Learning Community: LinkedIn: X:   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio...

info_outline
Unlocking the Power of Oracle APEX and AI show art Unlocking the Power of Oracle APEX and AI

Oracle University Podcast

Lois Houston and Nikita Abraham kick off a new season of the podcast, exploring how Oracle APEX integrates with AI to build smarter low-code applications. They are joined by Chaitanya Koratamaddi, Director of Product Management at Oracle, who explains the basics of Oracle APEX, its global adoption, and the challenges it addresses for businesses managing and integrating data. They also explore real-world use cases of AI within the Oracle APEX ecosystem   Oracle APEX: Empowering Low Code Apps with AI: Oracle University Learning Community: LinkedIn: X:   Special thanks to Arijit...

info_outline
Raise Your Game with Oracle Cloud Applications show art Raise Your Game with Oracle Cloud Applications

Oracle University Podcast

In this special episode of the Oracle University Podcast, Bill Lawson and Nikita Abraham chat with Peter Fernandez, Senior Director of Cloud Certification at Oracle University, about the exciting new Raise Your Game challenge. They discuss how the initiative is designed to enhance participants' skills in Oracle Fusion Cloud Applications and Oracle Cloud Success Navigator. They also cover key details about the challenge, such as how to get started, who can participate, the way it is structured, and the prizes up for grabs.   Raise Your Game:  Oracle University Learning...

info_outline
Oracle Database@Azure show art Oracle Database@Azure

Oracle University Podcast

The final episode of the multicloud series focuses on Oracle Database@Azure, a powerful cloud database solution. Hosts Lois Houston and Nikita Abraham, along with Senior Manager of CSS OU Cloud Delivery Samvit Mishra, discuss how this service allows customers to run Oracle databases within the Microsoft Azure data center, simplifying deployment and management. The discussion also highlights the benefits of native integration with Azure services, eliminating the need for complex networking setups.   Oracle Cloud Infrastructure Multicloud Architect Professional: Oracle University Learning...

info_outline
Oracle Interconnect for Azure show art Oracle Interconnect for Azure

Oracle University Podcast

Join Lois Houston and Nikita Abraham as they interview Samvit Mishra, Senior Manager of CSS OU Cloud Delivery, on Oracle Interconnect for Azure. Learn how this interconnect revolutionizes the customer experience by providing a direct, private link between Oracle Cloud Infrastructure and Microsoft Azure. From use cases to bandwidth considerations, get an in-depth look into how Oracle and Azure come together to create a unified cloud experience.   Oracle Cloud Infrastructure Multicloud Architect Professional: Oracle University Learning Community: LinkedIn: X:   Special thanks to...

info_outline
What is Multicloud? show art What is Multicloud?

Oracle University Podcast

This week, hosts Lois Houston and Nikita Abraham are shining a light on multicloud, a game-changing strategy involving the use of multiple cloud service providers. Joined by Senior Manager of CSS OU Cloud Delivery Samvit Mishra, they discuss why multicloud is becoming essential for businesses, offering freedom from vendor lock-in and the ability to cherry-pick the best services. They also talk about Oracle's pioneering role in multicloud and its partnerships with Microsoft Azure, Google Cloud, and Amazon Web Services.   Oracle Cloud Infrastructure Multicloud Architect Professional: ...

info_outline
Oracle Fusion Cloud Applications Foundations Training & Certifications show art Oracle Fusion Cloud Applications Foundations Training & Certifications

Oracle University Podcast

In this special episode of the Oracle University Podcast, hosts Lois Houston and Nikita Abraham dive into Oracle Fusion Cloud Applications and the new courses and certifications on offer. They are joined by Oracle Fusion Apps experts Patrick McBride and Bill Lawson who introduce the concept of Oracle Modern Best Practice (OMBP), explaining how it helps organizations maximize results by mapping Fusion Application features to daily business processes. They also discuss how the new courses educate learners on OMBP and its role in improving Fusion Cloud Apps implementations.   OMBP: Oracle...

info_outline
Monitoring MySQL and HeatWave show art Monitoring MySQL and HeatWave

Oracle University Podcast

In this episode, Lois Houston and Nikita Abraham chat with MySQL expert Perside Foster on the importance of keeping MySQL performing at its best. They discuss the essential tools for monitoring MySQL, tackling slow queries, and boosting overall performance.   They also explore HeatWave, the powerful real-time analytics engine that brings machine learning and cross-cloud flexibility into MySQL.   MySQL 8.4 Essentials: Oracle University Learning Community: LinkedIn: X:   Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for...

info_outline
 
More Episodes
In this episode, Senior Principal APEX and Apps Dev Instructor Brent Dayley joins hosts Lois Houston and Nikita Abraham to discuss Oracle AI Vector Search. Brent provides an in-depth overview, shedding light on the brand-new vector data type, vector embeddings, and the vector workflow.
 
 
 
Oracle University Learning Community: https://education.oracle.com/ou-community
 
 
 
Special thanks to Arijit Ghosh, David Wright, Radhika Banka, and the OU Studio Team for helping us create this episode.
 
---------------------------------------------------------
 
Episode Transcript:
 

00:00

Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started!

 

00:26

Lois: Hello and welcome to the Oracle University Podcast! I’m Lois Houston, Director of Innovation Programs here at Oracle University. Joining me as always is our Team Lead of our Editorial Services, Nikita Abraham.

Nikita: Hi everyone! Thanks for tuning in over the last few months as we’ve been discussing all the Oracle Database 23ai new features. We’re coming to the end of the season, and to close things off, in this episode and the next one, we’re going to be talking about the fundamentals of Oracle AI Vector Search. In today’s episode, we’ll try to get an overview of what vector search is, why Oracle Vector Search stands out, and dive into the new vector data type. We’ll also get insights into vector embedding models and the vector workflow.

01:11

Lois: To take us through all of this, we’re joined by Brent Dayley, who is a Senior Principal APEX and Apps Development Instructor with Oracle University. Hi Brent! Thanks for joining us today. Can you tell us about the new vector data type?

Brent: So this data type was introduced in Oracle Database 23ai. And it allows you to store vector embeddings alongside other business data. Now, the vector data type allows a foundation to store vector embeddings.

01:42

Lois: And what are vector embeddings, Brent?

Brent: Vector embeddings are mathematical representations of data points. They assign mathematical representations based on meaning and context of your unstructured data. You have to generate vector embeddings from your unstructured data either outside or within the Oracle Database. In order to get vector embeddings, you can either use ONNX embedding machine learning models or access third-party REST APIs. Embeddings can be used to represent almost any type of data, including text, audio, or visual, such as pictures. And they are used in proximity searches.

02:28

Nikita: Hmmm, proximity search. And similarity search, right? Can you break down what similarity search is and how it functions?

Brent: So vector data tends to be unevenly distributed and clustered into groups that are semantically related. Doing a similarity search based on a given query vector is equivalent to retrieving the k nearest vectors to your query vector in your vector space. What this means is that basically you need to find an ordered list of vectors by ranking them, where the first row is the closest or most similar vector to the query vector. The second row in the list would be the second closest vector to the query vector, and so on, depending on your data set. What we need to do is to find the relative order of distances. And that's really what matters rather than the actual distance.

Now, similarity searches tend to get data from one or more clusters, depending on the value of the query vector and the fetch size. Approximate searches using vector indexes can limit the searches to specific clusters. Exact searches visit vectors across all clusters.

03:44

Lois: Ok. I want to move on to vector embedding models. What are they and why are they valuable?

Brent: Vector embedding models allow you to assign meaning to what a word, or a sentence, or the pixels in an image, or perhaps audio. It allows you to quantify features or dimensions. Most modern vector embeddings use a transformer model. Bear in mind that convolutional neural networks can also be used. Depending on the type of your data, you can use different pretrained open source models to create vector embeddings. As an example, for textual data, sentence transformers can transform words, sentences, or paragraphs into vector embeddings.

04:33

Nikita: And what about visual data?

Brent: For visual data, you can use residual network also known as ResNet to generate vector embeddings. You can also use visual spectrogram representation for audio data. And that allows us to use the audio data to fall back into the visual data case. Now, these can also be based on your own data set. Each model also determines the number of dimensions for your vectors.

05:02

Lois: Can you give us some examples of this, Brent?

Brent: Cohere's embedding model, embed English version 3.0, has 1,024 dimensions. Open AI's embedding model, text-embedding-3-large, has 3,072 dimensions.

05:24

Want to get the inside scoop on Oracle University? Head over to the Oracle University Learning Community. Attend exclusive events. Read up on the latest news. Get first-hand access to new products. Read the OU Learning Blog. Participate in Challenges. And stay up-to-date with upcoming certification opportunities. Visit mylearn.oracle.com to get started. 

05:50

Nikita: Welcome back! Let’s now get into the practical side of things. Brent, how do you import embedding models?

Brent: Although you can generate vector embeddings outside the Oracle Database using pre-trained open source embeddings or your own embedding models, you also have the option of doing those within the Oracle Database. In order to use those within the Oracle Database, you need to use models that are compatible with the Open Neural Network Exchange Standard, or ONNX, also known as Onyx.

Oracle Database implements an Onyx runtime directly within the database, and this is going to allow you to generate vector embeddings directly inside the Oracle Database using SQL.

06:35

Lois: Brent, why should people choose to use Oracle AI Vector Search?

Brent: Now one of the biggest benefits of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data, all in one single system. This is very powerful, and also a lot more effective because you don't need to add a specialized vector database. And this eliminates the pain of data fragmentation between multiple systems.

It also supports Retrieval Augmented Generation, also known as RAG. Now this is a breakthrough generative AI technique that combines large language models and private business data. And this allows you to deliver responses to natural language questions. RAG provides higher accuracy and avoids having to expose private data by including it in the large language model training data.

07:43

Nikita: In the last part of our conversation today, I want to ask you about the Oracle AI Vector Search workflow, starting with generating vector embeddings.

Brent: Generate vector embeddings from your data, either outside the database or within the database. Now, embeddings are a mathematical representation of what your data meaning is. So what does this long sentence mean, for instance? What are the main keywords out of it?

You can also generate embeddings not only on your typical string type of data, but you can also generate embeddings on other types of data, such as pictures or perhaps maybe audio wavelengths.

08:28

Lois: Could you give us some examples?

Brent: Maybe we want to convert text strings to embeddings or convert files into text. And then from text, maybe we can chunk that up into smaller chunks and then generate embeddings on those chunks. Maybe we want to convert files to embeddings, or maybe we want to use embeddings for end-to-end search.

Now you have to generate vector embeddings from your unstructured data, either outside or within the Oracle Database. You can either use the ONNX embedding machine learning models or you can access third-party REST APIs.

You can import pre-trained models in ONNX format for vector generation within the database. You can download pre-trained embedding machine learning models, convert them into the ONNX format if they are not already in that format. Then you can import those models into the Oracle Database and generate vector embeddings from your data within the database.

Oracle also allows you to convert pre-trained models to the ONNX format using Oracle machine learning for Python. This enables the use of text transformers from different companies.

09:51

Nikita: Ok, so that was about generating vector embeddings. What about the next step in the workflow—storing vector embeddings?

Brent: So you can create one or more columns of the vector data type in your standard relational data tables. You can also store those in secondary tables that are related to the primary tables using primary key foreign key relationships.

You can store vector embeddings on structured data and relational business data in the Oracle Database. You do store the resulting vector embeddings and associated unstructured data with your relational business data inside the Oracle Database.

10:30

Nikita: And the third step is creating vector indexes?

Brent: Now you may want to create vector indexes in the event that you have huge vector spaces. This is an optional step, but this is beneficial for running similarity searches over those huge vector spaces.

So once you have generated the vector embeddings and stored those vector embeddings and possibly created the vector indexes, you can then query your data with similarity searches. This allows for Native SQL operations and allows you to combine similarity searches with relational searches in order to retrieve relevant data.

11:15

Lois: Ok. I think I’ve got it. So, Step 1, generate the vector embeddings from your unstructured data. Step 2, store the vector embeddings. Step 3, create vector indices. And Step 4, combine similarity and keyword search.

Brent: Now there is another optional step. You could generate a prompt and send it to a large language model for a full RAG inference. You can use the similarity search results to generate a prompt and send it to your generative large language model in order to complete your RAG pipeline.

11:59

Lois: Thank you for sharing such valuable insights about Oracle AI Vector Search, Brent. We can’t wait to have you back next week to talk about vector indices and memory.

Nikita: And if you want to know more about Oracle AI Vector Search, visit mylearn.oracle.com and check out the Oracle Database 23ai: Oracle AI Vector Search Fundamentals course.

Lois: Yes, and if you're serious about advancing in your development journey, we recommend taking the Oracle Database 23ai SQL workshop. It’s designed for those who might be familiar with SQL from other database platforms or even those completely new to SQL.

Nikita: Yeah, we’ll add the link to the workshop in the show notes so you can find it easily. Until next week, this is Nikita Abraham…

Lois: And Lois Houston signing off!

12:45

That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.