{"id":77,"date":"2024-03-02T17:11:00","date_gmt":"2024-03-02T17:11:00","guid":{"rendered":"https:\/\/sanjayk7r.com\/?p=77"},"modified":"2024-03-07T16:53:59","modified_gmt":"2024-03-07T16:53:59","slug":"power-up-llm-with-hybrid-search-rag","status":"publish","type":"post","link":"https:\/\/sanjayk7r.com\/index.php\/2024\/03\/02\/power-up-llm-with-hybrid-search-rag\/","title":{"rendered":"Power up LLM bots with RAG + Hybrid search"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"1024\" src=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/DALL\u00b7E-2024-03-03-18.20.02-Create-a-high-quality-detailed-isometric-illustration-inspired-by-Pixar-featuring-a-llama-surrounded-by-cute-little-flying-robots.-The-llama-stands-.webp\" alt=\"\" class=\"wp-image-82\" srcset=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/DALL\u00b7E-2024-03-03-18.20.02-Create-a-high-quality-detailed-isometric-illustration-inspired-by-Pixar-featuring-a-llama-surrounded-by-cute-little-flying-robots.-The-llama-stands-.webp 1024w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/DALL\u00b7E-2024-03-03-18.20.02-Create-a-high-quality-detailed-isometric-illustration-inspired-by-Pixar-featuring-a-llama-surrounded-by-cute-little-flying-robots.-The-llama-stands--300x300.webp 300w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/DALL\u00b7E-2024-03-03-18.20.02-Create-a-high-quality-detailed-isometric-illustration-inspired-by-Pixar-featuring-a-llama-surrounded-by-cute-little-flying-robots.-The-llama-stands--150x150.webp 150w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/DALL\u00b7E-2024-03-03-18.20.02-Create-a-high-quality-detailed-isometric-illustration-inspired-by-Pixar-featuring-a-llama-surrounded-by-cute-little-flying-robots.-The-llama-stands--768x768.webp 768w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Retrieval-Augmented Generation or RAG is important because it helps LLMs give more accurate answers by also including content (or context) from your private data.  See this post on <a href=\"https:\/\/sanjayk7r.com\/index.php\/2024\/03\/04\/use-private-data-securely-with-an-llm\/\" target=\"_blank\" rel=\"noopener\" title=\"\">how you can implement RAG quickly and securely.<\/a><\/p>\n\n\n\n<p>With RAG, your private data is stored as an embedding in a vector database. Depending on the user query, data is retrieved from this database using a type of search called &#8220;semantic search.&#8221;  Semantic search is cool because it understands questions more like a human does. i.e &#8220;cost&#8221; and &#8220;price&#8221; mean the same thing. <\/p>\n\n\n\n<p>But, it&#8217;s not perfect. It sometimes misses relevant keywords. In some cases the precise keyword must be included in the query e.g. brand or website name to get correct results.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Hybrid Search<\/h5>\n\n\n\n<p>To improve this, Amazon Bedrock introduced a new feature called &#8220;hybrid search.&#8221; Hybrid search combines semantic search with the old-school keyword search. It uses the strengths of both to give you better results. Here is an example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>what is the <mark style=\"background-color:rgba(0, 0, 0, 0);color:#73c1ff\" class=\"has-inline-color\">cost<\/mark> of the cycle of the brand &lt;<mark style=\"background-color:rgba(0, 0, 0, 0);color:#ff7373\" class=\"has-inline-color\">brand<\/mark>&gt; on &lt;<mark style=\"background-color:rgba(0, 0, 0, 0);color:#ff7373\" class=\"has-inline-color\">website<\/mark>&gt;?<\/code><\/pre>\n\n\n\n<p>In this query a semantic search for &#8220;cost&#8221; and a keyword search for &#8220;brand&#8221; and &#8220;website&#8221; will yield a better search result or context.<\/p>\n\n\n\n<p>The Bedrock AWS console currently offers 3 options for you to query your vector database.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"521\" height=\"264\" src=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/image-edited.png\" alt=\"\" class=\"wp-image-213\" style=\"width:451px;height:auto\" srcset=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/image-edited.png 521w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/image-edited-300x152.png 300w\" sizes=\"(max-width: 521px) 100vw, 521px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udc50<\/h2>\n\n\n\n<p>Thats it, with hybrid search you are going to get more accurate answers from your foundation models (FM). Note that semantic search is more precise when the domain is narrow and there is little room for misinterpretation.<\/p>\n\n\n\n<p>\ud83d\udc49 For more details, you can visit the full blog post <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/knowledge-bases-for-amazon-bedrock-now-supports-hybrid-search\/\" target=\"_blank\" rel=\"noopener\" title=\"\">here<\/a>.<\/p>\n\n\n\n<p>\ud83d\udc49  Want to try RAG on Bedrock? Here a <a href=\"https:\/\/github.com\/aws-samples\/amazon-bedrock-samples\/tree\/main\/knowledge-bases\" target=\"_blank\" rel=\"noopener\" title=\"\">github repo<\/a> to help you get started.<\/p>\n\n\n\n<p>\ud83d\udc49 Here is a superb video on the topic as well<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"AWS re:Invent 2023 - Use RAG to improve responses in generative AI applications (AIM336)\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/N0tlOXZwrSs?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/knowledge-bases-for-amazon-bedrock-now-supports-hybrid-search\/\" target=\"_blank\" rel=\"noreferrer noopener\">Read more<\/a><\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Retrieval-Augmented Generation or RAG is important because it helps LLMs give more accurate answers by also including content (or context) from your private data. See this post on how you can implement RAG quickly and securely. With RAG, your private data is stored as an embedding in a vector database. Depending on the user query, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":82,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[6,13,11],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/77"}],"collection":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/comments?post=77"}],"version-history":[{"count":14,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/77\/revisions"}],"predecessor-version":[{"id":214,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/77\/revisions\/214"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/media\/82"}],"wp:attachment":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/media?parent=77"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/categories?post=77"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/tags?post=77"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}