{"id":97,"date":"2024-03-04T22:36:13","date_gmt":"2024-03-04T22:36:13","guid":{"rendered":"https:\/\/sanjayk7r.com\/?p=97"},"modified":"2024-03-07T16:40:02","modified_gmt":"2024-03-07T16:40:02","slug":"use-private-data-securely-with-an-llm","status":"publish","type":"post","link":"https:\/\/sanjayk7r.com\/index.php\/2024\/03\/04\/use-private-data-securely-with-an-llm\/","title":{"rendered":"Use private data securely with an LLM"},"content":{"rendered":"\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sanjayk7r.com\\\/wp-content\\\/uploads\\\/2024\\\/03\\\/llmrag-1.jpg&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image alignwide size-large&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-184&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:2386,&quot;targetHeight&quot;:1491,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image&quot;,&quot;alt&quot;:&quot;&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image alignwide size-large wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on--click=\"actions.showLightbox\" data-wp-on--load=\"callbacks.setButtonStyles\" data-wp-on-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-1024x640.jpg\" alt=\"\" class=\"wp-image-184\" srcset=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-1024x640.jpg 1024w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-300x187.jpg 300w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-768x480.jpg 768w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-1536x960.jpg 1536w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/llmrag-1-2048x1280.jpg 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<p>Large language models (LLMs) are amazing but one shortcoming is that they wont be able to answer questions related to your private data. This is where RAG or retrieval-augmented generation comes in. RAG is a technique that lets you use LLMs with your private data.<\/p>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Dont roll your own RAG<\/strong><\/h6>\n\n\n\n<p>RAG architecture has quite a few moving parts. It involves a ingestion database, vector database, retrieval system, prompt, and finally a generative model. Moreover these need to be orchestrated so that they can ingest new or updated data. Managing these interconnected components adds challenges to the development and deployment process that can slow your team down. Open source libraries can simplify these tasks but risk introducing errors and require constant version updates. Using these libraries still demands substantial coding, determining data chunk sizes, and generating embeddings.<\/p>\n\n\n\n<p>\ud83d\udc49 Your team should focus on building your product and not on the undifferentiated heavy lifting behind a RAG process.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Hello BEDROCK KNOWLEDGE BASE<\/h5>\n\n\n\n<p>Here is how simple it is to setup a fully managed RAG that is secure, simple and production grade.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">1. Create a Knowledge Base<\/h4>\n\n\n\n<p>Knowledge Bases for Amazon Bedrock is a serverless approach to building your RAG workflow. It automates synchronisation of your data with the vector database. <\/p>\n\n\n\n<p>Simply point to the location of your data on S3 and Knowledge Base takes care of the entire ingestion process into your vector database including embedding and synchronisation when your data changes. Your data could be in a multitude of formats e.g. txt, csv, xls, doc, pdf etc.<\/p>\n\n\n\n<p>You can simple use the default embedding models and vector database or choose your own.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"658\" src=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/Untitled-1024x658.png\" alt=\"\" class=\"wp-image-205\" srcset=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/Untitled-1024x658.png 1024w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/Untitled-300x193.png 300w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/Untitled-768x494.png 768w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/Untitled.png 1064w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">2. Query your knowledge base<\/h4>\n\n\n\n<p>Once your data is ingested. You can go ahead and start talking to it. There are different ways to query your data.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">USE the AWS Console<\/h5>\n\n\n\n<p>The console lets you quickly try both retrieval and generation.<\/p>\n\n\n\n<h6 class=\"wp-block-heading\">Retrieval<\/h6>\n\n\n\n<p>This means finding the correct set of documents based on the user query. Your can try different queries like hybrid search, semantic search or let bedrock choose one for you. <a href=\"https:\/\/sanjayk7r.com\/index.php\/2024\/03\/02\/power-up-llm-with-hybrid-search-rag\/\" target=\"_blank\" rel=\"noopener\" title=\"More about that in this post\">More about that in this post<\/a>.<\/p>\n\n\n\n<h6 class=\"wp-block-heading\">GEneration<\/h6>\n\n\n\n<p>Next is &#8220;generation&#8221; i.e. taking the search results that you have pulled from the vector database and bundling it as a context together with the user prompt and sending it to an LLM. Choosing a model is just 1-click. e.g. Select Claude and have it generate answers for you. All the plumbing happens instantly behind the scenes and you can focus on getting the best answers from your data.<\/p>\n\n\n\n<p>Here is a simple screenshot but it demonstrates the entire RAG workflow. <\/p>\n\n\n\n<ol>\n<li>Private data was uploaded to s3. <\/li>\n\n\n\n<li>Data was ingested and embedded into a vector database. <\/li>\n\n\n\n<li>The vector database prepared a context based on my query. <\/li>\n\n\n\n<li>Then that context was sent to Claude together with my prompt. <\/li>\n\n\n\n<li>Claude then generated a response that you see below, moreover it even attached links to my data that helps us trace the answer to the data.<\/li>\n<\/ol>\n\n\n\n<p>The first step is manual, the rest is entirely managed by Bedrock for you!<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"452\" height=\"738\" src=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/bedrock-query.png\" alt=\"\" class=\"wp-image-206\" style=\"width:417px;height:auto\" srcset=\"https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/bedrock-query.png 452w, https:\/\/sanjayk7r.com\/wp-content\/uploads\/2024\/03\/bedrock-query-184x300.png 184w\" sizes=\"(max-width: 452px) 100vw, 452px\" \/><\/figure>\n\n\n\n<h5 class=\"wp-block-heading\">Use the SDK<\/h5>\n\n\n\n<p>If want to develop and app or talk to your data via a notebook, its easy as well. You just need the latest AWS SDK. e.g. Here are some api options if you use python.<\/p>\n\n\n\n<p>&#8211; api: <a href=\"https:\/\/boto3.amazonaws.com\/v1\/documentation\/api\/latest\/reference\/services\/bedrock-agent-runtime\/client\/retrieve_and_generate.html#retrieve-and-generate\" target=\"_blank\" rel=\"noopener\" title=\"retrieve_and_generate\">retrieve_and_generate<\/a> &#8211; Retrieve documents and obtains answers directly with the native Bedrock Knowledge Base.<\/p>\n\n\n\n<p>&#8211; api: <a href=\"https:\/\/boto3.amazonaws.com\/v1\/documentation\/api\/latest\/reference\/services\/bedrock-agent-runtime\/client\/retrieve.html#retrieve\" target=\"_blank\" rel=\"noopener\" title=\"\">retrieve<\/a> &#8211; Retrieve documents based on your query and then pass it to an LLM of your choice.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">build a fully customiSEd solution<\/h5>\n\n\n\n<p>If you want to build you own fully customised Q&amp;A application you can integrate bedrock with langchain. Have a look at <a href=\"https:\/\/python.langchain.com\/docs\/integrations\/retrievers\/bedrock\" target=\"_blank\" rel=\"noopener\" title=\"Langchain integration\">Langchain integration<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">\ud83d\udc50 <\/h2>\n\n\n\n<p>Thats it, it really is that straightforward to put together an LLM with RAG.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">\ud83d\ude80 Get Started<\/h4>\n\n\n\n<p>\ud83d\udc49 For a step by step walkthrough on setting on RAG on bedrock read this <a href=\"https:\/\/aws.amazon.com\/blogs\/aws\/knowledge-bases-now-delivers-fully-managed-rag-experience-in-amazon-bedrock\/\" target=\"_blank\" rel=\"noopener\" title=\"blog post\">blog post<\/a>.<\/p>\n\n\n\n<p>\ud83d\udc49 For a really good look at how to use RAG for drug discovery using Amazon Bedrock, read this <a href=\"https:\/\/aws.amazon.com\/blogs\/machine-learning\/use-rag-for-drug-discovery-with-knowledge-bases-for-amazon-bedrock\/\" target=\"_blank\" rel=\"noopener\" title=\"blog post\">blog post<\/a>.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Large language models (LLMs) are amazing but one shortcoming is that they wont be able to answer questions related to your private data. This is where RAG or retrieval-augmented generation comes in. RAG is a technique that lets you use LLMs with your private data. Dont roll your own RAG RAG architecture has quite a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":184,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[6,13,11],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/97"}],"collection":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/comments?post=97"}],"version-history":[{"count":37,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/97\/revisions"}],"predecessor-version":[{"id":210,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/posts\/97\/revisions\/210"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/media\/184"}],"wp:attachment":[{"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/media?parent=97"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/categories?post=97"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sanjayk7r.com\/index.php\/wp-json\/wp\/v2\/tags?post=97"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}