RAG allows you to augment the knowledge built within the model with new material when generating responses. There are a lot of ways you can use this, for example, Bing Chat or Bard leverage RAG to be able to use live content to respond to your queries.
Simply put, RAG can be as simple as prompting an LLM with something like this:
Answer the question based only on the following context, citing the page number(s) of the document(s) you used to answer the question:
document>
<content>S
The content goes here.
</content>
<page>10</page>
<file>example.pdf</file>
</document>
This also leads to fewer hallucinations from the LLM, which is very important in building trust with the bot users
The main tools used are Streamlit, Langchain, Replit, FAISS
Replit is a tool to “build, test, and deploy directly from the browser”, gives you a nice code editor with some AI capabilities right in your browser. The free version gives you a small VM to run your code on, but forces you to share whatever you write (so be careful with your OpenAI API key). It also can help you deploy your website to their cloud, if you choose to
FAISS (Facebook AI Similarity Search): A library for efficient similarity search and clustering of dense vectors. It is used to find chunks of content relevant to the user’s query to be used in the RAG process when querying the LLM
Below is basic code for searching own document
def get_search_index():
# load embeddings
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
file_folder = 'index'
file_name = 'Some_content_to_search.pdf'
search_index = FAISS.load_local(folder_path=file_folder,
index_name=file_name + '.index',
embeddings=OpenAIEmbeddings())
return search_index
The article has code here too
https://github.com/nimamahmoudi/LLMStreamlitDemoBasic
references:
https://itnext.io/building-rag-based-chatbots-using-streamlit-langchain-e5c8554ea435
No comments:
Post a Comment