This project serves as a functional RAG UI for both end users who want to do QA on their documents and developers who want to build their own RAG pipeline.
For end users:
A clean & minimalistic UI for RAG-based QA.
Supports LLM API providers (OpenAI, AzureOpenAI, Cohere, etc) and local LLMs (via ollama and llama-cpp-python).
Easy installation scripts.
For developers:
A framework for building your own RAG-based document QA pipeline.
Customize and see your RAG pipeline in action with the provided UI (built with Gradio ).
If you use Gradio for development, check out our theme here: kotaemon-gradio-theme.
Key Features
Host your own document QA (RAG) web-UI. Support multi-user login, organize your files in private / public collections, collaborate and share your favorite chat with others.
Organize your LLM & Embedding models. Support both local LLMs & popular API providers (OpenAI, Azure, Ollama, Groq).
Hybrid RAG pipeline. Sane default RAG pipeline with hybrid (full-text & vector) retriever + re-ranking to ensure best retrieval quality.
Multi-modal QA support. Perform Question Answering on multiple documents with figures & tables support. Support multi-modal document parsing (selectable options on UI).
Advance citations with document preview. By default the system will provide detailed citations to ensure the correctness of LLM answers. View your citations (incl. relevant score) directly in the in-browser PDF viewer with highlights. Warning when retrieval pipeline return low relevant articles.
Support complex reasoning methods. Use question decomposition to answer your complex / multi-hop question. Support agent-based reasoning with ReAct, ReWOO and other agents.
Configurable settings UI. You can adjust most important aspects of retrieval & generation process on the UI (incl. prompts).
Extensible. Being built on Gradio, you are free to customize / add any UI elements as you like. Also, we aim to support multiple strategies for document indexing & retrieval. GraphRAG indexing pipeline is provided as an example.
Installation
For end users
This document is intended for developers. If you just want to install and use the app as it is, please follow the non-technical User Guide. Use the most recent release .zip to include latest features and bug-fixes.
For developers
With Docker (recommended)
We support lite & full version of Docker images. With full, the extra packages of unstructured will be installed as well, it can support additional file types (.doc, .docx, ...) but the cost is larger docker image size. For most users, the lite image should work well in most cases.
To use the lite version.
docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-p 7860:7860 -it --rm \
ghcr.io/cinnamon/kotaemon:main-lite
To use the full version.
docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-p 7860:7860 -it --rm \
ghcr.io/cinnamon/kotaemon:main-full
Currently, two platforms: linux/amd64 and linux/arm64 (for newer Mac) are provided & tested. User can specify the platform by passing --platform in the docker run command. For example:
# To run docker with platform linux/arm64
docker run \
-e GRADIO_SERVER_NAME=0.0.0.0 \
-e GRADIO_SERVER_PORT=7860 \
-p 7860:7860 -it --rm \
--platform linux/arm64 \
ghcr.io/cinnamon/kotaemon:main-lite
If everything is set up fine, navigate to http://localhost:7860/ to access the web UI.
References:
https://github.com/Cinnamon/kotaemon
No comments:
Post a Comment