In this tutorial, we explain how to install and run locally RAGFlow with local large language and embedding models that are running in Ollama. That is, in this tutorial, you will learn how to install and host your own RAG AI system completely for free on a Linux Ubuntu system. In the next tutorial, we will explain how to install RAGFlow on a Windows computer.
One of the main advantages of RAGFlow is that it is distributed under the Apache 2.0 license, which is a very permissive license (even allowing for a commercial use). Furthermore, you can embed most popular local large language models, such as Llama, Mistral, DeepSeek, Gemini, ChatGPT, and others. In this tutorial, we will use Llama 3.2 models and a simple text embedding model.
The only issue we see with RAGFlow is that its installation is far for trivial. You need to carefully install several components and configure them properly in order to be able to run a local RAGFlow. Also, there might be several errors that you need to resolve that are not strictly due to the RAGFlow, but instead due to the inappropriate Linux configuration parameters.
The YouTube tutorial explaining all the installation steps is given below.
Installation Procedure
First, we need to install Docker Engine. To do that, follow the tutorial given here. Next, update and upgrade packages, and install curl
sudo apt update && sudo apt upgrade
sudo apt install curl
curl --version
Next, we need to install Git and Git for large files:
sudo apt install git-all
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
The next step is to check the variable max_map_count
sysctl vm.max_map_count
If this value is smaller than 262144, then you need to increase it to at least 262144 . To do that, you need to type this
sudo sysctl -w vm.max_map_count=262144
This change will be reset after a system reboot. Consequently, you have to make sure that this value is set every time you start the computer. To do that, you need to add a single line to the file /etc/sysctl.conf . You can do it like this
sudo nano /etc/sysctl.conf
and add the following line to this file
vm.max_map_count=262144
The next step is to clone the repository, and to install a Docker image of the RAGFlow. To do that, type this
cd ~
git clone https://github.com/infiniflow/ragflow.git
cd ragflow
git checkout -f v0.16.0
Here, we are checking out the most recent version of RAGFlow (as of February 2025). However, if you are following this tutorial after this date, then you need to checkout a different version. To see the correct version, go to the website
https://github.com/infiniflow/ragflow.git
Next, try to run this command in order to download the image, and start the RAGFflow container:
docker compose -f docker/docker-compose.yml up -d
If you see this error
Error response from daemon: driver failed programming external connectivity on endpoint ragflow-server (19f6c3fdc67d795f0c37ef03b776f8c0b6d26e80de2b3389edc974491033fc5d): failed to bind
port 0.0.0.0:80/tcp: Error starting userland proxy: listen tcp4 0.0.0.0:80: bind: address already in use
then, in order to fix this error, you need to install net-tools
sudo apt install net-tools
and then you need to type this
sudo lsof -i :80
to see what are the services that are using the port 80. In our case, the service creating the issue is apache2. To fix that type this:
sudo service apache2 stop
Then, you need to repeat the command:
docker compose -f docker/docker-compose.yml up -d
Then, you need to type this
docker logs -f ragflow-server
Then, you need to start a web browser and enter this address: localhost/login or the address 127.0.0.1 (see the YouTube tutorial for more details).
Next, you need to install Ollama and LLMs by using Docker containers. To do that, we need to install the NVIDIA Container Toolkit. To do that, execute these commands in the terminal:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
The next step is to configure the container by using the nvidia-ctk
command
sudo nvidia-ctk runtime configure --runtime=docker
The final step is to restart the Docker engine
sudo systemctl restart docker
The next step is to download the Ollama Docker image and start a Docker Ollama container
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Here, the name of the container is “ollama” which is created from the official image “ollama/ollama”. The next step is to download the LLM and embedding models
docker exec -it ollama ollama pull llama3.2
docker exec -it ollama ollama pull bge-m3
You can also use other LLM and embedding models. The next step is to verify that Ollama container can be recognized
sudo docker exec -it ragflow-server bash
Then, use Curl to verify that Ollama is running
curl http://host.docker.internal:11434/
You should get a response that Ollama is running. Note that this address:
http://host.docker.internal:11434/
should be entered when configuring the Ollama inside of the GUI of the RAGFlow. The final step is to configure Ollama and models in the GUI of RAGFlow. To do that watch the YouTube video tutorial given above.