February 13, 2025

How to Install RAGFlow with Ollama and Large Language Models on Linux Ubuntu – Host and Run

In this tutorial, we explain how to install and run locally RAGFlow with local large language and embedding models that are running in Ollama. That is, in this tutorial, you will learn how to install and host your own RAG AI system completely for free on a Linux Ubuntu system. In the next tutorial, we will explain how to install RAGFlow on a Windows computer.

One of the main advantages of RAGFlow is that it is distributed under the Apache 2.0 license, which is a very permissive license (even allowing for a commercial use). Furthermore, you can embed most popular local large language models, such as Llama, Mistral, DeepSeek, Gemini, ChatGPT, and others. In this tutorial, we will use Llama 3.2 models and a simple text embedding model.

The only issue we see with RAGFlow is that its installation is far for trivial. You need to carefully install several components and configure them properly in order to be able to run a local RAGFlow. Also, there might be several errors that you need to resolve that are not strictly due to the RAGFlow, but instead due to the inappropriate Linux configuration parameters.

The YouTube tutorial explaining all the installation steps is given below.

Installation Procedure

First, we need to install Docker Engine. To do that, follow the tutorial given here. Next, update and upgrade packages, and install curl

sudo apt update && sudo apt upgrade
sudo apt install curl
curl --version

Next, we need to install Git and Git for large files:

sudo apt install git-all
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs

The next step is to check the variable max_map_count

sysctl vm.max_map_count


If this value is smaller than 262144, then you need to increase it to at least 262144 . To do that, you need to type this

sudo sysctl -w vm.max_map_count=262144

This change will be reset after a system reboot. Consequently, you have to make sure that this value is set every time you start the computer. To do that, you need to add a single line to the file /etc/sysctl.conf . You can do it like this

sudo nano /etc/sysctl.conf 

and add the following line to this file

vm.max_map_count=262144

The next step is to clone the repository, and to install a Docker image of the RAGFlow. To do that, type this

cd ~
git clone https://github.com/infiniflow/ragflow.git
cd ragflow
git checkout -f v0.16.0

Here, we are checking out the most recent version of RAGFlow (as of February 2025). However, if you are following this tutorial after this date, then you need to checkout a different version. To see the correct version, go to the website

https://github.com/infiniflow/ragflow.git

Next, try to run this command in order to download the image, and start the RAGFflow container:

docker compose -f docker/docker-compose.yml up -d

If you see this error

Error response from daemon: driver failed programming external connectivity on endpoint ragflow-server (19f6c3fdc67d795f0c37ef03b776f8c0b6d26e80de2b3389edc974491033fc5d): failed to bind 
port 0.0.0.0:80/tcp: Error starting userland proxy: listen tcp4 0.0.0.0:80: bind: address already in use

then, in order to fix this error, you need to install net-tools

sudo apt install net-tools

and then you need to type this

sudo lsof -i :80

to see what are the services that are using the port 80. In our case, the service creating the issue is apache2. To fix that type this:

sudo service apache2 stop

Then, you need to repeat the command:

docker compose -f docker/docker-compose.yml up -d

Then, you need to type this

docker logs -f ragflow-server

Then, you need to start a web browser and enter this address: localhost/login or the address 127.0.0.1 (see the YouTube tutorial for more details).

Next, you need to install Ollama and LLMs by using Docker containers. To do that, we need to install the NVIDIA Container Toolkit. To do that, execute these commands in the terminal:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

The next step is to configure the container by using the nvidia-ctk command

sudo nvidia-ctk runtime configure --runtime=docker

The final step is to restart the Docker engine

sudo systemctl restart docker

The next step is to download the Ollama Docker image and start a Docker Ollama container

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Here, the name of the container is “ollama” which is created from the official image “ollama/ollama”. The next step is to download the LLM and embedding models

docker exec -it ollama ollama pull llama3.2
docker exec -it ollama ollama pull bge-m3

You can also use other LLM and embedding models. The next step is to verify that Ollama container can be recognized

sudo docker exec -it ragflow-server bash

Then, use Curl to verify that Ollama is running

curl  http://host.docker.internal:11434/

You should get a response that Ollama is running. Note that this address:

http://host.docker.internal:11434/

should be entered when configuring the Ollama inside of the GUI of the RAGFlow. The final step is to configure Ollama and models in the GUI of RAGFlow. To do that watch the YouTube video tutorial given above.