September 19, 2024

Install and Run Llama 3.1 Large Language Model (LLM) in Python Using Ollama on Windows on a Local Computer

In this tutorial, we explain how to run Llama 3.1 Large Language Model (LLM) in Python Using Ollama on Windows on a Local Computer. Ollama is an interface and a platform for running different LLMs on local computers. On the other hand, Llama 3.1 is Meta’s (previously Facebook) most powerful LLM up to date. We will call Llama 3.1 by using Ollama’s Python library. After the response is generated in Python, we will save the response in a text file such that you can use the generated text for other purposes. The YouTube tutorial accompanying this webpage is given here

The procedure is:

1. Install Ollama and download the Llama 3.1 model from the Ollama website
2. Create a workspace folder, create a Python virtual environment, and install the Ollama Python Library
3. Write Python code that calls Llama 3.1 by using the Ollama library and that saves the response in a text file.

1. Install Ollama and download Llama 3.1 model from the Ollama website

To install Ollama in Windows, go to the Ollama website

https://ollama.com/download/windows

Download the installation file and install Ollama in Windows. This is very simple, you just have to click on the downloaded file and the installation process will start. Next, we explain how to install and use LLMs in Ollama. To see a list of LLMs that can be used with Ollama, go to this website , and select a model. In our case, we will use Llama 3.1. It is an open-source LLM released by Meta. Click on the model, or directly go this webpage

https://ollama.com/library/llama3.1

Then, select your model. In our case, we will use 8B model, and consequently, we have to execute this command to install and run llama3.1:8b. To execute this command, open a command prompt in Windows in administrator mode (administrator mode might not be necessary), and run this command

ollama run llama3.1:8b

The first time you run this command, the model will be downloaded. Next time you run the command, the model will only be started. The first time you run this command, the model will be automatically started. To exit the model, simply type /bye in the command prompt of Ollama.

2. Create a workspace folder and create Python virtual environment

Create the codes folder on the C: drive and inside of that folder create another folder called ollamaTest. We can do it like this:

cd\
mkdir codes
cd codes
mkdir ollamaTest
cd ollamaTest

Create the Python virtual environment and activate it

python -m venv ollama
ollama\Scripts\activate.bat

Install the Ollama library

pip install ollama

3. Write Python code that calls Llama 3.1 by using Ollama library and that saves the response in a text file.

Create a new Python file called testOllama.py, and write and execute this code:

import ollama
desiredModel='llama3.1:latest'
questionToAsk='How to solve quadratic euqation'

response = ollama.chat(model=desiredModel, messages=[
  {
    'role': 'user',
    'content': questionToAsk,
  },
])

OllamaResponse=response['message']['content']

print(OllamaResponse)

with open("OutputOllama.txt", "w", encoding="utf-8") as text_file:
    text_file.write(OllamaResponse)

For example, you can use VS Code to write and execute this file. This file will ask the question “How to solve quadratic equation?”, and the response from LLM will be printed. Then, the response will be saved in the text file called OutputOllama.txt. This text file can be later on used for some other purpose.