In this tutorial, we explain how to run Llama 3.1 Large Language Model (LLM) in Python Using Ollama on Windows on a Local Computer. Ollama is an interface and a platform for running different LLMs on local computers. On the other hand, Llama 3.1 is Meta’s (previously Facebook) most powerful LLM up to date. We will call Llama 3.1 by using Ollama’s Python library. After the response is generated in Python, we will save the response in a text file such that you can use the generated text for other purposes. The YouTube tutorial accompanying this webpage is given here
The procedure is:
1. Install Ollama and download the Llama 3.1 model from the Ollama website
2. Create a workspace folder, create a Python virtual environment, and install the Ollama Python Library
3. Write Python code that calls Llama 3.1 by using the Ollama library and that saves the response in a text file.
1. Install Ollama and download Llama 3.1 model from the Ollama website
To install Ollama in Windows, go to the Ollama website
https://ollama.com/download/windows
Download the installation file and install Ollama in Windows. This is very simple, you just have to click on the downloaded file and the installation process will start. Next, we explain how to install and use LLMs in Ollama. To see a list of LLMs that can be used with Ollama, go to this website , and select a model. In our case, we will use Llama 3.1. It is an open-source LLM released by Meta. Click on the model, or directly go this webpage
https://ollama.com/library/llama3.1
Then, select your model. In our case, we will use 8B model, and consequently, we have to execute this command to install and run llama3.1:8b. To execute this command, open a command prompt in Windows in administrator mode (administrator mode might not be necessary), and run this command
ollama run llama3.1:8b
The first time you run this command, the model will be downloaded. Next time you run the command, the model will only be started. The first time you run this command, the model will be automatically started. To exit the model, simply type /bye in the command prompt of Ollama.
2. Create a workspace folder and create Python virtual environment
Create the codes folder on the C: drive and inside of that folder create another folder called ollamaTest. We can do it like this:
cd\
mkdir codes
cd codes
mkdir ollamaTest
cd ollamaTest
Create the Python virtual environment and activate it
python -m venv ollama
ollama\Scripts\activate.bat
Install the Ollama library
pip install ollama
3. Write Python code that calls Llama 3.1 by using Ollama library and that saves the response in a text file.
Create a new Python file called testOllama.py, and write and execute this code:
import ollama
desiredModel='llama3.1:latest'
questionToAsk='How to solve quadratic euqation'
response = ollama.chat(model=desiredModel, messages=[
{
'role': 'user',
'content': questionToAsk,
},
])
OllamaResponse=response['message']['content']
print(OllamaResponse)
with open("OutputOllama.txt", "w", encoding="utf-8") as text_file:
text_file.write(OllamaResponse)
For example, you can use VS Code to write and execute this file. This file will ask the question “How to solve quadratic equation?”, and the response from LLM will be printed. Then, the response will be saved in the text file called OutputOllama.txt. This text file can be later on used for some other purpose.