January 16, 2025

Install and Run You Only Look Once (YOLO) computer vision model on GPU/CPU and Windows

In this brief computer vision tutorial, we explain how to install the You Only Look Once (YOLO) computer vision model. In this tutorial, we will install YOLO version 11. However, everything explained in this tutorial works for older or newer versions (after January 2025) of YOLO. The YOLO algorithm can be used for standard operations in computer vision: object detection, segmentation, classification, pose estimation, as well as for other computer vision tasks. The YouTube tutorial is given below.

Here is a brief demonstration of the performance of the algorithm. We randomly placed several objects on a table and took the photo of the scene by using our phone camera. Note that this is a real raw photo and not some overly processed photo found on the internet. The raw image is given below.

The image produced by YOLO is given below.

Prerequisites

Installation instructions

First, open Windows Command Prompt, and check the Python version

python --version 

If Python is installed, you should see the Python version. Next, create the workspace folder

cd\
mkdir testYolo
cd testYolo

Create the Python virtual environment and activate

python -m venv env1
env1\Scripts\activate.bat

Install the necessary libraries

pip install setuptools

Then install PyTorch CUDA. Visit this website to get the installation command. Then run the generated command:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Then install YOLO. The best strategy is to install the newest branch directly from GitHub

pip install git+https://github.com/ultralytics/ultralytics.git@main

Then copy a test image to the workspace folder and write the test Python code given below.

from ultralytics import YOLO

# Run inference on an image
# yolo11n.pt, yolo11s.pt, yolo11m.pt, yolo11l.pt, yolo11x.pt
model = YOLO("yolo11l.pt")
results = model("test2.jpg")  # results list

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    #result.show()  # display to screen
    result.save(filename="result.jpg")  # save to disk