Tutorial

2 Easy ways to use parallel For loop in Python

7 min read

In this tutorial, we will learn about parallel for loop in Python. You will learn how to run Python parallel for loop with easy-to-understand examples.

Introduction

A parallel for loop is a powerful concept where each iteration of the loop is executed concurrently. This stands in contrast to a traditional for loop, where each iteration is executed sequentially, one after the other. The beauty of a parallel for loop lies in its potential for significant performance improvements.

This is particularly true when each iteration of the loop is independent and can be executed simultaneously without waiting for the previous iteration to complete.

This concurrent execution model allows for efficient utilization of system resources and can drastically reduce the overall execution time of the loop, leading to faster, more efficient code.

Understand Python’s Global Interpreter Lock (GIL) to use Parallel for loop in Python

The Global Interpreter Lock (GIL) is a unique feature of CPython, the most widely-used Python interpreter.

It’s a mechanism that ensures only one thread executes Python bytecodes at a time, even on multi-core systems. This synchronization is necessary because CPython’s memory management is not thread-safe.

The GIL has profound implications for multi-threaded Python programs.

Despite the presence of multiple threads, the GIL allows only one of them to execute at a time, making it impossible to fully utilize multiple processors with a single Python process.

This is a significant limitation for applications that are CPU-bound, meaning their performance is primarily limited by the speed of the CPU rather than I/O operations or network latency.

However, it’s important to note that the GIL does not prevent parallelism in all scenarios.

For I/O-bound tasks, where the program spends most of its time waiting for data from the network or a disk, Python threads can be effective because they allow the program to continue executing while waiting for I/O operations to complete.

Moreover, Python provides ways to bypass the GIL and achieve true parallelism, especially for CPU-bound tasks.

One approach is to use multiple processes instead of threads.

Each Python process has its own Python interpreter and memory space, so the GIL of one interpreter does not affect the execution of another. The multiprocessing module in Python’s standard library makes it easy to create multi-process applications.

Python Parallel For Loop Using Multiprocessing

Multiprocessing in Python is a powerful technique that allows for the execution of multiple processes concurrently.

This approach can be particularly useful when dealing with Python for loops, where each iteration of the loop is independent and can be executed simultaneously. This is often referred to as a parallel for loop in Python.

The code snippet provided demonstrates how to use the multiprocessing module to run parallel for loop.

The function square(n) is a simple function that calculates the square of a number and simulates a time-consuming task by including a one-second sleep.

from multiprocessing import Pool
import time

def square(n):
    time.sleep(1)  # Simulate a time-consuming task
    return n * n

The execute_task_parallel() function creates a pool of worker processes and uses the map() function to distribute the input range across the worker processes.

This is where the for loop is effectively parallelized.

Each worker process executes the square(n) function on a different value of n, and the results are collected into a Python list.

def execute_task_parallel():
    with Pool(5) as p:
        return p.map(square, range(1, 6))

For comparison, the execute_task_sequential() function performs the same calculations but does so sequentially, without using multiple processes. This function represents a traditional for loop, where each iteration is executed one after the other.

def execute_task_sequential():
    return [square(i) for i in range(1, 6)]

The measure_time() function is used to measure the execution time of both the parallel and sequential tasks.

It takes as arguments the function to execute and a string representing the type of execution.

It prints the results of the calculations and the total execution time, allowing for a direct comparison of the performance of the parallel and sequential approaches.

def measure_time(execution_function, execution_type):
    start_time = time.time()
    results = execution_function()
    end_time = time.time()
    
    print(f"{execution_type} execution:")
    for i, result in enumerate(results, start=1):
        print(f"Square of {i}: {result}")
    print(f"Total time: {end_time - start_time}")

Below is the complete Python script to test out the code:

from multiprocessing import Pool
import time

def square(n):
    time.sleep(1)  # Simulate a time-consuming task
    return n * n

def execute_task_parallel():
    with Pool(5) as p:
        return p.map(square, range(1, 6))
    
def execute_task_sequential():
    return [square(i) for i in range(1, 6)]

def measure_time(execution_function, execution_type):
    start_time = time.time()
    results = execution_function()
    end_time = time.time()
    
    print(f"{execution_type} execution:")
    for i, result in enumerate(results, start=1):
        print(f"Square of {i}: {result}")
    print(f"Total time: {end_time - start_time}")

def main():
    measure_time(execute_task_sequential, "Sequential")
    measure_time(execute_task_parallel, "Parallel")

if __name__ == "__main__":
    main()

We have added a main() function to call measure_time() on both sequential and parallel for loops.

When this Python script is run, it produces the output that clearly shows the time saved when executing the tasks in parallel:

output of parallel for loop in python using multiprocessing

Parallel For Loop Using Joblib in Python

Joblib is a Python library that provides tools for pipelining Python jobs and has built-in support for parallelism. It’s particularly useful for tasks that are independent and can be run simultaneously, such as iterations of a for loop. This is also referred to as a parallel for loop as the above.

The provided code snippet demonstrates how to use Joblib to achieve parallel for loop in Python.

The function square(n) is a simple function that calculates the square of a number and simulates a time-consuming task by including a one-second sleep.

from joblib import Parallel, delayed
import time

def square(n):
    time.sleep(1)  # Simulate a time-consuming task
    return n * n

The execute_task_parallel() function uses the Parallel() and delayed() functions from Joblib to distribute the input range across multiple workers. Each worker executes the square(n) function on a different value of n, and the results are collected into a list.

This is where the for loop is effectively parallel.

def execute_task_parallel():
    return Parallel(n_jobs=5)(delayed(square)(i) for i in range(1, 6))

For comparison, the execute_task_sequential() function performs the same calculations but does so sequentially, without using multiple workers.

This function represents a traditional for loop, where each iteration is executed one after the other.

def execute_task_sequential():
    return [square(i) for i in range(1, 6)]

In this example, we also define a measure_time() function to measure the execution time of both the parallel and sequential tasks. The complete script to test the execution times for both sequential and parallel executions is shown below:

from joblib import Parallel, delayed
import time

def square(n):
    time.sleep(1)  # Simulate a time-consuming task
    return n * n

def execute_task_parallel():
    return Parallel(n_jobs=5)(delayed(square)(i) for i in range(1, 6))

def execute_task_sequential():
    return [square(i) for i in range(1, 6)]

def measure_time(execution_function, execution_type):
    start_time = time.time()
    results = execution_function()
    end_time = time.time()
    
    print(f"{execution_type} execution:")
    for i, result in enumerate(results, start=1):
        print(f"Square of {i}: {result}")
    print(f"Total time: {end_time - start_time}")

def main():
    measure_time(execute_task_sequential, "Sequential")
    measure_time(execute_task_parallel, "Parallel")

if __name__ == "__main__":
    main()

The output clearly shows the time saved when executing the tasks in parallel.

output of parallel for loop in python using joblib

Parallelization is an important concept in modern computing and programming that can lead to significant performance improvements.

Despite the limitations imposed by the GIL, there are still many ways to use parallel code execution in Python code effectively. We encourage you to explore these techniques and experiment with them in your projects.