Python > Advanced Python Concepts > Concurrency and Parallelism > Choosing Between Threads and Processes
Comparing Threads and Processes for CPU-Bound Tasks
This example demonstrates the difference between using threads and processes for CPU-bound tasks. We'll compare their performance when calculating a large series of Fibonacci numbers.
Concepts Behind the Snippet
This snippet illustrates the crucial distinction between threads and processes when dealing with CPU-intensive operations in Python. The Global Interpreter Lock (GIL) in CPython (the standard Python implementation) restricts only one thread from holding control of the Python interpreter at any given time. This means that true parallel execution of Python bytecode by multiple threads is not possible for CPU-bound tasks. Processes, on the other hand, bypass the GIL because each process has its own Python interpreter and memory space. Therefore, using multiple processes allows you to achieve true parallelism on multi-core processors.
Code: Calculating Fibonacci Numbers Sequentially
This code calculates Fibonacci numbers sequentially. It's our baseline for comparison. It iterates from 30 to 34 and calculates the Fibonacci number for each. The execution time is recorded to provide a reference.
import time
def fibonacci(n):
if n <= 1:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)
if __name__ == '__main__':
start_time = time.time()
for i in range(30, 35):
print(f'Fibonacci({i}) = {fibonacci(i)}')
end_time = time.time()
print(f'Sequential Execution Time: {end_time - start_time:.4f} seconds')
Code: Using Threads (Inefficient for CPU-Bound)
This code uses threads to calculate the Fibonacci numbers. Due to the GIL, the threads largely execute in sequence, not in parallel, resulting in minimal performance improvement over the sequential version (and possibly slower due to the overhead of thread management). Each thread is assigned a Fibonacci number to calculate. `threading.Thread` is used to create and start each thread. `thread.join()` ensures the main thread waits for all created threads to complete before exiting. The thread name is printed, which helps to monitor the execution.
import threading
import time
def fibonacci(n):
if n <= 1:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)
def calculate_fibonacci(n):
print(f'Thread {threading.current_thread().name}: Fibonacci({n}) = {fibonacci(n)}')
if __name__ == '__main__':
start_time = time.time()
threads = []
for i in range(30, 35):
thread = threading.Thread(target=calculate_fibonacci, args=(i,), name=f'Thread-{i}')
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
end_time = time.time()
print(f'Threads Execution Time: {end_time - start_time:.4f} seconds')
Code: Using Processes (Efficient for CPU-Bound)
This code utilizes multiple processes to perform the Fibonacci calculations. Because each process has its own Python interpreter, they can truly run in parallel, leveraging multiple CPU cores. This results in significantly faster execution compared to threads for CPU-bound tasks. `multiprocessing.Process` is used to create and start each process. A `multiprocessing.Queue` is used to pass the calculated results back to the main process since each process has its own memory space. The results are then retrieved from the queue, sorted, and printed. Sorting is needed because queue provides the items in random order of ending processes.
import multiprocessing
import time
def fibonacci(n):
if n <= 1:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)
def calculate_fibonacci(n, queue):
result = fibonacci(n)
queue.put((n, result))
if __name__ == '__main__':
start_time = time.time()
processes = []
queue = multiprocessing.Queue()
for i in range(30, 35):
process = multiprocessing.Process(target=calculate_fibonacci, args=(i, queue))
processes.append(process)
process.start()
for process in processes:
process.join()
results = []
while not queue.empty():
results.append(queue.get())
results.sort()
for n, result in results:
print(f'Process: Fibonacci({n}) = {result}')
end_time = time.time()
print(f'Processes Execution Time: {end_time - start_time:.4f} seconds')
Real-Life Use Case
This concept is applicable to scenarios like image processing, scientific simulations, or any task that involves heavy computations where data does not need to be shared extensively between the parallel units. For instance, consider applying a complex filter to a large number of images. Using multiprocessing, each image can be processed in a separate process, effectively utilizing all available CPU cores and significantly reducing the overall processing time.
Best Practices
Interview Tip
Be prepared to explain the GIL and its impact on multi-threaded Python programs. Understand the difference between concurrency and parallelism. Also, know when to use threads vs. processes and be able to articulate the advantages and disadvantages of each.
When to Use Them
Memory Footprint
Processes generally have a larger memory footprint than threads because each process has its own memory space, including a copy of the Python interpreter and any loaded libraries. Threads, on the other hand, share the same memory space, making them more memory-efficient.
Alternatives
Pros and Cons: Threads
Pros:
Cons:
Pros and Cons: Processes
Pros:
Cons:
FAQ
-
What is the GIL?
The Global Interpreter Lock (GIL) is a mechanism used in CPython (the standard Python implementation) that allows only one thread to hold control of the Python interpreter at any given time. This means that only one thread can execute Python bytecode at a time, even on multi-core processors. This limits the ability of threads to achieve true parallelism for CPU-bound tasks. -
Why use threads at all if the GIL limits CPU-bound performance?
Threads are still useful for I/O-bound tasks where the thread spends most of its time waiting for external operations (network requests, file reads/writes). During this waiting time, the GIL can be released, allowing other threads to run. Additionally, threads are simpler to manage and have lower overhead than processes. They are also ideal for tasks that benefits from share memory. -
How does `multiprocessing.Queue` work?
`multiprocessing.Queue` is a thread-safe and process-safe queue implementation specifically designed for inter-process communication (IPC). It allows you to safely pass data between different processes. When a process puts data into the queue, it's serialized and copied into the queue's buffer. When another process retrieves data from the queue, it's deserialized and copied into the recipient process's memory space. This ensures that each process has its own independent copy of the data.