Python > Advanced Python Concepts > Concurrency and Parallelism > Threads and the `threading` Module
Basic Threading Example: Counting with Threads
This example demonstrates the fundamental usage of threads in Python using the threading
module. It creates multiple threads, each responsible for incrementing a shared counter. This illustrates how threads can execute concurrently, potentially leading to race conditions if not managed properly with synchronization mechanisms.
Code Snippet
The code defines a Counter
class with an increment
method that simulates some work using time.sleep
before incrementing the counter. Multiple threads are created, each executing the worker
function, which increments the counter 1000 times. The join()
method ensures the main thread waits for all worker threads to complete before printing the final counter value. Because there's no lock implemented, the value is unlikely to be 5000. This is because multiple threads try to access and modify the 'value' at the same time. Therefore, the result is unpredictable. That's the reason why in the advanced code snippet we use Locks to synchronize threads' access to shared ressources.
import threading
import time
class Counter:
def __init__(self):
self.value = 0
def increment(self):
time.sleep(0.01) # Simulate some work
self.value += 1
counter = Counter()
def worker():
for _ in range(1000):
counter.increment()
threads = []
for i in range(5):
t = threading.Thread(target=worker)
threads.append(t)
t.start()
for t in threads:
t.join()
print(f"Final counter value: {counter.value}")
Concepts Behind the Snippet
This snippet showcases basic thread creation and execution. Key concepts include:
threading
Module: Python's built-in module for creating and managing threads.Thread
Class: Represents a thread of execution.start()
Method: Starts the thread's execution.join()
Method: Waits for the thread to complete its execution.
Real-Life Use Case
Imagine downloading multiple files simultaneously. Each file download can be handled by a separate thread, allowing the application to continue responding to user input while the downloads proceed in the background. Another use case is in web servers, where each incoming request can be handled by a separate thread, increasing the server's throughput.
Best Practices
Interview Tip
Be prepared to explain the difference between threads and processes, the challenges of concurrent programming (e.g., race conditions, deadlocks), and how to use synchronization primitives to solve these challenges. Understand the Global Interpreter Lock (GIL) and its impact on CPU-bound tasks in Python.
When to Use Threads
Threads are suitable for I/O-bound tasks, such as network requests or file I/O, where the threads spend most of their time waiting for external resources. They are less effective for CPU-bound tasks in Python due to the Global Interpreter Lock (GIL), which prevents multiple threads from executing Python bytecode simultaneously.
Memory Footprint
Threads typically have a smaller memory footprint than processes, as they share the same memory space. This makes them more efficient for tasks that require frequent data sharing.
Alternatives
Alternatives to threads include:
multiprocessing
module to create separate processes, which bypasses the GIL and allows for true parallelism on CPU-bound tasks.asyncio
to write concurrent code using coroutines, which are lightweight and efficient for I/O-bound tasks.concurrent.futures
to manage a pool of threads, which can improve performance by reusing threads for multiple tasks.
Pros
Cons
FAQ
-
What is a race condition?
A race condition occurs when multiple threads access and modify shared data concurrently, and the final outcome depends on the unpredictable order in which the threads execute. This can lead to data corruption or unexpected behavior. -
What is the Global Interpreter Lock (GIL)?
The Global Interpreter Lock (GIL) is a mutex that allows only one thread to hold control of the Python interpreter at any given time. This means that only one thread can execute Python bytecode at a time, limiting true parallelism for CPU-bound tasks in CPython. Other Python implementations (like Jython or IronPython) may not have this limitation.