Python > Advanced Topics and Specializations > Concurrency and Parallelism > Threads and the `threading` Module
Basic Threading Example with the `threading` Module
This snippet demonstrates the fundamental usage of the threading
module in Python. It showcases how to create and start threads to execute functions concurrently. The example defines a simple function that prints a message and sleeps for a short duration. Multiple threads are then created, each executing this function. This provides a basic understanding of how to leverage threads for parallel execution.
Core Concepts: Thread Creation and Execution
The code defines a worker
function which represents the task each thread will execute. The main
function creates three threading.Thread
objects, each configured to run the worker
function with a unique thread ID. The start()
method initiates each thread. The join()
method ensures that the main thread waits for all created threads to complete before exiting.
import threading
import time
def worker(thread_id):
print(f'Thread {thread_id}: Starting')
time.sleep(2) # Simulate some work
print(f'Thread {thread_id}: Finishing')
def main():
threads = []
for i in range(3):
t = threading.Thread(target=worker, args=(i,))
threads.append(t)
t.start()
for t in threads:
t.join() # Wait for all threads to complete
print('All threads finished.')
if __name__ == "__main__":
main()
Real-Life Use Case: Background Tasks
Imagine a GUI application that needs to perform a long-running task, such as downloading a large file or processing a complex image. Performing this task in the main thread would freeze the GUI. Threading allows you to offload this task to a background thread, keeping the GUI responsive. Another common use case is parallelizing independent calculations to speed up the overall processing time.
Best Practices: Proper Thread Management
Always ensure proper thread management to avoid race conditions and deadlocks. Use synchronization primitives like locks (threading.Lock
) or semaphores (threading.Semaphore
) when multiple threads access shared resources. Gracefully handle exceptions within threads to prevent application crashes. Always join()
your threads to prevent them from running in the background indefinitely.
Interview Tip: The Global Interpreter Lock (GIL)
Be prepared to discuss the Global Interpreter Lock (GIL) in Python. The GIL allows only one thread to hold control of the Python interpreter at any given time. This means that true parallel execution of Python bytecode is limited to a single core, even on multi-core processors. This limitation primarily affects CPU-bound tasks. I/O-bound tasks still benefit from threading because threads can release the GIL while waiting for I/O operations to complete. Understanding the GIL is crucial for making informed decisions about when to use threading versus multiprocessing.
When to Use Threads
Use threads when dealing with I/O-bound tasks, such as network requests, file I/O, or waiting for external processes. In these scenarios, threads can release the GIL while waiting for I/O operations, allowing other threads to run. Threads are generally easier to manage than processes, especially when sharing data. However, for CPU-bound tasks, consider using the multiprocessing
module to bypass the GIL and achieve true parallelism.
Memory Footprint
Threads generally have a smaller memory footprint compared to processes, as they share the same memory space. However, this shared memory space also introduces the need for synchronization mechanisms to prevent data corruption.
Alternatives: `multiprocessing` and `asyncio`
For CPU-bound tasks, the multiprocessing
module provides true parallelism by creating separate processes, each with its own Python interpreter and memory space. For concurrent I/O bound operations, asyncio
is a single-threaded concurrency framework that provides asynchronous programming using coroutines and the event loop. asyncio
can achieve high concurrency without the overhead of multiple threads or processes.
Pros of Threading
Cons of Threading
FAQ
-
What is the difference between a thread and a process?
A process is an independent execution environment with its own memory space, while a thread is a lightweight execution unit within a process, sharing the same memory space. Processes provide true parallelism, while threads are subject to the GIL in Python (limiting CPU bound parallelism).
-
How can I prevent race conditions in my threaded code?
Use synchronization primitives like locks (
threading.Lock
), semaphores (threading.Semaphore
), or condition variables (threading.Condition
) to protect shared resources from concurrent access. -
What is the purpose of the `join()` method?
The
join()
method blocks the calling thread until the thread whosejoin()
method is called completes its execution. This ensures that the main thread waits for all spawned threads to finish before exiting the program.