Python > Working with Data > File Formats > Working with Compressed Files (`gzip`, `bz2`, `zipfile` modules)
Working with Zip Archives
This snippet demonstrates how to create, read, and extract files from ZIP archives using the zipfile
module in Python. ZIP archives are a common way to bundle multiple files into a single compressed file.
Creating a Zip Archive
This part of the code imports the zipfile
module. It defines a list of filenames filenames
that will be added to the zip archive. It creates two dummy text files (file1.txt
and file2.txt
) with some content. The zipfile.ZipFile()
function is used in write mode ('w'
) to create a new zip archive named myarchive.zip
. The code then iterates through the list of filenames and uses the zf.write()
method to add each file to the archive.
import zipfile
filenames = ['file1.txt', 'file2.txt']
# Create some dummy files
with open('file1.txt', 'w') as f:
f.write('This is the content of file1.')
with open('file2.txt', 'w') as f:
f.write('This is the content of file2.')
with zipfile.ZipFile('myarchive.zip', 'w') as zf:
for filename in filenames:
zf.write(filename)
Extracting Files from a Zip Archive
This part of the code demonstrates how to extract all files from a zip archive. It imports the zipfile
module and opens the myarchive.zip
file in read mode ('r'
) using zipfile.ZipFile()
. The zf.extractall()
method extracts all files from the archive into a directory named extracted_files
. The directory will be created if it doesn't already exist.
import zipfile
with zipfile.ZipFile('myarchive.zip', 'r') as zf:
zf.extractall('extracted_files')
Reading the Contents of a Zip Archive
This part of the code demonstrates how to list the files inside a zip archive without extracting them. It opens the myarchive.zip
file in read mode. The zf.namelist()
method returns a list of strings, where each string is the name of a file or directory within the archive.
import zipfile
with zipfile.ZipFile('myarchive.zip', 'r') as zf:
print(zf.namelist())
Concepts Behind ZIP Archives
ZIP archives use DEFLATE as their default compression algorithm (though other algorithms can be used). They support storing multiple files and directories within a single file. ZIP archives also support encryption and password protection.
Real-Life Use Case Section
ZIP archives are commonly used for distributing software, archiving documents, and transferring files. They are supported by most operating systems and file archiving tools.
Best Practices
with
statements ensures automatic closing.
Interview Tip
Be prepared to discuss the different modes for opening a zip file (read, write, append). Understand the differences between extract()
and extractall()
.
When to Use ZIP
Use ZIP when you need to bundle multiple files into a single archive, especially when compatibility with different operating systems and tools is important.
Memory Footprint
The memory footprint of zipfile operations depends on the size of the files being processed. Extracting large files may require significant memory.
Alternatives
Alternatives to ZIP include tar archives (often combined with gzip or bzip2 for compression) and 7z archives (which offer higher compression ratios).
Pros
Cons
FAQ
-
How do I add a file to an existing zip archive?
You can open the zip archive in append mode ('a'
) and use thezf.write()
method to add the new file. Example:with zipfile.ZipFile('myarchive.zip', 'a') as zf: zf.write('newfile.txt')
-
How do I extract a specific file from a zip archive?
You can use thezf.extract()
method to extract a specific file. Provide the filename and optionally the destination directory as arguments. Example:with zipfile.ZipFile('myarchive.zip', 'r') as zf: zf.extract('file1.txt', 'extracted_files')