Home arrow Python arrow Page 2 - Python UnZipped

Going Full Monty with the Zip File - Python

pythonPython is a great choice for anyone wanting to play with the increasingly popular ZIP or GZIP (not covered here) file formats, and as usual Python makes it surprisingly fun/easy! Don't believe me? In this article we'll look at creating, extracting, and adding to Zip archives using Pythons standard zipfile module and defining a set of functions you can use with your own Zip files; ending with an example which recursively scans a Zip file and sub-archives.

TABLE OF CONTENTS:
  1. Python UnZipped
  2. Going Full Monty with the Zip File
  3. Listings in the Key of Zip
By: Mark Lee Smith
Rating: starstarstarstarstar / 53
January 08, 2004

print this article
SEARCH DEV SHED

TOOLS YOU CAN USE

advertisement

You guessed it, where going to unzip them. (Using our file.txt and file.gif sample files again just to make things easier to follow.)


<br />>>> import zipfile
<br />>>> zip zipfile.ZipFile('Python.zip''r')
<
br />>>> file(' 'file.txt', 'w').write(zip.read('file.txt'))
<br />>>> file('
file.gif', 'wb').write(zip.read('file.gif'))
<br />>>> zip.close()
<br />

Note: Images are binary; I’ve used the 'wb' (write binary) flag for the second file although this may not always be necessary.

Ok we just extracted two files from our Zip, and in only five lines! And this example is fine if you know the names of the files you want to extract, but what if you don't?


<br />#!/usr/bin/env python</p>
<p> </p>
<
p>import zipfile</p>
<
p> </p>
<
p>def inzip(filenamezip):</p>
<
p>return filename in zip.namelist()</p>
<
p> </p>
<
p>if __name__ == '__main__':</p>
<
p> </p>
<
p>zip zipfile.ZipFile('Python.zip''r')</p>
<
p>inzip('file.txt'zip)</p>
<
p>zip.close()
<
br />

Short and sweet just like its name, this function simply returns True or False (True in the example above) if the string filename is in the list of files in the zip.

The namelist() method (along with its brothers and sisters) provides information about a Zip file, namelist() itself returns a list of all the files within a Zip. For example:


<br />>>> zip.namelist()</p>
<
p>['1.txt''2.txt''3.txt''file.gif''file.txt''folder/file.html'‘folder/]
<
br />>>>
<
br />

You’ve checked the contents of the file and you want to get extracting. Rather than sitting there typing names into your Python shell one by one (which lets face it is pretty boring), I’m going to show you how.


<br />#!/usr/bin/env python </p>
<p> </p>
<
p>import zipfile </p>
<
p> </p>
<
p>def unzip(zip): </p>
<
p>for name in zip.namelist(): </p>
<
p>file(name'wb').write(zip.read(name)) </p>
<
p> </p>
<
p>if __name__ == '__main__': </p>
<
p> </p>
<
p>zip zipfile.ZipFile('Python.zip''r') </p>
<
p>unzip(zip) </p>
<
p>zip.close()
<
br />

This is fine for a flat Zip files (those without subfolders) but it’d just barf all over the screen if we passed a name that included a none existent directory to file(), there are two choices:

  1. Remove everything before the filename, simple yes, but you could end up two files named the same and we all know what happens next.
  2. We can create all the directories before unzipping our file, which is a lot safer, though requires a little more work…

Of course we’re going for the second choice, not only is it the most interesting but also the most Pythonic!

To borrow from another TV snake (Black Adder) "I have a cunning plan!"


<br />#!/usr/bin/env python</p>
<p> </p>
<
p>import oszipfile</p>
<
p> </p>
<
p>def unzip(pathzip):</p>
<
p> </p>
<
p>isdir os.path.isdir
<br />join os.path.join
<br />norm os.path.normpath
<br />split os.path.split</p>
<
p> </p>
<
p>for each in zip.namelist():</p>
<
p>if not each.endswith('/'):</p>
<
p>rootname split(each)</p>
<
p>directory norm(join(pathroot))</p>
<
p>if not isdir(directory):</p>
<
p>os.makedirs(directory)</p>
<
p>file(join(directoryname), 'wb').write(zip.read(each))</p>
<
p> </p>
<
p>if __name__ == '__main__':</p>
<
p> </p>
<
p>zip zipfile.ZipFile('Python.zip''r')</p>
<
p>unzip(''zip)</p>
<
p>zip.close()
<
br />

Don’t panic! This is a little more advanced than the other functions we've created so far and there’s actually quite a lot going on inside it so we'll go though step by step; you might have noticed the os module sitting at the core of this example too.

The first part of this function is pretty strange as functions go; basically all it does is create some local copies of some of the functions from os.path (to improve performance).

Next we loop though each of the names in zip.namelist() and if the name isn’t a directory (end with a forward slash).


<br />>>> for each in zip.namelist(): print each
<br />1.txt
<br />2.txt
<br />3.txt
<br />file.gif
<br />file.txt
<br />folder/file.html
<br />folder/
<
br />

The path is split from the filename and assigned to root, name. Our next line creates a variable named directory that holds the new path for the file, which is simply path and root joined.

Note: This won't work with absolute paths like C:FolderFolderFile.ext; in this case the file should be extracted to that location (tested on windows). For this example I'm assuming that absolute paths won’t be used.

All we do then is check if the directory tree does NOT already exist before attempting to create it and extracting our file. Overall, it's a very small function (especially compared to some other languages).



 
 
>>> More Python Articles          >>> More By Mark Lee Smith
 

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort
   

PYTHON ARTICLES

- Python Big Data Company Gets DARPA Funding
- Python 32 Now Available
- Final Alpha for Python 3.2 is Released
- Python 3.1: String Formatting
- Python 3.1: Strings and Quotes
- Python 3.1: Programming Basics and Strings
- Tuples and Other Python Object Types
- The Dictionary Python Object Type
- String and List Python Object Types
- Introducing Python Object Types
- Mobile Programming using PyS60: Advanced UI ...
- Nested Functions in Python
- Python Parameters, Functions and Arguments
- Python Statements and Functions
- Statements and Iterators in Python

Developer Shed Affiliates

 


Dev Shed Tutorial Topics: