Python
  Home arrow Python arrow Page 2 - Python UnZipped
Dev Shed Forums  
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Smartphone Development  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Mobile Linux  
App Generation ROI  
IBM® developerWorks  
Forums Sitemap  
E-Commerce Hosting  
Linux Web Hosting  
Managed Hosting  
Small Business Hosting  
VPS Hosting  
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid  
Request Media Kit
Contact Us  
Site Map  
Privacy Policy  
Support  
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
Google.com  
PYTHON

Python UnZipped
By: Mark Lee Smith
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: starstarstarstarstar / 53
    2004-01-08


    Table of Contents:
  • Python UnZipped
  • Going Full Monty with the Zip File
  • Listings in the Key of Zip

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      error-file:tidyout.log Del.ici.ous error-file:tidyout.log Digg
      error-file:tidyout.log Blink error-file:tidyout.log Simpy
      error-file:tidyout.log Google error-file:tidyout.log Spurl
      error-file:tidyout.log Y! MyWeb error-file:tidyout.log Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article

     
     
    ADVERTISEMENT


    Python UnZipped - Going Full Monty with the Zip File
    ( Page 2 of 3 )

    You guessed it, where going to unzip them. (Using our file.txt and file.gif sample files again just to make things easier to follow.)


    <br />>>> import zipfile
    <br />>>> zip zipfile.ZipFile('Python.zip''r')
    <
    br />>>> file(' 'file.txt', 'w').write(zip.read('file.txt'))
    <br />>>> file('
    file.gif', 'wb').write(zip.read('file.gif'))
    <br />>>> zip.close()
    <br />

    Note: Images are binary; I’ve used the 'wb' (write binary) flag for the second file although this may not always be necessary.

    Ok we just extracted two files from our Zip, and in only five lines! And this example is fine if you know the names of the files you want to extract, but what if you don't?


    <br />#!/usr/bin/env python</p>
    <p> </p>
    <
    p>import zipfile</p>
    <
    p> </p>
    <
    p>def inzip(filenamezip):</p>
    <
    p>return filename in zip.namelist()</p>
    <
    p> </p>
    <
    p>if __name__ == '__main__':</p>
    <
    p> </p>
    <
    p>zip zipfile.ZipFile('Python.zip''r')</p>
    <
    p>inzip('file.txt'zip)</p>
    <
    p>zip.close()
    <
    br />

    Short and sweet just like its name, this function simply returns True or False (True in the example above) if the string filename is in the list of files in the zip.

    The namelist() method (along with its brothers and sisters) provides information about a Zip file, namelist() itself returns a list of all the files within a Zip. For example:


    <br />>>> zip.namelist()</p>
    <
    p>['1.txt''2.txt''3.txt''file.gif''file.txt''folder/file.html'‘folder/]
    <
    br />>>>
    <
    br />

    You’ve checked the contents of the file and you want to get extracting. Rather than sitting there typing names into your Python shell one by one (which lets face it is pretty boring), I’m going to show you how.


    <br />#!/usr/bin/env python </p>
    <p> </p>
    <
    p>import zipfile </p>
    <
    p> </p>
    <
    p>def unzip(zip): </p>
    <
    p>for name in zip.namelist(): </p>
    <
    p>file(name'wb').write(zip.read(name)) </p>
    <
    p> </p>
    <
    p>if __name__ == '__main__': </p>
    <
    p> </p>
    <
    p>zip zipfile.ZipFile('Python.zip''r') </p>
    <
    p>unzip(zip) </p>
    <
    p>zip.close()
    <
    br />

    This is fine for a flat Zip files (those without subfolders) but it’d just barf all over the screen if we passed a name that included a none existent directory to file(), there are two choices:

    1. Remove everything before the filename, simple yes, but you could end up two files named the same and we all know what happens next.
    2. We can create all the directories before unzipping our file, which is a lot safer, though requires a little more work…

    Of course we’re going for the second choice, not only is it the most interesting but also the most Pythonic!

    To borrow from another TV snake (Black Adder) "I have a cunning plan!"


    <br />#!/usr/bin/env python</p>
    <p> </p>
    <
    p>import oszipfile</p>
    <
    p> </p>
    <
    p>def unzip(pathzip):</p>
    <
    p> </p>
    <
    p>isdir os.path.isdir
    <br />join os.path.join
    <br />norm os.path.normpath
    <br />split os.path.split</p>
    <
    p> </p>
    <
    p>for each in zip.namelist():</p>
    <
    p>if not each.endswith('/'):</p>
    <
    p>rootname split(each)</p>
    <
    p>directory norm(join(pathroot))</p>
    <
    p>if not isdir(directory):</p>
    <
    p>os.makedirs(directory)</p>
    <
    p>file(join(directoryname), 'wb').write(zip.read(each))</p>
    <
    p> </p>
    <
    p>if __name__ == '__main__':</p>
    <
    p> </p>
    <
    p>zip zipfile.ZipFile('Python.zip''r')</p>
    <
    p>unzip(''zip)</p>
    <
    p>zip.close()
    <
    br />

    Don’t panic! This is a little more advanced than the other functions we've created so far and there’s actually quite a lot going on inside it so we'll go though step by step; you might have noticed the os module sitting at the core of this example too.

    The first part of this function is pretty strange as functions go; basically all it does is create some local copies of some of the functions from os.path (to improve performance).

    Next we loop though each of the names in zip.namelist() and if the name isn’t a directory (end with a forward slash).


    <br />>>> for each in zip.namelist(): print each
    <br />1.txt
    <br />2.txt
    <br />3.txt
    <br />file.gif
    <br />file.txt
    <br />folder/file.html
    <br />folder/
    <
    br />

    The path is split from the filename and assigned to root, name. Our next line creates a variable named directory that holds the new path for the file, which is simply path and root joined.

    Note: This won't work with absolute paths like C:FolderFolderFile.ext; in this case the file should be extracted to that location (tested on windows). For this example I'm assuming that absolute paths won’t be used.

    All we do then is check if the directory tree does NOT already exist before attempting to create it and extracting our file. Overall, it's a very small function (especially compared to some other languages).



     
     
    >>> More Python Articles          >>> More By Mark Lee Smith
     

       

    PYTHON ARTICLES

    - Tuples and Other Python Object Types
    - The Dictionary Python Object Type
    - String and List Python Object Types
    - Introducing Python Object Types
    - Mobile Programming using PyS60: Advanced UI ...
    - Nested Functions in Python
    - Python Parameters, Functions and Arguments
    - Python Statements and Functions
    - Statements and Iterators in Python
    - Sequences and Sets in Python
    - Python Expressions and Operators
    - Dictionaries, Variables and Statements in Py...
    - Data Types in Python
    - The Python Language
    - SSH with Twisted





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek