Paul's Python Intro 04

Paul Dickson plug-devel@lists.PLUG.phoenix.az.us
Thu Sep 9 14:37:06 2004


Here's a peek at where I'd like to take this:

    >>> import csv
    >>> db = {}
    >>> file = open("sac72.txt","r")  # Open file for reading
    >>> reader = csv.reader(file)     # Create a object that parses each line
    >>> header = reader.next()        # Extract first record into header
    >>> for record in reader:
    ...     db[record[0].strip()] = record
    ...
    >>> file.close()                  # Close file

    >>> len(header[-1])
    71
    >>> header[-1].strip()
    'NOTES'
    >>> db["NGC  224"][-1].strip()
    'Local Group;Andromeda Galaxy;nearest spiral'

    >>> header[4:6]
    ['R.A.   ', 'DEC.  ']
    >>> db["NGC  224"][4:6]
    ['00 42.7', '+41 16']

In this example, I imported the csv module (a Python library module),
created an empty dictionary (db = {}), then read the entire SAC Deep-Sky
Database into that dictionary (the for loop).  This databse can be found
at:  http://www.saguaroastro.org/content/db/sac72.zip

The "..." are prompts, but indented code is expected.  More on this later.

The last field in the database is the NOTES field (the header[-1]).  It's
supposed to be 71 characters and the len() function sgrees with it.  When
I peeked at these values, I invoked the .strip() method to remove the
padding spaces.  I don't need to do this for the RA and DEC values,
they're already short enough to print.

The dictionary, db, now holds the entire database.  The text from the
first field of the database is the dictionary's key for that record and
the record is stored as a list.  There's a minor problem in the above
code. The object names in the database have spaces added to help with
sorting (eg it's "NGC  224" above, and for NGC 1, it's "NGC    1").

To quickly summarize dictionaries, a dictionary is a generalized form of a
structure, but you don't need to know in advance what elements you'll
need.

    >>> g = {}
    >>> g["name"] = "John Smith"
    >>> g["phone"] = "555-1111"
    >>> g
    {'phone': '555-1111', 'name': 'John Smith'}
    >>> m = {"name":"Mary Johnson", "phone":"555-2222"}
    >>> m
    {'phone': '555-2222', 'name': 'Mary Johnson'}

You must create the dictionary (using the braces "{...}") before assigning
keys and values to it.  

    >>> h["name"] = "Mike Smith"
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    NameError: name 'h' is not defined

Dictionaries are unordered.  To sort the above SAC database:

    >>> keys = db.keys()        # Obtain a list of keys
    >>> keys[0:4]
    ['NGC 6747', 'NGC 5541', 'NGC 5939', 'NGC 5546']
    >>> keys.sort()             # Sort the list
    >>> keys[0:4]
    ['3C  48', '3C 147', '3C 249.1', '3C 273']
    >>> for i in keys:          # For each sorted key,
    ...     print db[i]         #   print that record
    ...

I only printed the first 4 entries of the "keys" list.  I also didn't
include the dump of the SAC database.  You might want to use Control-C to
stop before all 10,000+ records are printed.

You can use any CSV (Comma Separated Values) database as input.  They are
just text exports from Excel or Gnumeric, with fields separate by commas
and quoted as necessary.

The csv module is new in Python 2.3.

This concludes my brief summary of Python data types.  There are others
types and you can create your own, but, with the exception of classes, I
don't plan on covering them.

This is also all I have written so far.  When I have some free time I'll
continue.  Feel free to post questions as replies to these messages.  You
could find your answers in the tutorial link posted, but a query to this
list will likely get you a direct and relevant answer.

	-Paul