Home arrow XML arrow Page 4 - Practical XML Data Design and Manipulation for Voting Systems

Comparing XML Ballots - XML

EVM2003 brings XML to the democratic process: In this article, David Mertz discusses his practical experiences developing interrelated XML data formats for the EVM2003 Free Software project to develop voting machines that produce voter-verifiable paper ballots. Some design principles of format subsetting emerge. In addition, David looks at how an application-specific meaning for XML document equivalence can be programmed, and why canonicalization is insufficient. (This intermediate-level article was first published by IBM developerWorks, 28 Jun 2004, at http://www.ibm.com/developerWorks.)

  1. Practical XML Data Design and Manipulation for Voting Systems
  2. EVM2003 and XML
  3. XML Samples
  4. Comparing XML Ballots
  5. Conclusion and Resources
By: developerWorks
Rating: starstarstarstarstar / 17
September 22, 2004

print this article



I have mentioned that the programming for EVM2003 was done in Python; in addition, the XML access is performed using my Gnosis Utilities library, specifically gnosis.xml.objectify. Using this library makes operations on ballot or EBI files particularly painless. For example, information on contests and candidates is loaded into some Python data structures with the following code:

Listing 3. ballot-election.xml to Python conversion

from gnosis.xml.objectify import make_instance
ballot = make_instance(xml_data)
contnames, cont = [], {}
for contest in ballot.contest:
    name = contest.name
    if contest.coupled=="No":
        cont[name] = [select.PCDATA for select in contest.selection]
        if contest.allow_writein=="Yes":
        cont[name] = []
        for n in range(0, len(contest.selection), 2):
            cont[name].append([s.PCDATA for s in contest.selection[n:n+2]])
        if contest.allow_writein=="Yes":

The function make_instance() generally reduces thought of the XML-ness of data formats to a single line; after that, it's just Python.

A special issue comes up in comparing EBIs with each other, or with REBIs (or rather, several related issues). As I mentioned, REBIs are not generally byte-wise identical to their corresponding EBIs because write-in names are not recorded in full on barcodes. But more generally, the OVC intends to set standards for data formats, not simply produce them with specific code. Third-party code should be able to produce and process EBIs -- for example, to confirm that tabulation has been performed accurately.

The document equality question applies to many classes of XML documents: When are two documents identical according to application requirements? Conforming to the same DTD or schema is a minimum necessary condition, and XML canonicalization can remove many trivial syntactic variants. But as a rule, meaningful identity cannot be expressed by schemas alone. For example, deciding when the order of child elements is meaningful and when it is incidental is strictly an application-level issue.

The Gnosis Utilities library provides (in my opinion) a rather elegant way to customize the meaning of equality. You may define a custom class with equality and inequality tests to hold all XML documents with the root element <cast_ballot>. The module evm2003.utils.equiv injects an application-specific equality test into EBI Python objects, and may also be used as a command-line tool to compare EBIs/REBIs. Here it is, including the detailed docstring:

Listing 4. evm2003.utils.equiv.py module

"""Compare ballot XML files for equivalence
This file may be imported as a module or used as a command-line ballot
comparison tool. If imported, e.g.:
    >>> import evm2003.utils.equiv
    >>> from gnosis.xml.objectify import make_instance
    >>> a = make_instance('scanned.xml')
    >>> b = make_instance('stored.xml')
    >>> a == b
At the command-line:
    % python equiv.py scanned.xml stored.xml
(lack of any output means success, in that ultra-terse UNIX-philosophy
We implement custom .__eq__() and .__ne__() methods specific to cast
ballots.  Injecting such methods is the recommended technique for enhancing
gnosis.xml.objectify objects.
The files scanned.xml and stored.xml documents were used to test this.
They differ in several non-significant respects:
    (1) the top-level attributes occur in a different order;
    (2) non-ordered multi-select contests have selections in a different
    (3) Write-in votes have different PCDATA content (for example, nothing for
import gnosis.xml.objectify
import sys
class cast_ballot(gnosis.xml.objectify._XO_):
    def __eq__(self, other):
        metadata = '''election_date country state county
                      number precinct serial'''.split()
        for attr in metadata:
            if getattr(self, attr) != getattr(other, attr):
                return 0
        by_name = lambda a, b: cmp(a.name, b.name)
        for my, your in zip(self.contest, other.contest):
            if my.name != your.name or \
               my.ordered != your.ordered or \
               my.coupled != your.coupled:
                return 0
            if my.ordered == "No":
                # Compare non-writeins (but don't know if same num writeins)
                my_select = dict([(x.PCDATA,None) for x in my.selection
                                                  if x.writein=="No"])
                your_select = dict([(x.PCDATA,None) for x in your.selection
                                                    if x.writein=="No"])
                if my_select != your_select:
                    return 0
            for my_select, your_select in zip(my.selection, your.selection):
                if (my_select.writein, your_select.writein) == ("Yes","Yes"):
                elif my_select.PCDATA != your_select.PCDATA:
                    return 0
        return 1
    def __ne__(self, other):
        return not self == other
#-- Namespace injection
gnosis.xml.objectify._XO_cast_ballot = cast_ballot
#-- Command-line operation
if __name__=='__main__':
    a, b = map(gnosis.xml.objectify.make_instance, sys.argv[1:3])
    if a != b:
        print sys.argv[1], "and", sys.argv[2], "are NOT equivalent ballots!"

I see no need to explain the principles of EBI equivalence in more detail than the docstring gives. The sample code suffices as an illustration of similar considerations that arise in many XML processing applications.

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

>>> More XML Articles          >>> More By developerWorks

blog comments powered by Disqus
escort Bursa Bursa escort Antalya eskort


- Google Docs and Xpath Data Functions
- Flex Array Collection Sort and Filtering
- The Flex Tree Control
- Flex List Controls
- Working with Flex and Datagrids
- How to Set Up Podcasting and Vodcasting
- Creating an RSS Reader Application
- Building an RSS File
- An Introduction to XUL Part 6
- An Introduction to XUL Part 5
- An Introduction to XUL Part 4
- An Introduction to XUL Part 3
- An Introduction to XUL Part 2
- An Introduction to XUL Part 1
- XML Matters: Practical XML Data Design and M...

Developer Shed Affiliates


Dev Shed Tutorial Topics: