Proposal: collaborative comments on the Commons debates

mskala at ansuz.sooke.bc.ca mskala at ansuz.sooke.bc.ca
Tue Apr 30 19:13:19 EDT 2002


On Tue, 30 Apr 2002, Chris Brand wrote:
> >representing status or approval?  Some MP gets up to give a speech on
> >copyright, and I might add a comment saying "Made interesting point
> >re: blank media levy, +2 points".
> Great idea. Ideally, you'd have links where people could go directly to
> the source, too.

OK, I still don't know how to make a game out of this, but here's what I
imagine in the way of a database-driven comment system:

* Have a script that parses each day's Hansard, capturing all the names
  and headings and capturing them into a database.  The Government
  postings already contain link targets that could be stored in the
  database.

* Store comments in the database too.  Each comment would be associated
  with a specific point in some issue of Hansard.

* A "main" screen showing the N most recently added comments in the
  database, with columns for who wrote the comment, which member was
  speaking, and that member's party.  Also shown for each comment would be
  the comment itself, and a link that would take the user to that point in
  the annotated Hansard (see below).

* By clicking on any of the column values on the main screen, users could
  limit the display to comments having that value.  For instance, if
  there's a comment from me, they could click on my name and see a
  similar display of only my comments.  Or they could click on the party
  affiliation of the member speaking, and see commonts only on that
  party's members.  These would be cumulative, if you wanted to see only
  my comments on a particular party or whatever.

* Every comment would have a positive or negative point value and these
  pages would show the total for comments matching the filter.

* Each page would also be available in RSS format (trivial to do, just use
  the same code to spit out the results in XML instead of HTML) so people
  could keep an eye on it with an appropriate client.

* A script to provide an "annotated" Hansard, which I imagine as being a
  cached copy of the one from the Government's site, with extra icons
  added.  Anywhere users could add a comment, there would be an icon to
  click to add a comment at that point; for any points where there were
  already comments, there would be a little icon and note inserted (N
  comments, click to read them).  This would preferably be available, just
  like the files on www.parl.gc.ca, as both a complete transcript of the
  day's events, and in 5-minute segments.

* Other scripts could be added to do things like the top N Members by
  score, the form-letter thing I mentioned, and so on.  Also necessary
  would be the add-a-comment script, and administrative stuff for adding
  users, deleting users and comments, etc.

What do you think?  Is this something we'd like to have on our Web site?
(For some value of that; I know this list reaches the people in charge of
several different sites.)  Am I missing anything critical?  Would we be
able to get people to use it?

Some technical issues:

* My preferred platform for developing stuff like this is PHP4, with
  either mySQL or PostgreSQL.  I have both database packages on my system
  at home, but it isn't network-accessible.  I can post things for
  debugging purposes on a friend's machine, www.edifyingfellowship.org - 
  a "production" site wouldn't be welcome there for traffic reasons, but
  testing/debugging would be fine.  That system supports only PostgreSQL,
  not mySQL, so if we wanted to port from there to a mySQL-based system
  elsewhere, then some rewriting and conversion would be necessary.

* If this were going to be hosted on a system that already has a user
  account database (for phpSlash or similar) then it might be desirable to
  write code to connect with those user accounts instead of having a
  separate account base for the parliamentary-comment system; then we'd be
  spared of having to deal with account management ourselves.

* Disk space requirements: an issue of Hansard, in English, is about
  600K.  Double that if we include French as well.  Double again if we 
  store both the "complete" file and the "in 5-minute segments" files; divide
  by two, maybe, to account for compression.  We could probably fake one 
  of {complete,segmented} by splitting or joining the other, although 
  if space is cheap it would be nicer not to have to, because storing 
  both would allow better synch with the Government site.  My guess is
  that if we didn't cache the actual text, but only stored the "heading"
  information, that would take about half as much space, counting database
  overhead.  If we *did* cache the text we'd probably still want to store
  the headings in a database; my bottom line rough estimate is that we'd
  have about 2M of data to store per day of Hansard, plus whatever
  comments people add.  That's not a huge amount of disk space but is
  enough to be worth thinking about; I wouldn't want to have to store it
  on ansuz.sooke.bc.ca with my 100M space limit.

* We could have lots of "fun" figuring out how to represent what happens
  in the database when Members change parties, change portfolios, resign
  mid-term and get replaced, etc.  Other people speak in Parliament
  besides regular Members - for instance, the Speaker and his deputies.
  Also, I anticipate parsing problems with situations like the Committee of
  the Whole, and the editorial notes that occasionally get inserted when, 
  for instance, Members speak in languages other than English and French.

Some non-technical issues:

* Such a system would need people to be "operators", to keep an eye on it
  and make sure everything was going smoothly.  I could forsee
  vandalism/trolling problems; a "lack of critical mass" problem if we
  ever got into a situation where there were no recent comments in the
  system; and all kinds of fun when (as always happens eventually, with
  systems designed to automatically parse other people's
  for-human-consumption postings) the Government Web people changed the
  format of the Parliamentary site.

* What's the copyright on Hansard, and would this violate it in any way?

* To what extent should or could such a project be bilingual?

* Area of coverage: I am most interested in the House of Commons debates,
  but interesting things happen in the Senate and the Provincial
  legislatures too, and many if not all Parliamentary debates in Canada
  are online and could be fodder for such a system.

* Programming: I think I can write a parser and basic query script, but I
  don't have time and energy to do all of the development for a nice
  idiotproof system with all the features I've talked about.  Do we have,
  or can we recruit, other people who would participate in building it?
  I'm actually more concerned about recruiting "operators", because
  that's an activity I hate doing, whereas building new stuff is an
  activity I enjoy and will do as much as I have time for.

If a system like that described above seems too large or complicated,
there may be lighter-weight things we could do instead that would still be
valuable and serve many of the same purposes.  What I've described sounds
a lot like what a "Wiki" does; perhaps we could simply build a parser that
imports Hansard into the database of an existing Wiki package.  Then we'd
get all the collaboration features for free.  Less work to build, but my
guess is that it would also be less nice to use.  A good thing about that,
though, is that if we wanted a Wiki for other reasons (for instance, the
"open dictionary" that has been discussed), they could seamlessly merge.
-- 
Matthew Skala
mskala at ansuz.sooke.bc.ca                    Embrace and defend.
http://ansuz.sooke.bc.ca/

--
For (un)subscription information, posting guidelines and
links to other related sites please see http://www.flora.org/dmca/



More information about the Discuss mailing list