File Naming Convention
The files are named as follows: data/us/CCC/rolls/[hs]SSSS-NNN.xml:
- CCC signifies the Congress number. See the first column of data/us/sessions.tsv. It is a number from 1 to 111 (at the time of writing) and does NOT have zero-padding.
- "h" or "s" signifies whether the vote took place in the House or Senate.
- SSSS is the "session" of Congress that the vote took place in. Here I mean what is really called a session. Today, sessions follow calendar years and are named in GovTrack accordingly. So SSSS will be the four-digit year that vote took place in. Before the end of World War II, there were usually three sessions to Congress and these are labeled either 1, 2, 3 or with a letter. See the second column of data/us/sessions.tsv. (There is no zero- or space-padding.)
- NNN is the roll call number according to the House or Senate. There is no zero-padding. The Senate restarts its numbering every calendar year, whereas the House restarts its numbering after each two-year session.
Here is an example file, data/us/111/rolls/h2009-1.xml:
<roll where="house" session="111" year="2009" roll="3" when="1231276080" datetime="2009-01-06T16:08:00-05:00" updated="2009-01-06T19:25:39-05:00" aye="174" nay="249" nv="6" present="0"> <type>On Motion to Commit with Instructions</type> <question>On Motion to Commit with Instructions: H RES 5 Adopting rules for the One Hundred Eleventh Congress</question> <required>1/2</required> <result>Failed</result> <bill session="111" type="hr" number="5" /> <option key="+">Yea</option> <option key="-">Nay</option> <option key="P">Present</option> <option key="0">Not Voting</option> <voter id="400004" vote="+" value="Yea" state="AL" district="4"/> <voter id="400005" vote="+" value="Yea" state="MO" district="2"/> <voter id="400006" vote="+" value="Yea" state="LA" district="5"/> ...
The root element is "roll" and has the following attributes:
- where attribute: either "house" or "senate"
- session attribute which contains what GovTrack calls a session but is really the Congress number, e.g. 111
- year attribute: the year the vote took place in (same as in the file name for recent data)
- roll attribute: the number of the vote (same as in the file name)
- datetime attribute: the date and time of the vote, such as "2007-01-04T12:32:00-05:00". If the time of day is not known, the date will be in YYYY-MM-DD format.
- updated attribute: the date and time the file was updated, same date format as above (but time is always present)
- aye, nay, nv, present attributes: total counts for aye, nay, not voting, and present votes.
- source is either "house.gov", "senate.gov" or "keithpoole" which refers to data imported from this page.
Then there are these elements:
- type element: The type of the vote, taken right from the source data. Something like "On Passage". This is suitable for display. It's up to you to scan all of the different values of this element if you want to use it programmatically.
- question element: A description of the vote, taken right from the source data. Again, suitable for display, not necessarily processing.
- required element: Describes what is required for the vote to pass. Values can be unknown, 1/2, 2/3, 3/5, and QUORUM (if it's a quorum call) but be prepared for other things to show up. However, this isn't simple to use. For most votes, this means out of the Members voting, so exclude the "voter" elements whose vote is "0". 1/2 means more than 50% of those voting, 2/3 means at least 2/3rds of Members voting, etc. Senate quorum calls, cloture votes, Motions to Waive Rule XXVIII, and XLIV are out of all senators "duly sworn", meaning including those who didn't actually vote: meaning include all voter elements except those with VP=1.
- result element: This is a textual element for human consumption, but any of the following substrings indicate a passing vote: "Passed", "Agreed", "Confirmed". "Failed", "Defeated", "Rejected", and "Not Sustained" indicating a failing vote.
- bill element: If this vote is related to a bill (on passage of bill, on a motion related to the bill, on an amendment to the bill, etc.), this node will be present with session, type, and number attributes.
- amendment element: If this vote is related to an amendment, this node will be present. The ref attribute will be either "bill-serial" and the number attribute indicates the ordinal number of the amendment for the bill given in the bill node. That is, if the number attribute is "5", the amendment referred to is the 5th amendment to the bill referenced in the bill node. If ref is "regular", number contains an identifier of an amendment, such as "s1234".
- option elements: These elements indicate the types of votes that are permitted to be cast. Each vote option has a key and a textual description of the vote cast. The reason for this is primarily that votes in favor can be both Aye and Yea depending on the type of vote, and we want to preserve this info. The keys for normal votes are always "+" (aye/yea), "-" (nay,no), "P" (present but not voting), "0" (zero; absent/not voting). There are also quorum calls and votes to determine the Speaker of the House which have other values.
And finally there is a "voter" element for each Member of Congress who was elligible to vote. There are two oddities here. First, the Vice President casts tie-breakers in the Senate. The VP has a voter element just when he casts a tie-breaker. Second, the Speaker of the House is not required to vote, generally. That means that when the Speaker abstains in such a vote, he or she is simply omitted from the roll call, rather than portraying the speaker as having abstained or missed the vote. (We just do what the House does.) These elements have the following attributes:
- id attribute: The GovTrack ID of the person who cast the vote. Can be "0" if the ID of the voter could not be determined (but the node is left in so that the totals are correct). Also "0" if VP = "1".
- VP attribute. Set to "1" if the node represents the vote of the vice president in the case of a tie in the Senate. The id attribute is set to "0". Otherwise the attribute is absent.
- vote attribute: Generally "+", "-", "0", or "P" indicating an aye, nay, a "no vote" (absence), or a "Present" vote, which is used in quorum calls (and perhaps can show up elsewhere). These correspond to the keys of the vote options, described above.
- value attribute: This is a textual name for the actual vote cast, e.g. Aye, Yea, etc.