Getting Bulk Raw Data with Rsync
To download GovTrack’s raw data files in bulk, or if you plan to regularly update the files, you should use the rsync tool. Rsync is good for selecting which directory of data you want and keeping your files up to date by only downloading changes on each update.
Using rsync is pretty easy on Linux and Mac if you are comfortable with the command line. It is harder on Windows. Windows users may prefer the GovTrack API.
rsync on Linux and Mac
Once you install rsync, just type on a command-line:
rsync -avz --delete --delete-excluded govtrack.us::govtrackdata/us/112/bills .
That is all one line. Note the double colons and the period at the end.
rsync on Windows
On Windows, install DeltaCopy, which contains rsync for Windows. Then on a command line type:
mkdir C:\GovTrackData cd "\Program Files\Synametrics Technologies\DeltaCopy" rsync -avz --delete govtrack.us::govtrackdata/us/112/bills /GovTrackData
Note that you have to give a relative path to your GovTrackData directory because rsync will interpret "C:" as something other than a drive letter, since there are no drive letters in the Unix world. Watch out for the double colons in the middle.
This will put bill XML files in either C:\GovTrackData\bills or C:\cygwin\GovTrackData\bills. cygwin is the name of a common Windows wrapper around Unix tools. That's something to do with DeltaCopy, not GovTrack.