Website Source Code
The "front-end" website source code is the code behind the website that you see. It is based on ASP.NET, though it uses a very custom page generation system based on XSLT. This is only half of the source code that makes the site possible. The other half is the scrapers.
To get involved with the development of the front-end of GovTrack (i.e. this website that you see), you can set yourself up to run a version of the GovTrack website on your own computer. You will be able to modify the source code of the website, and if you make changes you can send a patch to me so I can integrate your changes into the production site.
The pages and source code of the site are licensed under the GNU AGPL. In short, you may only make modifications to the code if you make your modifications publicly available.
I've only tested the steps below on Linux. In principle it should work on Mac OS X fine. It might work on Windows too, but I have no idea.
You will need installed Mono (including "mcs" and "xsp") and Subversion.
Make a directory for GovTrack files.
Checkout the website "page" files from the source repository. These files are the XSLT templates (.xpd) that generate the pages of the site. This will create a "www" directory. Then make sure a 'data' directory exists inside that.
svn co svn://occams.info/govtrack/website/www cd www mkdir data
On Windows the page caching system is broken. To disable it, edit www/web.config and delete the <appSettings> section.
Download the website "code" binary .NET DLLs. These are some helper routines for the front-end files.
wget http://www.govtrack.us/frontend_bin.tgz tar -zxf frontend_bin.tgz rm frontend_bin.tgz
At this point you can start the website in sandbox mode running locally on your system. You can visit the site by visiting http://localhost:8080/index.xpd. On Linux, just run from the command line:
SANDBOX=1 xsp2
On Windows, you'll need to set the SANDBOX environment variable to 1. I think you can do this, but it's just off the top of my head:
set SANDBOX=1 xsp2
The website will download data files from GovTrack's web server as it needs them and will store them on disk for later. (The running process needs write permission in the data directory). And it will connect to GovTrack's MySQL database remotely to access other information (so also if you have a firewall it will need to be able to make outside connections).
The sandbox does not have access to the user profiles database, which means you cannot "log in" in the sandbox.
If you do a "svn update" to update your website source files, you may need to grab an updated frontend_bin.tgz file, since there may have been .NET code changes along with the changes in the pages.
Some backend .NET code is used as helper functions to generate the pages of the website. The code is compiled to www/bin/GovTrackWeb.dll. To edit this code, check out the backend source files. You don't need to do this just to run the sandbox.
(cd out of the www directory) svn co svn://occams.info/govtrack/website/src
After editing files, recompile the binary by running make:
cd src make
Note that because GovTrack uses .NET 2.0 classes you need to start it with xsp2 (as indicated above), not xsp. (And if you are daring and try mod_mono, you need to add to your httpd.conf MonoServerPath default /usr/bin/mod-mono-server2.)
Once data files are downloaded, they won't be updated from GovTrack's server. So your files will go out of data. To update them efficiently, use this command:
rsync -az --existing govtrack.us::govtrackdata/us/ data/us/
The sandbox won't download files needed by the web browser only, so PDFs for bills and automatically generated images like vote maps will not appear. If you really want all of the files, you can download them for a current session of Congress with the command below. It will download almost 500 megabytes, so for both your and my sake, don't do this unless you specifically want the missing files. In fact, if you want to rsync regularly see the source data page.

