Analysis Methodology
Statistical analyses of legislation and legislators provide context for the legislative process. Of all of the 10,000+ bills pending at any given time, our unique analyses help GovTrack visitors know what is relevant and what to pay attention to.
Ideology
The Ideology Analysis compares the sponsorship and cosponsorship patterns of Members of Congress. Read More »
Prognosis
The Prognosis Analysis looks at the factors that help or hurt a bill’s chance of getting out of committee and being enacted. It is based on a regression model. Read More »
Leadership
The Leadership Analysis looks at who is cosponsoring whose bills to see who the legislative leaders are. It’s a little like if you scratch my back will I scratch yours? The analysis is based on Google PageRank, the algorithm Google uses to order search results. Read More »
Ideology Analysis of Members of Congress
The ideology analysis assigns a left–right score to each Member of Congress based on their pattern of cosponsorship. The left–right score reflects the dominant ideological difference or differences among Members of Congress, which changes over time.
In a nutshell, Members of Congress who cosponsor similar sets of bills will get scores close together, while Members of Congress who sponsor different sets of bills will have scores far apart. Members of Congress with similar political views will tend to cosponsor the same set of bills, or bills by the same set of authors, and inversely Members of Congress with different political views will tend to cosponsor different bills.
You can find this analysis on the pages for current Members of Congress and in the charts to the right which plot the ideology score on the horizontal axis and the leadership score on the vertical axis.
Overview
The data that goes into this analysis is a list of who sponsored or cosponsored which bills. The process doesn’t look at the content of the bills or the party affiliation or anything else about the Members of Congress, but it is able to infer underlying behavioral patterns, some of which correspond to real-world concepts like left-right ideology.
You’ll see in the charts on the right that the ideology analysis does a good job at separating the Democrats from the Republicans, and within each party the moderates from the extremes. If you wanted to know how your representatives stood in relation to their peers ideologically, this chart is a good place to start.
But keep in mind its limitations: Although we don’t report a margin of error, the scores fluctuate significantly over time because of the limited data used in the analysis and that legislating is a process involving chance. In addition, while we sometimes refer to this as a left-right score, that’s something we attribute to the analysis after we see the results — it may be measuring something else, perhaps something more closely related to partisan-ness, and it may be affected by the popularity of a legislator since the analysis looks at when legislators work together. Additionally, cosponsorship is a low-risk legislative action that might not reflect how a legislator might vote when forced to make yes-or-no decisions. And our scores may be gamed by legislators who cosponsor bills with the intent to move their score to the left or right.
We first began publishing this analysis in 2004, then calling it a political spectrum. A similar analysis by Professor Keith Poole using voting records rather than cosponsorship produces similar results: see voteview.com. (As far as we know, we were the first to apply this sort of analysis to cosponsorship behavior.)
Methodology
The statistical method behind this analysis is Principal Components Analysis, also known as dimensionality reduction. Principal Components Analysis is a statistical technique that reveals underlying patterns in data.
Here’s how it works: Form a matrix (a grid of numbers) with columns representing Members of Congress and rows also representing Members of Congress. Do this for the House and Senate separately. We include (co)sponsorship from the current and previous two Congresses, so between four and six years of data. For the Senate, you have a 100x100 table. In each cell of the table, put the number of times the senator for the row cosponsored a bill introduced by the senator for the column. Or if it's the same senator in the row and column, put in the number of bills he or she introduced. Then compute the singular value decomposition of the matrix (which is how Principal Components Analysis is often done).
Every square matrix has a singular value decomposition which can be interpreted as a set of sets of scores for each Member of Congress, each set a ranking on different dimension. The dimensions are themselves ranked in order by how much of the original data they explain. We have found that the second dimension best corresponds with what people generally consider political ideology. We use the scores from that dimension in our charts.
The analysis is blind to actual information like what the legislation is about or what party each Member of Congress is affiliated with. In fact, there’s no guarantee that the scores have anything to do with liberal- or conversative-ness or any other standard frame for political ideology. All it tells us is how to spread Members of Congress out along a spectrum in a way that explains their record of cosponsorship. But in practice it captures left-right ideology very well.
Data
The ideology scores can be found in two CSV files sponsorshipanalysis_h.txt and sponsorshipanalysis_s.txt (House and Senate) over here.
Source Code
Running this analysis is pretty simple in Python. It is literally two lines. Assuming you have the cosponsorship matrix in P:
u, s, vT = numpy.linalg.svd(P) ideology = vT[1,:]
The full source code for this analysis can be found on github.
Citation
To cite our methodology and results, we recommend either of these:
GovTrack.us. 2013. Ideology Analysis of Members of Congress. Accessed at https://www.govtrack.us/about/analysis.
Tauberer, Joshua. 2012. Observing the Unobservables in the U.S. Congress, presented at Law Via the Internet 2012, Cornell Law School, October 2012. [text | slides | video]
References
For more on how to use singular value decomposition, check out:
Wall, Rechtsteiner, and Rocha. “Singular value decomposition and principal component analysis.” in A Practical Approach to Microarray Data Analysis. D.P. Berrar, W. Dubitzky, M. Granzow, eds. pp. 91-109, Kluwer: Norwell, MA (2003). LANL LA-UR-02-4001.
Leadership Analysis of Members of Congress
A leadership score is computed for each Member of Congress by looking at how often other Members of Congress cosponsor their bills — more or less. The analysis is based on PageRank, Google’s algorithm for ranking pages on the web.
The idea behind a leadership score is that if X cosponsors Y’s bills but Y does not cosponsor X’s bills, then X is a follower relative to Y being a leader.
You can find this analysis on the pages for current Members of Congress.
The charts to the right plot the leadership score on the vertical axis and the ideology score on the horizontal axis.
There are some interesting things in this chart. There’s a distinct V-shape. Congressional leaders appear to be more extreme. There are some confounding effects to consider here. Leaders tend to be more senior members of Congress, they tend to be older, and they have had more time to participate in legislating. But somewhere among those factors there’s an interesting correlation to having an extreme political ideology.
These leadership and ideology scores give us a view into Congress that is normally hidden to us. We can’t observe leadership. We’re not there, in Congress, to see it. We’re not in the meetings where you can see relationships form. But those relationships are known to the representatives and senators. It’s obvious to them. They know whether they lead or follow. Their staff know. This is a sort of social knowledge that is locked within the institution of Congress, unless we get a little creative with how we try to observe it.
Overview
The data that goes into this analysis is a list of who sponsored or cosponsored which bills. The process doesn’t look at the content of the bills or anything else about the Members of Congress, but it is able to infer underlying behavioral patterns, some of which correspond to real-world concepts like leadership.
We first began publishing leadership scores in 2010. As far as we know, this analysis is unique to GovTrack.
Methodology
The inspiration for this analysis comes from Google’s PageRank algorithm, which governs how Google ranks the order of pages in its search results. Google’s method is widely known: the more links you get to your website from other websites, and the more links those other websites have, the higher your PageRank and the higher up in search results you appear.
Here’s how we apply it to Congress: the more Members of Congress that cosponsor Member X’s bills, and the more cosponsors those other Members of Congress have, the higher X’s leadership score.
We start by forming a matrix (a grid of numbers) with cosponsorship data. It is the same matrix as in the ideology analysis, so see the methodology section there for details. Then we run the PageRank algorithm on the matrix, which yields a new number for each Member of Congress. That is the leadership score.
This analysis came from a suggestion from Joseph Barillari (who GovTrack’s creator knew in college). (The original formulation of the score for Member of Congress X was the mean across all other Members of Congress Y of the log of the number of bills sponsored by X and cosponsored by Y divided by the number of bills sponsored by Y and cosponsored by X.)
Data
The leadership scores can be found in two CSV files sponsorshipanalysis_h.txt and sponsorshipanalysis_s.txt (House and Senate) over here.
Source Code
Here is pseudo-code in Python. Assuming you have the cosponsorship matrix in P:
x = numpy.ones( (N, 1) ) / float(N) while True: y = numpy.dot(P, x) if onenorm(y-x) < .00000000001: break x = y def onenorm(u): return sum(abs(u))
The full source code for this analysis can be found on github.
Citation
To cite our methodology and results, we recommend either of these:
GovTrack.us. 2013. Leadership Analysis of Members of Congress. Accessed at https://www.govtrack.us/about/analysis.
Tauberer, Joshua. 2012. Observing the Unobservables in the U.S. Congress, presented at Law Via the Internet 2012, Cornell Law School, October 2012. [text | slides | video]
References
Kamvar, Sep. 2010. Numerical algorithms for personalized search in self-organizing information networks. Princeton University Press.
Text Incorporation
An analysis we incorporated into GovTrack in 2016 reveals when provisions of bills are incorporated into other bills. Our new tool will reveal much more about what Congress is doing, and what laws are being made, than has ever been known to the general public.
All too often Congress cuts bills apart and pastes them back together — sometimes into an “omnibus.” The bills that finally get a vote are an amalgam of provisions from other bills that either can’t or won’t get a standalone vote themselves. The most important legislation is crafted this way.
Congress and the President may not be enacting many new laws by the numbers, but those new laws come from an intricate web of connections that the general public has not been able to see until now. This isn’t just a matter of discovery. It is a window into how Congress really works, the processes that only insiders are normally able to see.
Our text incorporation analysis finds provisions of bills that are incorporated into enacted legislation. You can trace enacted bills back to the original legislation where provisions were introduced and you can now see when bills that appear to have died have instead been incorported into other legislative vehicles.
Only about 3% of bills will be enacted through the signature of the President or a veto override. Another 1% are identical to those bills, so-called “companion bills,” which are identified by the Congressional Research Service. Our new analysis reveals almost another 3% of bills which had substantial parts incorporated into an enacted bill in 2015–2016. To miss that last 3% is to be practically 100% wrong about how many bills are being enacted by Congress.
For further details, see How a complex network of bills becomes a law: Introducing a new data analysis of text incorporation!


The following information pertained to our prognosis analysis until October 2016, when we began showing predictions by Skopos Labs. You may find the description of our old analysis below informative, but it is no longer the methodology used on GovTrack.
Bill Prognosis Analysis
GovTrack computes a prognosis for each bill, which is the probability that the bill will be enacted. Our computation is based on factors that are correlated with successful or failed bills in the past, such as whether the sponsor is a committee chair.
What is the point of this?
- More than 10,000 bills will be considered by each Congress. About 7% will become law. Which bills should we focus on?
- Representatives and senators, their staff, and lobbyists all know what bills are important because they have the institutional knowledge of what makes a bill important. The prognosis highlights the factors that make a bill successful.
The prognosis scores can be found on the pages for bills throughout the site.
Overview
The data that goes into this analysis are factors that we compute for bills, such as whether the sponsor is a committee chair (see right for a full list), and whether the bill was successful. We “train” the model on bills from the 113th Congress (2013-2015) to compute probabilities for bills in the current Congress.
We first began publishing prognosis scores in 2012. As far as we know, we were the first to apply this analysis to Congressional bills.
Methodology
This analysis is based on a logistic regression. Logistic regression is similar to simple linear regression but it is more appropriate when modeling probabilities. We create eight separate models: For each of the four types of legislative measures (bills, joint resolutions, concurrent resolutions, and simple resolutions), we compute one model that predicts whether the bill/resolution will get out of committee and a separate model that computes, for bills/resolutions out of committee, whether the bill/resolution will be enacted or agreed to.
The independent variables are the binary factors mentioned above and listed in the factors table at the right.
The dependent variable is how successful the bill or resolution was. When predicting whether a bill or resolution will make it out of committee, it is a binary variable. When predicting whether a bill will be enacted or a resolution agreed to, this is a continuous variable computed as the percentage of paragraphs in the bill that appear in any enacted bill (and similarly for resolutions). We do this because there are often identical bills in Congress (so-called companion bills) and often bills are incorporated into other bills (such as omnibus bills), and we want to give the original bills credit for being successful even if the original bill itself is not enacted per se.
The output of the logistic regression models are weights assigned to the factors, called β in the table at the right. The prognosis score for a bill is computed by multiplying all of the factors together that apply to the bill (more or less, see logistic regression on Wikipedia for details). The result is a number that can be interpreted as a probability.
In choosing the factors for model, we select from a large set of plausible factors those which appear to be statistically significant on their own (using a binomial distribution). After the logistic regression, we remove factors that appear statistically non-significant and re-compute the model.
Citation
To cite our methodology and results, we recommend either of these:
GovTrack.us. 2013. Bill Prognosis Analysis. Accessed at https://www.govtrack.us/about/analysis.
Tauberer, Joshua. 2012. Observing the Unobservables in the U.S. Congress, presented at Law Via the Internet 2012, Cornell Law School, October 2012. [text | slides | video]
References
Here is some academic work on the same subject:
Tae Yano, Noah A. Smith, and John D. Wilkerson. 2012. "Textual Predictors of Bill Survival in Congressional Committees," at New Directions in Analyzing Text as Data 2012, 5-6 October at Harvard.
John Wilkerson, David Smith, Nick Stramp, and Jeremy Dashiell. 2013. "Tracing the Flow of Policy Ideas in Legislatures: A Computational Approach".
Results
The following tables show how various factors help or hurt a bill or resolution’s chance of making it out of committee and getting enacted (or agreed to). Two tables are given for each of the four bill types.
In the tables, N is the number of bills/resolutions that had the indicated factor in the training corpus; %S is of bills with this factor, the percent that were successful (past committee or enacted); and β is the regression coefficient (weight) from the prognosis analysis. Higher weights increase the bill or resolution’s probability of success.
Bills sent out of committee to the floor
Overall, about 15% of the 8,905 bills in 2013-2015 were sent out of committee to the floor. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
67 | 72% | 2.5 | Title starts with "To designate the facility of the United States Postal". |
27 | 48% | 1.9 | Title starts with "A bill to designate the". |
534 | 55% | 1.8 | Sponsor is a relevant committee chair. |
286 | 60% | 1.6 | Got past committee in a previous Congress. |
30 | 53% | 1.6 | Referred to Senate Appropriations (incl. companion). |
799 | 46% | 1.4 | A cosponsor is a relevant committee chair. |
70 | 49% | 1.2 | Referred to Senate Indian Affairs (incl. companion). |
158 | 22% | 1.0 | Referred to House Appropriations (incl. companion). |
802 | 34% | 0.9 | Referred to House Natural Resources (incl. companion). |
151 | 23% | 0.9 | On a companion bill: A cosponsor is a relevant committee chair. |
412 | 34% | 0.7 | Referred to Senate Energy and Natural Resources (incl. companion). |
999 | 28% | 0.7 | A cosponsor is a relevant committee ranking member. |
439 | 23% | 0.6 | Has a companion bill sponsored by a member of the other party. |
2,312 | 20% | 0.6 | Has cosponsors from both parties. |
725 | 28% | 0.5 | Sponsor is in majority party and 1/3rd+ of cosponsors are in minority party. |
2,497 | 26% | 0.5 | Sponsor is on a relevant committee & in majority party. |
1,650 | 24% | 0.3 | Cosponsor has high leadership score (majority party). |
3,335 | 20% | -0.2 | 2 or more cosponsors are on a relevant committee. |
345 | 11% | -0.4 | Introduced in the last 90 days of the Congress (incl. companion bills). |
4,045 | 8% | -0.5 | Sponsor is a member of the minority party. |
229 | 8% | -0.7 | Referred to House House Administration (incl. companion). |
1,501 | 6% | -0.7 | Referred to House Ways and Means (incl. companion). |
391 | 11% | -0.7 | Referred to House Veterans' Affairs (incl. companion). |
1,032 | 11% | -0.8 | Referred to House Judiciary (incl. companion). |
387 | 11% | -0.8 | Referred to Senate Judiciary (incl. companion). |
778 | 5% | -0.8 | Referred to House Education and the Workforce (incl. companion). |
1,233 | 9% | -0.8 | Referred to House Energy and Commerce (incl. companion). |
572 | 6% | -0.9 | Referred to Senate Health, Education, Labor, and Pensions (incl. companion). |
2,228 | 6% | -1.0 | Is a bill reintroduced from a previous Congress. |
401 | 6% | -1.1 | Referred to House Armed Services (incl. companion). |
120 | 8% | -1.1 | Referred to House Rules (incl. companion). |
726 | 4% | -1.3 | Referred to Senate Finance (incl. companion). |
79 | 4% | -1.8 | Referred to Senate Agriculture, Nutrition, and Forestry (incl. companion). |
121 | 2% | -1.9 | Referred to Senate Armed Services (incl. companion). |
32 | 0% | -30.1 | Title starts with "A bill for the relief of". |
29 | 0% | -30.2 | Title starts with "A bill to amend the Internal Revenue Code of". |
Simple resolutions sent out of committee to the floor
Overall, about 46% of the 1,385 simple resolutions in 2013-2015 were sent out of committee to the floor. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
21 | 100% | 33.2 | Title starts with "A resolution to authorize". |
99 | 98% | 4.7 | Title starts with "Providing for consideration of". |
44 | 86% | 2.9 | Title starts with "A resolution recognizing the". |
18 | 22% | 2.2 | On a companion bill: Sponsor has a high leadership score (majority party). |
20 | 90% | 2.1 | Title starts with "A resolution commemorating the". |
44 | 89% | 1.8 | Got past committee in a previous Congress. |
97 | 56% | 1.6 | Sponsor is a relevant committee chair. |
32 | 84% | 1.1 | Title starts with "A resolution congratulating the". |
111 | 33% | 0.8 | A cosponsor is a relevant committee ranking member. |
276 | 35% | 0.6 | Cosponsor has high leadership score (majority party). |
101 | 56% | -0.7 | Referred to Senate Foreign Relations (incl. companion). |
86 | 57% | -0.8 | Referred to Senate Judiciary (incl. companion). |
580 | 26% | -0.9 | Sponsor is a member of the minority party. |
393 | 24% | -0.9 | 2 or more cosponsors are on a relevant committee. |
30 | 23% | -1.2 | Referred to House Ways and Means (incl. companion). |
16 | 19% | -1.4 | Has a companion bill sponsored by a member of the other party. |
26 | 19% | -2.0 | Title starts with "A resolution expressing the sense of the Senate that". |
179 | 21% | -2.4 | Referred to House Foreign Affairs (incl. companion). |
149 | 72% | -2.6 | Referred to House Rules (incl. companion). |
31 | 16% | -2.8 | Referred to House Armed Services (incl. companion). |
45 | 20% | -3.0 | Referred to Senate Health, Education, Labor, and Pensions (incl. companion). |
63 | 6% | -3.2 | Referred to House Judiciary (incl. companion). |
120 | 3% | -3.4 | Is a bill reintroduced from a previous Congress. |
84 | 1% | -3.7 | Title starts with "Expressing the sense of the House of Representatives that". |
52 | 13% | -4.2 | Referred to Senate Rules and Administration (incl. companion). |
96 | 2% | -4.4 | Referred to House Energy and Commerce (incl. companion). |
110 | 3% | -4.5 | Referred to House Oversight and Government Reform (incl. companion). |
78 | 1% | -4.8 | Referred to House Education and the Workforce (incl. companion). |
21 | 0% | -33.2 | Title starts with "Expressing support for designation of the". |
20 | 0% | -34.2 | Title starts with "Supporting the goals and ideals of National". |
21 | 0% | -35.3 | Title starts with "Expressing support for the". |
Bills enacted
Overall, about 21% of the 1,333 bills that got past committee in 2013-2015 were enacted. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
48 | 72% | 1.8 | Title starts with "To designate the facility of the United States Postal". |
29 | 53% | 1.3 | Referred to House Budget (incl. companion). |
36 | 56% | 1.2 | Referred to Senate Health, Education, Labor, and Pensions (incl. companion). |
205 | 44% | 1.0 | Sponsor is in majority party and 1/3rd+ of cosponsors are in minority party. |
50 | 53% | 1.0 | Referred to Senate Homeland Security and Governmental Affairs (incl. companion). |
34 | 46% | 0.9 | Referred to House Appropriations (incl. companion). |
27 | 48% | 0.9 | Referred to Senate Finance (incl. companion). |
25 | 48% | 0.9 | Referred to Senate Banking, Housing, and Urban Affairs (incl. companion). |
28 | 42% | 0.9 | Referred to Senate Foreign Relations (incl. companion). |
34 | 41% | 0.8 | Referred to Senate Indian Affairs (incl. companion). |
140 | 40% | 0.8 | Referred to Senate Energy and Natural Resources (incl. companion). |
50 | 37% | 0.7 | Referred to Senate Commerce, Science, and Transportation (incl. companion). |
110 | 44% | 0.6 | Referred to House Energy and Commerce (incl. companion). |
322 | 41% | 0.6 | Sponsor is a member of the minority party. |
282 | 39% | 0.5 | A cosponsor is a relevant committee ranking member. |
456 | 32% | 0.3 | Has cosponsors from both parties. |
142 | 31% | -0.4 | Is a bill reintroduced from a previous Congress. |
654 | 29% | -0.8 | 2 or more cosponsors are on a relevant committee. |
Simple resolutions agreed to
Overall, about 96% of the 634 simple resolutions that got past committee in 2013-2015 were agreed to. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
54 | 82% | -1.9 | Sponsor is a relevant committee chair. |
96 | 84% | -2.3 | 2 or more cosponsors are on a relevant committee. |
Joint resolutions sent out of committee to the floor
Overall, about 19% of the 178 joint resolutions in 2013-2015 were sent out of committee to the floor. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
21 | 57% | 4.6 | Sponsor is a relevant committee chair. |
38 | 50% | 3.8 | Sponsor is on a relevant committee & in majority party. |
57 | 2% | -4.2 | Introduced in the first 90 days of the Congress (incl. companion bills). |
21 | 0% | -37.9 | A cosponsor is a relevant committee ranking member. |
56 | 0% | -40.3 | Title starts with "Proposing an amendment to the Constitution of the United". |
Concurrent resolutions sent out of committee to the floor
Overall, about 39% of the 169 concurrent resolutions in 2013-2015 were sent out of committee to the floor. The following factors help or hurt that:
N | %S | β | Factor |
---|---|---|---|
15 | 93% | 3.1 | Got past committee in a previous Congress. |
65 | 18% | -1.7 | Sponsor is a member of the minority party. |
18 | 11% | -2.0 | Has a companion bill in the other chamber. |
17 | 0% | -35.6 | Referred to House Oversight and Government Reform (incl. companion). |
25 | 0% | -36.6 | Title starts with "Expressing the sense of Congress that". |
Concurrent resolutions agreed to
Overall, about 83% of the 66 concurrent resolutions that got past committee in 2013-2015 were agreed to. The following factors help or hurt that:
There were no statistically significant factors in the model.
Joint resolutions enacted or passed
Overall, about 42% of the 33 joint resolutions that got past committee in 2013-2015 were enacted or passed. The following factors help or hurt that:
There were no statistically significant factors in the model.
Did it work? The following charts compare the prognoses computed for bills to their actual rate of success. The prognosis model for these charts was trained on the 112th Congress and tested on the 113th Congress.
For each regression model, the bills are divided into 10 bins by prognosis. The median prognosis is plotted on the horizontal axis and the percentage of successful bills in the bin is plotted on the vertical axis.
The prognosis closely estimates the actual chances of a bill getting out of committee. Though the accuracy is much less for other predictions, the rough upward slope in most of the charts shows that the prognosis was often predictive of a bill’s future.
Bills sent out of committee to the floor
Simple resolutions sent out of committee to the floor
Bills enacted
Simple resolutions agreed to
Joint resolutions sent out of committee to the floor
Concurrent resolutions sent out of committee to the floor
Concurrent resolutions agreed to
Joint resolutions enacted or passed
Here are some additional charts for machine learning researchers.
The charts below show precision vs. recall plotted parametrically for various values of a success-fail threshold t. Bills with prognosis above t are predicted successes for the purposes of these charts. The prognosis model for these charts was trained on the 112th Congress and tested on the 113th Congress.