Sunday, August 28, 2016

Mapping Congressional Cosponsorship Over Time

A recently developed interest of mine has been visualizing and studying social networks. At first, I looked to revisit the Instagram API I used in the past to study follower/following relationships amongst verified accounts (who doesn't want to know if Taylor Swift is really the most central celebrity ever?). Unfortunately, Instagram has now closed off their API, so we found a more timely dataset to analyze - sponsorships and cosponsorships in legislation.
This article mainly introduces the dataset, visualizes the networks over time, and does some preliminary analysis. Much deeper analysis will follow in the future.

The Data
The data is all available through the GovTrack API, which tracks all legislation that goes through the House and Congress. All code for this article is found on Github and interactive visualizations can be found on Plotly, a visualization tool that can layer interactiveness onto regular matplotlib graphs. 

I tracked sponsorships and cosponsorships in House and Senate bills from Congress #100 (1987-89) to Congress #114 so far (2015-17) for bills and joint resolutions. I disregarded approval status because we simply want to study the social dynamics of which legislators support one another, regardless of the outcome of the bill. We do not include simple resolutions, which do not have the force of law.

Visualizing Networks

All networks are constructed as directed graphs. Nodes represent congresspeople who have sponsored legislation, and edges lead into these nodes from other nodes who cosponsor, or support, their legislation. Edge weights are assigned by the number of times a cosponsor has sponsored a particular congressperson, so more cosponsorships from Person B to Person A will result in a greater edge weight between B and A. Node weights are assigned by total number of cosponsorships that person has, or the weighted in-degree of that node, so the more cosponsorships from any person to Person A will result in a greater node weight for person A. In the Senate, a bill may have multiple sponsors whereas in the House, that is not the case. Thus, most of our analysis following will be comparing cosponsorships as a more direct basis of comparison.

All node positions are visualized using force-directed drawing algorithms, which display pairs of nodes with greater edge weights closer together and those with lesser edge weights further apart. Nodes closer together represent more weight, or cosponsorships, between these nodes. Most of these graphs show a cluster of nodes in the center that are close together, meaning they have many cosponsorships with many of the other nodes in the center. The most recent Congress (#114) is shown directly below; historical congresses are in links following.

We also look at average eigenvector and in-degree centralities over time for the top 50 senators and representatives. These averages are based on the number of terms they served.

In-degree centrality of a node is proportional to the in-degrees to that node, which measures the effectiveness of a congressperson in attracting cosponsors. Eigenvector centrality also depends on the degree of connections, but additionally counts the centrality of those connections: $C_{E}(v_{i}) = \frac{1}{\lambda}\sum_{j \neq i}(A_{j,i}C_{E}(v_{i}))$, where $A$ is the square adjacency matrix of the network and $\lambda$ is some constant. If we write $C_{E}(v)$ as a vector of the eigenvector centralities of all nodes, then we can say $\lambda$ times this vector equals $A^{T}C_{E}(v)$, so $\lambda$ is the eigenvalue and $C_{E}(v)$ is the eigenvector. In this context, the most central senators and representatives are those connected to influential senators and representatives.


Senate #100-114

Eigenvector CentralityIn-Degree Centrality
Orrin Hatch0.147Orrin Hatch0.781
Charles Grassley0.140Patrick Leahy0.753
Patrick Leahy0.139Max Baucus0.695
Thomas Harkin0.134John McCain0.692
Dianne Feinstein0.125John Rockefeller0.662
Christopher Dodd0.116Thomas Harkin0.641
Richard Durbin0.116Dianne Feinstein0.617
Harry Reid0.105Harry Reid0.606
Jeff Bingaman0.105Pete Domenici0.591
Frank Lautenberg0.102Barbara Mikulski0.585
John Rockefeller0.098Jeff Bingaman0.571
Thomas Daschle0.097Edward Kennedy0.568
Max Baucus0.097Mitch McConnell0.557
John Kerry0.096Christopher Dodd0.547
John McCain0.096John Kerry0.538
Charles Schumer0.095Kay Hutchison0.532
Olympia Snowe0.082Susan Collins0.530
Daniel Moynihan0.077Joseph Biden0.498
Barbara Boxer0.076Frank Lautenberg0.493
Robert Dole0.076Christopher Bond0.491
Pete Domenici0.074Thad Cochran0.486
Susan Collins0.072Charles Schumer0.477
Joseph Biden0.068Daniel Inouye0.476
Barbara Mikulski0.068Richard Durbin0.473
Robert Menéndez0.066Olympia Snowe0.472
Kay Hutchison0.062Kent Conrad0.470
John Chafee0.062Barbara Boxer0.469
Daniel Inouye0.061Daniel Akaka0.460
Joseph Lieberman0.060Joseph Lieberman0.458
John Reed0.060John Warner0.444
David Pryor0.059Carl Levin0.443
Daniel Akaka0.058Richard Lugar0.442
Byron Dorgan0.058Arlen Specter0.435
Patty Murray0.057Byron Dorgan0.418
George Mitchell0.054Trent Lott0.414
Bob Graham0.054James Inhofe0.413
Christopher Bond0.054Thomas Daschle0.406
Alfonse D'Amato0.053Bob Graham0.391
Alan Cranston0.053John Breaux0.390
Arlen Specter0.052Ted Stevens0.386
Sherrod Brown0.052Ron Wyden0.381
James Jeffords0.051John Chafee0.381
Howard Metzenbaum0.051Michael Enzi0.378
Hillary Clinton0.050Ernest Hollings0.375
Michael DeWine0.050Patty Murray0.369
Ernest Hollings0.049Larry Craig0.367
Trent Lott0.048Daniel Moynihan0.364
Richard Lugar0.048Tim Johnson0.357
Kent Conrad0.047Samuel Brownback0.355

House #100-114

Eigenvector CentralityIn-Degree Centrality
Carolyn Maloney0.126Charles Rangel0.547
Rosa DeLauro0.113Michael Bilirakis0.503
Nita Lowey0.104Don Young0.479
Charles Rangel0.104Nita Lowey0.470
John Conyers0.103George Miller0.467
Fortney Stark0.091John Conyers0.455
Louise Slaughter0.087Elton Gallegly0.454
Henry Waxman0.085Louise Slaughter0.439
Nancy Johnson0.083Fred Upton0.433
Michael Bilirakis0.083Carolyn Maloney0.423
Christopher Smith0.081Nancy Johnson0.423
Edward Markey0.081Peter King0.417
John Lewis0.066Henry Waxman0.412
Jerrold Nadler0.066John Lewis0.408
Barney Frank0.066Rosa DeLauro0.408
Barbara Lee0.065Ileana Ros-Lehtinen0.404
Jim McDermott0.060Barney Frank0.404
Lois Capps0.060Edward Markey0.402
Lloyd Doggett0.059Eliot Engel0.400
Peter King0.059Fortney Stark0.397
Lane Evans0.057Sander Levin0.393
Constance Morella0.055Dale Kildee0.387
Eliot Engel0.053Bob Goodlatte0.379
William Clay0.053E. Shaw0.373
John Dingell0.053Sam Johnson0.368
E. Shaw0.053Clifford Stearns0.367
Don Young0.051F. Sensenbrenner0.366
Peter DeFazio0.051Peter DeFazio0.356
James McGovern0.050H. Saxton0.351
Christopher Shays0.050Dave Camp0.347
Bob Filner0.050Lois Capps0.342
Maxine Waters0.049Lane Evans0.340
Elton Gallegly0.049Bob Filner0.338
Philip English0.047Gary Ackerman0.336
Frank Pallone0.047Walter Jones0.332
Ileana Ros-Lehtinen0.046Bill Pascrell0.331
Gary Ackerman0.046Joe Barton0.327
Sander Levin0.045Thomas Davis0.324
Lynn Woolsey0.045Steny Hoyer0.323
Rush Holt0.045James Moran0.322
Patricia Schroeder0.044Mike Thompson0.317
Charles Schumer0.044John Dingell0.316
Benjamin Gilman0.043Kevin Brady0.314
Barbara Kennelly0.043Anna Eshoo0.312
Dale Kildee0.043Lamar Smith0.312
Bernard Sanders0.042C. Cox0.312
Fred Upton0.042Philip English0.311
Mike Thompson0.041Jim McDermott0.308
Amory Houghton0.041Constance Morella0.306


Next, we look at divisiveness of the House and Senate over time by mapping modularity over time. Modularity is a metric that essentially sums up for all node pairs, the difference between the actual and expected number of edges between them. High modularity indicates that edge connections are not random. Instead, there are dense connections amongst some nodes, and sparse connections to other densely connected nodes. First, we look at the modularity given a partition along political party lines:

We can see that modularity in the House of Representatives is typically higher than in the Senate, a conclusion supported by Y. Zhang, 2009. This may be explained by the intuition that House elections are smaller and more local than statewide Senate elections, and thus have a higher likelihood of electing more partisan people. We can see modularity rise dramatically during the 112th Congress for both the Senate and the House, a period that has been deemed "the least productive since the Civil War" due to extreme polarity. This period was after the 2010 midterm elections, in which more partisan congresspeople were elected into Congress and Republicans took both houses. This polarity was compounded by the fact that Obamacare was signed in March 2010 before these midterm elections.

Next, we look at the clusters that the Louvain method identifies in these Congresses over time and compare them to the actual split along party lines. This network cluster detection method iteratively maximizes the modularity measure (measure of divisiveness):

High error periods generally correspond to periods when party-modularity is not high, as expected. 

What's Next?
After we've introduced our data and did some basic visualizations and analyses, there is a lot left to explore. Possible next steps, as time allows, include:

1) Panel-regressing centrality measures on senator or representative characteristics over time. Are women less central? Are older people more or less central? Does being from a certain state automatically make you more or less central?
2) Reciprocity. Are some pairs always voting for each other? 
3) Predicting cosponsorship edges based on the characteristics of the sponsor, or even the content of the bill.