Pages Navigation Menu

Tracking the Congressional attention span

Arstechnica reports:  "While text mining 330,000 New York Times articles poses an interesting challenge, it’s not as interesting as sifting through 70 million words (from over 70,000 unique documents) found in the Congressional Record. A team of political science researchers  found that their software was able to answer questions too difficult for humans to handle on their own.

What’s exciting about this project and others like it is that computers are at last capable of unsupervised, dynamic analysis, and they can produce meaningful results with little or no intervention (humans will still be required to interpret the results, of course). The researchers in this project turned their software loose on 70 million words of Congressional debate without doing any initial topic coding. Researchers wanted to know several things: how do elected leaders distribute their attention? Under what circumstances do leaders push or follow public attention to an issue? Is debate on most issues incremental or explosive? Now that they could accurately track topics over time, the researchers found, for instance, that "judicial nominations" have consumed steadily more Congressional attention between 1997 and 2004. In fact, the topic produced the most number of words published in a single "day" of the Congressional Record: 230,000 on November 12, 2003.

Another hot issue, abortion, has moved in the other direction. Abortion has steadily received less Congressional attention over the last decade, and floor speeches on abortion now remain stable at one percent of the total (down from six percent in the 105th Congress)."