Note: more detailed datasets are available for download, with the corresponding codebook, via the Pennsylvania Policy Database Project’s website. The Pennsylvania Policy Database Project is a free, online resource that provides access to more than 180,000 state and news media records and enables users to trace and analyze with a few clicks the history of public policy in the Commonwealth since 1979.
This dataset contains a measure of the government’s attention to news reporting and opinion on public policy issues in Pennsylvania beginning in 1979 and typically updated to within a year or two of the present. Since there is no dominant news source for the entire state, these data are composed of a 10 percent random sample of news reports produced by dozens of newspapers and electronic media across Pennsylvania as collected by state Capitol press offices and circulated to key policymakers each weekday in the form of news digests. This is the only dataset that is not universal, i.e., that is composed of a random sample. The governor’s press office is the source of the news reports during the Thornburgh (1979-1987), Casey (1987-1995), Rendell (2003-2011), Corbett (2011-2015) administrations. During the combined Ridge and Schweiker administrations (1995-2003), the project sampled similar news digests produced by the press offices of the House Democratic and Republican caucuses. These reports are collected and circulated by press offices to ensure policymakers and their staffs are aware of what news organizations and opinion writers across Pennsylvania are saying about policy issues. In simple terms, the articles represent the news media’s “agenda” as perceived by key policymakers. The over 30 years of articles included in the dataset were collected, abstracted, and then coded by researchers at Temple University, Carnegie Mellon University, the University of Pittsburgh, Pennsylvania State University, Pennsylvania State University Harrisburg and The University of Pennsylvania. On average, there are around 1500 articles per year. A random sample was collected by examining articles on every tenth page from the archived news digests. Once collected, the articles were read and then converted into three to five sentence abstracts by undergraduate researchers. During this process, the researchers also coded a variety of filter variables regarding the mention of important actors or topics, which are explained in further detail below. In addition to the filters the articles’ newspaper of origin, date of publication, original headline, and article type were coded to allow for future researchers to locate the original source documents easily and efficiently. The final step in collecting these data involved the application of the Pennsylvania Policy Database Project’s adaptation of the topic coding scheme used by the Policy Agendas Project to the article abstracts. These data are then combined for each individual record such that for every article sampled the following variables are available for analysis.
67963 observations spanning the years 1979 to 2015
download dataset
download codebook