
Last updated in May 2025
This document provides a variable description for the shared dataset. Please cite “Corporate Lobbying of Bureaucrats” by Michelle Lowry and Ekaterina Volkova (2025) when using the data. More details on data construction are available in the latest draft: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5006884.
The dataset is based on the top 500 largest publicly traded companies each year (by market capitalization) from the CRSP-Compustat universe, covering the years 1999 through 2023.
We collect all annual reports (10-Ks) from SEC EDGAR for each company-year observation, and all texts of proposed and final rules from the Federal Register.
We estimate cosine similarity between each firm’s 10-K and each rule text to measure rule-relatedness. We also report the number of English words in each rule, based on the Grady Ward dictionary.
Finally, we identify significant rules using Office of Information and Regulatory Affairs (OIRA) classifications, and major rules based on U.S. Government Accountability Office (U.S. GAO) designations.
Variable Description
document_number
: Federal Register document ID number.
agency
: Issuing federal agency (e.g., Department of Agriculture).
publication_date
: Date the rule was published in Federal Register.
rule_type
: Either Proposed Rule or Rule (for final rules).
major_rule
: Indicator for "major" rules under the CRA.
significant_rule
: Indicator for "significant" rules under EO 12866.
rule_word_count
: Number of English words in the rule text.
cik
: SEC Central Index Key identifying the company.
company_name
: Legal name of the company associated with the CIK.
relatedness
: Cosine similarity between the firm’s 10-K and the rule text.