Data Sources: Economics of Digitization

This site links to data sources for research in the economics of digitization. See the Table of Contents (on the left) for a list of topics covered, or scroll down the page. For more information, please contact Shane Greenstein and Imke Reimers. Please sign up and help curate! 
  • A summary of the impact of the internet on the economy:

Individuals and the Internet
Current Population Survey (CPS) Internet Use 2010  (here).
The special Internet Use Supplement of the Consumer Population Survey’s (US Census/BLS) periodically surveys about 50,000 households and 129,000 individuals in the U.S. about their computer and internet use.  Tables depicting trends by various demographics are also available.  Microdata may be available.
Pew Internet and American Life Project (here and here).
Pew, a nonprofit, nonpartisan “fact tank,” surveys individuals on a variety of topics.   Of particular interest is their Internet and American Life Project.  A large number of questions about internet use are asked, though some inconsistently, in their surveys.  Raw data is available to researchers.  Summary statistics are available as well, split up by adults (18+) and teens (12-17).   
Comscore (2010 ,2011 ,2012 and 2013).
Comscore uses software voluntarily installed on millions of computers to record consumers’ online activity and purchase behavior.  In addition to micro panel data for purchase (may be available through Wharton’s WRDS), various statistics can be gleaned from press releases.  A similar type of data is available from Nielsen, but not through the WRDS interface.
Forrester Technographics (here).
Forrester Technographics uses an annual mail survey of approximately 40,000 households, intended to be nationally representative.    Data is available for purchase.
U.S. Department of Commerce Service Annual Survey (here)
The U.S. Department of Commerce’s Service Annual Survey collects data on quite a few revenue variables, expenditures variables, etc., from a sample of companies in a variety of sectors.  This data is used to generate total industry estimates.  Reporting is mandatory, so the response rate is excellent.  Due to the privacy concerns of companies reporting, micro data is not available.   
Consumer Expenditure Survey, U.S. Department of Labor (here)
The U.S. Bureau of Labor’s Consumer Expenditure Survey consists of two surveys – the Quarterly Interview Survey and the Diary Survey.  Respondents report their buying habits and expenditures, as well as demographic variables.  Micro data is available.  
Comscore press releases on e-commerce (here, here and here).
Internet Systems Consortium (here)
The Internet Systems Consortium Domain Survey reports discovered website hosts (proxy for websites).  The free data reports total worldwide hosts.  Disaggregated data is available for purchase.  
 Comscore press releases on websites and media (here)
Crunchbase (here)
This is a crowdsourced dataset of start-up companies, their founders, and their investors.
Newspaper Association of America (here)
The Newspaper Circulation Volume Data from Editor and Publisher Yearbook’s provides newspaper count, circulation, and expenditures for members of the Newspaper Association of America (believed to be most newspapers operating in the U.S.).  
  • Interactive Advertising Bureau (here)
The IAB, an advertising business organization, comprised of many members responsible for selling 86% of online advertising in the U.S., reports their estimates of total internet advertising revenues in the U.S. yearly.
Floodwatch (her4e)
Floodwatch is a browser tool that collects data on all the ads that are served while browsing. They do not provide any public data now but this tool could be used to track this information for survey or experiment participants.
U.S Department of commerce, Statistics of U.S. Business (here)