Pandemic Data Room

A Resource Provided by The Center for Global Data Visualization

The Pandemic Data Room is a comprehensive global COVID-19 data repository created by a consortium of partners and led by QED Group to improve understanding of the impact of physical distancing policies on social behavior, disease rates, hospital utilization, and local/national economies. This initiative will generate critical information needed to adjust policies to control the outbreak. We hope to bring amazing talent to work on the data and generate new tools that can be used to manage and understand this pandemic.

Contribute Research Questions and Analysis Ideas to the Pandemic Data Room

In order that the Pandemic Data Room best reflects questions being asked among the global health and international development communities, we have created a portal where people can pose questions on COVID-19 they are looking to get answered. This question portal will be available to Data Challenge participants and they can use it to generate ideas in creating compelling visualization and analysis tools. Please visit the portal here.

Participate in the COVID-19 Data Challenge using this data resource. Both students and professionals are encouraged to participate. For each track, submissions are judged separately and prizes (1st Place $2000, 2nd Place $1500, 3rd Place $1000, Honorable Mentions $100) are awarded.

Contribute to the Pandemic Data Room by submitting a new data source request in this Google Form. We will evaluate your data source and get back to you soon!

Click ▶ to see data source details.

Data sources provided by partners

IDS International and Clear Outcomes: Survey Data on Physical Distancing and Hygiene Behaviors in the United States - updated data April 29

Detailed Description: Flattening the curve depends on the population following policies on physical distancing (e.g., staying home, avoiding contact closer than 6’) and hygiene (e.g., washing hands, wearing masks). Getting back to work depends on measuring the effectiveness of policies and behaviors so that governments, institutions, businesses, and individuals can make better decisions on what can be done without triggering new outbreaks. IDS is a data and technology company working to create data collection and analysis tools to better measure compliance and effectiveness of pandemic behavior safety. On April 6th, 2020, IDS conducted an online survey among a nationally representative sample about Physical Distancing and Hygiene Behaviors. This survey was also conducted April 20. See and download latest results of the Clear Outcomes and IDS nationwide poll, summary for survey results, and survey questions here.
Data Resolution: US
File type: .csv

Fraym: Geospatial Data For Covid-19 Prevention and Crisis Response

Detailed Description: The risks posed by coronavirus are especially high for millions of people who live in low-and middle-income countries, where financial, medical equipment, and health personnel resources are highly constrained. To rapidly identify countries, cities and communities that exhibit the greatest risk of emergency cases and rapid transmission, Fraym provides access to relevant data layers including Emergency Case Risk Factors (Smoking prevalence, Elderly households, Body health - obesity, child stunting, child wasting) and Transmission Risk Factors (Population density, Household size, Occupation, Transportation modes, Hand Washing Practices). CGDV has requested the above data layers for countries including Guatemala, Kenya, Nigeria, Pakistan, Philippines, Rwanda, Senegal, and South Africa. Each folder should have a data dictionary and a citation guide for use. Download raster files with high-resolution down to 1km2 in CGDV Google Drive.
Data Resolution: Country
File type: TIF File

Geopoll: Coronavirus In Sub-Saharan Africa -- Updated April 21

Detailed Description: As a research organization that conducts remote research, GeoPoll takes an initiative to assist the global response to coronavirus. From March 10th – 13th, 2020, GeoPoll administered a first-round survey on the knowledge of and perceptions towards coronavirus in South Africa, Kenya, and Nigeria. This survey examined awareness levels, primary information sources, knowledge of how to prevent the virus, and levels of worry.

On April 15th, Geopoll further conducted a second-round survey about How Africans in 12 Nations are Responding to the COVID-19 Outbreak. The remote study examined the effects coronavirus is already having on people throughout the region.

Click links above to read full reports of both surveys. Download a copy of the survey data in CGDV Google Drive.
Data Resolution: County in African countries
File type: Excel

Exovera: COVID-19 Related Articles Published In US Newspaper

Detailed Description: Exovera provides COVID-19 social media data through its robust API platform. Download data files in CGDV Google Drive.

politics_coronavirus_rawdata_Jan012020-Apr072020.json: The US Politics dataset is a set of ~1m articles since Jan 01 2020, from ~10k sources both local/national of US newspapers/online news related to US Politics (using an Exovera Classifier that tags politics related content at a high level of recall).
coronavirus_english_topSources_04072020.json: Data from the top 500 largest publishers (in English/by reach) in Exovera's overall dataset. The data is collected via API from social media posts that contain URL's from the top publishers.
coronavirus_general_media_timeseries-04072020.csv: The timeseries are from Coronavirus related terms/content within all-english online News/Print media that we have access to worldwide, it encompasses 55k sources and uses an initial set of keywords to pull up content. The initial set of search terms has ~15m results with keywords 'Coronavirus', 'covid-19', 'covid19', "2019-nCoV" and "Sars-COV-2". Data are based around tagging / subtopic detection with labels applied.

Data Resolution: US
File type: .json, .csv

IDS International and Clear Outcomes: Contact tracing, rapid testing, and antibody testing data for US States - added April 29

Detailed Description: Clear Outcomes and IDS compiled information on state COVID-19 testing resources April 20-23. Detailed methods are described on the data webpage.
Data Resolution: US States
File type: .json, .xlsx

COVID-19 Case Data

Johns Hopkins Coronavirus Dashboard Dataset: Country, foreign provinces, and U.S. county case statistics

Detailed Description: Contains recovered, infected, and fatility case numbers for all countries, province-level for many countries, and county level for the US. Data is sourced from a variety of health organizations around the world.
Data Resolution: Global (some province level), U.S. County
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Minimal
Link: https://github.com/CSSEGISandData/COVID-19

New York City Public Health Department: Cases, hospitalizations, deaths by date of diagnosis as well as cases by ZIP code

Detailed Description: There are a lot of files in the github repo, however only 2 datasets that I think valuable (case-hosp-death.csv and tests-by-zcta.csv). The case-hosp-death accounts cases by date of diagnosis, hospitalized and deaths in NYC hospitals. The latter dataset is cumulative positive cases per zip code
Data Resolution: U.S., U.S. ZIP
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Minimal
Link: https://github.com/nychealth/coronavirus-data

New York Times Data: Two time-series datasets collected by the New York Times from various U.S. state and local agencies; the first record is aligns with the first case in the United States on 21 January 2020.

Detailed Description: Two time-series datasets collected by the New York Times from various state and local government agencies; the first record is the first case in the United States on 21 January 2020. One dataset contains information aggregated at the state-level and the other is information broken down by county. Features contained are: date, county/state, fips, cases, and deaths. NOTE: This source only provides information about positive cases.
Data Resolution: U.S. States, U.S. County
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Minimal
Link: https://github.com/nytimes/covid-19-data

INDIA COVID-19 TRACKER: Crowdsourced India COVID-19 data. Some interesting points because it takes data from anyone.

Detailed Description: This is a link to a GitHub repository that is used to crowdsource data about COVID-19 in India. The crowdsourced data has been used to make an HTML page (the link is in the GitHub repository). The data is crowdsourced through telegram, a social media type application, but it is not thoroughly validated. It is really interesting data about India, but it needs to be used appropriately in analysis. It is submitted through a social media platform, so some of it is likely incorrect, but could make fantastic supplementary data.
Data Resolution: Country
Frequency of update: Daily
Download Method: Clone / API

File type: JSON

Cleaning requirements: Minimal
Link: https://github.com/covid19india/api

European Centre for Disease Prevention and Control: Dataset of positive cases and deaths by country worldwide

Detailed Description: Contains a dataset that tracks positive cases and deaths per country. Originally a record data but could be transformed into timeseries with decent coding work
Data Resolution: Global
Frequency of update: Daily
Download Method: Download

File type: CSV, JSON, XML

Cleaning requirements: Minimal/Moderate
Link: https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide

DXY: a Chinese pandemic tracking online platform: A Chinese online platform showing the number of cases locally and globally

Detailed Description: Daily confirmed, deaths, and recovered cases worldwide. There is English version if click "switch to English version", but it doesn't provide dataset to download.
Data Resolution: Global, China
Frequency of update: Daily
Download Method: Copy-paste

File type: Text

Cleaning requirements: Significant
Link: https://ncov.dxy.cn/ncovh5/view/pneumonia

1Point3Acres covid19 dataset: case based covid 19 dataset in US and Canada

Detailed Description: The case data contains case id, confirmed date, state/province, county (for US only), confirmed case count, and death count. (Have rules on citing this source)
Data Resolution: US(county level) and Canada
Frequency of update: Daily
Download Method: API(I have requested and get the API access token, 20 requests per 24 hour)

File type: CSV

Cleaning requirements: Minimal
Link: https://coronavirus.1point3acres.com/en

Dados COVID-19 em Portugal: Data on COVID-19 cases at the municipality-level in Portugal - added April 14

Detailed Description: Data sourced from the Portuguese Directorate General of Health on their dashboard site Dashboard as well as other reports.
Data Resolution: Portugal, Municipalities of Portugal
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Moderate (Data in Portuguese)
Link: https://github.com/dssg-pt/covid19pt-data"

COVID-19 Italia — Dipartimento della Protezione Civile: Italian Dept. of Civil Protection github data on COVID-19 cases - added April 14

Detailed Description: Data from the arcGIS dashboard set up by the Italian Dept. of Civil Protection.Dashboard
Data Resolution: Italy, Regions/Provinces of Italy
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Moderate (Data in Italian)
Link: https://github.com/pcm-dpc/COVID-19"

Coronavirus (COVID-19) Data in the UK: Case and death data breakdown in UK - Added April 19

Detailed Description: Compilation of case and death data at the nation, region, and upper tier local authority (UTLA) levels by data.gov.uk
Data Resolution: UK, Nations, Regions, ULTA
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://coronavirus.data.gov.uk/#local-authorities

COVID Tracking Project: US data on COVID-19 testing and patient outcomes - Added April 21

Detailed Description: The COVID Tracking Project is a volunteer organization launched from The Atlantic and dedicated to collecting and publishing the data required to understand the COVID-19 outbreak in the United States. Every day, they collect data on COVID-19 testing and patient outcomes from all 50 states, 5 territories, and the District of Columbia. Their dataset is currently in use by national and local news organizations across the US and by research projects and agencies worldwide.
Data Resolution: US state, county
Frequency of update: Daily
Download Method: Download

File type: CSV, .json

Cleaning requirements: Moderate
Link: https://covidtracking.com/api

Coronavirus (COVID-19) Data in South Africa: Data compiled from the Department of Health and other Health sources in South Africa - Added April 24

Data Resolution: South Africa, South African States/Provinces
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Minimal/Moderate
Link: https://github.com/dsfsi/covid19za

Coronavirus (COVID-19) Data for Africa: Data compiled from WHO and various national health agencies in Africa. Data on coronavirus case numbers and information on invidiaul cases - Added April 24

Data Resolution: Africa Aggregate, African Nations with some state/province
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV

Cleaning requirements: Minimal/Moderate
Link: https://github.com/dsfsi/covid19africa

Our World in Data: COVID-19 Testing Dataset - Added April 23

Detailed Description: Testing is our window onto the pandemic and how it is spreading. Without testing we have no way of understanding the pandemic. Goal of Our World in Data is to provide testing data over time for many countries around the world. Alongside the data, they also provides a good understanding of the definitions used and any important limitations they might have. You will also find descriptions of the data for each country.
Data Resolution: Country
Frequency of update: Daily
Download Method: Download

File type: CSV, excel

Cleaning requirements: Moderate
Link: https://github.com/owid/covid-19-data/tree/master/public/data/testing

Korean Ministry of Health Data: Korean COVID-19 statistics, dataset is in Korean - Added April 30

Data Resolution: Korea
Frequency of update: Daily
Download Method: Api

File type: XML

Cleaning requirements: Minimal/Moderate
Link: http://data.go.kr/tcs/dss/selectApiDataDetailView.do?publicDataPk=15043376

Government Response Data

Worldwide Lockdown Dataset: Country and province stay-at-home order data

Detailed Description: 2 files. List of lockdown dates for each countries. A lockdown is assumed to be complete when all schools and non-essential businesses are closed. References for each country are also listed for where the information was found. Some rows contain blank provinces if it pertains to the whole nation.
Data Resolution: Global,
Frequency of update: Static? (updated 3 days ago)
Download Method: Download

File type: CSV

Cleaning requirements: Minimal/Moderate
Link: https://www.kaggle.com/jcyzag/covid19-lockdown-dates-by-country#countryLockdowndates.csv

US Lockdown Dataset: State and county stay-at-home order data

Detailed Description: Dates of when is each state / county's stay-at-home order becomes effective as a result of the covid-19 pandemic. This dataset is updated daily as more states & counties issue stay-at-home order. Some rows contain blank counties if it pertains to the whole state.
Data Resolution: U.S. States, U.S. County
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Minimal/Moderate
Link: https://www.kaggle.com/lin0li/us-lockdown-dates-dataset

Healthcare Resource Data

IHME: Institute for Health Metrics and Evaluation COVID-19 Estimate Data

Detailed Description: IHME has produced forecasts which show hospital bed use, need for intensive care beds, and ventilator use due to COVID-19 based on projected deaths for the United States, at the country and subnational level, and countries in the European Economic Area (EEA). Forecasts at the subnational level are included for three EEA countries: Germany, Italy, and Spain. These projections are produced by models based on observed death rates from COVID-19, and include uncertainty intervals. They incorporate information about social distancing and other protective measures and are being updated daily with new data. These forecasts were developed in order to provide hospitals, policy makers, and the public with crucial information about how expected need aligns with existing resources, so that cities and countries can best prepare.
Data Resolution: US, Countries in the European Economic Area (EEA)
Frequency of update: Last updated at 1 p.m. Pacific, April 13, 2020. as of date 4/15/2020
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: http://www.healthdata.org/covid/data-downloads

USA Hospital Beds: County level data. Contains hospital beds related data(amount, untility rate, bed type, etc)as well as hospital geographic data

Detailed Description: Contains hospital beds related data(amount, untility rate, bed type, etc)as well as hospital geographic data
Data Resolution: US county
Frequency of update: Daily(not sure, last updated 'yesterday')
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://coronavirus-disasterresponse.hub.arcgis.com/datasets/definitivehc::definitive-healthcare-usa-hospital-beds/data?geometry=94.394%2C-16.820%2C-119.356%2C72.123&page=10

US Hospital Facility Bed Capacity: Includes information about all hospitals bed and ventilators per capita, health care capacity data etc

Detailed Description: High quality data on US hospitals capacity including beds per capita, covid care data etc.
Data Resolution: US county
Frequency of update: Last updated on april 7
Download Method: Clone

File type: CSV/geojson

Cleaning requirements: Minimal
Link: https://github.com/covidcaremap/covid19-healthsystemcapacity/tree/master/data/published

Patient Medical Data for COVID-19: Medical records of patients infected with COVID-19

Detailed Description: Patient record including age, sex, location, date of onset, symptoms, travel history, chronic diseases, and date of discharge or death.
Data Resolution: Global
Frequency of update: Last updated on April 1
Download Method: Download

File type: CSV/JSON

Cleaning requirements: Minimal
Link: https://datarepository.wolframcloud.com/resources/Patient-Medical-Data-for-Novel-Coronavirus-COVID-19

WDI Health Systems: Data on the state of each countries healthcare system.

Detailed Description: The stated purpose for this data is "Does health spending levels (public or private), or hospital staff have any effect on the rate at which Covid-19 spreads in a country? Can we use this data to predict the rate at which Cases or Fatalities will grow?". It is only data on healthcare expenditures and the amount of healthcare available in countries throughout the world. There is not any direct COVID-19 data, but this could make good supplementary data for a question similar to one they posed as inspiration
Data Resolution: Global
Frequency of update: Every 2-3 Days
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://www.kaggle.com/danevans/world-bank-wdi-212-health-systems

Social Data

Google Trends: Data on the trends in people's google searches.

Detailed Description: GoogleTrends data is phenomenal, it is interesting, important, and can be so insightful, IF IT IS USED CORRECTLY. It can be a little confusing the first time you see it, and the instructions given will help you understand the graphs presented on the GoogleTrends page when you input a search term. However, figuring out how to use it further and get more from it, is not super clear. All of the data is given in search intensity, scaled from 0 to 100, where 100 is the maximum search intensity. The maximum search intensity does not give you any information about the actual number of searches, that number is that search terms peak in searches, then everything else is scaled to that value. A search intensity of 50 means that term was searched half as many times as the search intensity of 100. Now, lets put that in context, google trends allows you to vary the time period, regional resolution, and the search term(s). - You can specify a time period of any range dating back to 2014. - Time periods of less than a week will return hourly data - Time periods over a week, but less than 269 days (about 9 months, but using 8 is safe) returns daily data - Time periods over 269 days return weekly data - You can choose the whole world or a specific country - The whole world will give you country level comparisons - Different countries have different levels you can compare from, for example U.S. has a default of comparing states, but you can also choose to compare by metro region. Let's start with relative search intensities (i.e. comparing different searches): - You will specify a time period, and what is returned may be hourly, daily or weekly search intensities. - Only one term is going to reach 100 over that time period. This represents the highest search intensity for that term, and any of the other terms you are comparing. - Then every other search intensity is scaled from that point. No matter what term you are looking at in a relative search intensity on GoogleTrends it's search intensity = # searches for that term / # searches at the peak search intensity (100) - GoogleTrends allows you to compare up to five words or phrases at one time. There are ways to overlap time periods and search terms together to get a pretty good estimate to compare from, but DO NOT DO THIS UNLESS IT IS ABSOLUTELY NECESSARY. It is very difficult, and a tiny mistake makes all of your data innaccurate. Regional Search Intensities (comparing a terms search intensity based on location): - You enter a search term and you can specify whether it is the whole world, or one particular country. - GoogleTrends gives you colored maps representing this data. - What the actual data has for you is similar to the relative search intensities. - Only one region in the region and time period you specified will be reach 100. - The rest of the regions are scaled the same way as relative search intensity to that moment and regions search intensity *** You can also do regional searches that compare multiple terms, and it is really interesting. However, manipulation of that data is even more difficult, and requires a lot of attention to unravel. It is very easy to make a small mistake, and that small mistake will echo throughout all of the data, again making it worthless. This is just a brief summary of the data given, and what I have found to be the things to watch out for, look at google trends descriptions as well for details specific to their user interface. If you still feel like you want to dive deeper into some of this data, there is a library full of research articles using the data and webpages dedicated to some manipulation of the data to get more out of it. I will just warn you to be careful, the manipulation, overlapping and other methods to change the data are always approximations, and not always correct, so read them thoughourly and check that they validated their method in some clear and accurate way.
Data Resolution: Global, Country Level, U.S. State Level, U.S. Metro Region Level, Other Countries Have Unique Regional Breakdowns
Frequency of update: Daily
Download Method: Download / API (pytrends)

File type: CSV

Cleaning requirements: Minimal
Link: https://trends.google.com/trends/?geo=US

Postman: COVID-19 API Resource Center: Contains links and detailed information about accessing public feeds from 28 different organizations and topics via application program interfaces (API). Organizations represented include the WHO, CDC, and John Hopkins University.

Detailed Description: Contains links and detailed information about accessing public feeds from 28 different organizations and topics via application program interfaces (API). This site contains information to connect to feeds from the WHO, CDC, COVID Tracking Project, and John Hopkins University COVID Database just to name a few. There are examples of how to access an organization's Twitter and Youtube feed, however individuals must have the requisite API Key / Access Tokens to access the information contained on those sites.
Data Resolution: Various
Frequency of update: nan
Download Method: API

File type: Various

Cleaning requirements: Significant
Link: https://covid-19-apis.postman.com/

SafeGraph Dataset: Data on foot traffic throughout the US. It has the number of times people pass by over 6 million different points of interest in the US.

Detailed Description: This Data is based on businesses and consumer hot spots. It uses over 6 million points throughout the US and tracks the amount of foot traffic at each of these points. They give data like number of visitors over a certain period, and also offer shapefiles for mapping or any locational visualizations.
Data Resolution: US Points of Interest
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://www.safegraph.com/covid-19-data-consortium

Covid-19 Twitter dataset: Dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter

Detailed Description: Interesting dataset of social media, including daily top 1000 terms, bigrams, trigrams etc., also contains cleaned version on tweet text. Tweets languages including English Spanish and French
Data Resolution: Global
Frequency of update: every 2 days
Download Method: Clone

File type: CSV

Cleaning requirements: Minmal
Link: https://github.com/thepanacealab/covid19_twitter/tree/master/dailies/2020-03-22

Schools affected by COVID-19: Dataset of Higher Education schools moving to online-only instruction due to COVID-19

Detailed Description: nan
Data Resolution: US county
Frequency of update: Last updated March 27
Download Method: Download

File type: CSV

Cleaning requirements: Minmal
Link: https://www.notion.so/Schools-affected-by-COVID-19-a28139cb40814869a2cd64cc9453d82c

COVID-19 Legislation: Interactive site for users to access: statewide or nationwide data on all covid-19 legislation

Detailed Description: Queryable and downloadable data pertaining to United States COVID-19 legislation. The data contains name of the bill, the region it spans, description of the legislation, link to the source, status, last action, date of last action, type (house/senate/other), the internal quorum link.
Data Resolution: U.S. States, U.S.
Frequency of update: At least daily
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://www.quorum.us/spreadsheet/external/QCKYcPmSvYoAhnkIdcSS/

Coebiq Mobility Index Data: Dataset shows mobility and store visitation patterns

Detailed Description: This data representing the level of movement within each specific county in the U.S.
Data Resolution: US county
Frequency of update: Daily
Download Method: AWS S3 (premier account of Coebiq needed)

File type: CSV

Cleaning requirements: Minimal
Link: https://help.cuebiq.com/hc/en-us/articles/360041350092-Cuebiq-Mobility-Visit-Index-Feed-Specs#h_e4633fc1-3206-4ee5-a3b8-6f7735e22c7e

Google COVID-19 Community Mobility Reports: See how your community is moving around differently due to COVID-19

Detailed Description: These Community Mobility Reports aim to provide insights into what has changed in response to policies aimed at combating COVID-19. The reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential.
Data Resolution:
Frequency of update:
Download Method:

File type:

Cleaning requirements:
Link: https://www.google.com/covid19/mobility/

Apple Mobility Data: Change in routing requests since January 13, 2020 - added April 14

Detailed Description: Data on mobility based on direction requests. Data has variables on transit, walking, and driving.
Data Resolution: Global, Major World Cities
Frequency of update: Daily (not sure, first made available April 14)
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://www.apple.com/covid19/mobility

538 Survey Data on US Coronavirus Concern and Response Approval: Surveys conducted on concern about economy and risk of infection and approval of President Trump's response - Added April 19

Detailed Description: 538 compiled surveys from pollsters on concern about the economy, concern about getting infected, and approval of President Trump's response. Users shold use files ending in '_toplines' as this is the data that is used on the site. Polls files show how various polls are weighted to get to topline numbers.
Data Resolution: US
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://projects.fivethirtyeight.com/coronavirus-polls/

Mozilla Desktop Usage Data: Daily data on desktop usage and number of active hours per user on Mozilla Firefox - Added April 19

Data Resolution: Global, Major World Cities, Most US Cities
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Moderate
Link: https://blog.mozilla.org/data/2020/03/30/opening-data-to-understand-social-distancing/

Pew ResearchCenter American Trends Panel: Raw survey data on US social trends with multiple coronavirus questions - Added April 19

Detailed Description: Survey data on american trends. Data is individual-level survey responses for the American Trends Panel by the Pew Research Center. Ensure that you appropriately weighted measures when analyzing this data. The download includes multiple pdf files with the methodology and questionnaire, which should be closely reviewed.
Data Resolution: US
Frequency of update: nan
Download Method: Download (must make an account)

File type: SPSS (.SAV) Format

Cleaning requirements: Significant
Link: https://www.people-press.org/dataset/covid-19-late-march-2020/

Carnegie Mellon University Delphi Research Center: Self-reported descriptions of COVID-19-related symptoms - Added April 20

Detailed Description: The survey from CMU Delphi Research Center asks people to self-report symptoms associated with COVID-19 or the flu that they or anyone in their household has experienced in the last 24 hours. Data are gathered nationwide with the help of Facebook and Google. High correlation between self-reported descriptions of COVID-19-related symptoms and test-confirmed cases of the disease suggests self-reports might soon help the researchers in forecasting COVID-19 activity.

Delphi COVID-19 Response Team develops API for accessing the Delphi's COVID-19 Surveillance Streams (covidcast) data source of the Delphi's epidemiological data. COVIDcast displays signals related to COVID-19 activity levels across the United States, derived from a variety of anonymized, aggregated data sources made available by multiple partners. Each signal may reflect the prevalence of COVID-19 infection, mild symptoms, or more severe disease over time. Each signal can be presented at multiple geographic resolutions: state, county, and/or metropolitan area. All these signals taken together may suggest heightened or rising COVID-19 activity in specific locations.

Find home of Delphi's epidemiological data API here

Find Delphi's homepage here

Find Facebook & Carnegie Mellon University COVID-19 Symptom Map here
Data Resolution: US county
Frequency of update: daily
Download Method: Web Scraping using API. Libraries and Code Samples are available for CoffeeScript, JavaScript, Python, and R.
Cleaning requirements: Significant
Link: https://cmu-delphi.github.io/delphi-epidata/api/

Stanford COVID-19 Survey Data: Survey data of US and UK citizens on COVID-19 - Added April 30

Detailed Description: Survey results for US and UK citizen's knowledge and perceptions on COVID-19
Data Resolution: US and UK
Frequency of update: Static
Download Method: Download

File type: CSV

Cleaning requirements: Minimal
Link: https://purl.stanford.edu/tr461wp6422

Brazilian COVID 19 Donations Monitor: Website tracking the donations to fight COVID-19 and amounts - Added April 30

Detailed Description: Dataset tracking major donors, donation amount, and sectors information announced publicly in Brazil in response to COVID 19
Data Resolution: Brazil
Frequency of update: Weekly
Download Method: Download

File type: XLSX

Cleaning requirements: Minimal
Link: https://en.monitordasdoacoes.org.br/

Academic Data

Scholarly Article Database: Big database of scholarly article metadata with links and queryable json files for Natural Language Processing

Detailed Description: This dataset combines 44k+ scholarly articles/literature pertaining to the coronavirus. It can be used to analyze the main authors, sources, titles, journal and abstract for the analyst to look into. Each row provides a link to the article if Natural Language Processing should be a desired task.
Data Resolution: U.S.
Frequency of update: Static
Download Method: Download/Embedded link

File type: JSON

Cleaning requirements: Significant
Link: https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge#metadata.csv

WHO Scientific Knowledge Database: Scientific findings and journal articles on COVID-19 - added April 14

Detailed Description: nan
Data Resolution: Global
Frequency of update: Daily
Download Method: Download

File type: CSV

Cleaning requirements: Moderate
Link: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/global-research-on-novel-coronavirus-2019-ncov

Open database of COVID-19 cases with chest X-ray or CT images.: Infectious disease CT / X-ray images from journal publications - added April 14

Detailed Description: Articles submitted by anyone, and images scraped to create the dataset. Variables include basic demographic info, location, and medical condition info.
Data Resolution: Global
Frequency of update: Daily
Download Method: Download / Clone

File type: CSV / JPG

Cleaning requirements: Moderate
Link: <a href="https://github.com/ieee8023/covid-chestxray-dataset>https://github.com/ieee8023/covid-chestxray-dataset"</a>

Chinese National Bioinformatics Center Data: Genomic sequences and variation data submitted by researchers around the world - Added April 30

Data Resolution: Global
Frequency of update: Daily
Download Method: Download

File type: fasta

Cleaning requirements: Minimal/Moderate
Link: https://bigd.big.ac.cn/ncov

Other Relevant Data

AirNow Data: Real time air quality data for major world cities and US locations - Added April 30

Detailed Description: Link goes to the Embassies and Consulates section where you can click a specific city, then choose the historical tab to download csv files for specific years of air quality data. There is also a developer API located here https://docs.airnowapi.org/.
Data Resolution: Global, Major World Cities, US Cities
Frequency of update: Hourly
Download Method: Download

File type: CSV

Cleaning requirements: Moderate
Link: https://www.airnow.gov/international/us-embassies-and-consulates

COVID-19 Interactive Dashboard

COVID-19 Background Information

Experts' Thoughts on Dealing with COVID-19 Data

We encourage you to recognize both the limitations of the data and your ability to draw conclusions from this data. The importance of this COVID-19 means that any visuals created may be displayed in other contexts, and we ask you not to overreach in any conclusions you attempt to draw. See below for a list of articles around this topic.

You are (almost definitely) not qualified to make predictions about COVID-19. We’re here to help explain why, by Andy Cotgreave
Visualizing coronavirus data? Consider adding a disclaimer, by Amanda Makulec
Display New Daily Cases of COVID-19 with Care, by Stephen Few
10 considerations before you create another chart about COVID-19, by Amanda Makulec
Coronavirus Case Counts Are Meaningless*, by Nate Silver
Rates of change are tricky, by Alberto Cairo
A conversation with an Epidemiologist: 5 things to keep in mind when you look at the numbers on COVID-19, by Amanda Makulec