So if you want to teach an algorithm what a narwhal looks like, this would be a good place to start.

The Centers for Disease Control and Prevention maintains a database on cause of death.

Maybe to train a never ending language learner named NELL? Did you know that Google has a search engine for data sets? The economic incentives for predicting the weather are absurd. How about fraternity members or HAM radio operators?

You long for someone you can connect with on a deeper level. As a consequence, today, students of the game benefit from one of the richest data sets of any game or sport. Airline Data Project tags: This set of five surveys regarding how different groups experience employment could answer that question.

Educational Statistics — data on education by country. Maybe your plans are slightly less ambitious. It was quire complex, but they provided everything on time. Bureau of Labor Statistics: On the topic of games, for soccer fans, I recently came across this freely available data set of soccer games, players, teams, goals, and more.

One convenient way to use that API is through the choroplethr. But where to find the data for such a thing? The Open Product Data website aims to make barcode data available for every brand for free. The first step is to find an appropriate, interesting data set.

Predicting stock prices is a major application of data analysis and machine learning. Data can be exported into statistical software such as Excel and SAS.

Summary statistics are a way to.

Health Statistics & Data: Datasets/Raw Data

A collection of growing datasets that are exclusively from UCLA researchers and exist for a variety of classroom uses. The Housing Affordability Data System (HADS) is a set of files derived from the and later national American Housing Survey (AHS) and the and later Metro CSV Federal.

UC Berkeley's principal archive of digitized social science data and statistics. It operates as a part of the new UC Berkeley's Social Science Data Lab (D-Lab). Provides access to a broad range of computerized social science data to faculty, staff, and students at UC Berkeley.

+ Interesting Data Sets for Statistics. May 29, by Robb Seaton. Edit: Hey guys! This has proved to be one of the most popular articles on the site, so I’ve created a supplemental download on the 5 biggest statistics mistakes beginners make and how to avoid them.

The project has been collecting user data for years, and gwern has.

