At DataMarket, for example, you can search for datasets, gather the data, upload your own data, and compare. You can also output the results using DataMarket's chart and visualization templates.
Amazon offers Public Data Sets on AWS, which provides a centralized repository of free public data sets that can be integrated into AWS cloud-based applications. Google has Public Data Explorer, IBM's ManyEyes is geared towards visualization.
Factual concentrates on places and products. You can mash-up your information with their data on local businesses, points of interest, restaurants, hotels, and consumer packaged goods.
Infochimps has many free datasets, including raw text of 4,771 erotica stories, 100,000+ official crossword words, and the birth and death rates of US teenagers, culled from the US Census.
Microsoft Windows Azure Data Marketplace, as the name implies, integrates data with its applications. Its data assets include economic indicators, telephone numbers, weather data, as well as regional datasets like crime statistics for England and Wales.
There are advantages to buying data from these marketplaces. For one thing, it's clean, which may be a welcome change from the messy data you've been trying to scrub. Many of the services also enable you to do your data crunching on their servers, freeing you from time-consuming and often complicated downloads. If you are already using a cloud-based data analytics solution from one of the providers, the process is even easier.
And you may be surprised by the variety of data that's available.
"The wide availability of data continues to surprise me every day," said Shawndra Hill, who works with and teaches about Big Data in the Operations and Information Management Department at The Wharton School of the University of Pennsylvania.
"My colleagues and I have used publicly available data to predict drought in Ethiopia, the success of TV shows, what people will follow on Twitter, the success of advertising, and stock market trends," Hill said. "We have also worked on linking drugs to their side effects. In the past, these projects wouldn’t be possible without partnerships with firms that allowed the use of their proprietary data."
Editor’s Note: This is the fourth post in the ongoing series “Are You Ready for Big Data?” by DC Denison. Download the complete "Are You Ready for Big Data" ebook to learn more about Big Data, its applications in creating the next generatlon digital experience, and what it takes to get into the game.