Data Project 2: Analytics for Good

OIDD 245 Tambe


Project theme

This assignment asks you to create a project using R around the theme "Analytics for Good". Analytics for good is any use of analytics that is meant to serve the public good. In other words, the focus should be on the use of analytics to better society or better the world in some way. There are many examples of where analytics is being used for good, ranging from catching wildlife poachers to detecting climate change to helping the homeless.

A successful data product (e.g. 1, 2) should have users thinking "This is interesting and this is useful!" and requires some amount of research or domain knowledge about an industry. It is worthwhile to think hard about your options and to collect feedback from other students. This must be a new project, conceived and implemented specifically for this assignment.

Ultimately, this project is meant to provide a platform to utilize data skills in a context where you can specify some of the project structure. You are highly encouraged to pursue new and creative data sources. Many are freely available on the web, and data from web sources like Twitter, Yelp, and other companies can be harvested for analysis. In this class, we have covered web scraping and surveyed some visualization technologies and you are encouraged to use other methods that we have not yet covered in class. If you are unsure whether a project you are considering is a good candidate, please ask me!

Learning objectives

  1. Gain experience with data skills and with working with large data sets using R.
  2. Learn to appreciate the combinatorial nature of the possibilities that arise when combining data sources.
  3. Try to be creative in your projects. Many data scientists would argue that creativity, domain knowledge, and storytelling are equally or even more important than skills such as R or Python when developing data products.

Project requirements


Grading criteria

You will be graded on the quantity, quality, and creativity of the data sources, the visualizations, and the utility. More visualizations are not awarded more credit unless they contribute to the central story being told in the post. The project is worth a total of 100 points, divided into several parts: