Data Project 2: Analytics for Good
OIDD 245 Tambe
- Data project 2 is to be completed individually.
- Deadline: Please see the Canvas assignment page for the deadline.
This assignment asks you to create a project using R around the theme "Analytics for Good". Analytics for good is any use of analytics that is meant to serve the public good. In other words, the focus should be on the use of analytics to better society or better the world in some way. There are many examples of where analytics is being used for good, ranging from catching wildlife poachers to detecting climate change to helping the homeless.
A successful data product (e.g. 1, 2) should have users thinking "This is interesting and this is useful!" and requires some amount of research or domain knowledge about an industry. It is worthwhile to think hard about your options and to collect feedback from other students. This must be a new project, conceived and implemented specifically for this assignment.
Ultimately, this project is meant to provide a platform to utilize data skills in a context where you can specify some of the project structure. You are highly encouraged to pursue new and creative data sources. Many are freely available on the web, and data from web sources like Twitter, Yelp, and other companies can be harvested for analysis. In this class, we have covered web scraping and surveyed some visualization technologies and you are encouraged to use other methods that we have not yet covered in class. If you are unsure whether a project you are considering is a good candidate, please ask me!
- Gain experience with data skills and with working with large data sets using R.
- Learn to appreciate the combinatorial nature of the possibilities that arise when combining data sources.
- Try to be creative in your projects. Many data scientists would argue that creativity, domain knowledge, and storytelling are equally or even more important than skills such as R or Python when developing data products.
- A typical project may involve the R-based tools we covered in class, which can include some data cleaning and tools used for data visualization. If they can add value, using multiple data sources is recommended.
- They key output of the project is the visualizations. You should include a number of R-based data visualizations to support a central data-driven story. You should choose visualizations that contribute to your story. You cannot use Tableau for this project. Moreover, you need not perform any statistical analysis for this project - that will be left for the final data project.
- The deliverable for this project is an URL link to a Medium post where your project can be viewed.
- Building a Medium post: A goal of this project is to provide you with the space to take another step towards building a project portfolio that you can share with others. When submitting your Medium post link to Canvas, make sure you get the 'friend link' from the top right of the page as shown below, so that anyone with that link can access it even if it's behind a paywall.
- Through Canvas, you should also submit any code (e.g. R files) you used for your project.
In addition to your project information, You should also include the following information in your post.
- Who you are (so we can directly see on the site who the submission is from)
- Please make sure you give credit to the data sources you use
Some examples of Medium posts that incorporate data visualizations to tell a story (unrelated to the "Analytics for Good" theme):
You will be graded on the quantity, quality, and creativity of the data sources, the visualizations, and the utility. More visualizations are not awarded more credit unless they contribute to the central story being told in the post. The project is worth a total of 100 points, divided into several parts:
- Creativity of the idea and utility of the data product (30 points)
- Quality of analysis (40 points)
- How well the web site is executed (30 points)