Data project 3

OIDD 245 Tambe


Data project 3 is meant to be completed individually and is meant to provide a platform for you to utilize your R skills on a project where you have flexibility to specify the structure and context. One of the goals is to complete a data project that may be of specific use to you in getting a future job (i.e. something you can talk about in an interview or show to a future employer). Part of the goal is also to allow you to spend time on a project based in an industry context that is of specific interest to you, since as a group are going to be entering different industries (e.g. consulting, finance, real estate, healthcare, technology etc.) This can also be a “passion” project on a topic you might be particularly interested in, which could be anything ranging from protecting endangered species to the Game of Thrones.

You are highly encouraged to pursue rich and creative data sources. There are many that are freely available on the web. Moreover, data from sources such as Facebook, Twitter, Yelp, and other companies can be harvested for analysis. In this class, we have covered web scraping, using API’s, text mining, prediction, and surveyed some visualization techniques, and you are welcome to use packages and methods that we have not covered in class.

Choosing an audience and specifying an “interesting” question are important parts of the assignment (and of any good data science work). What makes for an interesting question can be subjective and is often domain specific. It is a good idea, therefore, to avail yourself of feedback from friends, family, TAs, or me, if you are stuck. I am happy to provide feedback over the next few weeks as you develop your projects.


Project Requirements


Grading Criteria (125 pts)

Some sample projects from past years