Lab 4. AI and Auction Houses (20 pts)

Tambe, OIDD 255X


Figure 1: Sample submission page from Sotheby’s

Introduction

Imagine that in a consulting role, a new engagement leads you to work with a well-regarded auction house that focuses on luxury goods (e.g. fine art, jewelry, high end bags, rare artifacts, and so on). This auction house is the largest in the industry and they employ a team of trained experts who evaluate the authenticity of all objects submitted to auction. Parties who are interested in selling items through the auction house initiate the process through the Internet, by submitting images of the object (e.g. see the image above taken from the Sotheby’s site).

As a first step, objects in these images are evaluated by experts to assess authenticity (i.e. to ensure that an object is not counterfeit) and only those items deemed authentic move forward to auction. All of the images that have ever been submitted to the auction house and the expert judgments they receive as to their authenticity have been stored in their databases. The auction house has been collecting this type of data for years and they have the largest database of this kind in the industry.

This auction house has hired you to help lead it into the age of AI. As a high-profile auction house, they are becoming overwhelmed by parties looking to sell through them. They have asked for guidance from you on how an AI-based tool could be used to help algorithmically identify counterfeit goods.

Deliverables

Please place your answers into a separate Word or Google Document and submit through Canvas. Anything between 400 and 1000 words is a reasonable size response.

Part 1. Counterfeit detection

You explain to the auction house that the database they have been building is a particularly useful asset for developing a machine learning tool that can identify counterfeit objects from image data, because it has a lot of image data of objects with clear labels as judged by experts (counterfeit/authentic).

As a first effort, you work with their engineers to use these data to build two classifiers that can predict whether an image is of a counterfeit item. These classifiers exhibit the following performance characteristics on a test data set. Assume that, for all of the tools discussed in this assignment, the positive class (1) corresponds to when the classifier predicts that something is “counterfeit”.

PropertyClassifier AClassifier B
Precision0.80.73
Sensitivity0.70.77

The auction house would like to deploy one of these two models (i.e. put it into production on their web site to predict whether a submitted image is of an object that is counterfeit or not).

Question 1A

Under what circumstances might you advise the auction house to use Classifier A over Classifier B? Be very specific about the cost/benefit assumptions that inform this recommendation and feel free to use examples. (2 pts)

Question 1B

Under what circumstances might you advise the auction house to use Classifier B over Classifier A? Be very specific about the cost/benefit assumptions that inform this recommendation and feel free to use examples. (2 pts)

Part 2. AI and Bias

During the initial roll out of your algorithm, you notice it is less accurate when identifying counterfeits when the items are from sellers outside the US. This is something the auction house would like to address.

Question 2A

Assuming that you have adequate training data from US and international sellers, how might this type of bias have arisen in the classifier? This question is asking you to speculate what a source of bias COULD be. There is not enough information to identify a specific source of bias with certainty. (2 pt)

To address this issue, you suggest reverting to a simpler machine learning model that does not use the raw images, and is based only on hand-coded features, such as country of origin, item category, shipment details, seller history, and potentially some features hand-coded from the images. Given this limited feature set, you advise your client that there are different approaches that can be used to avoid the perception of unfairness in algorithmic decisions.

Question 2B

When using the limited feature set, how might you implement anti-classification as a fairness approach in this context? What are the advantages and disadvantages of using this approach to achieve fair outcomes? (2 pts)

Question 2C

When using the limited feature set, how might you implement demographic parity as a fairness measure in this context? What are the advantages and disadvantages of using this approach to achieve fair outcomes? (2 pts)

Question 2D

Imagine you are given a sample of the classifier output on some test data. How would you check for predictive parity in this data set? (2 pts)

Question 2E

Imagine you are given a sample of the classifier output on some test data. How would you check for individual fairness in this data set? (2 pts)

Part 3. AI and competition

A tech startup has approached your client. This startup is trying to aggregate databases of the type described earlier from a number of major auction houses in order to train deep learning models for counterfeit detection that they would then sell as a service to auction houses around the world.

They propose to your client that they can take in their data and combine it with data from other auction houses. Your client can then subscribe to their service which, when given an image of an object, predicts whether or not it is counterfeit. Your auction house client would be their first partner.

Since your auction house client is the largest in the industry, and because their data represent a valuable asset, the startup is offering the subscription to your client free of charge -- there is no cost to use it if your client is willing to part with their data.

Question 3A

From the perspective of the auction house you are advising, what is a strategic advantage to be gained by working with this startup? (3 pts)

Question 3B

From the perspective of the auction house, what is a potential strategic disadvantage of working with this startup? (3 pts)