On Wednesday, January 5, the FTC released a report titled “Big Data: A Tool for Inclusion or Exclusion?” (the “Report”). The Report addresses the effects of the growing use of big data analytics on low-income and underserved populations, and the FTC’s role in monitoring and regulating the impacts of the commercial use of big data. There are two high-level takeaways from the Report: First, big data is a powerful tool that can be used to include or to exclude. Used responsibly, it can be a key to unlocking opportunities for underprivileged and underserved classes; but, the FTC emphasizes that when used with disregard for its effects, big data can serve to shut the underprivileged and underserved out of those same opportunities. Second, the FTC will be the cop on the beat. The Report’s emphasis on the tools at the FTC’s disposal for regulating the use of big data analytics, signals that the FTC intends to make use of its enforcement powers where it can.
The Report begins by observing that “mining large data sets to find useful, nonobvious patterns is a relatively new but growing practice in marketing, fraud prevention, human resources, and a variety of other fields.” Big data is beginning to help underserved communities in various areas, including education, access to credit, individually tailored healthcare, and equal access to employment. For example, Google used analytics to determine that its emphasis on grade point averages and brainteasers in its recruiting process resulted in unintended biases on the candidate pool, and was able to revise its practices accordingly. For these points, the FTC repeatedly cited the workshop testimony of Hogan Lovells partner Chris Wolf, as well as comments and a report from the Future of Privacy Forum.
The Report next identifies several risks of predictive analytics, such as mistakenly denying individuals opportunities based on the actions of others, creating or reinforcing existing disparities, exposing sensitive information, assisting the targeting of vulnerable individuals for fraud, and creating new justifications for exclusion. The Report cautions data miners that, while using more data can “increase the power of the analysis, simply adding more data does not necessarily correct inaccuracies or remove biases.”
The Report notes that companies assessing the risks of big data should both (1) maintain an awareness and familiarity with research aimed at identifying potential biases and inaccuracies, and (2) have an understanding of the relevant laws. The FTC urges companies to ensure that they use data ethically and comply with existing fair credit, equal opportunity, or consumer protection laws. The Report—drawing upon testimony and research collected in connection with its workshops, including a paper by Hogan Lovells senior associate Andrew Selbst—recommends that data miners consider four questions when using predictive analytics:
- How representative is your data set? Depending on how the data was collected, data sets could be missing information about certain populations, and companies should work to address issues of underrepresentation and overrepresentation.
- Does your data model account for biases? Models can contain built-in biases when they incorporate data that itself reflects prejudices, and thereby reproduce past, disparate results. Companies should develop strategies to overcome such biases.
- How accurate are your predictions based on big data? Big data is excellent at finding correlations but not at explaining them. Companies should take care to take all relevant variables into account and give sufficient attention to traditional applied statistics techniques to avoid inaccuracies.
- Does your reliance on big data raise ethical or fairness concerns? Companies should assess the predictive value of the various factors in the predictive analysis and ensure that certain factors do not raise fairness concerns. For example, one company incorporated into its hiring algorithm the fact that employees who lived closer to work stayed at their jobs longer than those who lived farther away, while another excluded this factor as racially discriminatory because of the differing racial composition of neighborhoods.
The Report reviews the various laws under which the FTC has authority to regulate uses of data: the Fair Credit Reporting Act (FCRA), the Equal Credit Opportunity Act (ECOA), and the FTC Act. The FTC observes that, although big data allows for credit determinations based on non-traditional data, such as social media information, if such data is to be used in a credit determination, then it would be subject to the FCRA. The FTC also observes that, to an extent, the use of data might disadvantage protected classes under equal opportunity laws, such as the ECOA. Finally, the FTC states that it will take action against unfair and deceptive uses of data under the authority granted to it in Section 5 of the FTC Act.
The Report also makes several recommendations for compliance, including ensuring compliance with the accuracy and privacy provisions of the FCRA if a company compiles big data for others to use in eligibility decisions, or if a company receives such data, complying with provisions applicable to users of consumer reports. Additionally, the Report recommends that companies ensure they are not treating people differently based on a prohibited basis, such as race or national origin. There are more recommendations highlighted on page 24 of the Report, including compliance with Section 5’s requirements to avoid deceptive practices and to use reasonable data security.
The Report concludes by exhorting government, academics, consumer advocates, and industry to work together to “help ensure that we maximize big data’s capacity for good while identifying and minimizing the risks it presents,” and notes that the FTC will continue to monitor the space.
Andrew Selbst, a senior associate in our Washington, D.C. office, contributed to this entry.