A policy for more transparent recommender systems

A Policy for More Transparent Recommender Systems

[This is part two out of three in a blog post series about transparency and recommendation systems. Here is part one and part three]

Recommender systems have become prevalent in society for communication platforms, e-commerce, and when searching for information online [1]. However, as we have discussed in a previous blog post, there are concerns related to conflicts of interest or unintended negative societal implications that the use of these systems may give rise to. Many AI policies have acknowledged the need for transparency of AI systems in general. Here we argue for a concrete policy for increased transparency specifically of recommender systems. We address the most important arguments against such a policy, and conclude that it is nevertheless well motivated.

Proposal

Providers of consumer-facing recommender systems should be required to reveal how they work to an independent public authority. They would have to describe which types of algorithms are used, and how these are optimized, i.e., their goal functions, and what types of input data have been used to train them.

Hence, the suppliers will have to divulge whether the algorithm aims to maximize engagement in terms of the time spent on a platform, or the retention of paying customers. They would also have to disclose what kind of behavioral input data is gathered to train the algorithm, for example, whether a user’s hovering over a video thumbnail will be registered and used. Other non-behavioral input variables such as age, location, and gender that are deployed should also be disclosed.

Rationale

Currently, platform providers are not required to reveal information about how their systems work. This has the effect that consumers often have no information about the ways in which their preferences may conflict with the objectives of the systems they are exposed to. By making it possible to learn how a particular algorithm works, consumers could make more informed choices about using a particular platform, and perhaps opt out of seeing its recommendations.

Algorithmic transparency is likely to promote the development of alternative systems which, in better coherence with consumer desires, use fewer personal data, or come with the possibility to disable algorithms that encourage unwanted behavior such as increased self-disclosure or longer time spent on mobile apps. If companies would be obliged to disclose how their recommendation systems work, online environments that offer an opt-out alternative could gain a competitive advantage, as platforms offering such possibilities may become more attractive to users.

The idea that consumers have a right to information about the products and services they use is arguably already within the purview of existing consumer protection regulations. For instance, the Consumer Rights Directive 2011/83/EU states that consumers buying a product or service online are entitled to extensive information in advance of purchase. Parallels can also be drawn to other areas of commerce. For instance, the Food and Drug Administration (FDA) requires all ingredients to be listed on food labels.

“…the rationale for the proposed policy parallels the way that food labeling regulations were motivated in the US in the 1960s and 1970s… calls for more detailed nutritional information were driven by the idea that the right to be informed could not be separated from the right not to be misled”

Hence, the rationale for the proposed policy parallels the way that food labeling regulations were motivated in the US in the 1960s and 1970s; at that time, FDA’s calls for more detailed nutritional information were driven by the idea that the right to be informed could not be separated from the right not to be misled, as grounded in section 201(n) and section 403(a) of the US Federal Food, Drug, and Cosmetic Act from 1938 [2]. Hence, the organization argued that the omission of material information could be as deceptive as the inclusion of false information, and that labeling promoted “honesty and fair dealing in the interests of consumers.” [2]

Potential drawbacks and perspectives to consider

We have identified five potential concerns related to this policy: it may result in increased bureaucracy; the algorithms might be too complex for users to comprehend; it may reduce profit for businesses; it may enable users to exploit the algorithm in ways that are detrimental to other users (so-called “gaming”); and it may hinder technology provider’s swift responses to urgent threats. Below, these five aspects are addressed.

An overwhelming bureaucratic burden?

First, it can be argued that divulging details about the algorithms that underlie recommender algorithms will impose an insurmountable bureaucratic burden on technology providers, due to the high frequency of their updates and their complex nature.

However, already under current regulations, transparency is an integral part of running a large public company. For instance, publicly-traded companies in the U.S. are required to disclose standardized financial information to a government agency, the Securities and Exchange Commission (SEC). Likewise, listed companies in the EU are required to disclose financial information in a standardised form.

In the case of recommender algorithms, it is evident that the providers have the information available. Such technologies do not just appear; the goal functions and the input variables used to train them are outcomes of deliberate processes and strategic decision making by profit-maximizing organizations. Their developers have programmed their algorithms, and thus have access to the information that they would be asked to disclose. Further, companies would not be asked to reveal details that could be considered trade secrets.

The algorithms’ complexity renders the proposal pointless?

Second, it may be argued that consumers will be unable to comprehend the complex algorithms that recommenders involve and that this will make the policy toothless. Decisions made by machine-learning systems often make use of statistical patterns that do not easily lend themselves to human interpretation. The difficulty of interpretation increases with many input features, each contributing to the resulting output in nearly indiscernible ways.

Even the developers of a recommender system may sometimes be unable to explain the logic behind a specific suggestion. Demanding transparency may seem untenable when the algorithms appear inherently opaque. Our policy recommendation circumvents these issues by focusing on information that is directly under human control, namely, the input variables and what the algorithms are trying to optimize. In these cases, the lack of transparency is an intentional choice, not the result of technical limitations.

Notably, an argument similarly related to lack of consumer comprehension was put forth with the introduction of nutrition labels in the 1970’s. However, while it has been shown that consumers often do not understand or use food labeling, for instance in restaurants [3], the policy prevails and has been shown to be relevant for two reasons:

“The reported privacy violations highlight the need for independent institutions that can control against malfeasance, and assess related social impacts.”

First, labeling helps consumers with individual needs as they strive to avoid (or consume more of) foods with particular contents, due to allergies or other dietary restrictions. In fact, it has been shown that nutrition label reading correlates with dietary practices [4]. Similarly, knowing what an algorithm is optimizing would allow consumers of digital products to make more informed decisions about how they use them. This could be particularly important for children or adults who are especially susceptible to manipulation, for instance, people with cognitive disabilities or addictive tendencies. It is notable in this context that food allergies are increasingly common, which highlights that food labeling is becoming ever more relevant for consumers. This parallels the current policy proposal, since the public’s worry over data use is growing, and a recent survey revealed that 69% of consumers are concerned about how their data is collected in mobile apps. This indicates an increasing demand for disclosure of what types of personal data are used to train recommender algorithms.

Second, the fact that the information is available makes it possible for engaged citizens as well as journalists, researchers, and public health authorities to compare products and spread awareness of the implications of various contents. Transparency of this kind thus enables public debate, which in terms of nutrition labels may regard health effects of particular contents such as sugar and the taxation of sugary drinks, for instance.

In terms of digital environments, the risks related to recommender systems and personal data disclosure online are becoming increasingly evident in the public press (see here and here for examples). The reported privacy violations highlight the need for independent institutions that can control against malfeasance, and assess related social impacts. To do this, specific knowledge of how the underlying algorithms work is needed. It is often put forth as a problem that only technology providers are able to thoroughly assess the behavioral and social implications of their algorithms. A transparency policy could therefore improve the supplier-provider relationship by reducing the risk of, for instance, sudden revelations in the media by whistleblowers.. Therefore, we argue that revealing functionality will help engaged institutions compare technologies and drive the public debate, even though the algorithms may be too complex for many consumers to properly engage with.

Bad for business?

Third, it could be said that this policy would reduce profit, since it would restrain companies’ opportunities for a higher profit margin, when they cannot rely on recommender algorithms to nudge or in other ways persuade users to purchase products or services. However, the policy we propose promotes consumer choice, but does not prohibit or preclude the use of algorithms that aim to promote user engagement and purchases.

Nevertheless, we recognize that the policy may initially reduce profits since it may make the public less inclined to use digital services with recommender systems that optimize the platform’s own profit. Profits may be negatively affected by consumer reactions to information about what is being optimized. For instance, consumers may choose to avoid websites using recommendations that appear to disregard their well-being when it conflicts with profit margins. But, while this may be damaging for individual businesses, enabling consumers to act on their preferences should be viewed as a policy success: a debugging of a market failure.

The policy will motivate companies in meeting the real and long-term needs and desires of consumers. With disclosure, users will make more informed choices driving providers towards alternatives that better match user demands, for instance, related to privacy. A recent survey revealed that more than half of users distrust their connected devices to handle their personal information in a respectful way. Food labeling has proved to promote self-regulation as companies need to be more open about the content of their products and services [2]. Hence, we believe that transparency will be good for the market in the long run, since it will promote trust. This trust will be maintained as there will be fewer opportunities for hidden motives, and the total quality of the consumer experience is likely to be improved over time.

Risks related to gaming?

Fourth, an argument against disclosure is that some users may exploit knowledge of the recommendation system for their own ends at the detriment of other users, so-called “gaming.” [5] Some algorithms require that users be unaware of how they work to be effective. For example, Google keeps the details about its search algorithm secret to avoid being “gamed” by Search Engine Optimization companies that seek to boost the visibility of certain web pages. De Laat [6] detailed why total algorithm transparency may lead to perverse effects, because when models are open, interested parties may use information about proxies to evade them; Content providers may use knowledge of the algorithms’ functionality and input variables to promote their own content, for example by avoiding certain words or certain images. This may bias the content recommended in favor of actors engaging in this practice, with drawbacks for individuals and society, such as less efficient spam filters, increased spreading of hate speech or misinformation, as well as pornographic and violent content. This is an inevitable drawback of the policy, motivating a strong role for the public body that is to receive and assess the information from technology providers.

An obstacle to swift action?

Fifth, it should be considered that the requirement to disclose the algorithm could constrain or delay companies to act on emergency situations that require a fast response. For example, if a certain anti-semitic conspiracy theory is inadvertently promoted by a tweaked algorithm. We find this a valid point and concede that any policy in this regard will have to be developed to allow for tech providers to act unobstructedly when such urgent responses are called for.

The next step

In this blog post we have proposed that platforms that use recommender systems should be compelled to disclose their functionality to an independent public body, and we have discussed some concerns that such a policy may give rise to. In the following and last blog post on this topic, we will elaborate on the specificities of the proposed policy and provide a more detailed outline.

[1] Burke, R. Hybrid Recommender Systems: Survey and Experiments. User Model User-Adap Inter 12, 331–370 (2002). https://doi.org/10.1023/A:1021240730564

[2] For a discussion on this see French, W. A., & Barksdale, H. C. (1974). Food Labeling Regulations: Efforts toward Full Disclosure. Journal of Marketing, 38(3), 14–19. https://doi.org/10.2307/1249845

[3] Krukowski, R. A., Harvey-Berino, J., Kolodinsky, J., Narsana, R. T., & DeSisto, T. P. (2006). Consumers may not use or understand calorie labeling in restaurants. Journal of the American Dietetic Association, 106(6), 917-920. https://doi.org/10.1016/j.jada.2006.03.005

[4] Kreuter, M. W., Brennan, L. K., Scharff, D. P., & Lukwago, S. N. (1997). Do nutrition label readers eat healthier diets? Behavioral correlates of adults’ use of food labels. American journal of preventive medicine, 13(4), 277-283. https://doi.org/10.1016/S0749-3797(18)30175-2

[5] Eslami, M., Vaccaro, K., Lee, M. K., Elazari Bar On, A., Gilbert, E., & Karahalios, K. (2019, May). User attitudes towards algorithmic opacity and transparency in online reviewing platforms. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-14).
https://doi.org/10.1145/3290605.3300724

[6] de Laat, P.B. Algorithmic Decision-Making Based on Machine Learning from Big Data: Can Transparency Restore Accountability?. Philos. Technol. 31, 525–541 (2018). https://doi.org/10.1007/s13347-017-0293-z

Authors appear in alphabetical order.

2021-12-12