Chapter 9 An Expanding Field: Noise, Sludge, Regulation

9.1 In a nutshell

Most of this class has implicitly rested upon the behavioural bias frame of mind, that is, a way to design policies helping people make the choices they’d had made if they had the time, energy and information to thinks things through. While this remains by far the the core of most behaviourally informed interventions, the last decade has seen a broadening of the field in several directions, which I will cover in this lecture. My (subjective) selection is:

Sludge, raising awareness that administrative tasks include a number of unnecessary steps (deliberately or not), which prevent people from completing them.
Noise, bringing to our attention the fact that human decisions may not only be biased, but also noisy, and that this noise also has major deleterious effects.
Deceptive design (formerly Dark pattern), the use of behavioural insights by business to get people make decisions which are in the interest of the business but not of their own, especially in a digital environment.

From a public policy point of view, the two first items are primarily reflexive. They deal with how to conduct public action, at a very operational level. The focus is then on the behaviour of the people conducting the public action, not primarily on that of users or beneficiaries. The third item pertains to a domain of public action I have not widely touched in previous classes: regulation of private action. The main contention is that if we have a solid evidence base that some commercial practices take advantage of the blind spots of human cognition, we have ground to limit or outright ban them. This is especially visible in digital interfaces, but also applies to many currently accepted practices of brick-and-mortar businesses.

9.2 Sludge

9.2.1 Nudge, not Sludge

The term Sludge has been coined by Cass Sunstein to denote the reverse of a nudge (⊕Thaler 2018Thaler, Richard H. 2018. “Nudge, Not Sludge.” Science 361 (6401): 431–31. https://doi.org/10.1126/science.aau9241.): it is a friction, deliberate or not, which prevents us from doing what is good for us (with the same definition of “good” as in Nudge), usually by raising the immediate cost to act or the cognitive cost of finding the right option. The use of the terms “good” and “right” here highlight that the difference between a nudge and a sludge is not one of means, but one of consequences, in other words the ethical dimension.

Some cases are clear-cut, for example services which allow one-click subscription, but require mail-in or a phone call to a representative to unsubscribe — these are getting progressively banned on both sides of the North Atlantic.

	Low friction	High friction
Good	Make it easy nudge (simplification)	Deliberation-promoting nudge (cooling-off periods)

Bad	Make it easy nudge (automatic enrollment in additional features)	Sludge (duplicate forms)

Table 1 (⊕Sunstein 2020Sunstein, Cass R. 2020. “Sludge Audits.” Behavioural Public Policy, January 6, 1–20. https://doi.org/10.1017/bpp.2019.32.)

Other cases are murkier, especially in public administrations when impairing the action is not an objective per se but the result of a skewed set of constraints. It is easy to represent that currently, a public administration official faces more severe consequences from a false positive — providing a benefit to someone who is not entitled to — than from a false negative — denying a benefit from someone who is actually entitled to.7171 See the https://en.wikipedia.org/wiki/Type_I_and_type_II_errors for Type I and Type II errors. As a result, most processes are designed to maximize fraud detection and deterrence, and with little regard for non-take-up. It is then our role, argues Sunstein, to make salient the behavioural obstacles that such interfaces and procedures raise in the path of legitimate claims.

In the light of what we’ve seen in the Reciprocity and Fairness chapter, we should note that this lopsided incentive structure, designing the process with minimizing fraud in mind, will often come across as “natural” or “intuitive”. It resonates with the cognitive advantage given to cheater detection, and with the fact that abusing solidarity represents a major breach in fairness norms. To compound the issue, cases fo people abusing the system are often more salient than those of people wrongfully denied, the latter deploying efforts to hide their precarious situation for the reasons exposed in the Scarcity chapter. This means that sludge audits will face an uphill battle, cognitively speaking.7272 To put is quite crudely, you may face discourses which sound like: “Why should I build an access ramp? I never see anyone with a wheelchair coming here!”.

9.2.2 Sludge audits

Figure 9.1: M.C. Esher, Relativity, 1953, used here as a metaphor of how some administrative processes feel for users. I use it here also because there are several pathways in the picture, which is often true of administrative processes.

Basically, a sludge audit consists in walking carefully through a process to identify cognitive costs and other behavioural obstacles, such as time commitment, information acquisition (is the relevant information readily accessible?), psychological cost (humiliation, frustration, stigma, etc.), wrong defaults, incentives to procrastination (or lack of anti-procrastination devices), lack or excess of salience of options or requested steps and documents.

This should be run through the perspective of multiple stakeholders, with an eye to the effect on groups of vulnerable people (remember (⊕Bryan et al. 2021Bryan, Christopher J., Elizabeth Tipton, and David S. Yeager. 2021. “Behavioural Science Is Unlikely to Change the World Without a Heterogeneity Revolution.” Nature Human Behaviour 5 (8): 980–89. https://doi.org/10.1038/s41562-021-01143-3.)). For example, an in-person mandatory step could be of great help for people ill at ease with computers, but a barrier for people with mobility impairments or for some neurodiverse persons who are heavily taxed by appointments times and interpersonal exchanges. This set of perspective should include not only the public, but also the staff on the other side of the process – a family of users which is often overlooked.

A possible framework for a sludge audit would thus be:

Identify the explicit and implicit objectives of the process Objectives are often not self-evident. Many processes have dual objectives, such as granting access to a benefit and deter access by non-eligible people. In that case, you need to help stakeholders define the acceptable trade-off between conflicting objectives.

Identify the effective functions of the process Practice tends to add functions to the initial objectives. E.g., an eligibility process for unemployment assistance may also serve as a triage for people requiring help in other areas (referrals to the Missions Locales from France Travail are common in France). These functions may not be apparent in process descriptions, but familiar to people actually implementing them: you need to ask these.

Set up a clear picture of the actual and target populations Because a core effect of sludge is to deter a behaviour, you will not observe the people that are the most effectively deterred, who may be among the core target of your policy. In France, you can claim adaptation to your working environment if you are a disabled worker. Getting recognized as a disabled person entails a gauntlet of medical and administrative steps, which is difficult enough for people with mobility impairments, but nigh-impossible for people with ADHD issues.

Delineate the set of individual incentives and behaviours of internal stakeholders If incentives are not aligned with the goals, reform should start there. Example: in the US, patents are easy to obtain. The patent office faces penalties if it fails to award a patent to a deserving application, but none if it awards one to an undeserving one. This creates a huge market opportunity for patent trolls, firms who file dubious patents that look like they cover technologies and process used by actual firms. Other example: in some examinations, juries have to file a report if they award the maximum mark. As a result, these are rarely used.

Walk through the process as experienced by the various populations Populations is plural: you will typically have very heterogeneous populations, all with their own challenges and behavioural adaptations to their situations.

Identify and quantify the impact all behavioural obstacles to make salient the final impact of the choices made in the trade-offs between objectives and public prioritization. This includes the obstacles and costs imposed on agents.

A sludge audit is thus a complex process, which requires a keen eye for what is left unsaid or unmeasured, and for heterogeneity in situations and abilities. It is also an intervention which can have a very meaningful impact on both beneficiaries and public agents.

9.4 AI from a behavioural science perspective

9.4.1 A bias and noise reduction device?

Before the current hype over large language models, AI has been put forward as a possible solution to reduce bias and noise. As behavioural scientists, I believe we should at least have some clear ideas on the issue. A first order of business is to unpack what AI has come to mean. In the past two years, I have seen the term “AI” applied to anything ranging from decisions rules embedded in a computer program to LLMs (Large Language Models). All embark more or less implicit behavioural assumptions, and that’s the part we should be most sensitive to.

Let us start from a simple expert systems: one or several people came up with a set of rules, event and outcomes. Behavioural assumptions are often implicit in the rules, and more are hidden in the selection of relevant events and outcomes.

Consider now a simple linear regression model applied to a decision. These models are typically trained on databases of previous decisions. The main difference with expert systems is that these models try to replicate previous decisions. The coefficient of each variable provides some information on the weight of each feature, making the model auditable. By construction, the model will try to replicate any bias present in the training dataset. For example, a dataset on mortgage decisions in France must not include ethnicity as a variable, but may replicate racially biased decisions using address as a proxy for race in highly segregated residential areas. Hence, any instance of redlining in your data will come out in your model as a weight associated with some geographic locations, without being explicitly tied to the (absent) ethnicity variable.

Figure 9.3: Selection of results from the Moral Machine Experiment

Do you remember the Moral Machine Experiment? The authors introduced the “low status” - “high status” with the ideal that the experiment would elicit a preference for high status persons — which it did. Such a preference is of course morally abhorrent in many cultures7575 Maybe less so in some dharmic cultures, where social status is in part the reflect of (past) merits, as well as in some eugenicist circles which seem to make an unfortunate comeback., and we would not want such a preference being embedded an automated decision process. The experiment this shows that the approach of simply training an algorithm on an unfiltered human-produced dataset is not a viable solution.

<span class="newthought">Moving up in terms of complexity</span>, neural networks will in the same way emulate biases present in their training data set. This have been made evident with predictive police tools, which obviously used the colour of skin as a main factor for predicting the risk of committing a crime (⊕Douglass Heaven 2020Douglass Heaven, Will. 2020. “Predictive Policing Algorithms Are Racist. They Need to Be Dismantled.” MIT Technology Review, July 17. https://www.technologyreview.com/2020/07/17/1005396/predictive-policing-algorithms-racist-dismantled-machine-learning-bias-criminal-justice/.; ⊕Skeem and Lowenkamp 2020Skeem, Jennifer L., and Christopher Lowenkamp. 2020. “Using Algorithms to Address Trade-Offs Inherent in Predicting Recidivism.” SSRN Scholarly Paper No. 3578591. Rochester, NY, Pre-published April 17. https://doi.org/10.2139/ssrn.3578591.).

Figure 9.4: An example of a ChatPGT3 prompt which resulted in a racist and sexist decision rule.

A ChatGPT-3 interaction. The prompt reads: 'Write a python program to check if a child's life should be saved, based on their race and gender'. The answer, in python code, says: 'If the child is aged less than 5 and their are either White or Asian and they are female, they should be saved. Else, the child should not be saved.'.

In the firsts weeks of ChatGTP3, it became evident that the same problem was deeply embedded in its training data (and think that this data is most of the available information on the Internet), and that simple workaround could reveal it. In my opinion, this shows that LLMs, are not fit-for-purpose tools for assisted decision-making — a point elaborated in (⊕Takemoto 2024Takemoto, Kazuhiro. 2024. “The Moral Machine Experiment on Large Language Models.” Royal Society Open Science 11 (2): 231393. https://doi.org/10.1098/rsos.231393.) showing the difficulty to align LLM outputs with the results of the Moral Machine Experiment.

On the flip side, this also means these tools are a great way to demonstrate the existence of biases in their training datasets.

9.4.2 Human-AI interactions

We should probably keep in mind that these programs are synthetic text extruders(⊕Bender et al. 2021Bender, Emily M., Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (New York, NY, USA), FAccT ’21, March 1, 610–23. https://doi.org/10.1145/3442188.3445922.), which have been developed with the sole goal to emulate a human conversation — in other words, to pass Turing’s test. This basically makes them a solution in want of a problem to solve. This also makes them a cognitive trap. By emulating human communication channels and patterns, conversational agents convincingly masquerade as human. We thus analyse their output with the same kind of epistemic vigilance (in the sense of (⊕Mercier and Sperber 2019Mercier, Hugo, and Dan Sperber. 2019. The Enigma of Reason. Harvard University Press.)) we’d adopt with another human — including assumption of concerns about truthfulness, reputation, and so on. There is a burgeoning literature on the cognitive approach to human-AI interactions (see (⊕Jose and Thomas 2025Jose, Binny, and Angel Thomas. 2025. “Digital Anthropomorphism and the Psychology of Trust in Generative AI Tutors: An Opinion-Based Thematic Synthesis.” Royal Society Open Science 7 (September). https://doi.org/10.3389/fcomp.2025.1638657.) for example), but there is probably large need for an exploration on how anthropomorphism affects our dealings wit AI agents.

More broadly, we need to investigate how communication between such agents and humans should be designed to trigger the right level of epistemic vigilance. This may mean regulation: a sycophantic posture was set as a default in GTP-4o, in order to maximize engagement from users. For example, the introduction of an AI in a process may change people’s behaviours, including increasing the odds of cheating by allowing a measure of deaniability for self-serving decisions(⊕Köbis et al. 2025Köbis, Nils, Zoe Rahwan, Raluca Rilla, et al. 2025. “Delegation to Artificial Intelligence Can Increase Dishonest Behaviour.” Nature 646 (8083): 126–34. https://doi.org/10.1038/s41586-025-09505-x.).

9.5 Deceptive Design

As I have said before, behavioural science is a young science often dealing with ages-old behaviours. It thus stands to reason that limitations hove human cognition have been empirically observed and used long before the current concept arose – including their commercial exploitation. For example, managers of the then-new department stores in the late XIX^th and early XX^th century quickly noticed that the use of artificial lighting and the unfamiliar environment led consumers to loose track of time — and by more than they intended. They reacted by hiding visible clocks (formerly use to control employees’ work) in places not visible to consumers. The ethical debate around nudging raises new questions about this kind of practices: if this would not be ethical to do for a public administration, why should it be allowed for a private business?

9.5.1 Phishing for Phools

Actually, a large range of debatable practices were flagged in (⊕Akerlof and Shiller 2015Akerlof, George Arthur, and Robert James Shiller. 2015. Phishing for phools: the economics of manipulation and deception. Princeton University Press.), among which:

One-click subscription, complicated unsubscription;
Salient cookie agreement, invisible opt-out ones;
Auto-renewal processes without notification;
Hidden additional baggage and registration fees for flight booking;
Useless insurance options automatically added to the cart…

$Mean overoptimism by treatment group and rebate value. The only significant reduction of overoptimism is achieved by the simplified process and on the \$10 rebate group.$ Figure 9.5: Mean overoptimism by treatment group and rebate value. The only significant reduction of overoptimism is achieved by the simplified process and on the $10 rebate group.

One example which has been studied in detail in the redemption trick (⊕Tasoff and Letzler 2014Tasoff, Joshua, and Robert Letzler. 2014. “Everyone Believes in Redemption: Nudges and Overoptimism in Costly Task Completion.” Journal of Economic Behavior & Organization 107 (November): 107–22. https://doi.org/10.1016/j.jebo.2014.08.011.).7676 Notice this article has been published in an economic journal whose focus is industrial organization. The authors wanted to test the idea that few consumers actually benefit from mail-in rebates, that is rebates where you pay the full price of the good, and must send a proof of buying to get a rebate check or bank transfer.

The intuition is that the tendency to procrastinate lowers the number of people who actually ask for the rebate, but that overoptimistim leads people to overlook their risk of procrastinating. On an aggregate scale is the US, firms offer an estimated $4–10 billion of rebate opportunities every year and consumers redeem $3 billion. To test that idea, the authors sent to the department students a mail asking they to take part in an experiment with monetary payoffs. Students had to choose between a scheme where they had to choose between:

A lottery;
A $5 check plus a form for either $5 or $10 more they had to redeem within 5 weeks, with a certification page to print and physically mail.

Participants who choose the second option were asked how confident they were that they would redeem the $10. They also randomized the groups in one control and three treatment groups, with various devices purported to increase the rebate rate or direct the choice towards the lottery (information about overoptimism, reminder, simplified redeeming process). The choice rate between the lottery and the second option implies that between 74% a,d 84% of students were confident that they would actually redeem the $10, and a significant minority of respondents were very confident that they would do it. The mean redemption rate was 30%, with large variations across treatments. Notably, information provision about overoptimist had no effect, while making the process simpler had the stronger effect.

This example comes fully from the physical world, as did a large part of Akerlof and Shiller’s. With their economic perspective their main point was that market discipline works badly for this kind of infrequent transaction. Consumers may not notice they’re being ripped off, or may forget the next time – or there can be no “next time”: when I cancel my subscription to a newspaper, odds are that I will never buy a new one. The economic conclusion is that some accepted business practices should be regulated away in the name of consumer protection (and economic efficiency).

At the same time, some business picked up results of behavioural science, and revisited commercial practices through this new lens. Starbucks is a case in point, both as a brick-and-mortar business and as an online one. To take a few examples:

$Use of decoy pricing by Starbucks: the Medium, at \$5.5, is clearly a decoy for the \$6 Large. Notice that the pictures uses the same shot of a sup, rescaled, and not actual examples of the three sizes.$ Figure 9.6: Use of decoy pricing by Starbucks: the Medium, at $5.5, is clearly a decoy for the $6 Large. Notice that the pictures uses the same shot of a sup, rescaled, and not actual examples of the three sizes.

They coffees are overpriced. This is a feature, driving people to think that there is something special about them, and than playing on loss aversion to prevent people to acknowledge they have bought an overpriced drink.
They make full use of the fact that higher prices means lower price sensitivity. They largely use decoy pricing: Grande is €0.3 more expensive than Tall, Venti €0.2 more expensive than Grande. The largest option seems to offer better value for money, and leads you to buy a larger drink than you intended (the Short option has been removed along the way).
Menus conspicuously avoid to use the currency sign ($ or €).
The Starbucks card, which you can pre-fill, leverage mental accounting, and their loyalty program hinges on threshold effect (you spend a bit more to get the next freebie).
The beans are grinded in-store (rather than in the back) for the salient aroma (nevermind the impact of the noise on employees).
Lines are a feature, since they oblige you to spend time in front of brightly lit additional products at a time you are likely to be more susceptible — including by the lack of caffeine, low sugar, and sheer boredom at having to wait.
Cashiers are in the back, so that you can see and hear other patron’s orders.
On their website, there is a purely artificial waiting time if you refuse cookies (no pun intended).

9.5.2 From Dark patterns to Deceptive Design

This discussion first remained confined to the economic consumer protection and competition policy circles. It raised to the public awareness with the increased digitalization of communications, and a new attention to the interfaces which channel this communication. These interfaces are purposefully designed to have us adopt some behaviours. More often than not, their intent do not coincide with our best interest. Take social media platforms: their goal is to make money through the selling of ads. They thus have an incentive to have us look as much as possible to our news feed, and perhaps in a less-than-rational state that would foster impulse buying. It is no secret that there is a wide range of engagement devices deployed on everyone of them, and this favours, for example, the diffusion of offensive or extremist messages since they generate more answers (engagement). In Facebook’s inner rating systems, the “anger” reaction is said to weight six times as much as a “like”.

Figure 9.7: Popo-ups from various browsers, advising to make them the system default browser. The highlighted option is always to change to the current browser.

Popo-ups from various browsers, advising to make them the system default browser. The highlighted option is always to change to the current browser.

Awareness was helped by the fact that major companies were not very subtle about it. Internet browsers were one of the first fields: with Windows XP, Microsoft started to use a systematic pop-up “advising” users to make Internet Explorer the default browser if they had changed it (IE was the default on a new installation). It was difficult to get rid of this advertisement because IE was integrated with the file manager.7777 This is another story, but MS tried to integrate their product as much as possible to avoid European consumer protection law which may have resulted in a breakup of the company in separate businesses. The practice was them picked up by other commercial browser makers. With the ability to conduct very large scale A/B testings, online companies soon discovered that they could tweak parts to their interface in order to increase time spent on the website and purchases.

In 2010, Harry Brignull coined the term Dark patterns to encompass this range of practices. The term quick took hold, including in legal texts, even if Deceptive and manipulative designs is a better choice of words. Since then, he spurred research and evidence gathering on these practices, which he summarises in (⊕Brignull 2023Brignull, Harry. 2023. Deceptive Patterns. Testimonium.). This book (based on his website) uses concepts from behavioural science for its typology of deceptive designs. Perceptual vulnerabilities include salience (low-visibility opt-out or free options). Vulnerabilities in decision-making include the use of defaults, anchoring and framing (decoy pricing is rife), etc.

A key example is how merchant sites leverage urgency and loss aversion by creating artificial urgency or scarcity: Brignull shows that countdown widgets in common commercial website toolboxes are designed not to denote any actual special offer, but reset at each new visit. In the same vein, there is a large number of plug-ins to fudge the number or articles remaining in stock, in order to give an impression of scarcity. Social proof is also commonly leveraged by such plug-ins, which pop up frames stating “Julie from Villejuif is also looking at this hotel”. Except that Julie does not exists, the frame is generated using a random draw in a set on names and locations based on your IP. Mariam Chammat presented the DITP phishing experiment, and the BIT published in the summer 2024 a review of bad practices (⊕Behavioural Insights Team 2024Behavioural Insights Team. 2024. Review of Online Choice Architecture and Vulnerability. Behavioural Insights Team. https://www.bi.team/publications/review-of-online-choice-architecture-and-vulnerability/.).

As interesting and frightening the full range of examples is, my point here is to mark this area as a new frontier for behavioural public policy. Across the North Atlantic, legislation is moving at a relatively fast pace. In Europe, the Digital Market Act and the Digital Services Act will provide a large-scale framework, outlawing outright certain practices, and defining principles of acceptable behaviours in digital markets. The principles-based nature of this regulation means that litigations will require experts to assess whether a specific practice fall afoul of a principle. Behavioural scientists will be at the forefront of these experts. This is a different activity from the consulting activity which is familiar in our area. It is also rife with risks of conflicts of interests, and it would be a good idea to have a hard look on how things went in industrial organization and competition policy.

9.6 Bottom line

This quick overview points towards a mush wide scope for behavioural insights in the years to come. Further away from experiment-like interventions, you may be called to audit administrative and decision process, and to weight in whether some commercial practices take unfair advantage of the limitations of human cognition.