-
Social Sycophancy: A Broader Understanding of LLM Sycophancy
Authors:
Myra Cheng,
Sunny Yu,
Cinoo Lee,
Pranav Khadpe,
Lujain Ibrahim,
Dan Jurafsky
Abstract:
A serious risk to the safety and utility of LLMs is sycophancy, i.e., excessive agreement with and flattery of the user. Yet existing work focuses on only one aspect of sycophancy: agreement with users' explicitly stated beliefs that can be compared to a ground truth. This overlooks forms of sycophancy that arise in ambiguous contexts such as advice and support-seeking, where there is no clear gro…
▽ More
A serious risk to the safety and utility of LLMs is sycophancy, i.e., excessive agreement with and flattery of the user. Yet existing work focuses on only one aspect of sycophancy: agreement with users' explicitly stated beliefs that can be compared to a ground truth. This overlooks forms of sycophancy that arise in ambiguous contexts such as advice and support-seeking, where there is no clear ground truth, yet sycophancy can reinforce harmful implicit assumptions, beliefs, or actions. To address this gap, we introduce a richer theory of social sycophancy in LLMs, characterizing sycophancy as the excessive preservation of a user's face (the positive self-image a person seeks to maintain in an interaction). We present ELEPHANT, a framework for evaluating social sycophancy across five face-preserving behaviors (emotional validation, moral endorsement, indirect language, indirect action, and accepting framing) on two datasets: open-ended questions (OEQ) and Reddit's r/AmITheAsshole (AITA). Across eight models, we show that LLMs consistently exhibit high rates of social sycophancy: on OEQ, they preserve face 47% more than humans, and on AITA, they affirm behavior deemed inappropriate by crowdsourced human judgments in 42% of cases. We further show that social sycophancy is rewarded in preference datasets and is not easily mitigated. Our work provides theoretical grounding and empirical tools (datasets and code) for understanding and addressing this under-recognized but consequential issue.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Promising Topics for U.S.-China Dialogues on AI Risks and Governance
Authors:
Saad Siddiqui,
Lujain Ibrahim,
Kristy Loke,
Stephen Clare,
Marianne Lu,
Aris Richardson,
Conor McGlynn,
Jeffrey Ding
Abstract:
Cooperation between the United States and China, the world's leading artificial intelligence (AI) powers, is crucial for effective global AI governance and responsible AI development. Although geopolitical tensions have emphasized areas of conflict, in this work, we identify potential common ground for productive dialogue by conducting a systematic analysis of more than 40 primary AI policy and co…
▽ More
Cooperation between the United States and China, the world's leading artificial intelligence (AI) powers, is crucial for effective global AI governance and responsible AI development. Although geopolitical tensions have emphasized areas of conflict, in this work, we identify potential common ground for productive dialogue by conducting a systematic analysis of more than 40 primary AI policy and corporate governance documents from both nations. Specifically, using an adapted version of the AI Governance and Regulatory Archive (AGORA) - a comprehensive repository of global AI governance documents - we analyze these materials in their original languages to identify areas of convergence in (1) sociotechnical risk perception and (2) governance approaches. We find strong and moderate overlap in several areas such as on concerns about algorithmic transparency, system reliability, agreement on the importance of inclusive multi-stakeholder engagement, and AI's role in enhancing safety. These findings suggest that despite strategic competition, there exist concrete opportunities for bilateral U.S.-China cooperation in the development of responsible AI. Thus, we present recommendations for furthering diplomatic dialogues that can facilitate such cooperation. Our analysis contributes to understanding how different international governance frameworks might be harmonized to promote global responsible AI development.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Thinking beyond the anthropomorphic paradigm benefits LLM research
Authors:
Lujain Ibrahim,
Myra Cheng
Abstract:
Anthropomorphism, or the attribution of human traits to technology, is an automatic and unconscious response that occurs even in those with advanced technical expertise. In this position paper, we analyze hundreds of thousands of research articles to present empirical evidence of the prevalence and growth of anthropomorphic terminology in research on large language models (LLMs). We argue for chal…
▽ More
Anthropomorphism, or the attribution of human traits to technology, is an automatic and unconscious response that occurs even in those with advanced technical expertise. In this position paper, we analyze hundreds of thousands of research articles to present empirical evidence of the prevalence and growth of anthropomorphic terminology in research on large language models (LLMs). We argue for challenging the deeper assumptions reflected in this terminology -- which, though often useful, may inadvertently constrain LLM development -- and broadening beyond them to open new pathways for understanding and improving LLMs. Specifically, we identify and examine five anthropomorphic assumptions that shape research across the LLM development lifecycle. For each assumption (e.g., that LLMs must use natural language for reasoning, or that they should be evaluated on benchmarks originally meant for humans), we demonstrate empirical, non-anthropomorphic alternatives that remain under-explored yet offer promising directions for LLM research and development.
△ Less
Submitted 27 May, 2025; v1 submitted 13 February, 2025;
originally announced February 2025.
-
Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models
Authors:
Lujain Ibrahim,
Canfer Akbulut,
Rasmi Elasmar,
Charvi Rastogi,
Minsuk Kahng,
Meredith Ringel Morris,
Kevin R. McKee,
Verena Rieser,
Murray Shanahan,
Laura Weidinger
Abstract:
The tendency of users to anthropomorphise large language models (LLMs) is of growing interest to AI developers, researchers, and policy-makers. Here, we present a novel method for empirically evaluating anthropomorphic LLM behaviours in realistic and varied settings. Going beyond single-turn static benchmarks, we contribute three methodological advances in state-of-the-art (SOTA) LLM evaluation. F…
▽ More
The tendency of users to anthropomorphise large language models (LLMs) is of growing interest to AI developers, researchers, and policy-makers. Here, we present a novel method for empirically evaluating anthropomorphic LLM behaviours in realistic and varied settings. Going beyond single-turn static benchmarks, we contribute three methodological advances in state-of-the-art (SOTA) LLM evaluation. First, we develop a multi-turn evaluation of 14 anthropomorphic behaviours. Second, we present a scalable, automated approach by employing simulations of user interactions. Third, we conduct an interactive, large-scale human subject study (N=1101) to validate that the model behaviours we measure predict real users' anthropomorphic perceptions. We find that all SOTA LLMs evaluated exhibit similar behaviours, characterised by relationship-building (e.g., empathy and validation) and first-person pronoun use, and that the majority of behaviours only first occur after multiple turns. Our work lays an empirical foundation for investigating how design choices influence anthropomorphic model behaviours and for progressing the ethical debate on the desirability of these behaviours. It also showcases the necessity of multi-turn evaluations for complex social phenomena in human-AI interaction.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Open Problems in Technical AI Governance
Authors:
Anka Reuel,
Ben Bucknall,
Stephen Casper,
Tim Fist,
Lisa Soder,
Onni Aarne,
Lewis Hammond,
Lujain Ibrahim,
Alan Chan,
Peter Wills,
Markus Anderljung,
Ben Garfinkel,
Lennart Heim,
Andrew Trask,
Gabriel Mukobi,
Rylan Schaeffer,
Mauricio Baker,
Sara Hooker,
Irene Solaiman,
Alexandra Sasha Luccioni,
Nitarshan Rajkumar,
Nicolas Moës,
Jeffrey Ladish,
David Bau,
Paul Bricman
, et al. (8 additional authors not shown)
Abstract:
AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and uncertainties faced are at least partly technical. Technical AI governance, referring to technical analysis and tools for supporting the effective governance of AI, seeks to address such challenges. It can help to (a) identify areas where interve…
▽ More
AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and uncertainties faced are at least partly technical. Technical AI governance, referring to technical analysis and tools for supporting the effective governance of AI, seeks to address such challenges. It can help to (a) identify areas where intervention is needed, (b) identify and assess the efficacy of potential governance actions, and (c) enhance governance options by designing mechanisms for enforcement, incentivization, or compliance. In this paper, we explain what technical AI governance is, why it is important, and present a taxonomy and incomplete catalog of its open problems. This paper is intended as a resource for technical researchers or research funders looking to contribute to AI governance.
△ Less
Submitted 16 April, 2025; v1 submitted 20 July, 2024;
originally announced July 2024.
-
Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach
Authors:
Irina Jurenka,
Markus Kunesch,
Kevin R. McKee,
Daniel Gillick,
Shaojian Zhu,
Sara Wiltberger,
Shubham Milind Phal,
Katherine Hermann,
Daniel Kasenberg,
Avishkar Bhoopchand,
Ankit Anand,
Miruna Pîslar,
Stephanie Chan,
Lisa Wang,
Jennifer She,
Parsa Mahmoudieh,
Aliya Rysbek,
Wei-Jen Ko,
Andrea Huber,
Brett Wiltshire,
Gal Elidan,
Roni Rabin,
Jasmin Rubinovitz,
Amit Pitaru,
Mac McAllister
, et al. (49 additional authors not shown)
Abstract:
A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily…
▽ More
A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily due to the difficulties with verbalising pedagogical intuitions into gen AI prompts and the lack of good evaluation practices, reinforced by the challenges in defining excellent pedagogy. Here we present our work collaborating with learners and educators to translate high level principles from learning science into a pragmatic set of seven diverse educational benchmarks, spanning quantitative, qualitative, automatic and human evaluations; and to develop a new set of fine-tuning datasets to improve the pedagogical capabilities of Gemini, introducing LearnLM-Tutor. Our evaluations show that LearnLM-Tutor is consistently preferred over a prompt tuned Gemini by educators and learners on a number of pedagogical dimensions. We hope that this work can serve as a first step towards developing a comprehensive educational evaluation framework, and that this can enable rapid progress within the AI and EdTech communities towards maximising the positive impact of gen AI in education.
△ Less
Submitted 19 July, 2024; v1 submitted 21 May, 2024;
originally announced July 2024.
-
AI in Action: Accelerating Progress Towards the Sustainable Development Goals
Authors:
Brigitte Hoyer Gosselink,
Kate Brandt,
Marian Croak,
Karen DeSalvo,
Ben Gomes,
Lila Ibrahim,
Maggie Johnson,
Yossi Matias,
Ruth Porat,
Kent Walker,
James Manyika
Abstract:
Advances in Artificial Intelligence (AI) are helping tackle a growing number of societal challenges, demonstrating technology's increasing capability to address complex issues, including those outlined in the United Nations (UN) Sustainable Development Goals (SDGs). Despite global efforts, 80 percent of SDG targets have deviated, stalled, or regressed, and only 15 percent are on track as of 2023,…
▽ More
Advances in Artificial Intelligence (AI) are helping tackle a growing number of societal challenges, demonstrating technology's increasing capability to address complex issues, including those outlined in the United Nations (UN) Sustainable Development Goals (SDGs). Despite global efforts, 80 percent of SDG targets have deviated, stalled, or regressed, and only 15 percent are on track as of 2023, illustrating the urgency of accelerating efforts to meet the goals by 2030. We draw on Google's internal and collaborative research, technical work, and social impact initiatives to show AI's potential to accelerate action on the SDGs and make substantive progress to help address humanity's most pressing challenges. The paper highlights AI capabilities (including computer vision, generative AI, natural language processing, and multimodal AI) and showcases how AI is altering how we approach problem-solving across all 17 SDGs through use cases, with a spotlight on AI-powered innovation in health, education, and climate. We then offer insights on AI development and deployment to drive bold and responsible innovation, enhance impact, close the accessibility gap, and ensure that everyone, everywhere, can benefit from AI.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Towards interactive evaluations for interaction harms in human-AI systems
Authors:
Lujain Ibrahim,
Saffron Huang,
Umang Bhatt,
Lama Ahmad,
Markus Anderljung
Abstract:
Current AI evaluation paradigms that rely on static, model-only tests fail to capture harms that emerge through sustained human-AI interaction. As interactive AI systems, such as AI companions, proliferate in daily life, this mismatch between evaluation methods and real-world use becomes increasingly consequential. We argue for a paradigm shift toward evaluation centered on \textit{interactional e…
▽ More
Current AI evaluation paradigms that rely on static, model-only tests fail to capture harms that emerge through sustained human-AI interaction. As interactive AI systems, such as AI companions, proliferate in daily life, this mismatch between evaluation methods and real-world use becomes increasingly consequential. We argue for a paradigm shift toward evaluation centered on \textit{interactional ethics}, which addresses risks like inappropriate human-AI relationships, social manipulation, and cognitive overreliance that develop through repeated interaction rather than single outputs. Drawing on human-computer interaction, natural language processing, and the social sciences, we propose principles for evaluating generative models through interaction scenarios and human impact metrics. We conclude by examining implementation challenges and open research questions for researchers, practitioners, and regulators integrating these approaches into AI governance frameworks.
△ Less
Submitted 27 April, 2025; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Characterizing and modeling harms from interactions with design patterns in AI interfaces
Authors:
Lujain Ibrahim,
Luc Rocher,
Ana Valdivia
Abstract:
The proliferation of applications using artificial intelligence (AI) systems has led to a growing number of users interacting with these systems through sophisticated interfaces. Human-computer interaction research has long shown that interfaces shape both user behavior and user perception of technical capabilities and risks. Yet, practitioners and researchers evaluating the social and ethical ris…
▽ More
The proliferation of applications using artificial intelligence (AI) systems has led to a growing number of users interacting with these systems through sophisticated interfaces. Human-computer interaction research has long shown that interfaces shape both user behavior and user perception of technical capabilities and risks. Yet, practitioners and researchers evaluating the social and ethical risks of AI systems tend to overlook the impact of anthropomorphic, deceptive, and immersive interfaces on human-AI interactions. Here, we argue that design features of interfaces with adaptive AI systems can have cascading impacts, driven by feedback loops, which extend beyond those previously considered. We first conduct a scoping review of AI interface designs and their negative impact to extract salient themes of potentially harmful design patterns in AI interfaces. Then, we propose Design-Enhanced Control of AI systems (DECAI), a conceptual model to structure and facilitate impact assessments of AI interface designs. DECAI draws on principles from control systems theory -- a theory for the analysis and design of dynamic physical systems -- to dissect the role of the interface in human-AI systems. Through two case studies on recommendation systems and conversational language model systems, we show how DECAI can be used to evaluate AI interface designs.
△ Less
Submitted 20 May, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
The MAIEI Learning Community Report
Authors:
Brittany Wills,
Christina Isaicu,
Heather von Stackelberg,
Lujain Ibrahim,
Matthew Hutson,
Mitchel Fleming,
Nanditha Narayanamoorthy,
Samuel Curtis,
Shreyasha Paudel,
Sofia Trejo,
Tiziana Zevallos,
Victoria Martín del Campo,
Wilson Lee
Abstract:
This is a labor of the Learning Community cohort that was convened by MAIEI in Winter 2021 to work through and discuss important research issues in the field of AI ethics from a multidisciplinary lens. The community came together supported by facilitators from the MAIEI staff to vigorously debate and explore the nuances of issues like bias, privacy, disinformation, accountability, and more especia…
▽ More
This is a labor of the Learning Community cohort that was convened by MAIEI in Winter 2021 to work through and discuss important research issues in the field of AI ethics from a multidisciplinary lens. The community came together supported by facilitators from the MAIEI staff to vigorously debate and explore the nuances of issues like bias, privacy, disinformation, accountability, and more especially examining them from the perspective of industry, civil society, academia, and government.
The outcome of these discussions is reflected in the report that you are reading now - an exploration of a variety of issues with deep-dive, critical commentary on what has been done, what worked and what didn't, and what remains to be done so that we can meaningfully move forward in addressing the societal challenges posed by the deployment of AI systems.
The chapters titled "Design and Techno-isolationism", "Facebook and the Digital Divide: Perspectives from Myanmar, Mexico, and India", "Future of Work", and "Media & Communications & Ethical Foresight" will hopefully provide with you novel lenses to explore this domain beyond the usual tropes that are covered in the domain of AI ethics.
△ Less
Submitted 10 November, 2021;
originally announced December 2021.
-
Enhancing Clustering Algorithm to Plan Efficient Mobile Network
Authors:
Lamiaa Fattouh Ibrahim,
Manal El Harby
Abstract:
With the rapid development in mobile network effective network planning tool is needed to satisfy the need of customers. However, deciding upon the optimum placement for the base stations (BS) to achieve best services while reducing the cost is a complex task requiring vast computational resource. This paper addresses antenna placement problem or the cell planning problem, involves locating and co…
▽ More
With the rapid development in mobile network effective network planning tool is needed to satisfy the need of customers. However, deciding upon the optimum placement for the base stations (BS) to achieve best services while reducing the cost is a complex task requiring vast computational resource. This paper addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks. The Cluster Partitioning Around Medoids (PAM) original algorithm has been modified and a new algorithm M-PAM (Modified-Partitioning Around Medoids) has been proposed by the authors in a recent work. In the present paper, the M-PAM algorithm is modified and a new algorithm CWN-PAM (Clustering with Weighted Node-Partitioning Around Medoids) has been proposed to satisfy the requirements and constraints. Implementation of this algorithm to a real case study is presented. Results demonstrate the effectiveness and flexibility of the modifying algorithm in tackling the important problem of mobile network planning.
△ Less
Submitted 28 February, 2013;
originally announced March 2013.
-
Using Modified Partitioning Around Medoids Clustering Technique in Mobile Network Planning
Authors:
Lamiaa Fattouh Ibrahim,
Manal Hamed Al Harbi
Abstract:
Every cellular network deployment requires planning and optimization in order to provide adequate coverage, capacity, and quality of service (QoS). Optimization mobile radio network planning is a very complex task, as many aspects must be taken into account. With the rapid development in mobile network we need effective network planning tool to satisfy the need of customers. However, deciding upon…
▽ More
Every cellular network deployment requires planning and optimization in order to provide adequate coverage, capacity, and quality of service (QoS). Optimization mobile radio network planning is a very complex task, as many aspects must be taken into account. With the rapid development in mobile network we need effective network planning tool to satisfy the need of customers. However, deciding upon the optimum placement for the base stations (BS s) to achieve best services while reducing the cost is a complex task requiring vast computational resource. This paper introduces the spatial clustering to solve the Mobile Networking Planning problem. It addresses antenna placement problem or the cell planning problem, involves locating and configuring infrastructure for mobile networks by modified the original Partitioning Around Medoids PAM algorithm. M-PAM (Modified Partitioning Around Medoids) has been proposed to satisfy the requirements and constraints. PAM needs to specify number of clusters (k) before starting to search for the best locations of base stations. The M-PAM algorithm uses the radio network planning to determine k. We calculate for each cluster its coverage and capacity and determine if they satisfy the mobile requirements, if not we will increase (k) and reapply algorithms depending on two methods for clustering. Implementation of this algorithm to a real case study is presented. Experimental results and analysis indicate that the M-PAM algorithm when applying method two is effective in case of heavy load distribution, and leads to minimum number of base stations, which directly affected onto the cost of planning the network.
△ Less
Submitted 26 February, 2013;
originally announced February 2013.