On May 28, 2025, the “Proceedings of the 2025 AAAI Spring Symposium Series” (Vol. 5 No. 1) were published. Oliver Bendel was involved in two papers at the symposium “Human-Compatible AI for Well-being: Harnessing Potential of GenAI for AI-Powered Science”. The paper “Revisiting the Trolley Problem for AI: Biases and Stereotypes in Large Language Models and their Impact on Ethical Decision-Making” by Sahan Hatemo, Christof Weickhardt, Luca Gisler, and Oliver Bendel is summarized as follows: “The trolley problem has long served as a lens for exploring moral decision-making, now gaining renewed significance in the context of artificial intelligence (AI). This study investigates ethical reasoning in three open-source large language models (LLMs) – LLaMA, Mistral and Qwen – through variants of the trolley problem. By introducing demographic prompts (age, nationality and gender) into three scenarios (switch, loop and footbridge), we systematically evaluate LLM responses against human survey data from the Moral Machine experiment. Our findings reveal notable differences: Mistral exhibits a consistent tendency to over-intervene, while Qwen chooses to intervene less and LLaMA balances between the two. Notably, demographic attributes, particularly nationality, significantly influence LLM decisions, exposing potential biases in AI ethical reasoning. These insights underscore the necessity of refining LLMs to ensure fairness and ethical alignment, leading the way for more trustworthy AI systems.” The renowned and traditional conference took place from March 31 to April 2, 2025 in San Francisco. The proceedings are available at ojs.aaai.org/index.php/AAAI-SS/issue/view/654.
Miss Tammy in the AAAI Proceedings
On May 28, 2025, the “Proceedings of the 2025 AAAI Spring Symposium Series” (Vol. 5 No. 1) were published. Oliver Bendel was involved in two papers at the symposium “Human-Compatible AI for Well-being: Harnessing Potential of GenAI for AI-Powered Science”. The paper “Miss Tammy as a Use Case for Moral Prompt Engineering” by Myriam Rellstab and Oliver Bendel is summarized as follows: “This paper describes an LLM-based chatbot as a use case for moral prompt engineering. Miss Tammy, as it is called, was created between February 2024 and February 2025 at the FHNW School of Business as a custom GPT. Different types of prompt engineering were used. In addition, RAG was applied by building a knowledge base with a collection of netiquettes. These usually guide the behavior of users in communities but also seem to be useful to control the actions of chatbots and make them competent in relation to the behavior of humans. The tests with pupils aged between 14 and 16 showed that the custom GPT had significant advantages over the standard GPT-4o model in terms of politeness, appropriateness, and clarity. It is suitable for de-escalating conflicts and steering dialogues in the right direction. It can therefore contribute to users’ well-being and is a step forward in human-compatible AI.” The renowned and traditional conference took place from March 31 to April 2, 2025 in San Francisco. The proceedings are available at ojs.aaai.org/index.php/AAAI-SS/issue/view/654.