Towards Moral Prompt Engineering

Machine ethics, which was often dismissed as a curiosity ten years ago, is now part of everyday business. It is required, for example, when so-called guardrails are used in language models or chatbots, via alignment in the form of fine-tuning or via prompt engineering. When you create GPTs, i.e. “custom versions of ChatGPT”, as Open AI calls them, you have the “Instructions” field available for prompt engineering. Here, the “prompteur” or “prompreuse” can create certain specifications and restrictions for the chatbot. This can include references to documents that have been uploaded. This is exactly what Myriam Rellstab is currently doing at the FHNW School of Business as part of her final thesis “Moral Prompt Engineering”, the interim results of which she presented on May 28, 2024. As a “prompteuse”, she tames GPT-4o with the help of her instructions and – as suggested by the initiator of the project, Prof. Dr. Oliver Bendel – with the help of netiquettes that she has collected and made available to the chatbot. The chatbot is tamed, the tiger becomes a house cat that can be used without danger in the classroom, for example. With GPT-4o, guardrails have already been introduced beforehand. These were programmed in or obtained via reinforcement learning from human feedback. So, strictly speaking, you turn a tamed tiger into a house cat. This is different with certain open source language models. The wild animal must first be captured and then tamed. And even then it can seriously injure you. But even with GPTs there are pitfalls, and as we know, house tigers can hiss and scratch. The results of the project will be available in August. Moral prompt engineering had already been applied to Data, a chatbot for the Data Science course at the FHNW School of Engineering (Image: Ideogram).