Ideogram seemed to start as a rather free and permissive image generator in August 2023. In the meantime, a noticeable number of images are censored. It is not the prompt that matters, but the image itself. If the platform detects during generation that the image might be problematic, it is not finished, but replaced by a tile with a cat holding a sign in its paws that says “MAYBE NOT SAFE”. A prompt read: “The sculpture Galatea, resembling the beautiful Aphrodite, creates itself, photo, film”. So, the sculpture of Pygmalion was to empower itself. The four images, two of which showed breasts, were seen by the user and also by the platform itself, apparently resulting in the images being transformed into the said warnings before they were completed. On the other hand, photorealistic images of women in revealing poses remain unproblematic, as long as they are wearing bikinis or hotpants. As with other American platforms, the problem here seems to be the visibility of nipples, whether human or sculptural. In another experiment, in one of the four pictures, the nipples were visible until they disappeared under the cat’s fur. In another sculpture, Ideogram itself had covered the nipples, one with her hand, the other with a piece of clay or stone jewellery. This Galatea was spared the fate of her sister.
ChatGPT can See, Hear, and Speak
OpenAI reported on September 25, 2023 in its blog: “We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.” (OpenAI Blog, 25 September 2023) The company gives some examples of using ChatGPT in everyday life: “Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.” (OpenAI Blog, 25 September 2023) But the application can not only see, it can also hear and speak: “You can now use voice to engage in a back-and-forth conversation with your assistant. Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate.” (OpenAI Blog, 25 September 2023) More information via openai.com/blog/chatgpt-can-now-see-hear-and-speak.
CONVERSATIONS 2023 in Oslo
The CONVERSATIONS 2023, a two-day workshop on chatbot research, applications, and design, will take place at the University of Oslo, Norway. According to the CfP, contributions concerning applications of large language models such as the GPT family are warmly welcome, as are contributions on applications combining information retrieval approaches and large language model approaches. Building on the results from previous six CONVERSATIONS workshops, the following topics are of particular interest: 1. Chatbot users and implications, 2. Chatbot user experience, design, and evaluation, 3. Chatbot frameworks and platforms, 4. Chatbots for collaboration, 5. Democratizing chatbots – chatbots for all, 6. Ethics and safety implications of chatbots and large language models, 7. Leveraging advances in AI technology and large language models. More information via 2023.conversations.ws.
Introducing Visual ChatGPT
Researchers at Microsoft are working on a new application based on ChatGPT and solutions like Stable Diffusion. Visual ChatGPT is designed to allow users to generate images using text input and then edit individual elements. In their paper “Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models” Chenfei Wu and his co-authors write: “We build a system called Visual ChatGPT, incorporating different Visual Foundation Models, to enable the user to interact with ChatGPT by 1) sending and receiving not only languages but also images 2) providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps” – and, not to forget: “3) providing feedback and asking for corrected results” (Wu et al. 2023). For example, one lets an appropriate prompt create an image of a landscape, with blue sky, hills, meadows, flowers, and trees. Then, one instructs Visual ChatGPT with another prompt to make the hills higher and the sky more dusky and cloudy. One can also ask the program what color the flowers are and color them with another prompt. A final prompt makes the trees in the foreground appear greener. The paper can be downloaded from arxiv.org.