OpenAI reported on September 25, 2023 in its blog: “We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.” (OpenAI Blog, 25 September 2023) The company gives some examples of using ChatGPT in everyday life: “Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.” (OpenAI Blog, 25 September 2023) But the application can not only see, it can also hear and speak: “You can now use voice to engage in a back-and-forth conversation with your assistant. Speak with it on the go, request a bedtime story for your family, or settle a dinner table debate.” (OpenAI Blog, 25 September 2023) More information via openai.com/blog/chatgpt-can-now-see-hear-and-speak.
AI-generated Short Stories
The technology philosopher and writer Oliver Bendel published the book “ARTIFACTS WITH HANDICAPS” on 24 September 2023. The information about the author reads: “Oliver Bendel featuring Ideogram and GPT-4”. In fact, the entire work was created with the help of generative AI. It consists of 11 images, each followed by a short story. This one deals with the imperfection of representation. Once a hand looks like that of a mummy, once a skateboard floats in the air above the wheels. But there is also one or another representation that looks perfect. In this case, the story explains what is different about the person, their history, or their behavior. Ultimately, it is about the otherness and the fact that this is in fact a special feature. The book is freely available and can be distributed and used as desired, with credit given to the authors, i.e. the artist and the AI systems. Oliver Bendel has been writing experimental literature, including digital literature, for 40 years. As of 2007, he was one of the best-known cell phone novelists in Europe. In 2010, he attracted attention with a volume of haiku – “handyhaiku” – in which the poems were printed in the form of QR codes. In 2020, the volume “Die Astronautin” was published, in which the poems are printed in the form of 3D codes. The standard work “Die Struktur der modernen Literatur” (“The Structure of Modern Literature”) by Mario Andreotti devotes two pages to the writer’s work.
AAAI Spring Symposia Return to Stanford
In late August 2023, AAAI announced the continuation of the AAAI Spring Symposium Series, to be held at Stanford University from 25-27 March 2024. Due to staff shortages, the prestigious conference had to be held at the Hyatt Regency SFO Airport in San Francisco in 2023 – and will now return to its traditional venue. The call for proposals is available on the AAAI Spring Symposium Series page. Proposals are due by 6 October 2023. They should be submitted to the symposium co-chairs, Christopher Geib (SIFT, USA) and Ron Petrick (Heriot-Watt University, UK), via the online submission page. Over the past ten years, the AAAI Spring Symposia have been relevant not only to classical AI, but also to roboethics and machine ethics. Groundbreaking symposia were, for example, “Ethical and Moral Considerations in Non-Human Agents” in 2016, “AI for Social Good” in 2017, or “AI and Society: Ethics, Safety and Trustworthiness in Intelligent Agents” in 2018. More information is available at aaai.org/conference/spring-symposia/sss24/.
A Universal Translator Comes
The idea of a Babel Fish comes from the legendary novel or series of novels “The Hitchhiker’s Guide to the Galaxy”. Douglas Adams alluded to the Tower of Babel. In 1997, Yahoo launched a web service for the automatic translation of texts under this name. Various attempts to implement the Babel Fish in hardware and software followed. Meta’s SeamlessM4T software can handle almost a hundred languages. In a blog post, the American company refers to the work of Douglas Adams. “M4T” stands for “Massively Multilingual and Multimodal Machine Translation”. Again, it is a language model that makes spectacular things possible. It has been trained on four million hours of raw audio. A demo is available at seamless.metademolab.com/demo. The first step is to record a sentence. The sentence is displayed as text. Then select the language you want to translate into, for example Japanese. The sentence is displayed again in text form and, if desired, in spoken language. A synthetic voice is used. You can also use your own voice, but this is not yet integrated into the application. A paper by Meta AI and UC Berkeley can be downloaded here.
AI-based Robots for the Disposal of Discarded Ammunition
The Robotics Innovation Center (RIC) at the German Research Centre for Artificial Intelligence (DFKI) in Bremen wants to clear the seabed of discarded ammunition in the North Sea and Baltic Sea. This was reported by the online magazine Golem on 14 June 2023. The researchers are using the autonomous underwater vehicle Cuttlefish, developed at DFKI, as a test platform. According to Golem, the robot has been equipped with two deep-sea-capable gripper systems. These are designed to enable flexible handling of objects under water, even difficult objects such as explosive devices. The AI-based control system allows the robot to change its buoyancy and centre of gravity during the dive. According to the online magazine, the AUV is equipped with numerous sensors such as cameras, sonars, laser scanners, and magnetometers. This is how it is supposed to approach an object without colliding with it. The system will certainly be effective – whether it is efficient remains to be seen.
Self-driving Cars Stopped by Fog
“Five self-driving vehicles blocked traffic early Tuesday morning in the middle of a residential street in San Francisco’s Balboa Terrace neighborhood, apparently waylaid by fog that draped the southwestern corner of the city.” (San Francisco Chronicle, 11 April 2023) The San Francisco Chronicle reported this in an article published on April 11, 2023. The fact that fog is a problem for Waymo’s vehicles has been known to the company for some time. A blog post from 2021 states: “Fog is finicky – it comes in a range of densities, it can be patchy, and can affect a vehicle’s sensors differently.” (Blog Waymo, 15 November 2021) Against this background, it is surprising that vehicles are allowed to roll through the city unaccompanied, especially since Frisco – this name comes from sailors – is very often beset by fog. But fog is not the only challenge for the sensors of self-driving cars. A thesis commissioned and supervised by Prof. Dr. Oliver Bendel presented dozens of phenomena and methods that can mislead sensors of self-driving cars. The San Francisco Chronicle article “Waymo says dense S.F. fog brought 5 vehicles to a halt on Balboa Terrace street” can be accessed at www.sfchronicle.com/bayarea/article/san-francisco-waymo-stopped-in-street-17890821.php.
Bar Robots for Well-being of Guests
From March 27-29, 2023, the AAAI 2023 Spring Symposia will feature the symposium “Socially Responsible AI for Well-being” by Takashi Kido (Teikyo University, Japan) and Keiki Takadama (The University of Electro-Communications, Japan). The venue is usually Stanford University. For staffing reasons, this year the conference will be held at the Hyatt Regency in San Francisco. On March 28, Prof. Dr. Oliver Bendel and Lea Peier will present their paper “How Can Bar Robots Enhance the Well-being of Guests?”. From the abstract: “This paper addresses the question of how bar robots can contribute to the well-being of guests. It first develops the basics of service robots and social robots. It gives a brief overview of which gastronomy robots are on the market. It then presents examples of bar robots and describes two models used in Switzerland. A research project at the School of Business FHNW collected empirical data on them, which is used for this article. The authors then discuss how the robots could be improved to increase the well-being of customers and guests and better address their individual wishes and requirements. Artificial intelligence can play an important role in this. Finally, ethical and social problems in the use of bar robots are discussed and possible solutions are suggested to counter these.” More Information via aaai.org/conference/spring-symposia/sss23/.
GPT-4 as Multimodal Model
GPT-4 was launched by OpenAI on March 14, 2023. “GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.” (Website OpenAI) On its website, the company explains the multimodal options in more detail: “GPT-4 can accept a prompt of text and images, which – parallel to the text-only setting – lets the user specify any vision or language task. Specifically, it generates text outputs (natural language, code, etc.) given inputs consisting of interspersed text and images.” (Website OpenAI) The example that OpenAI gives is impressive. An image with multiple panels was uploaded. The prompt is: “What is funny about this image? Describe it panel by panel”. This is exactly what GPT-4 does and then comes to the conclusion: “The humor in this image comes from the absurdity of plugging a large, outdated VGA connector into a small, modern smartphone charging port.” (Website OpenAI) The technical report is available via cdn.openai.com/papers/gpt-4.pdf.
Bard Comes into the World
Sundar Pichai, the CEO of Google and Alphabet, announced the answer to ChatGPT in a blog post dated February 6, 2023. According to him, Bard is an experimental conversational AI service powered by LaMDA. It has been opened to trusted testers and will be made available to the public in the coming weeks. “Bard seeks to combine the breadth of the world’s knowledge with the power, intelligence and creativity of our large language models. It draws on information from the web to provide fresh, high-quality responses. Bard can be an outlet for creativity, and a launchpad for curiosity, helping you to explain new discoveries from NASA’s James Webb Space Telescope to a 9-year-old, or learn more about the best strikers in football right now, and then get drills to build your skills.” (Sundar Pichai 2023) In recent weeks, Google had come under heavy pressure from OpenAI’s ChatGPT. It was clear that they had to present a comparable application based on LaMDA as soon as possible. In addition, Baidu wants to launch the Ernie Bot, which means another competing product. More information via blog.google/technology/ai/bard-google-ai-search-updates/.
AI for Well-being
As part of the AAAI 2023 Spring Symposia in San Francisco, the symposium “Socially Responsible AI for Well-being” is organized by Takashi Kido (Teikyo University, Japan) and Keiki Takadama (The University of Electro-Communications, Japan). The AAAI website states: “For our happiness, AI is not enough to be productive in exponential growth or economic/financial supremacies but should be socially responsible from the viewpoint of fairness, transparency, accountability, reliability, safety, privacy, and security. For example, AI diagnosis system should provide responsible results (e.g., a high-accuracy of diagnostics result with an understandable explanation) but the results should be socially accepted (e.g., data for AI (machine learning) should not be biased (i.e., the amount of data for learning should be equal among races and/or locations). Like this example, a decision of AI affects our well-being, which suggests the importance of discussing ‘What is socially responsible?’ in several potential situations of well-being in the coming AI age.” (Website AAAI) According to the organizers, the first perspective is “(Individually) Responsible AI”, which aims to clarify what kinds of mechanisms or issues should be taken into consideration to design Responsible AI for well-being. The second perspective is “Socially Responsible AI”, which aims to clarify what kinds of mechanisms or issues should be taken into consideration to implement social aspects in Responsible AI for well-being. More information via www.aaai.org/Symposia/Spring/sss23.php#ss09.