RT/ Star Trek’s Holodeck recreated using ChatGPT and video game assets

Paradigm

Published in

Paradigm

29 min readApr 30, 2024

Robotics & AI biweekly vol.93, 15th April — 30th April

TL;DR

Engineers leverage AI to bring Star Trek’s Holodeck to reality, generating 3D environments from everyday language prompts.
Modular, spring-like devices are engineered to amplify the power of live muscle fibers, enabling them to fuel biohybrid robots.
Researchers pioneer a privacy-centric camera design, processing and obscuring visual data pre-digitization to safeguard anonymity.
A novel method based on SLAM algorithms enhances a bipedal climbing robot’s ability to map its surroundings on trusses.
Biomedical, mechanical, and aerospace engineers unite to create a hopping robot by affixing a telescopic leg to a quadcopter’s underside.
Nanorobots gain adaptive time delay capabilities, improving collaboration efficiency through research advancements.
AI-driven enhancement accelerates cell imaging by 100 times, improving contrast for evaluating eye diseases like AMD.
DeepMind’s AI specialists teach miniature robots soccer through machine learning techniques.
Orthodontists benefit from AI predictions, ensuring optimal braces fit using virtual patient data.
Biomimetic olfactory chips revolutionize artificial sensor technology, mimicking human olfaction with nanostructured arrays.
And more!

Robotics market

The global market for robots is expected to grow at a compound annual growth rate (CAGR) of around 26 percent to reach just under 210 billion U.S. dollars by 2025.

Size of the global market for industrial and non-industrial robots between 2018 and 2025 (in billion U.S. dollars):

Latest News & Research

Holodeck: Language Guided Generation of 3D Embodied AI Environments

by Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark in arXiv

In Star Trek: The Next Generation, Captain Picard and the crew of the U.S.S. Enterprise leverage the holodeck, an empty room capable of generating 3D environments, to prepare for missions and to entertain themselves, simulating everything from lush jungles to the London of Sherlock Holmes. Deeply immersive and fully interactive, holodeck-created environments are infinitely customizable, using nothing but language: the crew has only to ask the computer to generate an environment, and that space appears in the holodeck.

Today, virtual interactive environments are also used to train robots prior to real-world deployment in a process called “Sim2Real.” However, virtual interactive environments have been in surprisingly short supply.

“Artists manually create these environments,” says Yue Yang, a doctoral student in the labs of Mark Yatskar and Chris Callison-Burch, Assistant and Associate Professors in Computer and Information Science (CIS), respectively. “Those artists could spend a week building a single environment,” Yang adds, noting all the decisions involved, from the layout of the space to the placement of objects to the colors employed in rendering.

That paucity of virtual environments is a problem if you want to train robots to navigate the real world with all its complexities. Neural networks, the systems powering today’s AI revolution, require massive amounts of data, which in this case means simulations of the physical world.

“Generative AI systems like ChatGPT are trained on trillions of words, and image generators like Midjourney and DALLE are trained on billions of images,” says Callison-Burch. “We only have a fraction of that amount of 3D environments for training so-called ‘embodied AI.’ If we want to use generative AI techniques to develop robots that can safely navigate in real-world environments, then we will need to create millions or billions of simulated environments.”

Example outputs of HOLODECK — a large language model powered system, which can generate diverse types of environments (arcade, spa, museum), customize for styles (Victorian-style), and understand fine-grained requirements (“has a cat”, “fan of Star Wars”).

Enter Holodeck, a system for generating interactive 3D environments co-created by Callison-Burch, Yatskar, Yang and Lingjie Liu, Aravind K. Joshi Assistant Professor in CIS, along with collaborators at Stanford, the University of Washington, and the Allen Institute for Artificial Intelligence (AI2). Named for its Star Trek forebear, Holodeck generates a virtually limitless range of indoor environments, using AI to interpret users’ requests. “We can use language to control it,” says Yang. “You can easily describe whatever environments you want and train the embodied AI agents.”

Holodeck leverages the knowledge embedded in large language models (LLMs), the systems underlying ChatGPT and other chatbots. “Language is a very concise representation of the entire world,” says Yang. Indeed, LLMs turn out to have a surprisingly high degree of knowledge about the design of spaces, thanks to the vast amounts of text they ingest during training. In essence, Holodeck works by engaging an LLM in conversation, using a carefully structured series of hidden queries to break down user requests into specific parameters.

Given a text input, HOLODECK generates the 3D environment through multiple rounds of conversation with an LLM.

Just like Captain Picard might ask Star Trek’s Holodeck to simulate a speakeasy, researchers can ask Penn’s Holodeck to create “a 1b1b apartment of a researcher who has a cat.” The system executes this query by dividing it into multiple steps: first, the floor and walls are created, then the doorway and windows. Next, Holodeck searches Objaverse, a vast library of premade digital objects, for the sort of furnishings you might expect in such a space: a coffee table, a cat tower, and so on. Finally, Holodeck queries a layout module, which the researchers designed to constrain the placement of objects, so that you don’t wind up with a toilet extending horizontally from the wall.

To evaluate Holodeck’s abilities, in terms of their realism and accuracy, the researchers generated 120 scenes using both Holodeck and ProcTHOR, an earlier tool created by AI2, and asked several hundred Penn Engineering students to indicate their preferred version, not knowing which scenes were created by which tools. For every criterion — asset selection, layout coherence and overall preference — the students consistently rated the environments generated by Holodeck more favorably.

The researchers also tested Holodeck’s ability to generate scenes that are less typical in robotics research and more difficult to manually create than apartment interiors, like stores, public spaces and offices. Comparing Holodeck’s outputs to those of ProcTHOR, which were generated using human-created rules rather than AI-generated text, the researchers found once again that human evaluators preferred the scenes created by Holodeck. That preference held across a wide range of indoor environments, from science labs to art studios, locker rooms to wine cellars.

Finally, the researchers used scenes generated by Holodeck to “fine-tune” an embodied AI agent. “The ultimate test of Holodeck,” says Yatskar, “is using it to help robots interact with their environment more safely by preparing them to inhabit places they’ve never been before.”

Across multiple types of virtual spaces, including offices, daycares, gyms and arcades, Holodeck had a pronounced and positive effect on the agent’s ability to navigate new spaces. For instance, whereas the agent successfully found a piano in a music room only about 6% of the time when pre-trained using ProcTHOR (which involved the agent taking about 400 million virtual steps), the agent succeeded over 30% of the time when fine-tuned using 100 music rooms generated by Holodeck.

“This field has been stuck doing research in residential spaces for a long time,” says Yang. “But there are so many diverse environments out there — efficiently generating a lot of environments to train robots has always been a big challenge, but Holodeck provides this functionality.”

Enhancing and Decoding the Performance of Muscle Actuators with Flexures

by Naomi Lynch, Nicolas Castro, Tara Sheehan, Laura Rosado, Brandon Rios, Martin Culpepper, Ritu Raman in Advanced Intelligent Systems

Our muscles are nature’s perfect actuators — devices that turn energy into motion. For their size, muscle fibers are more powerful and precise than most synthetic actuators. They can even heal from damage and grow stronger with exercise.

For these reasons, engineers are exploring ways to power robots with natural muscles. They’ve demonstrated a handful of “biohybrid” robots that use muscle-based actuators to power artificial skeletons that walk, swim, pump, and grip. But for every bot, there’s a very different build, and no general blueprint for how to get the most out of muscles for any given robot design.

Now, MIT engineers have developed a spring-like device that could be used as a basic skeleton-like module for almost any muscle-bound bot. The new spring, or “flexure,” is designed to get the most work out of any attached muscle tissues. Like a leg press that’s fit with just the right amount of weight, the device maximizes the amount of movement that a muscle can naturally produce. The researchers found that when they fit a ring of muscle tissue onto the device, much like a rubber band stretched around two posts, the muscle pulled on the spring, reliably and repeatedly, and stretched it five times more, compared with other previous device designs. The team sees the flexure design as a new building block that can be combined with other flexures to build any configuration of artificial skeletons. Engineers can then fit the skeletons with muscle tissues to power their movements.

“These flexures are like a skeleton that people can now use to turn muscle actuation into multiple degrees of freedom of motion in a very predictable way,” says Ritu Raman, the Brit and Alex d’Arbeloff Career Development Professor in Engineering Design at MIT. “We are giving roboticists a new set of rules to make powerful and precise muscle-powered robots that do interesting things.”

Mathematical model characterizing muscle–flexure interaction.

Raman and her colleagues report the details of the new flexure design in a paper. The study’s MIT co-authors include Naomi Lynch ’12, SM ’23; undergraduate Tara Sheehan; graduate students Nicolas Castro, Laura Rosado, and Brandon Rios; and professor of mechanical engineering Martin Culpepper.

When left alone in a petri dish in favorable conditions, muscle tissue will contract on its own but in directions that are not entirely predictable or of much use.

“If muscle is not attached to anything, it will move a lot, but with huge variability, where it’s just flailing around in liquid,” Raman says.

To get a muscle to work like a mechanical actuator, engineers typically attach a band of muscle tissue between two small, flexible posts. As the muscle band naturally contracts, it can bend the posts and pull them together, producing some movement that would ideally power part of a robotic skeleton. But in these designs, muscles have produced limited movement, mainly because the tissues are so variable in how they contact the posts. Depending on where the muscles are placed on the posts, and how much of the muscle surface is touching the post, the muscles may succeed in pulling the posts together but at other times may wobble around in uncontrollable ways.

Raman’s group looked to design a skeleton that focuses and maximizes a muscle’s contractions regardless of exactly where and how it is placed on a skeleton, to generate the most movement in a predictable, reliable way.

“The question is: How do we design a skeleton that most efficiently uses the force the muscle is generating?” Raman says.

The researchers first considered the multiple directions that a muscle can naturally move. They reasoned that if a muscle is to pull two posts together along a specific direction, the posts should be connected to a spring that only allows them to move in that direction when pulled.

“We need a device that is very soft and flexible in one direction, and very stiff in all other directions, so that when a muscle contracts, all that force gets efficiently converted into motion in one direction,” Raman says.

As it turns out, Raman found many such devices in Professor Martin Culpepper’s lab. Culpepper’s group at MIT specializes in the design and fabrication of machine elements such as miniature actuators, bearings, and other mechanisms, that can be built into machines and systems to enable ultraprecise movement, measurement, and control, for a wide variety of applications. Among the group’s precision machined elements are flexures — spring-like devices, often made from parallel beams, that can flex and stretch with nanometer precision.

“Depending on how thin and far apart the beams are, you can change how stiff the spring appears to be,” Raman says.

She and Culpepper teamed up to design a flexure specifically tailored with a configuration and stiffness to enable muscle tissue to naturally contract and maximally stretch the spring. The team designed the device’s configuration and dimensions based on numerous calculations they carried out to relate a muscle’s natural forces with a flexure’s stiffness and degree of movement.

The flexure they ultimately designed is 1/100 the stiffness of muscle tissue itself. The device resembles a miniature, accordion-like structure, the corners of which are pinned to an underlying base by a small post, which sits near a neighboring post that is fit directly onto the base. Raman then wrapped a band of muscle around the two corner posts (the team molded the bands from live muscle fibers that they grew from mouse cells), and measured how close the posts were pulled together as the muscle band contracted.

The team found that the flexure’s configuration enabled the muscle band to contract mostly along the direction between the two posts. This focused contraction allowed the muscle to pull the posts much closer together — five times closer — compared with previous muscle actuator designs.

“The flexure is a skeleton that we designed to be very soft and flexible in one direction, and very stiff in all other directions,” Raman says. “When the muscle contracts, all the force is converted into movement in that direction. It’s a huge magnification.”

Inherently privacy-preserving vision for trustworthy autonomous systems: Needs and solutions

by Adam K. Taras, Niko Sünderhauf, Peter Corke, Donald G. Dansereau in Journal of Responsible Technology

From robotic vacuum cleaners and smart fridges to baby monitors and delivery drones, the smart devices being increasingly welcomed into our homes and workplaces use vision to take in their surroundings, taking videos and images of our lives in the process.

In a bid to restore privacy, researchers at the Australian Centre for Robotics at the University of Sydney and the Centre for Robotics (QCR) at Queensland University of Technology have created a new approach to designing cameras that process and scramble visual information before it is digitised so that it becomes obscured to the point of anonymity.

Known as sighted systems, devices like smart vacuum cleaners form part of the “internet-of-things” — smart systems that connect to the internet. They can be at risk of being hacked by bad actors or lost through human error, their images and videos at risk of being stolen by third parties, sometimes with malicious intent. Acting as a “fingerprint,” the distorted images can still be used by robots to complete their tasks but do not provide a comprehensive visual representation that compromises privacy.

“Smart devices are changing the way we work and live our lives, but they shouldn’t compromise our privacy and become surveillance tools,” said Adam Taras, who completed the research as part of his Honours thesis.
“When we think of ‘vision’ we think of it like a photograph, whereas many of these devices don’t require the same type of visual access to a scene as humans do. They have a very narrow scope in terms of what they need to measure to complete a task, using other visual signals, such as colour and pattern recognition,” he said.

The researchers have been able to segment the processing that normally happens inside a computer within the optics and analogue electronics of the camera, which exists beyond the reach of attackers.

“This is the key distinguishing point from prior work which obfuscated the images inside the camera’s computer — leaving the images open to attack,” said Dr Don Dansereau, Taras’ supervisor at the Australian Centre for Robotics. “We go one level beyond to the electronics themselves, enabling a greater level of protection.”

The researchers tried to hack their approach but were unable to reconstruct the images in any recognisable format. They have opened this task to the research community at large, challenging others to hack their method.

“If these images were to be accessed by a third party, they would not be able to make much of them, and privacy would be preserved,” said Taras.

Dr Dansereau said privacy was increasingly becoming a concern as more devices today come with built-in cameras, and with the possible increase in new technologies in the near future like parcel drones, which travel into residential areas to make deliveries.

“You wouldn’t want images taken inside your home by your robot vacuum cleaner leaked on the dark web, nor would you want a delivery drone to map out your backyard. It is too risky to allow services linked to the web to capture and hold onto this information,” said Dr Dansereau.

The approach could also be used to make devices that work in places where privacy and security are a concern, such as warehouses, hospitals, factories, schools and airports. The researchers hope to next build physical camera prototypes to demonstrate the approach in practice.

“Current robotic vision technology tends to ignore the legitimate privacy concerns of end-users. This is a short-sighted strategy that slows down or even prevents the adoption of robotics in many applications of societal and economic importance. Our new sensor design takes privacy very seriously, and I hope to see it taken up by industry and used in many applications,” said Professor Niko Suenderhauf, Deputy Director of the QCR, who advised on the project.

Professor Peter Corke, Distinguished Professor Emeritus and Adjunct Professor at the QCR who also advised on the project said: “Cameras are the robot equivalent of a person’s eyes, invaluable for understanding the world, knowing what is what and where it is. What we don’t want is the pictures from those cameras to leave the robot’s body, to inadvertently reveal private or intimate details about people or things in the robot’s environment.”

BiCR-SLAM: A multi-source fusion SLAM system for biped climbing robots in truss environments

by Haifei Zhu et al in Robotics and Autonomous Systems

Climbing robots could have many valuable real-world applications, ranging from the completion of maintenance tasks on roofs or other tall structures to the delivery of parcels or survival kits in locations that are difficult to access. To be successfully deployed in real-world settings, however, these robots should be able to effectively sense and map their surroundings, while also accurately predicting where they are located within mapped environments.

Researchers at Guangdong University of Technology recently developed a new method to enhance the ability of a bipedal climbing robot to estimate its state and map its surroundings while climbing a truss (i.e., a triangular system consisting of straight interconnected elements, which could be a bridge, roof or another man-built structure). Their proposed method is based on a simultaneous localization and mapping (SLAM) algorithm.

“Our recent work deploys SLAM methods to a particular biped climbing robot (BiCR), which was developed by our lab, named the Biomimetic and Intelligent Robotics Lab,” Weinan Chen, co-author of the paper, told. “BiCR is an electromechanical system similar to a moving manipulator that is able to move via grippers at both ends and rotate with multiple joints. The robot can be used for installation, maintenance, and inspection in high-altitude and high-risk environments, such as construction site scaffolding and power towers.”

The primary objective of the recent study by Chen and his colleagues was to allow a bipedal climbing robot to autonomously localize itself and create a map of surroundings while navigating environments characterized by truss structures. The SLAM-based approach they propose was specifically applied to BiCR, a bipedal climbing robot previously developed at their lab.

“Since there are many differences in the configurations and working environments of the BiCR and other robots (ground vehicles, UAVs, etc.), this paper proposes a method that fuses robot joint information and environmental information to improve the localization accuracy of the BiCR,” Chen said.

Factor graph outlining the functioning of the team’s model. The encoder gets the forward kinematics factor by encoder dead reckoning. The point cloud from LiDAR is used as the odometry factor. In the pole landmark mapping, the team uses a LiDAR-visual fusion scheme to sense the poles. The constraint relationship between the poles and the grippers is utilized to form the gripping factor.

BiCR-SLAM, the simultaneous localization and mapping system developed by the researchers, uses information about the BiCR robot’s configuration, a LiDAR sensing system and visual data collected by cameras to localize a robot and map the truss it is climbing. The framework can determine the pose of the robotic gripper and create a map of poles surrounding the gripper, so that it can better plan its actions while climbing a truss.

“The framework consists of four components: an encoder dead reckoning, a LiDAR odometry estimation tool, a pole landmark mapping model, and a global optimization technique,” Jianhong Xu, another author of the paper, said. “In the global optimization, we propose a multi-source fusion factor graph to jointly optimize the robot localization and pole landmarks.”

A notable advantage of BiCR-SLAM is that it simultaneously considers information related to the bipedal robot’s joints and data collected by sensors. It thus allows the BiCR robot to map its surroundings and predict its pose, using this information to plan its next moves and safely climb a truss.

“The system can also keep working in some low-texture and single-structure truss environments,” Chen said. “To the best of our knowledge, BiCR-SLAM is the first SLAM system solution to incorporate BiCR’s information for a truss map. This work can advance the development of BiCR and SLAM and improve BiCR’s localization and navigation performance in autonomous operations.”

While Chen and his collaborators specifically designed their SLAM method for the BiCR robot, in the future it could also be adapted and applied to other climbing robots. So far, the team used a single LiDAR system to sense poles within a small sensing range around a robot, but they soon hope to further advance its capabilities using more of these systems along with deep learning techniques.

“In our next studies, we plan to use multiple LiDARs to sense pole objects using deep learning approaches that are free from the calibration of the external parameters between various sensors,” Chen added.
“We also plan to use the sensing configuration with a more extensive scanning range to improve segmentation accuracy and apply this work to the autonomous navigation of climbing robots. Specifically, we will integrate the motion planning part to realize an autonomous navigation function.”

An agile monopedal hopping quadcopter with synergistic hybrid locomotion

by Songnan Bai et al in Science Robotics

A team of biomedical, mechanical, and aerospace engineers from City University of Hong Kong and Hong Kong University of Science and Technology has developed a hopping robot by attaching a spring-loaded telescopic leg to the underside of a quadcopter.

Quadcopters have become widely popular over the past several years for recreational use by the general public, a means of surveillance, and as a research tool — they allow for unprecedented aerial viewing and sometimes for carrying payloads.

Two features of the flying robots that are notably in need of improvement are flight time and payload capacity. In this new study, the researchers working in Hong Kong have devised a means to overcome both problems.

The approach they developed involved adding a spring-loaded telescopic leg (essentially a pogo stick) beneath a standard quadcopter, allowing it to hop when necessary. To allow the leg to work properly, the researchers also added stabilizing capabilities.

Adding the hopping ability reduced battery drain, allowing for longer flight times. It also allowed the quadcopter to lift much heavier loads because it did not have to keep them aloft.

The researchers found that the robot could hop around as desired, moving easily from one location to another. It could also take flight mid-hop and then fly as a normal quadcopter. Testing showed that in addition to clean vertical hops, the robot was capable of hopping on uneven ground and could even hop horizontally, which meant the leg could be used as a bumper of sorts, preventing damage if the robot ran into a wall or other structure.

The researchers describe their robot as being the size of a bird with a low weight, approximately 35 grams. Among possible applications, they suggest it could be used to monitor wildlife, for example, hopping among branches high in the trees. It could also be used in disaster areas, helping in assessments and finding survivors, or as farm monitors, hopping from plant to plant testing soil and moisture levels.

Persistent and responsive collective motion with adaptive time delay

by Zhihan Chen, Yuebing Zheng in Science Advances

In natural ecosystems, the herd mentality plays a major role — from schools of fish, to beehives to ant colonies. This collective behavior allows the whole to exceed the sum of its parts and better respond to threats and challenges.

This behavior inspired researchers from The University of Texas at Austin, and for more than a year they’ve been working on creating “smart swarms” of microscopic robots. The researchers engineered social interactions among these tiny machines so that they can act as one coordinated group, performing tasks better than they would if they were moving as individuals or at random.

“All these groups, flocks of birds, schools of fish and others, each member of the group has this natural inclination to work in concert with its neighbor, and together they are smarter, stronger and more efficient than they would be on their own,” said Yuebing Zheng, associate professor in the Walker Department of Mechanical Engineering and Texas Materials Institute. “We wanted to learn more about the mechanisms that make this happen and see if we can reproduce it.”

Collective motion of light-powered particles with adaptive time delay investigated by an optical feedback-control platform.

In the new paper, Zheng and his team have given these swarms a new trait called adaptive time delay. This concept allows each microrobot within the swarm to adapt its motion to changes in local surroundings. By doing this, the swarm showed a significant increase in responsivity without decreasing its robustness — the ability to quickly respond to any environment change while maintaining the integrity of the swarm.

This finding builds on a novel optical feedback system — the ability to direct these microrobots in a collective way using controllable light patterns. This system was first unveiled in the researchers’ 2023 paper and it facilitated the development of adaptive time delay for microrobots.

The adaptive time delay strategy offers potential for scalability and integration into larger machinery. This approach could significantly enhance the operational efficiency of autonomous drone fleets. Similarly, it could enable conveys of trucks and cars to autonomously navigate extensive highway journeys in unison, with improved responsiveness and increased robustness. The same way schools of fish can communicate and follow each other, so will these machines. As a result, there’s no need for any kind of central control, which takes more data and energy to operate.

“Nanorobots, on an individual basis, are vulnerable to complex environments; they struggle to navigate effectively in challenging conditions such as bloodstreams or polluted waters,” said Zhihan Chen, a Ph.D. student in Zheng’s lab and co-author on the new paper. “This collective motion can help them better navigate a complicated environment and reach the target efficiently and avoid obstacles or threats.”

Having proven this swarm mentality in the lab setting, the next step is to introduce more obstacles. These experiments were conducted in a static liquid solution. Up next, they’ll try to repeat the behavior in flowing liquid. And then they’ll move to replicate it inside an organism.

Once fully developed, these smart swarms could serve as advanced drug delivery forces, able to navigate the human body and elude its defenses to bring medicine to its target. Or, they could operate like iRobot robotic vacuums, but for contaminated water, collectively cleaning every bit of an area together.

Revealing speckle obscured living human retinal cells with artificial intelligence assisted adaptive optics optical coherence tomography

by Vineeta Das, Furu Zhang, Andrew J. Bower, Joanne Li, Tao Liu, Nancy Aguilera, Bruno Alvisio, Zhuolin Liu, Daniel X. Hammer, Johnny Tam in Communications Medicine

Researchers at the National Institutes of Health applied artificial intelligence to a technique that produces high-resolution images of cells in the eye. They report that with AI, imaging is 100 times faster and improves image contrast 3.5-fold. The advance, they say, will provide researchers with a better tool to evaluate age-related macular degeneration (AMD) and other retinal diseases.

“Artificial intelligence helps overcome a key limitation of imaging cells in the retina, which is time,” said Johnny Tam, Ph.D., who leads the Clinical and Translational Imaging Section at NIH’s National Eye Institute.

Tam is developing a technology called adaptive optics (AO) to improve imaging devices based on optical coherence tomography (OCT). Like ultrasound, OCT is noninvasive, quick, painless, and standard equipment in most eye clinics.

Imaging RPE cells with AO-OCT comes with new challenges, including a phenomenon called speckle. Speckle interferes with AO-OCT the way clouds interfere with aerial photography. At any given moment, parts of the image may be obscured. Managing speckle is somewhat similar to managing cloud cover. Researchers repeatedly image cells over a long period of time. As time passes, the speckle shifts, which allows different parts of the cells to become visible. The scientists then undertake the laborious and time-consuming task of piecing together many images to create an image of the RPE cells that’s speckle-free.

Tam and his team developed a novel AI-based method called parallel discriminator generative adverbial network (P-GAN) — a deep learning algorithm. By feeding the P-GAN network nearly 6,000 manually analyzed AO-OCT-acquired images of human RPE, each paired with its corresponding speckled original, the team trained the network to identify and recover speckle-obscured cellular features.

When tested on new images, P-GAN successfully de-speckled the RPE images, recovering cellular details. With one image capture, it generated results comparable to the manual method, which required the acquisition and averaging of 120 images. With a variety of objective performance metrics that assess things like cell shape and structure, P-GAN outperformed other AI techniques. Vineeta Das, Ph.D., a postdoctoral fellow in the Clinical and Translational Imaging Section at NEI, estimates that P-GAN reduced imaging acquisition and processing time by about 100-fold. P-GAN also yielded greater contrast, about 3.5 greater than before.

“Adaptive optics takes OCT-based imaging to the next level,” said Tam. “It’s like moving from a balcony seat to a front row seat to image the retina. With AO, we can reveal 3D retinal structures at cellular-scale resolution, enabling us to zoom in on very early signs of disease.”

While adding AO to OCT provides a much better view of cells, processing AO-OCT images after they’ve been captured takes much longer than OCT without AO.

Tam’s latest work targets the retinal pigment epithelium (RPE), a layer of tissue behind the light-sensing retina that supports the metabolically active retinal neurons, including the photoreceptors. The retina lines the back of the eye and captures, processes, and converts the light that enters the front of the eye into signals that it then transmits through the optic nerve to the brain. Scientists are interested in the RPE because many diseases of the retina occur when the RPE breaks down.

By integrating AI with AO-OCT, Tam believes that a major obstacle for routine clinical imaging using AO-OCT has been overcome, especially for diseases that affect the RPE, which has traditionally been difficult to image.

“Our results suggest that AI can fundamentally change how images are captured,” said Tam. “Our P-GAN artificial intelligence will make AO imaging more accessible for routine clinical applications and for studies aimed at understanding the structure, function, and pathophysiology of blinding retinal diseases. Thinking about AI as a part of the overall imaging system, as opposed to a tool that is only applied after images have been captured, is a paradigm shift for the field of AI.”

Learning agile soccer skills for a bipedal robot with deep reinforcement learning

by Tuomas Haarnoja et al in Science Robotics

A team of AI specialists at Google’s DeepMind has used machine learning to teach tiny robots to play soccer.

As machine-learning-based LLMs make their way into the public domain, computer engineers continue to look for other applications for AI tools. One purpose that has long captured the imagination of scientists and the public at large is the implementation of robots that can carry out traditionally human tasks that are difficult or arduous.

The basic design for most such robots has typically involved using a direct programming or mimicking approach. In this new effort, the research team in the U.K. has applied machine learning to the process and has created tiny robots (approximately 510 mm tall) that are remarkably good at playing soccer.

The process of creating the robots involved developing and training two main reinforcement learning skills in computer simulations — getting up off the ground after falling, for example, or attempting to kick a goal. They then trained the system to play a full, one-on-one version of soccer by training it with a massive amount of video and other data.

Once the virtual robots could play as desired, the system was transferred to several Robotis OP3 robots. The team also added software that allowed the robots to learn and improve as they first tested out individual skills and then when they were placed on a small soccer field and asked to play a match against one another.

In watching their robots play, the research team noted that many of the moves they made were accomplished more smoothly than robots trained using standard techniques. They could get up off the pitch much faster and more elegantly, for example.

The robots also learned to use techniques such as faking a turn to push their opponent into overcompensating, giving them a path toward the goal area. The researchers claim that their AI robots played considerably better than robots trained with any other technique to date.

Deep-Learning-Based Segmentation of Individual Tooth and Bone With Periodontal Ligament Interface Details for Simulation Purposes

by Peidi Xu, Torkan Gholamalizadeh, Faezeh Moshfeghifar, Sune Darkner, Kenny Erleben in IEEE Access

A new tool being developed by the University of Copenhagen and 3Shape will help orthodontists correctly fit braces onto teeth. Using artificial intelligence and virtual patients, the tool predicts how teeth will move, so as to ensure that braces are neither too loose nor too tight.

Many of us remember the feeling of having our braces regularly adjusted and retightened at the orthodontist’s office. And every year, about 30 percent of Danish youth up to the age of 15 wear braces to align crooked teeth. Orthodontists use the knowledge gained from their educations and experience to perform their jobs, but without the possibilities that a computer can provide for predicting final results.

A new tool, developed in a collaboration between the University of Copenhagen’s Department of Computer Science and the company 3Shape, makes it possible to simulate how braces should fit to give the best result without too many unnecessary inconveniences. The tool has been developed with the help of scanned imagery of teeth and bone structures from human jaws, which artificial intelligence then uses to predict how sets of braces should be designed to best straighten a patient’s teeth.

“Our simulation is able to let an orthodontist know where braces should and shouldn’t exert pressure to straighten teeth. Currently, these interventions are based entirely upon the discretion of orthodontists and involve a great deal of trial and error. This can lead to many adjustments and visits to the orthodontist’s office, which our simulation can help reduce in the long run,” says Professor Kenny Erleben, who heads IMAGE (Image Analysis, Computational Modelling and Geometry), a research section at UCPH’s Department of Computer Science.

It’s no wonder that it can be difficult to predict exactly how braces will move teeth, because teeth continue shifting slightly throughout a person’s life. And, these movements are very different from mouth to mouth.

“The fact that tooth movements vary from one patient to another makes it even more challenging to accurately predict how teeth will move for different people. Which is why we’ve developed a new tool and a dataset of different models to help overcome these challenges,” explains Torkan Gholamalizadeh, from 3Shape and a PhD from the Department of Computer Science.

As an alternative to the classic bracket and braces, a new generation of clear braces, known as aligners, has gained ground. Aligners are designed as a transparent plastic cast of the teeth that patients fit over their teeth. Patients must wear aligners for at least 22 hours a day and they need to be swapped for new and tighter sets every two weeks. Because aligners are made of plastic, a person’s teeth also change the contours of the aligner itself, something that the new tool also takes into account.

“As transparent aligners are softer than metal braces, calculating how much force it takes to move the teeth becomes even more complicated. But it’s a factor that we’ve taught our model to take into account, so that one can predict tooth movements when using aligners as well,” says Torkan Gholamalizadeh.

Researchers created a computer model that creates accurate 3D simulations of an individual patient’s jaw, and which dentists and technicians can use to plan the best possible treatment.

To create these simulations, researchers mapped sets of human teeth using detailed CT scans of teeth and of the small, fine structures between the jawbone and the teeth known as peridontal ligaments — a kind of fiber-rich connective tissue that holds teeth firmly in the jaw. This type of precise digital imitation is referred to as a digital twin — and in this context, the researchers built up a database of ‘digital dental patients’. But they didn’t stop there. The researchers’ database also contains other digital patient types that could one day be of use elsewhere in the healthcare sector:

“Right now, we have a database of digital patients that, besides simulating aligner designs, can be used for hip implants, among other things. In the long run, this could make life easier for patients and save resources for society,” says Kenny Erleben.

The area of research that makes use of digital twins is relatively new and, for the time being, Professor Erleben’s database of virtual patients is a world leader. However, the database will need to get even bigger if digital twins are to really take root and have benefit the healthcare sector and society.

“More data will allow us to simulate treatments and adapt medical devices so as to more precisely target patients across entire populations,” says Professor Erleben.

Furthermore, the tool must clear various regulatory hurdles before it is rolled out for orthodontists. This is something that the researchers hope to see in the foreseeable future. A digital twin is a virtual model that lives in the cloud, and is designed to accurately mirror a human being, physical object, system, or real-world process.

“The virtual model can answer what’s happening in the real world, and do so instantly. For example, one can ask what would happen if you pushed on one tooth and get answers with regards to where it would move and how it would affect other teeth. This can be done quickly, so that you know what’s happening. Today, weeks must pass before finding out whether a desired effect has been achieved,” says Professor Kenny Erleben.

Digital twins can be used to plan, design and optimize — and can therefore be used to operate companies, robots, factories and used much more in the energy, healthcare and other sectors.

One of the goals of working with digital twins at the Department of Computer Science is to be able to create simulations of populations, for example, in the healthcare sector. If working with a medical product, virtual people must be exposed to and tested for their reactions in various situations. A simulation provides a picture of what would happen to an individual — and finally, to an entire population.

Biomimetic olfactory chips based on large-scale monolithically integrated nanotube sensor arrays

by Chen Wang, Zhesi Chen, Chak Lam Jonathan Chan, Zhu’an Wan, Wenhao Ye, Wenying Tang, Zichao Ma, Beitao Ren, Daquan Zhang, Zhilong Song, Yucheng Ding, Zhenghao Long, Swapnadeep Poddar, Weiqi Zhang, Zixi Wan, Feng Xue, Suman Ma, Qingfeng Zhou, Geyu Lu, Kai Liu, Zhiyong Fan in Nature Electronics

A research team led by the School of Engineering of the Hong Kong University of Science and Technology (HKUST) has addressed the long-standing challenge of creating artificial olfactory sensors with arrays of diverse high-performance gas sensors. Their newly developed biomimetic olfactory chips (BOC) are able to integrate nanotube sensor arrays on nanoporous substrates with up to 10,000 individually addressable gas sensors per chip, a configuration that is similar to how olfaction works for humans and other animals.

For decades, researchers worldwide have been developing artificial olfaction and electronic noses (e-noses) with the aim of emulating the intricate mechanism of the biological olfactory system to effectively discern complex odorant mixtures. Yet, major challenges of their development lie on the difficulty of miniaturizing the system and increasing its recognition capabilities in determining the exact gas species and their concentrations within complex odorant mixtures.

To tackle these issues, the research team led by Prof. FAN Zhiyong, Chair Professor at HKUST’s Department of Electronic & Computer Engineering and Department of Chemical & Biological Engineering, used an engineered material composition gradient that allows for wide arrays of diverse sensors on one small nanostructured chip. Leveraging the power of artificial intelligence, their biomimetic olfactory chips exhibit exceptional sensitivity to various gases with excellent distinguishability for mixed gases and 24 distinct odors. With a vision to expand their olfactory chip’s applications, the team also integrated the chips with vision sensors on a robot dog, creating a combined olfactory and visual system that can accurately identify objects in blind boxes.

Fabrication process of the biomimetic olfactory chip (BOC).

The development of the biomimetic olfactory chips will not only improve the existing broad applications of the artificial olfaction and e-noses systems in food, environmental, medical and industrial process control etc, but also open up new possibilities in intelligent systems, such as advanced robots and portable smart devices, for applications in security patrols and rescue operations.

For example, in their applications in real-time monitoring and quality control, the biomimetic olfactory chips can be used to detect and analyze specific odors or volatile compounds associated with different stages of industrial processes to ensure safety; detect any abnormal or hazardous gases in environmental monitoring; and identify leakage in pipes to facilitate timely repair.

The technology presented in this study serves as a pivotal breakthrough in the realm of odor digitization. As the scientific community witnesses the triumphant prevalence of visual information digitization, facilitated by the modern and mature imaging sensing technologies, the realm of scent-based information has yet remained untapped due to the absence of advanced odor sensors. The work conducted by Prof. Fan’s team has paved the way for the development of biomimetic odor sensors that possess immense potential. With further advancements, these sensors could find widespread utilization, akin to the ubiquitous presence of miniaturized cameras in cell phones and portable electronics, thereby enriching and enhancing people’s quality of life.

“In the future, with the development of suitable bio-compatible materials, we hope that the biomimetic olfactory chip can also be placed on human body to allow us to smell odor that normally cannot be smelled. It can also monitor the abnormalities in volatile organic molecules in our breath and emitted by our skin, to warn us on potential diseases, reaching further potential of biomimetic engineering,” said Prof. Fan.