Deep Science: Introspective, detail-oriented and disaster-chasing AIs

Research papers come out far too frequently for anyone to read them all. That’s especially true in the field of machine learning, which now affects (and produces papers in) practically every industry and company. This column aims to collect some of the most relevant recent discoveries and papers — particularly in, but not limited to, artificial intelligence — and explain why they matter.

It takes an emotionally mature AI to admit its own mistakes, and that’s exactly what this project from the Technical University of Munich aims to create. Maybe not the emotion, exactly, but recognizing and learning from mistakes, specifically in self-driving cars. The researchers propose a system in which the car would look at all the times in the past when it has had to relinquish control to a human driver and thereby learn its own limitations — what they call “introspective failure prediction.”

For instance, if there are a lot of cars ahead, the autonomous vehicle’s brain could use its sensors and logic to make a decision de novo about whether an approach would work or whether none will. But the TUM team says that by simply comparing new situations to old ones, it can reach a decision much faster on whether it will need to disengage. Saving six or seven seconds here could make all the difference for a safe handover.

It’s important for robots and autonomous vehicles of all types to be able to make decisions without phoning home, especially in combat, where decisive and concise movements are necessary. The Army Research Lab is looking into ways in which ground and air vehicles can interact autonomously, allowing, for instance, a mobile landing pad that drones can land on without needing to coordinate, ask permission or rely on precise GPS signals.

Their solution, at least for the purposes of testing, is actually rather low tech. The ground vehicle has a landing area on top painted with an enormous QR code, which the drone can see from a fairly long way off. The drone can track the exact location of the pad totally independently. In the future, the QR code could be done away with and the drone could identify the shape of the vehicle instead, presumably using some best-guess logic to determine whether it’s the one it wants.

Illustration showing how an AI tracks cells through a microscope.

Image Credits: Nagoya City University

In the medical world, AI is being put to work not on tasks that are not much difficult but are rather tedious for people to do. A good example of this is tracking the activity of individual cells in microscopy images. It’s not a superhuman task to look at a few hundred frames spanning several depths of a petri dish and track the movements of cells, but that doesn’t mean grad students like doing it.

This software from researchers at Nagoya City University in Japan does it automatically using image analysis and the capability (much improved in recent years) of understanding objects over a period of time rather than just in individual frames. Read the paper here, and check out the extremely cute illustration showing off the tech at right … more research organizations should hire professional artists.

This process is similar to that of tracking moles and other skin features on people at risk for melanoma. While they might see a dermatologist every year or so to find out whether a given spot seems sketchy, the rest of the time they must track their own moles and freckles in other ways. That’s hard when they’re in places like one’s back.

RPA market surges as investors, vendors capitalize on pandemic-driven tech shift

When UIPath filed its S-1 last week, it was a watershed moment for the robotic process automation (RPA) market. The company, which first appeared on our radar for a $30 million Series A in 2017, has so far raised an astonishing $2 billion while still private. In February, it was valued at $35 billion when it raised $750 million in its latest round.

RPA and process automation came to the fore during the pandemic as companies took steps to digitally transform. When employees couldn’t be in the same office together, it became crucial to cobble together more automated workflows that required fewer people in the loop.

RPA has enabled executives to provide a level of workflow automation that essentially buys them time to update systems to more modern approaches while reducing the large number of mundane manual tasks that are part of every industry’s workflow.

When UIPath raised money in 2017, RPA was not well known in enterprise software circles even though it had already been around for several years. The category was gaining in popularity by that point because it addressed automation in a legacy context. That meant companies with deep legacy technology — practically everyone not born in the cloud — could automate across older platforms without ripping and replacing, an expensive and risky undertaking that most CEOs would rather not take.

RPA has enabled executives to provide a level of workflow automation, a taste of the modern. It essentially buys them time to update systems to more modern approaches while reducing the large number of mundane manual tasks that are part of just about every industry’s workflow.

While some people point to RPA as job-elimination software, it also provides a way to liberate people from some of the most mind-numbing and mundane chores in the organization. The argument goes that this frees up employees for higher level tasks.

As an example, RPA could take advantage of older workflow technologies like OCR (optical character recognition) to read a number from a form, enter the data in a spreadsheet, generate an invoice, send it for printing and mailing, and generate a Slack message to the accounting department that the task has been completed.

We’re going to take a deep dive into RPA and the larger process automation space — explore the market size and dynamics, look at the key players and the biggest investors, and finally, try to chart out where this market might go in the future.

Meet the vendors

UIPath is clearly an RPA star with a significant market share lead of 27.1%, according to IDC. Automation Anywhere is in second place with 19.4%, and Blue Prism is third with 10.3%, based on data from IDC’s July 2020 report, the last time the firm reported on the market.

Two other players with significant market share worth mentioning are WorkFusion with 6.8%, and NTT with 5%.

Deep Science: AI adventures in arts and letters

There’s more AI news out there than anyone can possibly keep up with. But you can stay tolerably up to date on the most interesting developments with this column, which collects AI and machine learning advancements from around the world and explains why they might be important to tech, startups or civilization.

To begin on a lighthearted note: The ways researchers find to apply machine learning to the arts are always interesting — though not always practical. A team from the University of Washington wanted to see if a computer vision system could learn to tell what is being played on a piano just from an overhead view of the keys and the player’s hands.

Audeo, the system trained by Eli Shlizerman, Kun Su and Xiulong Liu, watches video of piano playing and first extracts a piano-roll-like simple sequence of key presses. Then it adds expression in the form of length and strength of the presses, and lastly polishes it up for input into a MIDI synthesizer for output. The results are a little loose but definitely recognizable.

Diagram showing how video of a piano player's hands on the keys is turned into MIDI sequences.

Image Credits: Shlizerman, et. al

“To create music that sounds like it could be played in a musical performance was previously believed to be impossible,” said Shlizerman. “An algorithm needs to figure out the cues, or ‘features,’ in the video frames that are related to generating music, and it needs to ‘imagine’ the sound that’s happening in between the video frames. It requires a system that is both precise and imaginative. The fact that we achieved music that sounded pretty good was a surprise.”

Another from the field of arts and letters is this extremely fascinating research into computational unfolding of ancient letters too delicate to handle. The MIT team was looking at “locked” letters from the 17th century that are so intricately folded and sealed that to remove the letter and flatten it might permanently damage them. Their approach was to X-ray the letters and set a new, advanced algorithm to work deciphering the resulting imagery.

Diagram showing x-ray views of a letter and how it is analyzed to virtually unfold it.

Diagram showing X-ray views of a letter and how it is analyzed to virtually unfold it. Image Credits: MIT

“The algorithm ends up doing an impressive job at separating the layers of paper, despite their extreme thinness and tiny gaps between them, sometimes less than the resolution of the scan,” MIT’s Erik Demaine said. “We weren’t sure it would be possible.” The work may be applicable to many kinds of documents that are difficult for simple X-ray techniques to unravel. It’s a bit of a stretch to categorize this as “machine learning,” but it was too interesting not to include. Read the full paper at Nature Communications.

Diagram showing reviews of electric car charge points are analyzed and turned into useful data.

Image Credits: Asensio, et. al

You arrive at a charge point for your electric car and find it to be out of service. You might even leave a bad review online. In fact, thousands of such reviews exist and constitute a potentially very useful map for municipalities looking to expand electric vehicle infrastructure.

Georgia Tech’s Omar Asensio trained a natural language processing model on such reviews and it soon became an expert at parsing them by the thousands and squeezing out insights like where outages were common, comparative cost and other factors.