Carbon capture is headed for the high seas

Unless you live near a port, you probably don’t think much of the tens of thousands of container ships tearing through the seas, hauling some 1.8 billion metric tons of stuff each year. Yet these vessels run on some of the dirtiest fuel there is, spewing more greenhouse gases than airplanes do in the process. The industry is exploring alternative fuels, and electrification, to solve the problem for next-generation ships, but in the meantime a Y Combinator-backed startup is gearing up to (hopefully) help decarbonize the big boats that’re already in the water.

London-based Seabound is currently prototyping carbon capture equipment that connects to ships’ smokestacks, using a “lime-based approach” to cut carbon emissions by as much as 95%, cofounder and CEO Alisha Fredriksson said in a call with TechCrunch. The startup’s tech works by routing the exhaust into a container that’s filled with porous, calcium oxide pebbles, which in turn “bind to carbon dioxide to form calcium carbonate,”—essentially, limestone, per Fredriksson.

Though carbon capture has yet to really catch on for ships, Seabound is just one of the companies out to prove the tech can eventually scale. Others, including Japanese shipping firm K Line and Netherlands-based Value Maritime, are developing their own carbon-capture tech for ships, typically utilizing the better-established, solvent-based approach (which is increasingly used in factories). Yet this comparably tried-and-true method demands more space and energy aboard ships, because the process of isolating the CO2 happens on the vessel, according to Fredriksson.

In contrast, Seabound intends to process the CO2 on land, if at all. When the ships return from their journey, the limestone can be sold as is or separated via heat. In the latter case, the calcium oxide would be reused and the carbon sold for use or sequestration, per Fredriksson, who previously helped build maritime fuel startup Liquid Wind. Her cofounder, CTO Roujia Wen, previously worked on AI products at Amazon.

Seabound says it has signed six letters of intent with “major shipowners,” and it aims to trial the tech aboard ships beginning next year. To get there, the company has secured $4.4 million in a seed round led by Chris Sacca’s Lowercarbon Capital. Several other firms also chipped in on the deal, including Eastern Pacific Shipping, Emles Venture Partners, Hawktail, Rebel Fund and Soma Capital.

Beyond carbon capture, another Y Combinator-backed startup is setting out to decarbonize existing ships via a novel battery-swapping scheme. New Orleans-based Fleetzero aims to power electrified ships using shipping container-sized battery packs, which could be recharged through a network of charging stations at small ports.

Behold NeuroMechFly, the best fruit fly simulator yet

Drosophila melanogaster, the common fruit fly, is in some ways a simple creature. But in others it is so complex that, as with any form of life, we are only scratching the surface of understanding it. Researchers have taken a major step with D. melanogaster by creating the most accurate digital twin yet — at least in how it moves and, to a certain extent, why.

NeuroMechFly, as the researchers at EPFL call their new model, is a “morphologically realistic biomechanical model” based on careful scans and close observation of actual flies. The result is a 3D model and movement system that, when prompted, does things like walk around or respond to certain basic stimuli pretty much as a real fly would.

To be clear, this isn’t a complete cell-by-cell simulation, which we’ve seen some progress on in the last few years with much smaller microorganisms. It doesn’t simulate hunger, or vision or any sophisticated behaviors — not even how it flies, only how it walks along a surface and grooms itself.

What’s so hard about that, you ask? Well, it’s one thing to approximate this type of movement or behavior and make a little 3D fly that moves more or less like a real one. It’s another to do so to a precise degree in a physically simulated environment, including a biologically accurate exoskeleton, muscles, and a neural network analogous to the fly’s that controls them.

To make this very precise model, they started with a CT scan of a fly, in order to create the morphologically realistic 3D mesh. Then they recorded a fly walking in very carefully controlled circumstances and tracked its precise leg movements. EPFL researchers then needed to model exactly how those movements corresponded to the physically simulated “articulating body parts, such as head, legs, wings, abdominal segments, proboscis, antennae, halteres,” the latter of which is a sort of motion-sensing organ that helps during flight.

Image Credits: Pavan Ramdya (EPFL)

They showed that these worked by bringing the precise motions of the observed fly into a simulation environment and replaying them with the simulated fly — the real movements mapped correctly onto the model’s. Then they demonstrated that they could create new gaits and movements based on these, letting the fly run faster or in a more stable way than what they had observed.

Image Credits: Pavan Ramdya (EPFL)

Not that they’re improving on nature, exactly; they’re just showing that the simulation of the fly’s movement extended to other, more extreme examples. Their model was even robust against virtual projectiles…to a certain extent, as you can see in the animation above.

“These case studies built our confidence in the model. But we are most interested in when the simulation fails to replicate animal behavior, pointing out ways to improve the model,” said EPFL’s Pavan Ramdya, who leads the group that built the simulator (and other D. melanogaster–related models). Seeing where their simulation breaks down shows where there’s work to do.

“NeuroMechFly can increase our understanding of how behaviors emerge from interactions between complex neuromechanical systems and their physical surroundings,” reads the abstract of the paper published last week in Nature Methods. By better understanding how and why a fly moves the way it does, we can understand the systems that underlie it better as well, producing insights in other areas (fruit flies are among the most used experimental animals). And of course if we ever wanted to create an artificial fly for some reason, we would definitely want to know how it works first.

While NeuroMechFly is in some ways a huge advance in the field of digitally simulating life, it’s still (as its creators would be the first to acknowledge) incredibly limited, focusing solely on specific physical processes and not on the many other aspects of the tiny body and mind that make a Drosophila a Drosophila. You can check out the code and perhaps contribute over at GitHub or Code Ocean.

Isabl’s rapid whole-genome analysis opens the playbook for cancer treatment

Every cancer is unique because every person is unique, and one of the most important weapons in any cancer battle is information. Isabl offers that in abundance through rapid sequencing of cancer cells’ entire genomes, potentially showing which therapies will and won’t be effective within days. The company has received a breakthrough designation from the FDA and raised $3 million to bring its approach to market.

The last ten years have brought numerous medical advances due to the commoditization of genomic processes from sequencing to analysis, and cancer treatment is no exception. In fact, because cancer is (though it is a simplification) genetic mutation that has gotten out of hand, understanding those genes is an especially promising line of research.

Panel tests look within the DNA of cancerous cells for mutations in a selection of several hundred genes known to affect prognosis and clinical strategy. For instance, a cancer may have certain mutations that render it susceptible to radiation treatment but resistant to chemo, or vice versa — it’s incredibly helpful to know which.

Isabl co-founder and CEO Elli Papaemmanuil explains that however helpful panel tests are, they’re only the beginning.

“These tests have been designed very carefully to look for the most common mutations, and they have revolutionized cancer diagnosis for patients with common cancers,” she said. “But patients with rare cancers — and what we define as a rare cancer is still a third of patients — don’t benefit from them.”

Even many with common cancers may find that their condition does not involve mutations of these most predictive genes. The relevant genes are somewhere among the other two billion base pairs — current tests only look at about 1 percent of the genome.

While the technology exists to look at that other 99 percent, it has historically been expensive and slow compared with panels, and analysis of the resulting large body of data was likewise difficult and time-consuming. But Isabl’s tests show that it’s definitely worthwhile.

Diagram showing information (groups, individuals, cells) going into an analysis.

Image Credits: Isabl

“It turns out that whole-genome sequencing can detect many more clinically relevant findings — results we can act on today. And what we’ve done is develop a platform that lets us summarize it in a way that doctors can read and use, in a day,” Papaemmanuil said. They call it a “clinically actionable whole genome and transcriptome test,” or cWGTS.

The company was formed out of research Papaemmanuil did at Memorial Sloan Kettering, a cancer care and science nexus in New York. “You could see all these successes from panel testing, then all these patients who weren’t benefiting. But in my lab we had the tech and the know-how,” she recalled. They collected and combined three different datasets: the germline (i.e. patient’s) genomes; the tumor’s genome, and also its transcriptome, essentially what the body produces from transcribing the DNA.

“This gives a really full picture of the profile of the tumor,” she said. “Rather than having a classifier or a model that annotates the mutations [i.e. an automated panel test], we have analytics that integrate those three layers to interpret the role of the mutation and its relevance to each tumor type.”

Though it does own the whole process from sampling to report, Isabl’s key advance is data-based and therefore “there is no technical obstacle to making this solution available today. And we’ve demonstrated we can do it at scale,” Papaemmanuil said. But in the medical world, just because it’s possible doesn’t mean it’s permitted. The FDA has granted the technology with “breakthrough” status, which is a fast track — but even the fast track is slow in the federal government.

While full clinical approval is probably 3-5 years away, that’s much faster than the 5-10 years estimated by the industry for this type of application. But research, both for validation and other purposes, is ongoing, having just published the main paper proving out the process today in Nature Communications. (Though this study focuses on pediatric and young adults cancers, the technique is not limited to those demographics.)

“The seed round is very much to let us do the roadmap — it’s a good starting point for getting the necessary evidence and approvals,” Papaemmanuil said. “We’re already partnering with cancer centers to do studies, and most importantly, to hear from oncologists on what they need and how they’d like the data.”

From left, Isabl co-founders Andrew Kung, Elli Papaemmanuil, and Juan Santiago Medina.

The $3 million round was led by Two Sigma Ventures, with participation from Y Combinator, Box 1, and other firms. Papaemmanuil’s co-founders are CTO Juan Santiago Medina and Andrew Kung.

She also made it clear that Isabl’s research would be conducted openly — “We have a very strong scientific scientific foundation and will be active in publishing the work. The data needs to be both published and made accessible in a form that will enable further research,” she said. The self-reinforcing play of producing and identifying predictive data could prove an incredibly valuable resource across many disciplines.

Isabl is an example of the power of a more or less pure data play in an industry more frequently associated with advances in the lab — though of course it took a lot of lab work to produce in the first place. But when automation of key processes, in this case DNA transcription, enables a huge uptick in data capture, there’s always value to be found in it. In this case that value could save many lives.

48 hours left to save $200 on TC Sessions: Mobility

Attention mobility startups, professionals, investors and enthusiasts! You have just 48 hours remaining to save $200 on TC Sessions: Mobility 2022, our first in-person mobility event since 2019. The event takes place on May 18-19 in San Mateo, California with online analyst commentary on May 20. It’s the must-see mobility event of the year, and you have until Sunday, May 15, to save $200 on a General Admission pass. Once Monday hits, the price for a pass goes up to $495, so take advantage of these savings while they last.

Join us at next week’s event and you’ll walk away with a deeper understanding of trends and market influences that can help you position your business for success. Here’s what serial entrepreneur Parug Demircioglu, CEO at Invemo and a partner at Nito Bikes, told us about his experience.

“We were planning to launch Nito Bikes in the U.S., and the conference was an excellent opportunity to gain a solid grasp of the micromobility space. We heard from industry experts, learned about current and future trends and checked out the competition. I thoroughly enjoyed the experience.”

TC Sessions: Mobility is jam-packed with mobility focused content, from main stage keynotes to topic-driven breakouts and smaller, more intimate roundtable discussions. There’s something for everyone — click here to view the full agenda.

We not only have fantastic speakers and companies onstage, but we have the future of mobility on our expo floor with more than 50 early-stage startups. Get your ticket and get hands-on with the latest and greatest in mobility tech and who knows, maybe you’ll end the week by meeting your next co-founder.

In fact, you can take that next step in finding your future investor or co-founder on our AI-powered CrunchMatch platform. It’s a smart, targeted and efficient way to meet the right people — in person and/or online — and maximize your time.

Can’t make it in person, but want to soak up all the great content and networking? We’ve got that covered with our Online Only ticket offering. Enjoy recorded content that drops on May 20 and meet fellow attendees on CrunchMatch.

TC Sessions: Mobility 2022 takes place in person on May 18-19 in San Mateo, California, followed by an online event on May 20. Buy your pass before May 15 and you’ll save $200. Now, get ready to connect with the influential people who can help you drive your business forward.

DeepMind’s new AI can perform over 600 tasks, from playing games to controlling robots

The ultimate achievement to some in the AI industry is creating a system with artificial general intelligence (AGI), or the ability to understand and learn any task that a human can. Long relegated to the domain of science fiction, it’s been suggested that AGI would bring about systems with the ability to reason, plan, learn, represent knowledge, and communicate in natural language.

Not every expert is convinced that AGI is a realistic goal — or even possible. But it could be argued that DeepMind, the Alphabet-backed research lab, took a toward it this week with the release of an AI system called Gato,

Gato is what DeepMind describes as a “general-purpose” system, a system that can be taught to perform many different types of tasks. Researchers at DeepMind trained Gato to complete 604, to be exact, including captioning images, engaging in dialogue, stacking blocks with a real robot arm, and playing Atari games.

Jack Hessel, a research scientist at the Allen Institute for AI, points out that a single AI system that can solve many tasks isn’t new. For example, Google recently began using a system in Google Search called multitask unified model, or MUM, which can handle text, images, and videos to perform tasks from finding interlingual variations in the spelling of a word to relating a search query to an image. But what is potentially newer, here, Hessel says, is the diversity of the tasks that are tackled and the training method.

DeepMind Gato

DeepMind’s Gato architecture.

“We’ve seen evidence previously that single models can handle surprisingly diverse sets of inputs,” Hessel told TechCrunch via email. “In my view, the core question when it comes to multitask learning … is whether the tasks complement each other or not. You could envision a more boring case if the model implicitly separates the tasks before solving them, e.g., ‘If I detect task A as an input, I will use subnetwork A. If I instead detect task B, I will use different subnetwork B.’ For that null hypothesis, similar performance could be attained by training A and B separately, which is underwhelming. In contrast, if training A and B jointly leads to improvements for either (or both!), then things are more exciting.”

Like all AI systems, Gato learned by example, ingesting billions of words, images from real-world and simulated environments, button presses, joint torques, and more in the form of tokens. These tokens served to represent data in a way Gato could understand, enabling the system to — for example — tease out the mechanics of Breakout, or which combination of words in a sentence might make grammatical sense.

Gato doesn’t necessarily do these tasks well. For example, when chatting with a person, the system often responds with a superficial or factually incorrect reply (e.g., “Marseille” in response to “What is the capital of France?”). In captioning pictures, Gato misgenders people. And the system correctly stacks blocks using a real-world robot only 60% of the time.

But on 450 of the 604 aforementioned tasks, DeepMind claims that Gato performs better than an expert more than half the time.

“If you’re of the mind that we need general [systems], which is a lot of folks in the AI and machine learning area, then [Gato is] a big deal,” Matthew Guzdial, an assistant professor of computing science at the University of Alberta, told TechCrunch via email. “I think people saying it’s a major step towards AGI are overhyping it somewhat, as we’re still not at human intelligence and likely not to get there soon (in my opinion). I’m personally more in the camp of many small models [and systems] being more useful, but there’s definitely benefits to these general models in terms of their performance on tasks outside their training data.”

Curiously, from an architectural standpoint, Gato isn’t dramatically different from many of the AI systems in production today. It shares characteristics in common with OpenAI’s GPT-3 in the sense that it’s a “Transformer.” Dating back to 2017, the Transformer has become the architecture of choice for complex reasoning tasks, demonstrating an aptitude for summarizing documents, generating music, classifying objects in images, and analyzing protein sequences.

DeepMind Gato

The various tasks that Gato learned to complete.

Perhaps even more remarkably, Gato is orders of magnitude smaller than single-task systems including GPT-3 in terms of the parameter count. Parameters are the parts of the system learned from training data and essentially define the skill of the system on a problem, such as generating text. Gato has just 1.2 billion, while GPT-3 has more than 170 billion.

DeepMind researchers kept Gato purposefully small so the system could control a robot arm in real time. But they hypothesize that — if scaled up — Gato could tackle any “task, behavior, and embodiment of interest.”

Assuming this turns out to be the case, several other hurdles would have to be overcome to make Gato superior in specific tasks to cutting-edge single-task systems, like Gato’s inability to learn continuously. Like most Transformer-based systems, Gato’s knowledge of the world is grounded in the training data and remains static. If you ask Gato a date-sensitive question, like the current president of the U.S. chances are it would respond incorrectly.

The Transformer — and Gato, by extension — has another limitation in its context window, or the amount of information the system can “remember” in the context of a given task. Even the best Transformer-based language models can’t write a lengthy essay, much less a book, without failing to remember key details and thus losing track of the plot. The forgetting happens in any task, whether writing or controlling a robot, which is why some experts have called it the “Achilles’ heel” of machine learning.

For these reasons and others, Mike Cook, a member of the Knives & Paintbrushes research collective, cautions against assuming Gato is a path to truly general-purpose AI.

“I think the result is open to misinterpretation, somewhat. It sounds exciting that the AI is able to do all of these tasks that sound very different, because to us it sounds like writing text is very different to controlling a robot. But in reality this isn’t all too different from GPT-3 understanding the difference between ordinary English text and Python code,” Cook told TechCrunch via email. “Gato receives specific training data about these tasks, just like any other AI of its type, and it learns how patterns in the data relate to one another, including learning to associate certain kinds of inputs with certain kinds of outputs. This isn’t to say this is easy, but to the outside observer this might sound like the AI can also make a cup of tea or easily learn another ten or fifty other tasks, and it can’t do that. We know that current approaches to large-scale modelling can let it learn multiple tasks at once. I think it’s a nice bit of work, but it doesn’t strike me as a major stepping stone on the path to anything.”

Simulation meets observation in first image of the supermassive black hole at our galaxy’s center

As countless science and general news outlets have reported today, the image of Sagittarius A*, the supermassive black hole at the center of our galaxy, is a fabulous scientific achievement. But one aspect that hasn’t gotten quite as much attention is the central role played by simulations and synthetic data in the discovery.

If you haven’t read about this awesome science news yet, the Event Horizon Telescope’s own post is a great place to get the gist. Based on years of observations from around the globe, a huge team at over a hundred institutions managed to assemble an image of the black hole around which our galaxy rotates, despite its relative closeness and the interference from light-years worth of dust, nebulae and other vagaries of the void.

But this wasn’t just a matter of pointing the telescope in the right direction at the right time. Black holes can’t be observed directly using something like the Hubble or even the still-warming-up Webb. Instead, all kinds of other direct and indirect measurements of the object must be made — how radiation and gravity bend around it and so on.

This means data from dozens of sources must be assembled and reconciled, itself an enormous task and a big part of why observations made in 2017 are only now being published as a final image, which you can see below. But because this project really has no precedent (even the famous M87* image, though superficially similar, used different processes) it was necessary to essentially test multiple possibilities for how the same observations might have been made.

For instance, if it’s “dark” in the middle, is it because there’s something in the way (and there is — about half the galaxy) or because the hole itself has a hole (and it seems to)? The lack of direct observational data makes it hard to say. (Note that the images here don’t simply show an image based on visible light, but the inferred shape based on countless readings of radiation and other measures.)

This is the first image of Sgr A*, the supermassive black hole at the centre of our galaxy.

Image Credits: EHT

Think about viewing an ordinary object from a distance. From straight on it looks like a circle — but does that mean it’s a ball? A plate? A cylinder viewed end on? Here on Earth you might move your head or take a few steps to the side to get a little more info — but try doing that on a cosmic scale! To get effective parallax on a black hole 27,000 light-years away, you’d need to go quite a distance, and probably break the laws of physics in the process. So the researchers needed to use other methods to determine what shapes and phenomena best explained what little the could observe.

To systematically explore and evaluate the imaging algorithms’ design choices and their effects on the resulting image reconstructions, we generated a series of synthetic data sets. The synthetic data were carefully constructed to match properties of Sgr A* EHT measurements. The use of synthetic data enables quantitative evaluation of image reconstruction by comparison to the known ground truth. This in turn enables evaluation of the design choices and imaging algorithms’ performance.

In other words, they generated oceans of data relating to different possible explanations for their observations, and looked at how predictive these simulated black hole environments were.

Lisa Medeiros from the Institute for Advanced Study, in a very interesting Q&A worth watching in its entirety if you have the time, explained a bit of this in regards to how and why the study looked at the spin of the black hole and how that related to the spin of materials around it, and to the galaxy at large.

“What was really exciting about this new result, compared to what we did in 2019 for M87, was in paper 5 we actually include several simulations where we explore that [i.e. the spin relationships],” she said. “So, there are simulations where the spin axis of the black hole is not aligned with the spin axis of the matter that is swirling around the black hole, and this is a really new and exciting simulation that was not included in the 2019 publications.”

Image Credits: EHT

Naturally these simulations are unbelievably complicated things that require supercomputers to process, and there’s an art and a science to figuring out how many make sense to do, and how close together they should be. In this case the alignment question being looked at is of inherent scientific value, but could also help interpret, for example, the interference caused by gases and dust swirling around the black hole. If the spin is like this, its gravity would affect the dust like this, meaning the readings should be read like this.

“Our simulations, when we look at the simulations compared with the data, we tend to prefer models that are almost pointed at us — not pointed directly at us, but off by about 30 degrees or so,” Medeiros continued. “And that would indicate that the spin axis of the black hole is not aligned with the spin axis of the galaxy as a whole, and if you believe what I said earlier, the disk does prefer to be aligned with the spin axis of the black hole. It does seem like the disk and black hole are aligned, but that neither are aligned with the galaxy.”

In addition to going after specific aspects like this, there was the more general question of what shape (or “underlying source morphology”) would produce the readings they got: essentially the “ball vs. plate” question, but way, way more complicated.

In one of the papers released today, the team describes building seven different potential morphologies for the black hole, reflecting different arrangements of its matter, from ring to disc and even a sort of binary black hole — why not, right? They simulated how these different shapes would produce different results in their instruments, and compared those with a more computationally (and linguistically) demanding “General relativistic Magnetohydrodynamic” or GRMHD simulation.

You can see those in a combination of two images from the paper here:

Images of simulated black holes and how their data might appear to sensors on Earth.

Image Credits: EHT

The idea was to find which of the simulations produced results most like those they actually saw, and while there was no runaway winner, the ring and GRMHD sims (which it must be said were rather ringlike — produced the most consistent results. This informed the way the data was interpreted for the final interpretation of the data and resultant image. (Note that I am broadly summarizing a wildly complex process here.)

Considering these observations were made some five years ago and much has happened since then, there’s still plenty to investigate and more simulations to run. But they had to hit “print” at some point and the image at top is their most informed interpretation of the data produced. As observations and simulations stack up we can no doubt expect even better ones.

In fact, as the University of Texas, San Antonio’s Richard Anantua put it at the Q&A session, you might even give it a shot yourself.

“If you’re in sixth grade, and you can get access to some of your school’s computers, I think there’s EHT imaging, and we have all sorts of pipelines and tools that you can teach your class,” he said, seemingly only half joking. “The data for some of this is public — so you can start working on this now and by the time you’re in college, you pretty much have an image.”

Less than 7 Days until TC Sessions: Mobility

We’re less than seven days away from TC Sessions: Mobility 2022, our first in-person mobility event since 2019. The event takes place on May 18-19 in San Mateo, California with online analyst commentary on May 20. It’s the must-see mobility event of the year, and you have until Monday, May 15, to save $200 on a General Admission pass.

Here’s a quick reminder of what goes on and why you don’t want to miss out. The FOMO is real. You’ll hear and learn from mobility’s leading founders, CEOs, VCs and policymakers as TechCrunch editors shove the hype aside to ask tough, thought-provoking questions during one-on-one interviews, panel discussions and fireside chats.

You’ll walk away with a deeper understanding of trends and market influences that can help you position your business for success. Here’s what serial entrepreneur Parug Demircioglu, CEO at Invemo and a partner at Nito Bikes, told us about his experience.

“We were planning to launch Nito Bikes in the U.S., and the conference was an excellent opportunity to gain a solid grasp of the micromobility space. We heard from industry experts, learned about current and future trends and checked out the competition. I thoroughly enjoyed the experience.”

Don’t skip the smaller, topic-focused roundtable discussions. They let you really dig into a subject, connect with other founders and expand your network. You can find the entire agenda here.

What’s better than watching amazing speakers onstage? How about checking out the latest and greatest in early-stage mobility startups. We’re expecting more than 50 startups in our expo hall, which means opportunities galore to get your hands on the newest in transportation tech. All work and no play? Not a chance.

So you’re inspired by what you’re seeing onstage and in our expo area, and you want to take that next step to finding your next investor or co-founder, but you’re not sure what to do next. We’ve got that covered. Take it from 2019 TC Sessions: Mobility attendee Karin Maake, senior director of communications at FlashParking:

“TC Sessions Mobility offers several big benefits. First, networking opportunities that result in concrete partnerships. Second, the chance to learn the latest trends and how mobility will evolve. Third, the opportunity for unknown startups to connect with other mobility companies and build brand awareness.”

We have networking opportunities aplenty, whether it be on the expo floor, at one of our intimate, topic-driven roundtable sessions with industry leaders, or on our AI-powered CrunchMatch platform. It’s a smart, targeted and efficient way to meet the right people — in person and/or online — and maximize your time.

Can’t make it in person, but want to soak up all the great content and networking? We’ve got that covered with our Online Only ticket offering. Enjoy recorded content on May 20 and meet fellow attendees on CrunchMatch.

TC Sessions: Mobility 2022 takes place in person on May 18-19 in San Mateo, California, followed by an online event on May 20. Buy your pass before May 15 and you’ll save $200. Now, get ready to connect with the influential people who can help you drive your business forward.

Is your company interested in sponsoring or exhibiting at TC Sessions: Mobility 2022? Contact our sponsorship sales team by filling out this form.

Brightseed’s first AI-detected ‘phytonutrient’ comes to market alongside a $68M B round

Who knows what secrets await in the hearts of plants? Brightseed has found a couple of them, anyway, using an AI-based analysis method and will be bringing its first products to market soon with the help of a $68 million B round.

Brightseed’s thesis is simply that there are almost certainly some very helpful and healthy substances waiting in plants, which they call phytonutrients, that have yet to be either discovered or popularized. As many of our medicines and vitamins are plant-derived, it’s hardly a controversial idea — but how can you sort through the countless compounds that plants make and are made up of?

The company’s answer is Forager, a machine learning platform that identifies and categorizes plant compounds at a very fast clip; they’ve already mapped two million, considerably more than are characterized in scientific literature.

Of course they’re not all winners, but the system helps pick out ones that are more likely to be beneficially bioactive, either unstudied or analogous to existing compounds.

This process has historically been done in labs, where you painstakingly test thousands of substances to see if there’s any effect, something that takes years and is horribly wasteful. But as we’re seeing elsewhere in the drug discovery space, AI can cut through the noise and eliminate the vast majority of these substances, surfacing the best of the best. Brightseed claims it was about a hundred times faster, and much cheaper, than standard screening processes.

For example: Brightseed’s processes have identified a pair of compounds, sourced from cast-off hempseed hulls and black pepper, called N-trans-caffeoyltyramine (NCT) and N-trans-feruloyltyramine (NFT, unfortunately) that “showed a remarkable ability to clear fat from the livers of mice and in human cells.” And these are on their way to being bottled as we speak.

Don’t worry, it’s not just some “all natural herbal weight loss” pill in the making. No one is trying to lose a few pounds specifically from their liver. Though the substance will be getting a “generally recognized as safe” rating from the FDA, it’s a long way from being prescribed for acute conditions.

A scientist weighs out a powder in a plant-filled lab.

Image Credits: Brightseed

That said, there is a huge demand for supplements with potentially large benefits and no known issues. Who reading this doesn’t have a couple bottles of this or that for joint health or better sleep? There is evidence backing these up, just as there is for NCT and NFT, which are in human clinical trials after some basic initial mouse models and human safety testing.

“A growing body of scientific research highlights the central role that natural dietary bioactives play in reducing the risk of many chronic diseases and health conditions, and longevity,” co-founder and CEO Jim Flatt told TechCrunch. “The application of AI to large biology data sets is enabling a new golden era of discovery that can solve our most pressing needs in healthcare, and that is where Brightseed is focused today.”

The AI side of things is going to be expanding to cover more areas of health needs and sets of substances, eventually looking at fungi and bacteria as well as plants. “Dozens are in various stages of validation,” Flatt said.

More immediately, though, and powered by the $68 million, “we are expanding our data business and launching our novel bioactive ingredients business, based on our clinically studied natural bioactives.”

The Brightseed team in a lab.

Image Credits: Brightseed

In other words, commercialization; the company cited Danone, Ocean Spray and Pharmavite as existing partners, though not necessarily just for NCT and NFT products. It isn’t just signing over the rights, either; it’ll be producing the substances itself at a facility in Raleigh, where it plans to hire up and then scale up.

“Forager sees deeper into plants and nature than we’ve ever been able to see before,” said Flatt. “That vision, paired with insight and validation, is unlocking new opportunities for companies across the health continuum, and ultimately a future where health solutions are both natural and scientifically validated.”

The latest round added Temasek as lead investor, with participation from all previous investors Lewis & Clark AgriFood, Seed 2 Growth Ventures, Horizons Ventures, CGC Ventures, Fifty Years, Germin8 and AgFunder.

(The headline of this article originally misstated the round as an A round; it’s definitely a B round. My mistake.)

Perceptron: AI bias can arise from annotation instructions

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron (previously Deep Science), aims to collect some of the most relevant recent discoveries and papers — particularly in, but not limited to, artificial intelligence — and explain why they matter.

This week in AI, a new study reveals how bias, a common problem in AI systems, can start with the instructions given to the people recruited to annotate data from which AI systems learn to make predictions. The coauthors find that annotators pick up on patterns in the instructions, which condition them to contribute annotations that then become over-represented in the data, biasing the AI system toward these annotations.

Many AI systems today “learn” to make sense of images, videos, text, and audio from examples that have been labeled by annotators. The labels enable the systems to extrapolate the relationships between the examples (e.g., the link between the caption “kitchen sink” and a photo of a kitchen sink) to data the systems haven’t seen before (e.g., photos of kitchen sinks that weren’t included in the data used to “teach” the model).

This works remarkably well. But annotation is an imperfect approach — annotators bring biases to the table that can bleed into the trained system. For example, studies have shown that the average annotator is more likely to label phrases in African-American Vernacular English (AAVE), the informal grammar used by some Black Americans, as toxic, leading AI toxicity detectors trained on the labels to see AAVE as disproportionately toxic.

As it turns out, annotators’ predispositions might not be solely to blame for the presence of bias in training labels. In a preprint study out of Arizona State University and the Allen Institute for AI, researchers investigated whether a source of bias might lie in the instructions written by data set creators to serve as guides for annotators. Such instructions typically include a short description of the task (e.g. “Label all birds in these photos”) along with several examples.

Parmar et al.

Image Credits: Parmar et al.

The researchers looked at 14 different “benchmark” data sets used to measure the performance of natural language processing systems, or AI systems that can classify, summarize, translate, and otherwise analyze or manipulate text. In studying the task instructions provided to annotators that worked on the data sets, they found evidence that the instructions influenced the annotators to follow specific patterns, which then propagated to the data sets. For example, over half of the annotations in Quoref, a data set designed to test the ability of AI systems to understand when two or more expressions refer to the same person (or thing), start with the phrase “What is the name,” a phrase present in a third of the instructions for the data set.

The phenomenon, which the researchers call “instruction bias,” is particularly troubling because it suggests that systems trained on biased instruction/annotation data might not perform as well as initially thought. Indeed, the coauthors found that instruction bias overestimates the performance of systems and that these systems often fail to generalize beyond instruction patterns.

The silver lining is that large systems, like OpenAI’s GPT-3, were found to be generally less sensitive to instruction bias. But the research serves as a reminder that AI systems, like people, are susceptible to developing biases from sources that aren’t always obvious. The intractable challenge is discovering these sources and mitigating the downstream impact.

In a less sobering paper, scientists hailing from Switzerland concluded that facial recognition systems aren’t easily fooled by realistic AI-edited faces. “Morphing attacks,” as they’re called, involve the use of AI to modify the photo on an ID, passport, or other form of identity document for the purposes of bypassing security systems. The coauthors created “morphs” using AI (Nvidia’s StyleGAN 2) and tested them against four state-of-the art facial recognition systems. The morphs didn’t post a significant threat, they claimed, despite their true-to-life appearance.

Elsewhere in the computer vision domain, researchers at Meta developed an AI “assistant” that can remember the characteristics of a room, including the location and context of objects, to answer questions. Detailed in a preprint paper, the work is likely a part of Meta’s Project Nazare initiative to develop augmented reality glasses that leverage AI to analyze their surroundings.

Meta egocentric AI

Image Credits: Meta

The researchers’ system, which is designed to be used on any body-worn device equipped with a camera, analyzes footage to construct “semantically rich and efficient scene memories” that “encode spatio-temporal information about objects.” The system remembers where objects are and when the appeared in the video footage, and moreover grounds answers to questions a user might ask about the objects into its memory. For example, when asked “Where did you last see my keys?,” the system can indicate that the keys were on a side table in the living room that morning.

Meta, which reportedly plans to release fully-featured AR glasses in 2024, telegraphed its plans for “egocentric” AI last October with the launch of Ego4D, a long-term “egocentric perception” AI research project. The company said at the time that the goal was to teach AI systems to — among other tasks — understand social cues, how an AR device wearer’s actions might affect their surroundings, and how hands interact with objects.

From language and augmented reality to physical phenomena: an AI model has been useful in an MIT study of waves — how they break and when. While it seems a little arcane, the truth is wave models are needed both for building structures in and near the water, and for modeling how the ocean interacts with the atmosphere in climate models.

Image Credits: MIT

Normally waves are roughly simulated by a set of equations, but the researchers trained a machine learning model on hundreds of wave instances in a 40-foot tank of water filled with sensors. By observing the waves and making predictions based on empirical evidence, then comparing that to the theoretical models, the AI aided in showing where the models fell short.

A startup is being born out of research at EPFL, where Thibault Asselborn’s PhD thesis on handwriting analysis has turned into a full-blown educational app. Using algorithms he designed, the app (called School Rebound) can identify habits and corrective measures with just 30 seconds of a kid writing on an iPad with a stylus. These are presented to the kid in the form of games that help them write more clearly by reinforcing good habits.

“Our scientific model and rigor are important, and are what set us apart from other existing applications,” said Asselborn in a news release. “We’ve gotten letters from teachers who’ve seen their students improve leaps and bounds. Some students even come before class to practice.”

Image Credits: Duke University

Another new finding in elementary schools has to do with identifying hearing problems during routine screenings. These screenings, which some readers may remember, often use a device called a tympanometer, which must be operated by trained audiologists. If one is not available, say in an isolated school district, kids with hearing problems may never get the help they need in time.

Samantha Robler and Susan Emmett at Duke decided to build a tympanometer that essentially operates itself, sending data to a smartphone app where it is interpreted by an AI model. Anything worrying will be flagged and the child can receive further screening. It’s not a replacement for an expert, but it’s a lot better than nothing and may help identify hearing problems much earlier in places without the proper resources.

GPS signals could detect tsunamis better and faster than seismic sensors

GPS networks are already a crucial part of everyday life around the world, but an international team of scientists has found a new, potentially life-saving use for them: tsunami warnings.

Researchers from University College London and universities across Japan studied the ability of the GPS network to detect tsunamis, concluding that instruments can indeed detect the destructive waves from space. They’ve also determined that GPS can provide more detailed information than current detection systems — at an extremely low cost — allowing authorities to issue more accurate warnings in advance of a tsunami’s impact on shore.

Tsunamis are created when ocean water is dramatically displaced by earthquakes, landslides, or volcanic eruptions. In the deep ocean, the waves are usually less than a foot high, but as they approach land at speeds up to 500 miles per hour, they grow in height rapidly before inundating a shoreline. GPS networks can detect these waves long before they reach land.

Though the disturbance at the ocean’s surface is slight, it’s enough to create a ripple effect through the atmosphere. As air is pushed upward, an acoustic wave travels all the way to the ionosphere, some 186 miles above the Earth, amplifying in scale as it travels. There, the density of electrons in the ionosphere is reduced by the wave, which directly affects the radio signals sent from GPS satellites to ground receivers. The researchers have developed a way to interpret the changes in radio signals to glean critical information about tsunamis.

Animation of particles disturbing the ionosphere above a tsunami.

Image Credits: University College London

Currently, tsunami warnings are issued based on seismic activity. The warnings are not necessarily very accurate, only indicating that a tsunami may happen at some point in the near future, but providing little other detail.

“In 2011, Japan’s warning system underestimated the [Tōhoku] wave’s height. A better warning may have saved lives and reduced the widespread destruction that occurred, allowing people to get to higher ground and further away from the sea,” Professor Serge Guillas of UCL Statistical Science and the Alan Turing Institute and senior author of the paper said in a press release. “Our study, a joint effort by statisticians and space scientists, demonstrates a new method of detecting tsunamis that is low-cost, as it relies on existing GPS networks, and could be implemented worldwide, complementing other ways of detecting tsunamis and improving the accuracy of warning systems.”

The researchers suspect that if GPS data had been used during the Tōhoku disaster, an accurate tsunami warning could have been issued at least 10 minutes before the wave reached land, potentially giving more people time to prepare for impact.

The team believes that with further research, they will be able to more precisely determine the size and shape of tsunamis based on GPS radio signals.

“From my experience of working for the Japanese government in the past and seeing the damage caused by the tsunami, I believe that if this research comes to fruition, it will surely contribute to saving lives,” said Ph.D. researcher Ryuichi Kanai of UCL Statistical Science and the Alan Turing Institute, who co-authored the paper.

The researchers’ study was published in the journal Natural Hazards and Earth System Sciences last month.