When big AI labs refuse to open source their models, the community steps in

Benchmarks are as important a measure of progress in AI as they are for the rest of the software industry. But when the benchmark results come from corporations, secrecy very often prevents the community from verifying them.

For example, OpenAI granted Microsoft, with which it has a commercial relationship, the exclusive licensing rights to its powerful GPT-3 language model. Other organizations say that the code they use to develop systems is dependent on impossible-to-release internal tooling and infrastructure or uses copyrighted data sets. While motivations can be ethical in nature — OpenAI initially declined to release GPT-2, GPT-3’s predecessor, out of concerns that it might be misused — by the effect is the same. Without the necessary code, it’s far harder for third-party researchers to verify an organization’s claims.

“This isn’t really a sufficient alternative to good industry open-source practices,” Columbia computer science Ph.D. candidate Gustaf Ahdritz told TechCrunch via email. Ahdritz is one of the lead developers of OpenFold, an open source version of DeepMind’s protein structure-predicting AlphaFold 2. “It’s difficult to do all of the science one might like to do with the code DeepMind did release.”

Some researchers go so far as to say that withholding a system’s code “undermines its scientific value.” In October 2020, a rebuttal published in the journal Nature took issue with a cancer-predicting system trained by Google Health, the branch of Google focused on health-related research. The coauthors noted that Google withheld key technical details including a description of how the system was developed, which could significantly impact its performance.


Image Credits: OpenFold

In lieu of change, some members of the AI community, like Ahdritz, have made it their mission to open source the systems themselves. Working from technical papers, these researchers painstakingly try to recreate the systems, either from scratch or building on the fragments of publicly available specifications.

OpenFold is one such effort. Begun shortly after DeepMind announced AlphaFold 2, the goal is to verify that AlphaFold 2 can be reproduced from scratch and make available components of the system that might be useful elsewhere, according to Ahdritz.

“We trust that DeepMind provided all the necessary details, but … we don’t have [concrete] proof of that, and so this effort is key to providing that trail and allowing others to build on it,” Ahdritz said. “Moreover, originally, certain AlphaFold components were under a non-commercial license. Our components and data — DeepMind still hasn’t published their full training data — are going to be completely open-source, enabling industry adoption.”

OpenFold isn’t the only project of its kind. Elsewhere, loosely-affiliated groups within the AI community are attempting implementations of OpenAI’s code-generating Codex and art-creating DALL-E, DeepMind’s chess-playing AlphaZero, and even AlphaStar, a DeepMind system designed to play the real-time strategy game StarCraft 2. Among the more successful are EleutherAI and AI startup Hugging Face’s BigScience, open research efforts that aim to deliver the code and datasets needed to run a model comparable (though not identical) to GPT-3.

Philip Wang, a prolific member of the AI community who maintains a number of open source implementations on GitHub, including one of OpenAI’s DALL-E, posits that that open-sourcing these systems reduces the need for researchers to duplicate their efforts.

“We read the latest AI studies, like any other researcher in the world. But instead of replicating the paper in a silo, we implement it open source,” Wang said. “We are in an interesting place at the intersection of information science and industry. I think open source is not one-sided and benefits everybody in the end. It also appeals to the broader vision of truly democratized AI not beholden to shareholders.”

Brian Lee and Andrew Jackson, two Google employees, worked together to create MiniGo, a replication of AlphaZero. While not affiliated with the official project, Lee and Jackson — being at Google, DeepMind’s initial parent company — had the advantage of access to certain proprietary resources.


Image Credits: MiniGo

“[Working backward from papers is] like navigating before we had GPS,” Lee, a research engineer at Google Brain, told TechCrunch via email. “The instructions talk about landmarks you ought to see, how long you ought to go in a certain direction, which fork to take at a critical juncture. There’s enough detail for the experienced navigator to find their way, but if you don’t know how to read a compass, you’ll be hopelessly lost. You won’t retrace the steps exactly, but you’ll end up in the same place.”

The developers behind these initiatives, Ahdritz and Jackson included, say that they’ll not only help to demonstrate whether the systems work as advertised but enable new applications and better hardware support. Systems from large labs and companies like DeepMind, OpenAI, Microsoft, Amazon, and Meta are typically trained on expensive, proprietary datacenter servers with far more compute power than the average workstation, adding to the hurdles of open-sourcing them.

“Training new variants of AlphaFold could lead to new applications beyond protein structure prediction, which is not possible with DeepMind’s original code release because it lacked the training code — for example, predicting how drugs bind proteins, how proteins move, and how proteins interact with other biomolecules,” Ahdritz  said. “There are dozens of high-impact applications that require training new variants of AlphaFold or integrating parts of AlphaFold into larger models, but the lack of training code prevents all of them.”

“These open-source efforts do a lot to disseminate the “working knowledge” about how these systems can behave in non-academic settings,” Jackson added. “The amount of compute needed to reproduce the original results [for AlphaZero] is pretty high. I don’t remember the number off the top of my head, but it involved running about a thousand GPUs for a week. We were in a pretty unique position to be able to help the community try these models with our early access to the Google Cloud Platform’s TPU product, which was not yet publicly available.”

Implementing proprietary systems in open source is fraught with challenges, especially when there’s little public information to go on. Ideally, the code is available in addition to the data set used to train the system and what are called weights, which are responsible for transforming data fed to the system into predictions. But this isn’t often the case.

For example, in developing OpenFold, Ahdritz and team had to gather information from the official materials and reconcile the differences between different sources, including the source code, supplemental code, and presentations that DeepMind researchers gave early on. Ambiguities in steps like data prep and training code led to false starts, while a lack of hardware resources necessitated design compromises.

“We only really get a handful of tries to get this right, lest this drag on indefinitely. These things have so many computationally intensive stages that a tiny bug y can greatly set us back, such that we had to retrain the model and also regenerate lots of training data,” Ahdritz said. “Some technical details that work very well for [DeepMind] don’t work as easily for us because we have different hardware … In addition, ambiguity about what details are critically important and which ones are selected without much thought makes it hard to optimize or tweak anything and locks us in to whatever (sometimes awkward) choices were made in the original system.”

So, do the labs behind the proprietary systems, like OpenAI, care that their work is being reverse-engineered and even used by startups to launch competing services? Evidently not. Ahdritz says the fact that DeepMind in particular releases so many details about its systems suggests it implicitly endorses the efforts, even if it hasn’t said so publicly.

“We haven’t received any clear indication that DeepMind disapproves or approves of this effort,” Ahdritz said. “But certainly, no one has tried to stop us.”

Everstream Analytics secures new cash to predict supply chain disruptions

Everstream Analytics, a supply chain insights and risk analytics startup, today announced that it raised $24 million in a Series A round led by Morgan Stanley Investment Management with participation from Columbia Capital, StepStone Group, and DHL. CEO Julie Gerdeman said that the new money would be used to “propel technology innovation” and “further global expansion.”

Everstream, which was launched as Resilience360 and Riskpulse, provides predictive insights for supply chains. Drawing on billions of supply chain interactions, the company applies AI to assess materials, suppliers, and facilities for risk.

Plenty of startups claim to do this, including Backbone, Altana, and Craft. Project44 recently raised $202 million to expand its own set of predictive analytics tools, including estimated time of arrivals for shipments.

But what sets Everstream apart is its access to proprietary data that goes beyond what competitors are leveraging, according to Gerdeman.

“[Everstream provides] visibility into essentially every network, component, ingredient, ​and raw material around the world,” she told TechCrunch via email. “Connected business networks, scalable computing power, graph data base technology, and advances in AI algorithms enable Everstream to combine massive volumes of public and proprietary data to build a model of the global supply chain.”

As new data enters the platform, Everstream, which integrates with existing enterprise resource planning systems, retrains its AI system to reflect the current supply chain environment. Customers receive proactive warnings based on signals including financial reports and news of weather events, environmental and sustainability risks, and natural disasters.

For example, Everstream can warn businesses when it might be difficult to source a specific material and how likely customers are to cancel, increase, or move forward orders. It can also provide suggestions for optimizing logistics operations based on metrics such as timeliness, quality, and cost of goods shipped.

“Everstream’s AI-based models and preset dynamic thresholds can be used to predict disruptions and prescribe recommendations to mitigate risk and deliver better results to the business needs,” Gerdeman added. “[Everstream] identifies the most impactful risks in the network and creates targeted insights-based on inputs from the … platform, including incident monitoring, predictive risks, ESG, and shipment data — slashing time, cost, and complexity.”

Most would argue these are useful tools at a time when uncertainty continues to dog the supply chain — assuming Everstream’s AI systems perform as well as advertised. While some surveys show tepid adoption of predictive analytics among the supply chain industry, Gartner recently found that 87% of supply chain professionals plan to invest in “resilience” within the next two years, including automation and AI.

Investors seemingly see the potential. Last year was a banner year for venture-backed supply chain management companies, which saw $11.3 billion in funding, according to Crunchbase.

For its part, Everstream claims its customer base has grown 550% to date in 2022 and now includes brands like AB InBev, Google, Bayer, Schneider Electric, Unilever, and Whirlpool. Mum’s the word on concrete revenue numbers; Gerdeman demurred when asked about them.

“The pandemic has illustrated why deep visibility is needed not only into a company’s network, but down to the component, ingredient, ​and raw material level, because it doesn’t matter if the company’s supplier is operational if their suppliers are not,” Gerdeman said. “Everstream’s insights are not only predictive in nature, but they are also prescriptive – meaning we not only tell clients what’s coming next, but also what they should do about it.”

Everstream, which employs 100 people, has raised $70 million in equity and debt funding so far.

Heartex raises $25M for its AI-focused, open source data labeling platform

Heartex, a startup that bills itself as an “open source” platform for data labeling, today announced that it landed $25 million in a Series A funding round led by Redpoint Ventures. Unusual Ventures, Bow Capital, and Swift Ventures also participated, bringing Heartex’s total capital raised to $30 million.

Co-founder and CEO Michael Malyuk said that the new money will be put toward improving Heartex’s product and expanding the size of the company’s workforce from 28 people to 68 by the end of the year.

“Coming from engineering and machine learning backgrounds, [Heartex’s founding team] knew what value machine learning and AI can bring to the organization,” Malyuk told TechCrunch via email. “At the time, we all worked at different companies and in different industries yet shared the same struggle with model accuracy due to poor-quality training data. We agreed that the only viable solution was to have internal teams with domain expertise be responsible for annotating and curating training data. Who can provide the best results other than your own experts?”

Software developers Malyuk, Maxim Tkachenko, and Nikolay Lyubimov co-founded Heartex in 2019. Liubimov was a senior engineer at Huawei before moving to Yandex, where he worked as a backend developer on speech technologies and dialogue systems.


Heartex’s dashboard.

The ties to Yandex, a company sometimes referred to as the “Google of Russia”, might unnerve some — particularly in light of accusations by the European Union that Yandex’s news division played a sizeable role in spreading Kremlin propaganda. Heartex has an office in San Francisco, California, but several of the company’s engineers are based in the former Soviet Republic of Georgia.

When asked, Heartex says that it doesn’t collect any customer data and open sources the core of its labeling platform for inspection. “We’ve built a data architecture that keeps data private on the customer’s storage, separating the data plane and control plane,” Malyuk added. “Regarding the team and their locations, we’re a very international team with no current members based in Russia.”

Setting aside its geopolitical affiliations, Heartex aims to tackle what Malyuk sees as a major hurdle in the enterprise: extracting value from data by leveraging AI. There’s a growing wave of businesses aiming to become ‘data-centric’ — Gartner recently reported that enterprise use of AI grew a whopping 270% over the past several years. But many organizations are struggling to use AI to its fullest.

“Having reached a point of diminishing returns in algorithm-specific development, enterprises are investing in perfecting data labeling as part of their strategic, data-centric initiatives,” Malyuk said. “This is a progression from earlier development practices that focused almost exclusively on algorithm development and tuning.”

If, as Malyuk asserts, data labeling is receiving increased attention from companies pursuing AI, it’s because labeling is a core part of the AI development process. Many AI systems “learn” to make sense of images, videos, text and audio from examples that have been labeled by teams of human annotators. The labels enable the systems to extrapolate the relationships between the examples (e.g., the link between the caption “kitchen sink”: and a photo of a kitchen sink) to data the systems haven’t seen before (e.g., photos of kitchen sinks that weren’t included in the data used to “teach” the model).

The trouble is, not all labels are created equal. Labeling data like legal contracts, medical images, and scientific literature requires domain expertise that not just any annotator has. And — being human — annotators make mistakes. In an MIT analysis of popular AI data sets, researchers found mislabeled data like one breed of dog confused for another and an Ariana Grande high note categorized as a whistle.

Malyuk makes no claim that Heartex completely solves these issues. But in an interview, he explained that the platform is designed to support labeling workflows for different AI use cases, with features that touch on data quality management, reporting, and analytics. For example, data engineers using Heartex can see the names and email addresses of annotators and data reviewers, which are tied to labels that they’ve contributed or audited. This helps to monitor label quality and — ideally — to fix problems before they impact training data.

“The angle for the C-suite is pretty simple. It’s all about improving production AI model accuracy in service of achieving the project’s business objective,” Malyuk said. “We’re finding that most C-suite managers with AI, machine learning, and/or data science responsibilities have confirmed through experience that, with more strategic investments in people, processes, technology, and data, AI can deliver extraordinary value to the business across a multitude of diverse use cases. We also see that success has a snowball effect. Teams that find success early are able to create additional high-value models more quickly building not just on their early learnings but also on the additional data generated from using the production models.”

In the data labeling toolset arena, Heartex competes with startups including AIMMO, Labelbox, Scale AI, and Snorkel AI, as well as Google and Amazon (which offers data labeling products through Google Cloud and SageMaker, respectively). But Malyuk believes that Heartex’s focus on software as opposed to services sets it apart from the rest. Unlike many of its competitors, the startup doesn’t sell labeling services through its platform.

“As we’ve built a truly horizontal solution, our customers come from a variety of industries. We have small startups as customers, as well as several Fortune 100 companies. [Our platform] has been adopted by over 100,000 data scientists globally,” Malyuk said, while declining to reveal revenue numbers. “[Our customers] are establishing internal data annotation teams and buying [our product] because their production AI models aren’t performing well and recognize that poor training data quality is the primary cause.”

Glean aims to help employees surface info across sprawling enterprise systems

At enterprises of a certain size, keeping track of data including apps, employees, and projects becomes increasingly challenging. According to McKinsey, employees spend 1.8 hours every day — 9.3 hours per week, on average — searching for and gathering information. The veracity of metrics like these has been challenged over the years. But it’s reasonable to say that knowledge workers in particular devote a sizeable chunk of their workdays to sifting through data, whether to find basic contact info or domain-specific files.

The emergence in recent years of AI algorithms that can parse natural language has fueled the rise of platforms that can shrink that chunk. At least, that’s the assertion of Arvind Jain, a former Google engineer and Rubrik co-founder, whose startup, Glean, employs AI to power a unified search experience across all apps used at a company.

Arvind, a former Google engineer, began work on Glean at Rubrik, the cloud data management company. In Rubrik’s annual employee pulse survey, Arvind observed that one of the biggest productivity challenges was workers not being able to find the information they needed — whether a specific document or subject-matter expert.

“Engineers were spending too much time outside code; account managers couldn’t find the latest research or presentation needed to close deals; new employees took too long to ramp,” Arvind told TechCrunch in an email interview. “This growing problem was not only destroying productivity, but also sapping energy and detracting from the employee experience.”

Other companies were experiencing the same issues, as it turned out — exacerbated by their embrace of the cloud and distributed work setups. Sensing an opportunity, Arvind managed to convince former engineering lead Piyush Prahladka, ex-Facebook and -Microsoft engineer T.R. Vishwanath, and Tony Gentilcore, previously at AT&T and Google, to build the prototype for Glean.


Image Credits: Glean

Fast forward to 2022, and Glean has over 70 customers including Okta, Confluent, Samsara, Grammarly, and Outreach. Reflecting growth since its 2019 founding, Glean today closed a $100 million Series C round led by Sequoia with participation from the Slack Fund at a $1 billion valuation post-money.

On the one hand, Glean’s technology isn’t incredibly novel. Services like Microsoft’s SharePoint Syntex, Amazon Kendra, and Google Cloud Search tap natural language processing technology to understand not only document minutia but the searches employees across an organization might perform, like “How do I invest in our company’s 401k?” They fall under the banner of “cognitive search,” a product category encompassing search tools that implement AI to ingest, understand, organize, and query data from multiple sources.

But Arvind claims that Glean is simpler to set up and use than the competition, including smaller outfits like Coveo, Elastic, Lucidworks, and Mindbreeze.

“[Glean] takes less than two hours for initial setup, and doesn’t require any engineering talent or manual fine-tuning for implementation,” Arvind said. “And Glean has seamless workflow integration, whether you’re using Glean in the web app, new tab page, sidebar search, native search, or Slack commands.”

Arvind notes that one of the major problems in enterprise search is the diversity of data sources, like knowledge bases, tickets, chat messages, and pull requests. To address this, Glean uses AI systems to predict for every query the relative importance of content across these sources, training separate systems on customer data to learn company-specific jargon, concepts, entities, and acronyms. To deliver personalized results, as well as proactively recommend documents, Glean accounts for variables like a person’s role, work patterns, job function, and specific projects and responsibilities in its indexing.

“Glean’s biggest competitor is the status quo: employees continuing to deal with the complexity of finding the information and people they need at work. In a typical sales process, potential customers often require Glean to first start with a pilot to demonstrate how much value implementing Glean can provide,” Arvind said. “Glean uses the user’s information to personalize the search experience for the user along several dimensions — for example, for the same query an engineer may see very different results than a sales executive. Glean also uses the user’s activity, such as clicks on search results, to improve the search relevance.”

Because Glean acts like a layer on top of all other apps a company uses, it can double as a work portal from where managers can create and share “shortlinks” to resources (e.g., “go/benefits” instead of a long URL). Management can also share news, handbooks, expense policies, KPI dashboards, and company OKRs and expose the company’s people directory, which shows who people are and what projects they’re working on.


Image Credits: Glean

In a dashboard, Glean surfaces answers to frequently asked questions and devotes a space to links and descriptions of those links that can be shared with the wider organization. A control panel allows users to run data loss prevention reports on the sources from which Glean draws and check for compliance with GDPR, CCPA, and other privacy regulations.

“Prospective customers are often anxious about providing Glean with access to all of their data, which is why Glean has spent so much time ensuring it respects all privacy controls from the applications that it integrates with, and invested heavily in security certifications and processes from the very beginning … When [a] user deletes a document in the underlying application (Slack, Drive, Office, etc.), the document gets deleted from the Glean system as well,” Arvind said. “Glean customers can choose to have Glean host them or self-host Glean to keep their information within their environment. With Glean’s e-discovery and data loss prevention tools, companies can be confident about what data is available within their organization — and how that information is used.”

No enterprise search tool is without limitations. In a 2021 survey by APQC, which provides benchmarks and best practices for businesses, 19% of workers said that poor search functionality is a key problem in their organizations. But there’s a healthy market for enterprise search solutions regardless. The same survey found that 41% of respondents expect to “significantly” increase investment in search and findability within their organizations in the next 12 to 18 months.

Glean, whose total capital raised stands at $155 million, plans to use the proceeds from the latest round to expand its team, build out a go-to-market plan, and “drive new feature innovation.” Glean has more than 100 employees today and expects to have over 250 by the end of the year.

“Increased value put on employee productivity and happiness has been a boon to Glean’s growth among fast-growing companies that care about employee experience,” Arvind said. “Glean provides value to prospective customers from the first minute they start searching, and Glean is constantly working through product development and customer enablement to ensure customers have the best experience from then on.”

ZMO.ai secures $8M led by Hillhouse to create AI generated fashion models

With breakthroughs in machine learning, it’s no longer uncommon to see algorithmically generated bodies that can move and talk authentically like real humans. The question is now down to whether startups offering such products can achieve a sustainable business model. Some of them have demonstrated that potential and attracted investors.

ZMO.ai, founded by a team of Chinese entrepreneurs who have spent years studying and working abroad, just closed an $8 million Series A financing round led by Hillhouse Capital. GGV Capital and GSR Ventures also participated in the round.

The startup has found a healthy demand from fashion e-commerce companies that are struggling to hire and afford models due to their growing number of stock-keeping units (SKUs), or styles, as consumer tastes become more changeable. Using the generative adversarial network (GAN), ZMO has created a piece of software to help them create virtual full bodies of models by defining simple parameters like face, height, skin color, body shape, and pose.

“Traditionally, the entire cycle of garment manufacturing may take two to three months, from design, fabric selection, pattern making, modeling, to actually hitting the shelves,” says Ella Zhang, ZMO’s CEO and co-founder, a former engineer at Google and Apple.

“We are flipping and shortening that process. [Customers] can now test a piece of clothing by putting it on a virtual model, which can go on the website. Once orders come in, the e-commerce customer can start manufacturing,” she tells TechCrunch. “They can also test what type of people would suit a certain product by trying it out on different virtual models.”

It’s unsurprising that fashion e-commerce operators would find ZMO and its likes a cost-saving tool. Zhang says her company is in early discussion with fast fashion giant Shein, which rolls out 2,000-3,000 new products per day, about potential collaborations.

Screen capture of ZMO’s AI-generated video

We previously covered Surreal, a Sequoia-backed, Shenzhen-based startup also working on synthetic media to replace humans in lifestyle photos and other commercial scenarios. The business attracted a surge in interest as the COVID-19 pandemic hit China’s e-commerce exporters, who were having a hard time finding foreign models as the country went into strict border controls.

Going forward, ZMO is also planning to apply GPT-3, which uses big data and deep learning to imitate the natural language patterns of humans, to create speeches for models. As spooky as it may sound, the feature would make it breezy for e-commerce companies to churn out TikTok videos quickly and cheaply for product promotion.

On average, e-commerce companies spend around 3-5% of their annual gross merchandise value (a rough metric measuring sales, usually excluding returns and refunds) on photoshoots, according to Roger Yin, who worked at Evernote and ran his own cross-border e-commerce business before co-founding ZMO with Zhang.

“Images play a big role in driving e-commerce sales. The problem is that the [sales] cycle is short but the cost of images is high,” Yin observes, adding that costs can be even higher for fashion companies with a quick turnover of styles. The goal of ZMO is to reduce the costs of photoshoots to 1% of GMV.

Right now, 80% of ZMO’s customers are based in China, but it’s working to attract more overseas users this year using its new financial infusion. Operating with a team of 30 staff, the startup boasts 30 “medium and large-sized” customers, including Tencent-backed Chicv, one of Shein’s numerous challengers, and over 100 “small and medium” customers, such as dropshipping sellers.

ZMO’s other co-founders include Ma Liqian, a Ph.D. in computer vision who graduated from Belgium’s KU Leuven, and Yang Han, who previously worked on AI-powered styling at Tencent and SenseTime.

Tractian, which uses AI to monitor industrial equipment, raises $15M

Tractian, a startup developing a product to monitor the status of machines and electrical infrastructure, today announced that it closed a $15 million Series A funding round led by Next47, with participation from Y Combinator and others. The money will be put toward product development and expanding Tractian’s workforce and geographic footprint, according to co-founder and co-CEO Igor Marinelli, as well as ongoing customer acquisition efforts.

Founded in 2019, Tractian is the brainchild of Y Combinator alumni Marinelli and Gabriel Lameirinhas. Prior to starting Tractian, they worked at a paper manufacturer, International Paper, as software engineers, where Marinelli says they noticed how backwards the systems were for monitoring machinery health.

“Industrial managers of any kind need traceability of work orders, and need to know the health of their machines from kilometers away from the operations,” Marinelli said. “[W]ithout the proper combination of hardware and software, you can’t solve the industry’s real challenge.”

Tractian’s flagship product, which Marinelli says is patent pending in the U.S., uses AI to identify mechanical problems a machine might be having by analyzing its “rotational assets,” like motors, pumps and compressors. Tractian can spot signs of looseness, imbalance and misalignment from vibration and temperature anomalies measured by custom sensors, Marinelli claims, in addition to potential electrical failures.

Tractian provides sensors that attach to — and send data about — machines via 3G or 4G cellular networks. The company’s software provides checklist and inspection steps for each machine, plus diagnostics, recommendations, alerts and scheduling tools and inventories.


Monitoring equipment with Tractian. Image Credits: Tractian

Marinelli readily acknowledges that Tractian isn’t the first to the machine analytics space. Predictive maintenance technologies have been used for decades in jet engines and gas turbines, and companies including Samsara, Augury, Upkeep and MaintainX offer solutions with capabilities similar to Tractian. In April, Amazon threw its hat in the ring with the general launch of Lookout for Equipment, a service that ingests sensor data from a customer’s industrial equipment and then trains a machine learning model to predict early warning signs of machine failure.

In a sign of the segment’s competitiveness, Augury just this month acquired Seebo, a startup that provided manufacturing teams with the insights to optimize their industrial processes. Augury is one of the better-funded startups in the sector, having raised nearly $300 million in venture capital to date.

But both Marinelli and Lameirinhas sense opportunity in a market that could be worth $12.3 billion by 2025. In 2018, Gartner predicted that by 2022, spend on internet of things-enabled predictive maintenance would increase to $12.9 billion, up from $3.4 billion in 2018.

While Marinelli declined to go into detail when asked about the technical details of Tractian’s platform, including the accuracy of its algorithms, he noted that Tractian’s customer base of roughly 200 companies spans well-known brands like John Deere, Bosch, Embraer and Hyundai.

Looking ahead, the key for Tractian will be convincing would-be customers that its technology performs better than the rest. In a survey by McKinsey, analysts at the firm highlight the dangers of an under-performing predictive maintenance algorithm, claiming that one company saved over 10% on the breakdown of a piece of equipment using an algorithm but spent significantly more as a result of the algorithm’s high false-positive rate.

“[O]ur technology involves the same concept of Shazam, but for machines,” Marinelli said. [The pandemic especially] increased the necessity of real time monitoring of assets because many operators [can’t] be physically working [near them] for long periods of time.”

In March, Tractian announced its expansion to North America, opening a new office in Mexico with a team dedicated to developing the company’s activities there. Tractian plans to follow up with market entry in Atlanta, Georgia later this year.

When reached for comment, Debjit Mukerji, a partner at Next47 who plans to join Tractian’s board of directors, said: “This is a critical space, the heartbeat of our economy. Next47 is thrilled to join Tractian on its mission to transform the maintenance experience for enterprises globally. Having followed this space for years, we concluded that frictionless deployment, intuitive user interfaces and a mobile/cloud-first approach are essential ingredients of success, particularly in the underserved medium enterprise segment. Tractian combines these in its extraordinary product vision and consistently delights its customers.”

Tractian currently has 100 employees and it expects to expand its headcount to 200 in the next 18 months. The company’s total capital raised stands at $19 million; Marinelli demurred when asked about the valuation.

Virtual product placement ads are coming to Amazon Prime Video and Peacock

Announced at this month’s NewFronts, Amazon and Peacock demonstrated new ad formats that use similar virtual product placement (VPP) tools, a post-production technique for inserting a brand into a TV show or movie scene.

Amazon presented its new VPP tool, currently operating in beta, that lets advertisers place their branded products directly into streaming content after they have already been filmed and produced. Meanwhile, Peacock’s new “In-Scene” ads will identify key moments within a show and digitally insert a brand’s customized messaging or product post-production so the brand is showcased in the right TV show/movie and at the right time.

Product placement is nothing new and has long been a holy grail of the advertising industry. In 2019 alone, product placement in the U.S. garnered about $11.44 billion, per Statista data. That same year, approximately 49% of American viewers took action after seeing product placement in media.

Brands that use product placement in movies and TV shows capture target markets and promote products in a subtle way. Research by Sortlist revealed that, on average, customers are being sold 12.61 products per movie without even noticing.

However, the strategy is outdated, and products used in the content, for instance, a can of coke on a table, are decisions that are made months in advance.

Streaming services are rethinking this technology and using virtual product placement allows the platform to introduce new ads in the future and remonetize a piece of content over and over again.

In an illustrative video, Amazon demonstrated how its new VPP program enables brands to strategically insert products post-production into content streaming from Amazon Prime Video and the newly rebranded Amazon Freevee. The video shown at Newfronts had an M&Ms billboard (pictured above) that was digitally added way after the show had been filmed.

Colleen Aubrey, Senior Vice President, Advertising Products & Tech at Amazon, explained to the audience, “Working with content creators and using machine learning, we’re able to insert products and branded findings into a TV show or movie.” Billboards, signs, and screens in any chosen show can now have specific messaging on the streamer. Amazon will now be able to integrate different products into episodes at different moments and scenes.

She added that the M&M’s virtual product placement drove an almost 7% increase in brand favorability and almost a 15% increase in purchase intent. This gives advertisers the ability to bring their brands “in the content instead of just around,” she said, giving more flexibility and opportunity for customers to easily discover and engage with products. “Amazon ads are helping advertisers create long-term connections with customers in very everyday interactions,” Aubrey said.

The virtual product placement beta program has already been implemented in several Prime Video and Freevee original series such as “Tom Clancy’s Jack Ryan,” “Bosch: Legacy,” and the overall Bosch franchise, “Reacher,” and “Leverage: Redemption.”

Henrik Bastin, Chief Executive of Fabel Entertainment and Executive Producer of “Bosch: Legacy,” said, “Virtual product placement is a game-changer. It creates the ability to film your series without thinking about all that is required with traditional placements during production. Instead, you can sit with the final cut and see where a product could be seamlessly and naturally integrated into the storytelling.”

Image Credits: Peacock

Peacock also announced their own digitally inserted ad strategy at NewFronts. The new In-Scene Ads are designed to strengthen commercial opportunities with marketing partners, seamlessly blending products and/or messaging with content during post-production to insert advertisements during scenes that are deemed relevant to customers.

John Jelley, SVP of Product and UX at Peacock, said, “The majority of Peacock customers are opting for our ad-supported experience,” he said, “and we remain focused on collaborating with our brand partners to develop innovative, personalized ad experiences that continue to enhance the customer experience.”

While maybe not as “mind-blowing” as people think, the possibility of customizing ads for different users is fascinating to think about.

The unique technology brought forth by Peacock, Amazon Prime Video, and Amazon Freevee has the potential to transform ad-supported streaming. The insertion of carefully curated, digitally implemented ads could become the new way streaming platforms and their marketing partners target audiences and increase ad revenue.

We’re curious to see how other streaming services improve their advertising methods. Especially now that ad-supported options have become more popular. Netflix and Disney+ are the latest to announce upcoming cheaper ad-supported tiers.

AI-powered construction management platform Buildots lands $60M

In the construction industry, managers can become disconnected from what’s happening on-site — particularly when dealing with pandemic-related disruptions. Among the top hurdles are staying on top of costs, communicating with all stakeholders, and assessing risk related to aspects like contractor billing and performance. The disconnect can lead to delays and unanticipated expenses. One study found that 85% of construction projects over the course of a 70-year period experienced cost overruns and just 25% came close to their original deadlines.

The enormous addressable market — $1.3 trillion in the U.S. alone — continues to attract entrepreneurs like Roy Danon, who co-founded construction tech startup Buildots in 2018 with Aviv Leibovici and Yakir Sudry. Graduates of the Israel Defense Forces (IDF), the founders created a platform that leverages AI and hardhat-mounted 360-degree cameras to capture images of ongoing construction projects during site inspections.

Buildots today announced that it raised $60 million in a Series C round co-led by Viola Growth and Eyal Ofer’s O.G. Tech with participation from TLV Partners, Lightspeed Venture Partners, Future Energy Ventures, and Maor Investments. Danon says that the new capital, which brings Buildots’ total raised to $106 million, will be put toward product development and expanding the Buildots team, particularly in Europe and North America.

“The construction industry has been going through major transformation over the past few years,” Danon told TechCrunch in an email interview. “Drivers of this change include rising demand for new commercial and residential projects, increasing complexity of these projects, and manpower shortage. Traditionally, contractors have suffered from low profit margins, but Buildots’ solution is leading the charge to connect data to decision-making so that they can maximize revenue.”

The seed of the idea for Buildots came in 2017, ten years after Danon, Leibovici, and Sudry met in the IDF’s elite Talpiot unit. Without prior experience in or knowledge of the construction industry, the trio spent six months researching projects to better understand the challenges that construction companies and contractors face.


Image Credits: Buildots

Buildots analyzes project schedules, designs, and other data to generate a model of an active construction site. When workers equip their hardhats with a compatible 360-degree camera and upload the footage to Buildots, the platform automatically compares features of the site — e.g., pipes under a kitchen sink — to the model to gauge progress, automatically blurring out people in the footage for compliance purposes.

“Having operated in numerous and diverse regions, including North America, Europe, Asia-Pacific, and the Middle East, we have collected varied datasets that allow us to create extremely robust AI models,” Danon said. “For example, when a solution looks at hundreds of millions of sockets, it gets really good at recognizing issues and flagging mistakes. While most items look similar between construction sites, there’s always that one client that decides to put a pink cubicle toilet that challenges our models at first glance.”

Buildots’ two-way integrations with planning platforms Oracle Primavera P6, Asta Powerproject, and Microsoft Project are designed to allow instant timeline updates. Monthly progress reports, meanwhile, ostensibly help to validate subcontractor payment applications.

“Increasing visibility helps decision-makers reduce risk, improve cash flow, and save significant time,” Danon added. “[This gives] them the confidence to know their projects are being carried out as planned and in the most efficient way possible.”

Of course, Buildots isn’t the only company applying AI in the construction domain. Others include BeamUp, which is developing an AI-powered building design platform, and Versatile, which — like Buildots — captures and analyzes data across the construction site to provide a picture of construction progress.

Construction tech is a lucrative pursuit right now, thanks to the wider industry boom. In 2021, investment in construction startups reached a record $4.5 billion, triple the amount of money invested in 2020, according to Cemex Ventures.

When asked about traction and future plans, Danon said that Buildots’ revenue grew “tenfold” over the course of 2021 as the company’s customer base expanded to “dozens” of contractors.


Image Credits: Buildots

“There are certainly companies trying to make construction more efficient. Examples include Avvir and Doxel in the U.S. and Disperse in the U.K. However, when it comes to syncing decision makers with data from building sites, Buildots does this with more depth and accuracy than anyone else,” Danon said. “Buildots has more than doubled the size of its team over the past year and recently surpassed the 200 mark. The new funding will be used to expand our R&D, sales, and marketing [organizations]. We expect to be 300 strong by the end of 2022.”

When contacted for comment, Viola Growth general partner Natalie Growth said: “With top-notch technology and a superb team, Buildots offers immense potential in terms of efficiency and profitability. We are excited for their continued success capitalizing on this market.”

DOJ warns that misuse of algorithmic hiring tools could violate accessibility laws

AI tools for the hiring process have become a hot category, but the Department of Justice warns that careless use of these processes could lead to violations of U.S. laws protecting equal access for people with disabilities. If your company uses algorithmic sorting, facial tracking or other high-tech methods for sorting and rating applicants, you may want to take a closer look at what they’re doing.

The department’s Equal Employment Opportunity Commission, which watches for and advises on industry trends and actions pertaining to eponymous matters, has issued guidance on how company can safely use algorithm-based tools without risking the systematic exclusion of people with disabilities.

“New technologies should not become new ways to discriminate. If employers are aware of the ways AI and other technologies can discriminate against persons with disabilities, they can take steps to prevent it,” said EEOC Chair Charlotte A. Burrows in the press release announcing the guidance.

The general sense of the guidance is to think hard (and solicit the opinions of affected groups) about whether these filters, tests, metrics and so on measure qualities or quantities relevant to doing the job. They offer a few examples:

  • An applicant with a visual impairment must complete a test or task with a visual component to qualify for an interview, such as a game. Unless the job has a visual component this unfairly cuts out blind applicants.
  • A chatbot screener asks questions that have been poorly phrased or designed, like whether a person can stand for several hours straight, with “no” answers disqualifying the applicant. A person in a wheelchair could certainly do many jobs that some may stand for, just from a sitting position.
  • An AI-based resume analysis service downranks an application due to a gap in employment, but that gap may be for reasons related to a disability or condition it is improper to penalize for.
  • An automated voice-based screener requires applicants to respond to questions or test problems vocally. Naturally this excludes the deaf and hard of hearing, as well as anyone with speech disorders. Unless the job involves a great deal of speech, this is improper.
  • A facial recognition algorithm evaluates someone’s emotions during a video interview. But the person is neurodivergent, or suffers from facial paralysis due to a stroke; their scores will be outliers.

This is not to say that none of these tools or methods are wrong or fundamentally discriminatory in a way that violates the law. But companies that use them must recognize their limitations and offer reasonable accommodations in case an algorithm, machine learning model or some other automated process is inappropriate for use with a given candidate.

Having accessible alternatives is part of it but also being transparent about the hiring process and declaring up front what skill will be tested and how. People with disabilities are the best judges of what their needs are and what accommodations, if any, to request.

If a company does not or cannot provide reasonable accommodations for these processes — and yes, that includes processes built and operated by third parties — it can be sued or otherwise held accountable for this failure.

As usual, the earlier this kind of thing is brought into consideration, the better; if your company hasn’t consulted with an accessibility expert on matters like recruiting, website and app access, and internal tools and policies, get to it.

Meanwhile, you can read the full guidance from the DOJ here, with a brief version aimed at workers who feel they may be discriminated against here, and for some reason there is another truncated version of the guidance here.

To win insurtech 2.0, focus on underwriting before growth

Like many legacy markets poised for change, the insurance industry has already seen its first wave of innovation.

Similar in many ways to the initial novelty of opening a bank account online, insurtech 1.0 brought a centuries-old product into the digital era by giving customers a way to apply for insurance online. Customer excitement translated into investor excitement, and everybody rode off into the sunset.

Well, not quite. It seems some might have flown a little too close to the sun instead: Focusing on customer experience on the front end leads to rapid growth indeed, but failing to focus on underwriting on the back end can lead to a very large number of claims, very quickly.

That’s because insurance, fundamentally, is about risk. It follows that digital insurance innovation should primarily focus on digital underwriting innovation — in essence, using technology to correctly assess and price risk in real time.

The truly magical (and most misunderstood) fact is that everything else can simply flow from that innovative underwriting foundation: an instant, digital customer experience, sustainable growth unburdened by excessive claims and the ability to embed insurance in other digital journeys, creating better experiences for consumer, partners and insurtechs alike.

By focusing first on growth and then on underwriting, the insurtech 1.0 wave essentially flowed in the wrong direction. But there is plenty of time to reverse the tide — consumers’ enormous appetite for convenient, modern insurance products has only been whet.

Insurtech companies need to keep pace with the demand they have created through sustainable unit economics and wise risk management.

So what does focusing on next-generation underwriting really look like, and how should you build upon it? Here’s our five-step playbook for winning in the insurtech 2.0 era.

Realign your business around underwriting excellence

Refocusing on underwriting innovation starts with refocusing your business.

Ask yourself the following questions:

  • Do your primary KPIs include ways to measure underwriting outcomes alongside traditional growth metrics?
  • Do a majority of your employees work on underwriting directly or indirectly?
  • Do your company goals include explicit underwriting goals?
  • Can all your employees articulate how/why underwriting is a differentiator at your company?

If you’ve answered no to one or more questions, it might be worth rethinking your goals, metrics and organizational structure.

Prove your models

Nobody likes to qualify growth, but in insurtech, smart growth is the name of the game. Resist the urge to rapidly scale acquisition before you’ve built confidence in your underwriting engine. But how do you do that?