When big AI labs refuse to open source their models, the community steps in

Benchmarks are as important a measure of progress in AI as they are for the rest of the software industry. But when the benchmark results come from corporations, secrecy very often prevents the community from verifying them.

For example, OpenAI granted Microsoft, with which it has a commercial relationship, the exclusive licensing rights to its powerful GPT-3 language model. Other organizations say that the code they use to develop systems is dependent on impossible-to-release internal tooling and infrastructure or uses copyrighted data sets. While motivations can be ethical in nature — OpenAI initially declined to release GPT-2, GPT-3’s predecessor, out of concerns that it might be misused — by the effect is the same. Without the necessary code, it’s far harder for third-party researchers to verify an organization’s claims.

“This isn’t really a sufficient alternative to good industry open-source practices,” Columbia computer science Ph.D. candidate Gustaf Ahdritz told TechCrunch via email. Ahdritz is one of the lead developers of OpenFold, an open source version of DeepMind’s protein structure-predicting AlphaFold 2. “It’s difficult to do all of the science one might like to do with the code DeepMind did release.”

Some researchers go so far as to say that withholding a system’s code “undermines its scientific value.” In October 2020, a rebuttal published in the journal Nature took issue with a cancer-predicting system trained by Google Health, the branch of Google focused on health-related research. The coauthors noted that Google withheld key technical details including a description of how the system was developed, which could significantly impact its performance.


Image Credits: OpenFold

In lieu of change, some members of the AI community, like Ahdritz, have made it their mission to open source the systems themselves. Working from technical papers, these researchers painstakingly try to recreate the systems, either from scratch or building on the fragments of publicly available specifications.

OpenFold is one such effort. Begun shortly after DeepMind announced AlphaFold 2, the goal is to verify that AlphaFold 2 can be reproduced from scratch and make available components of the system that might be useful elsewhere, according to Ahdritz.

“We trust that DeepMind provided all the necessary details, but … we don’t have [concrete] proof of that, and so this effort is key to providing that trail and allowing others to build on it,” Ahdritz said. “Moreover, originally, certain AlphaFold components were under a non-commercial license. Our components and data — DeepMind still hasn’t published their full training data — are going to be completely open-source, enabling industry adoption.”

OpenFold isn’t the only project of its kind. Elsewhere, loosely-affiliated groups within the AI community are attempting implementations of OpenAI’s code-generating Codex and art-creating DALL-E, DeepMind’s chess-playing AlphaZero, and even AlphaStar, a DeepMind system designed to play the real-time strategy game StarCraft 2. Among the more successful are EleutherAI and AI startup Hugging Face’s BigScience, open research efforts that aim to deliver the code and datasets needed to run a model comparable (though not identical) to GPT-3.

Philip Wang, a prolific member of the AI community who maintains a number of open source implementations on GitHub, including one of OpenAI’s DALL-E, posits that that open-sourcing these systems reduces the need for researchers to duplicate their efforts.

“We read the latest AI studies, like any other researcher in the world. But instead of replicating the paper in a silo, we implement it open source,” Wang said. “We are in an interesting place at the intersection of information science and industry. I think open source is not one-sided and benefits everybody in the end. It also appeals to the broader vision of truly democratized AI not beholden to shareholders.”

Brian Lee and Andrew Jackson, two Google employees, worked together to create MiniGo, a replication of AlphaZero. While not affiliated with the official project, Lee and Jackson — being at Google, DeepMind’s initial parent company — had the advantage of access to certain proprietary resources.


Image Credits: MiniGo

“[Working backward from papers is] like navigating before we had GPS,” Lee, a research engineer at Google Brain, told TechCrunch via email. “The instructions talk about landmarks you ought to see, how long you ought to go in a certain direction, which fork to take at a critical juncture. There’s enough detail for the experienced navigator to find their way, but if you don’t know how to read a compass, you’ll be hopelessly lost. You won’t retrace the steps exactly, but you’ll end up in the same place.”

The developers behind these initiatives, Ahdritz and Jackson included, say that they’ll not only help to demonstrate whether the systems work as advertised but enable new applications and better hardware support. Systems from large labs and companies like DeepMind, OpenAI, Microsoft, Amazon, and Meta are typically trained on expensive, proprietary datacenter servers with far more compute power than the average workstation, adding to the hurdles of open-sourcing them.

“Training new variants of AlphaFold could lead to new applications beyond protein structure prediction, which is not possible with DeepMind’s original code release because it lacked the training code — for example, predicting how drugs bind proteins, how proteins move, and how proteins interact with other biomolecules,” Ahdritz  said. “There are dozens of high-impact applications that require training new variants of AlphaFold or integrating parts of AlphaFold into larger models, but the lack of training code prevents all of them.”

“These open-source efforts do a lot to disseminate the “working knowledge” about how these systems can behave in non-academic settings,” Jackson added. “The amount of compute needed to reproduce the original results [for AlphaZero] is pretty high. I don’t remember the number off the top of my head, but it involved running about a thousand GPUs for a week. We were in a pretty unique position to be able to help the community try these models with our early access to the Google Cloud Platform’s TPU product, which was not yet publicly available.”

Implementing proprietary systems in open source is fraught with challenges, especially when there’s little public information to go on. Ideally, the code is available in addition to the data set used to train the system and what are called weights, which are responsible for transforming data fed to the system into predictions. But this isn’t often the case.

For example, in developing OpenFold, Ahdritz and team had to gather information from the official materials and reconcile the differences between different sources, including the source code, supplemental code, and presentations that DeepMind researchers gave early on. Ambiguities in steps like data prep and training code led to false starts, while a lack of hardware resources necessitated design compromises.

“We only really get a handful of tries to get this right, lest this drag on indefinitely. These things have so many computationally intensive stages that a tiny bug y can greatly set us back, such that we had to retrain the model and also regenerate lots of training data,” Ahdritz said. “Some technical details that work very well for [DeepMind] don’t work as easily for us because we have different hardware … In addition, ambiguity about what details are critically important and which ones are selected without much thought makes it hard to optimize or tweak anything and locks us in to whatever (sometimes awkward) choices were made in the original system.”

So, do the labs behind the proprietary systems, like OpenAI, care that their work is being reverse-engineered and even used by startups to launch competing services? Evidently not. Ahdritz says the fact that DeepMind in particular releases so many details about its systems suggests it implicitly endorses the efforts, even if it hasn’t said so publicly.

“We haven’t received any clear indication that DeepMind disapproves or approves of this effort,” Ahdritz said. “But certainly, no one has tried to stop us.”

Heartex raises $25M for its AI-focused, open source data labeling platform

Heartex, a startup that bills itself as an “open source” platform for data labeling, today announced that it landed $25 million in a Series A funding round led by Redpoint Ventures. Unusual Ventures, Bow Capital, and Swift Ventures also participated, bringing Heartex’s total capital raised to $30 million.

Co-founder and CEO Michael Malyuk said that the new money will be put toward improving Heartex’s product and expanding the size of the company’s workforce from 28 people to 68 by the end of the year.

“Coming from engineering and machine learning backgrounds, [Heartex’s founding team] knew what value machine learning and AI can bring to the organization,” Malyuk told TechCrunch via email. “At the time, we all worked at different companies and in different industries yet shared the same struggle with model accuracy due to poor-quality training data. We agreed that the only viable solution was to have internal teams with domain expertise be responsible for annotating and curating training data. Who can provide the best results other than your own experts?”

Software developers Malyuk, Maxim Tkachenko, and Nikolay Lyubimov co-founded Heartex in 2019. Liubimov was a senior engineer at Huawei before moving to Yandex, where he worked as a backend developer on speech technologies and dialogue systems.


Heartex’s dashboard.

The ties to Yandex, a company sometimes referred to as the “Google of Russia”, might unnerve some — particularly in light of accusations by the European Union that Yandex’s news division played a sizeable role in spreading Kremlin propaganda. Heartex has an office in San Francisco, California, but several of the company’s engineers are based in the former Soviet Republic of Georgia.

When asked, Heartex says that it doesn’t collect any customer data and open sources the core of its labeling platform for inspection. “We’ve built a data architecture that keeps data private on the customer’s storage, separating the data plane and control plane,” Malyuk added. “Regarding the team and their locations, we’re a very international team with no current members based in Russia.”

Setting aside its geopolitical affiliations, Heartex aims to tackle what Malyuk sees as a major hurdle in the enterprise: extracting value from data by leveraging AI. There’s a growing wave of businesses aiming to become ‘data-centric’ — Gartner recently reported that enterprise use of AI grew a whopping 270% over the past several years. But many organizations are struggling to use AI to its fullest.

“Having reached a point of diminishing returns in algorithm-specific development, enterprises are investing in perfecting data labeling as part of their strategic, data-centric initiatives,” Malyuk said. “This is a progression from earlier development practices that focused almost exclusively on algorithm development and tuning.”

If, as Malyuk asserts, data labeling is receiving increased attention from companies pursuing AI, it’s because labeling is a core part of the AI development process. Many AI systems “learn” to make sense of images, videos, text and audio from examples that have been labeled by teams of human annotators. The labels enable the systems to extrapolate the relationships between the examples (e.g., the link between the caption “kitchen sink”: and a photo of a kitchen sink) to data the systems haven’t seen before (e.g., photos of kitchen sinks that weren’t included in the data used to “teach” the model).

The trouble is, not all labels are created equal. Labeling data like legal contracts, medical images, and scientific literature requires domain expertise that not just any annotator has. And — being human — annotators make mistakes. In an MIT analysis of popular AI data sets, researchers found mislabeled data like one breed of dog confused for another and an Ariana Grande high note categorized as a whistle.

Malyuk makes no claim that Heartex completely solves these issues. But in an interview, he explained that the platform is designed to support labeling workflows for different AI use cases, with features that touch on data quality management, reporting, and analytics. For example, data engineers using Heartex can see the names and email addresses of annotators and data reviewers, which are tied to labels that they’ve contributed or audited. This helps to monitor label quality and — ideally — to fix problems before they impact training data.

“The angle for the C-suite is pretty simple. It’s all about improving production AI model accuracy in service of achieving the project’s business objective,” Malyuk said. “We’re finding that most C-suite managers with AI, machine learning, and/or data science responsibilities have confirmed through experience that, with more strategic investments in people, processes, technology, and data, AI can deliver extraordinary value to the business across a multitude of diverse use cases. We also see that success has a snowball effect. Teams that find success early are able to create additional high-value models more quickly building not just on their early learnings but also on the additional data generated from using the production models.”

In the data labeling toolset arena, Heartex competes with startups including AIMMO, Labelbox, Scale AI, and Snorkel AI, as well as Google and Amazon (which offers data labeling products through Google Cloud and SageMaker, respectively). But Malyuk believes that Heartex’s focus on software as opposed to services sets it apart from the rest. Unlike many of its competitors, the startup doesn’t sell labeling services through its platform.

“As we’ve built a truly horizontal solution, our customers come from a variety of industries. We have small startups as customers, as well as several Fortune 100 companies. [Our platform] has been adopted by over 100,000 data scientists globally,” Malyuk said, while declining to reveal revenue numbers. “[Our customers] are establishing internal data annotation teams and buying [our product] because their production AI models aren’t performing well and recognize that poor training data quality is the primary cause.”

Glean aims to help employees surface info across sprawling enterprise systems

At enterprises of a certain size, keeping track of data including apps, employees, and projects becomes increasingly challenging. According to McKinsey, employees spend 1.8 hours every day — 9.3 hours per week, on average — searching for and gathering information. The veracity of metrics like these has been challenged over the years. But it’s reasonable to say that knowledge workers in particular devote a sizeable chunk of their workdays to sifting through data, whether to find basic contact info or domain-specific files.

The emergence in recent years of AI algorithms that can parse natural language has fueled the rise of platforms that can shrink that chunk. At least, that’s the assertion of Arvind Jain, a former Google engineer and Rubrik co-founder, whose startup, Glean, employs AI to power a unified search experience across all apps used at a company.

Arvind, a former Google engineer, began work on Glean at Rubrik, the cloud data management company. In Rubrik’s annual employee pulse survey, Arvind observed that one of the biggest productivity challenges was workers not being able to find the information they needed — whether a specific document or subject-matter expert.

“Engineers were spending too much time outside code; account managers couldn’t find the latest research or presentation needed to close deals; new employees took too long to ramp,” Arvind told TechCrunch in an email interview. “This growing problem was not only destroying productivity, but also sapping energy and detracting from the employee experience.”

Other companies were experiencing the same issues, as it turned out — exacerbated by their embrace of the cloud and distributed work setups. Sensing an opportunity, Arvind managed to convince former engineering lead Piyush Prahladka, ex-Facebook and -Microsoft engineer T.R. Vishwanath, and Tony Gentilcore, previously at AT&T and Google, to build the prototype for Glean.


Image Credits: Glean

Fast forward to 2022, and Glean has over 70 customers including Okta, Confluent, Samsara, Grammarly, and Outreach. Reflecting growth since its 2019 founding, Glean today closed a $100 million Series C round led by Sequoia with participation from the Slack Fund at a $1 billion valuation post-money.

On the one hand, Glean’s technology isn’t incredibly novel. Services like Microsoft’s SharePoint Syntex, Amazon Kendra, and Google Cloud Search tap natural language processing technology to understand not only document minutia but the searches employees across an organization might perform, like “How do I invest in our company’s 401k?” They fall under the banner of “cognitive search,” a product category encompassing search tools that implement AI to ingest, understand, organize, and query data from multiple sources.

But Arvind claims that Glean is simpler to set up and use than the competition, including smaller outfits like Coveo, Elastic, Lucidworks, and Mindbreeze.

“[Glean] takes less than two hours for initial setup, and doesn’t require any engineering talent or manual fine-tuning for implementation,” Arvind said. “And Glean has seamless workflow integration, whether you’re using Glean in the web app, new tab page, sidebar search, native search, or Slack commands.”

Arvind notes that one of the major problems in enterprise search is the diversity of data sources, like knowledge bases, tickets, chat messages, and pull requests. To address this, Glean uses AI systems to predict for every query the relative importance of content across these sources, training separate systems on customer data to learn company-specific jargon, concepts, entities, and acronyms. To deliver personalized results, as well as proactively recommend documents, Glean accounts for variables like a person’s role, work patterns, job function, and specific projects and responsibilities in its indexing.

“Glean’s biggest competitor is the status quo: employees continuing to deal with the complexity of finding the information and people they need at work. In a typical sales process, potential customers often require Glean to first start with a pilot to demonstrate how much value implementing Glean can provide,” Arvind said. “Glean uses the user’s information to personalize the search experience for the user along several dimensions — for example, for the same query an engineer may see very different results than a sales executive. Glean also uses the user’s activity, such as clicks on search results, to improve the search relevance.”

Because Glean acts like a layer on top of all other apps a company uses, it can double as a work portal from where managers can create and share “shortlinks” to resources (e.g., “go/benefits” instead of a long URL). Management can also share news, handbooks, expense policies, KPI dashboards, and company OKRs and expose the company’s people directory, which shows who people are and what projects they’re working on.


Image Credits: Glean

In a dashboard, Glean surfaces answers to frequently asked questions and devotes a space to links and descriptions of those links that can be shared with the wider organization. A control panel allows users to run data loss prevention reports on the sources from which Glean draws and check for compliance with GDPR, CCPA, and other privacy regulations.

“Prospective customers are often anxious about providing Glean with access to all of their data, which is why Glean has spent so much time ensuring it respects all privacy controls from the applications that it integrates with, and invested heavily in security certifications and processes from the very beginning … When [a] user deletes a document in the underlying application (Slack, Drive, Office, etc.), the document gets deleted from the Glean system as well,” Arvind said. “Glean customers can choose to have Glean host them or self-host Glean to keep their information within their environment. With Glean’s e-discovery and data loss prevention tools, companies can be confident about what data is available within their organization — and how that information is used.”

No enterprise search tool is without limitations. In a 2021 survey by APQC, which provides benchmarks and best practices for businesses, 19% of workers said that poor search functionality is a key problem in their organizations. But there’s a healthy market for enterprise search solutions regardless. The same survey found that 41% of respondents expect to “significantly” increase investment in search and findability within their organizations in the next 12 to 18 months.

Glean, whose total capital raised stands at $155 million, plans to use the proceeds from the latest round to expand its team, build out a go-to-market plan, and “drive new feature innovation.” Glean has more than 100 employees today and expects to have over 250 by the end of the year.

“Increased value put on employee productivity and happiness has been a boon to Glean’s growth among fast-growing companies that care about employee experience,” Arvind said. “Glean provides value to prospective customers from the first minute they start searching, and Glean is constantly working through product development and customer enablement to ensure customers have the best experience from then on.”

Apollo GraphQL launches its Supergraph

The name kind of gives it away, but Apollo GraphQL has long focused on helping developers use the GraphQL query language for APIs to integrate data from a variety of services. Over the course of the last few years, it also worked with large enterprises to help them bring together data from a wide variety of sources into a single ‘supergraph,’ as the company likes to call it. Now, it is making these capabilities, which were previously the domain of large enterprises like Expedia, Walmart, and Zillow, available to anybody on its platform.

Apollo CEO and co-founder Geoff Schmidt wasn’t shy about what he thinks this announcement means when I talked to him ahead of today’s announcement. “We’ve been working on GraphQL since 2016, back when we were Meteor.js. But what we have to announce today is really why we built the company through all these years and through all these open-source projects,” he said. “It’s something that I think history will look at as being as big a deal as the database or the message bus or containerization — or maybe even the cloud itself.”

That’s a lot to live up to.

“The Super graph is a whole new way to think about GraphQL and what it’s for and what it delivers,” Schmidt continued. “I think the key idea of the Supergraph is the graph of graphs. It’s how these individual graphs that people have been building come together into a new layer of the stack — a different way of building applications — something that is as significant for how we’re all going to use the stack in the future as the database was.”

Image Credits: Apollo

Schmidt argues that as enterprises broke up their monolithic application architectures and moved to microservices, everything became so atomized that it now puts the burden on developers to piece everything back together when they want to build a new application on top of these systems.

At the core of the Supergraph are three projects. The first is the Apollo Router, a Rust-based runtime that processes GraphQL queries and then plans and executes them across federated subgraphs and returns those responses back to the client. This router, the company says, is 10x faster than the old Apollo Gateway, which the company previously used for querying federated graphs. The second piece is a set of new capabilities or the free tier of Apollo Studio, the company’s tool for managing data sources. The free tier will now include schema checks to ensure a new schema won’t break and existing applications and a launch dashboard that provides visibility into the schema-checking and launch process which was only available to enterprise users until now. And the third piece is Apollo Federation 2, which launched in April and allows users to compose their subgraphs into a single Supergraph.

Schmidt stressed that the company isn’t trying to replicate data lakes for analytical use cases here but a layer in the stack that allows developers to build new use cases.

“It’s not just how many pizzas that I sell, but can I order a pizza? You want to create something that’s almost like a virtual database — or a virtual server — that has objects that represent everything in a company: every customer, every product, every order, every like, every blog post — and you want to be able to ask questions like, ‘show me all the orders that this customer did,’ even though all that stuff lives in 1000 different services,” Schmidt explained.

It’ll be interesting to see if the Supergraph can live up to Apollo’s hype. Currently, the company’s GraphQL client, server and gateway are currently being downloaded more than 17 million times a month and the company says its products are being used in production by 30% of the Fortune 500. With the Supergraph, the company hopes to establish itself as a core part of the modern development stack.

QuestBook raises $8.3 million to help web3 developers secure funds

Blockchain and other web3 projects are racing to reach developers, hosting hackathons, bandying out grants and offering other perks to lure those who can build. But they currently don’t have the bandwidth to review the voluminous number of applications they receive, which pushes away some of the same builders who can bring immense value to those projects.

QuestBook, a startup that is attempting to solve that and more, said today it has raised $8.3 million in its Series A funding as it looks to scale its efforts.

The financing round was led by Lemniscap and saw participation from scores of investors, including Coinbase Ventures, Alameda Research, Dragonfly, Hashed, Polygon, Balaji Srinivasan, Raj Gokal of Solana, Arjun Sethi of Tribe Capital and Maneesh Sharma of GitHub.

QuestBook operates an eponymous platform that allows firms to give grants to developers and invest in them in an efficient and more transparent way. It also screens projects using a number of factors such as a developer’s on-chain and GitHub history to shoulder much of the burden from various blockchain and web3 firms.

The startup, founded in May 2021, originally set out to help builders earn in crypto. But it soon realized, explained Madhavan Malolan, one of its co-founders, that there weren’t many developers who knew how to code in Solidity and Rust, the programming languages underpinning Ethereum and Solana blockchains, respectively.

There’s a large whitespace in the developer ecosystem today. Just as retail investors’ appetite slows down when crypto enters the bear cycle, many developers also start to explore other, more familiar opportunities.

But more importantly, there aren’t too many developers coding for web3 in the first place, a problem that can be solved with long-term incentives. Even as more than 34,000 new developers committed code for a web3 project in 2021, it’s still a tiny fraction of the global software engineers base, according to a recent report by Electric Capital, a venture firm that invests in web3 startups.

To kick things off, QuestBook began teaching developers how to code in these languages at no charge and also published over 100 tutorials for anyone to get started. The program quickly took off, said Malolan, and amassed over 18,000 developers. Since attracting a large enough base, QuestBook has been exploring ways to help developers secure funds to build their projects.

“That’s something we always kind of vibe with,” he said in an interview. “For example, I have never bought any crypto in my life. All the crypto that I have, I earned them by contributing to open source projects.”

“One of the things we saw during the late December break was that people were not looking at crypto as a full-time job. They were looking at it as a side-job. We wondered how we could give these people an opportunity to start earning on the side so that they can find financial mobility and then double down,” he said.

Nearly every blockchain, their foundations, or firms building atop of these projects offer grants. They typically fund these projects through their own tokens, thereby increasing the value of their own digital assets if more developers build something viable on their platforms.

But as mutually beneficial as this transaction appears, it is confusing and overwhelming at times to identify the themes the variety of firms wish to back and the rate at which they are deploying the funds. Often it may take a developer months to just hear from the protocol, for instance.

And these firms need additional assistance in credential-gating the applications to remove potential bad players. Uniswap, a decentralized network on Ethereum credited for the emergence of the DeFi ecosystem on the mothership, learned this lesson the hard way.

The firm last year offered a no-strings-attached $20 million grant to developers, which quickly saw over 50% of the capital being swapped for a stablecoin.

QuestBook today works with the Ethereum Foundation, Polygon, Aave, Near and Harmony, and has helped them provide developers over $1.5 million in grant money. The startup, which currently doesn’t monetize its software, plans to increase this grants disbursement to $30 million to $50 million over the next two quarters, it said.

It has much larger ambitions beyond helping developers secure grants, he said.

QuestBook’s tool is already helping firms create small funds across the globe so that local talent can find and invest in new projects. The investments are all transparent and if the developers think the individual running the show isn’t doing a good job, they can vote to have the person replaced.

“It’s minimal grants DAO with maximum community participation. I know we need to come up with a better name,” Malolan laughed. “The idea is that let us say you’re running a network or a protocol, and you’re able to see some innovation happening in a space or a region that you don’t have the expertise or bandwidth to evaluate, you delegate capital to others.”

QuestBook is also working to broaden the ways developers can find work opportunities.

“Right now, we are solving the capital allocation problem,” said Malolan. “What we are aiming for is to bring permissionless work to crypto. An Uber driver can tap a button and get the job and start earning. The same infrastructure is not available for developers. Of course to solve this, you will need strong credentialing, strong workflows, and capital. We are beginning to tackle the workflows.”

Malolan hesitated to talk much about it, saying the efforts are in early stage, but shared that QuestBook has built a wallet of its own that does gasless transactions (meaning users won’t be required to pay transaction fees to do transactions.)

“Wallets we all use today are designed for DeFi. Since in our case, we are not transacting money, but just information, it made sense for us to build our own gasless wallet where it doesn’t cost people anything to save information. For the most part, developers will not even know, nor should they, that there’s a wallet in play,” he said.

Fable funds quest for accessibility-inclusive development with $10M A round

The importance of designing accessibility in software from the ground up has only been emphasized by the pandemic, and as a consequence Fable’s on-demand accessibility experts have proven their value many times over. The company has raised $10.5 million to scale up and pursue its goal to “make Inclusive Product Development the status quo,” as CEO Anwar Pillai put it.

Fable raised a $1.5 million seed round in the summer of 2020 (it was founded in 2018), in response to increasing demand for firsthand expertise in accessibility — essentially, people with disabilities and software experience who could be tapped to provide testing and advice to developers. If you’re designing your app to be usable by blind folks, you should probably have blind folks testing it, right? Fable makes that sort of thing easy.

But the point of the company isn’t just to provide a diverse group of testers and experts — it’s to ensure that accessibility can be on the roadmap at any company and any project from the start. The last couple years have driven that message home.

“With the onset of COVID-19 the physical world ground to a halt and everything went online. This put a spotlight on the importance of ensuring that everyone can access digital products and services,” said Pillai. “While this spotlight has helped bring awareness of the problem to more organizations, our mission has always been, and continues to be, to empower people with disabilities to participate, contribute, and shape society.”

The new funding round, led by Five Elms Capital with participation from Difference Partners, Disruption Ventures, and several angels, will of course help scale and improve the products they already offer. Companies like Microsoft, Shopify, Slack and Meta are already customers.

But now the company is also going to dip its toe into the world of corporate training.

“While we’ve seen an explosion in the practice of accessibility across organizations, knowledge and skillsets have not kept up,” Pillai said. “Our second product, Fable Upskill, was developed in response to overwhelming customer demand for accessibility training.”

Upskill will be “video based courses designed by accessibility experts specifically for each team,” including the actual products and processes the customer already uses. So not just a standard one-size-fits-all “How to do accessibility” video (we probably have enough of those already).

Of course, given Fable’s core strength of a widespread community of professionals with disabilities, the content will also put those experts and their voices front and center so the advice doesn’t come across as abstract. Upskill will be getting a shot in the arm with this $10 million infusion, so don’t be surprised if you see one of Fable’s videos in your own training materials sometime soon.

Koyeb is a serverless platform that integrates with your GitHub repository

Koyeb has evolved quite a lot since I first covered the startup. The company is still focused on serverless infrastructure. But it now offers a general purpose serverless platform that you can configure through a simple “git push” command or by using Docker containers.

The company’s serverless platform is now available as a public preview with a free tier to get started and try out the service — the free tier lets you run two nano apps on the platform. It has already been tested by 10,000 developers during the private beta phase. There are currently 3,000 applications running on Koyeb’s infrastructure.

Koyeb wants to abstract your server infrastructure as much as possible so that you can focus on development instead of system administration. You can use it to host a web app, an API or event-driven workloads.

Behind the scenes, the startup doesn’t use Kubernetes. Instead, it has built its own custom stack based on Firecracker microVMs, Nomad and Kuma. It runs on bare-metal servers with recent Intel and AMD chips.

There are two ways to deploy your apps to Koyeb. You can deploy from your git repository (currently limited to GitHub repositories) or from any public or private container registry. Koyeb has a web interface but also offers a command-line interface and an API.

When you deploy a new app, Koyeb gives your app a “.koyeb.app” subdomain and automatically secures the app with TLS. You can also configure your own domain name.

If you need more resources, you can easily scale your app from a slider. In that case, Koyeb launches your app on several new instances and traffic is automatically load balanced between those instances.

All of this is transparent for the development team. Every time there’s a new git commit, Koyeb automatically starts building and deploying your app.

Koyeb plans to offer a global edge network. The service is currently live in one core location in Paris and 250 edge locations for native load balancing, TLS encryption and CDN-like caching. By the end of the year, your app will be simultaneously deployed to 10 core locations around the world.

It’s clear that Koyeb is still a work in progress. But it sounds like a promising start for lean development teams who don’t want to spend too much time on managing a cloud infrastructure.

Announcing the startups and judges onstage at TC Sessions: Mobility 2022

TechCrunch is excited to announce the six companies pitching in person and onstage at TC Sessions Mobility 2022. Hailing from around the United States and the globe, founders will pitch on the main stage, for four minutes, followed by an intense Q&A with our expert panel of judges.

The judges for this pitch-off will be Yoon Choi (Muirwoods Ventures), Mar Hershenson (Pear VC) and Gabriel Scheer (Elemental Excelerator) on day one; and Sven Strohband (Khosla Ventures), Victoria Beasley (Prelude Ventures) and John Du (GM Ventures) on day two. You can find additional details on each of the judges below.

Alright, alright. I know you want to see who made the cut. Join us on Wednesday, May 18 and Thursday, May 19 to watch these incredible founders take the stage.

Startups pitching on the main stage

Day 1 — Wednesday May 18: 1:10 p.m.–1:45 p.m. PDT

Koop Technologies (Pittsburgh, PA, USA) — Presenter: Sergey Litvinenko, co-Founder and CEO

“Koop Technologies is an insurance platform for autonomous vehicles and robotics. The Singularity Platform is essentially a combo of three tools that Koop built: Koop API, Portal By Koop, and Insurability Sufficiency Framework (ISF). Koop provides autonomy insurance through data collection and proprietary analysis, wrapped up in the UX/UI provided by the portal.”

Boston Materials (Billerica, MA, USA) — Presenter: Anvesh Gurijala, founder and CEO

“Boston Materials is a high-performance materials company enabling manufacturers of industrial and consumer products to break through their design trade-offs with new materials. The company’s patented Z-axis Fiber™ technology is a lightweight material that has an extraordinary ability to diffuse energy (whether from impact, heat or electrical surge, for example). It is produced from 100% reclaimed carbon fiber, enabling new, high-volume, energy-efficient products that have a low carbon footprint.”

Swyft Cities (Mountain View, CA) — Presenter: Jeral Poskey, CEO

“Swyft is a new form of urban mobility, using autonomous cabins on lightweight cable infrastructure to solve transportation problems in densely developed areas including corporate campuses, airports, universities and tourism districts. Swyft adds a new connection, increasing capacity to the site with an attractive alternative to automobiles. This adds capacity into an area, allowing higher density and more profitable developments. It also reduces costs on parking and traffic mitigation. In some areas, providing connections within the site can drive high value.”

Day 2 — Thursday May 19: 1:15 p.m.–1:45 p.m. PDT

Beyond Aero (Paris/Toulouse, France) — Presenter: Eloa Guillotin, co-founder and CEO

“Beyond Aero is making long range electric aircraft possible using hydrogen-electric propulsion. The first aircraft is a zero emission private aircraft (6-9 seat), designed for hydrogen propulsion, flying 1,000 miles in range.”

MeterFeeder (Pittsburgh, PA, USA) — Presenter: Jim Gibbs, co-founder and CEO

MeterFeeder powers parking payments, data and management for individuals, fleets and municipalities. “MeterFeeder allows individual users to pay simply with geolocation, fleets to remain compliant and rapidly pay when needed. MeterFeeder provides backend software, enforcement devices, and payment platform are cost-effective.”

DIMO (Brooklyn, NY, USA) – Presenter: Andy Chatham, co-founder

“DIMO enables a new class of mobility applications to be built by developers on real world data. DIMO based on a network of drivers and fleets to collect and share their vehicle data to learn more about their vehicle, save money, and build better mobility applications.”

Expert panel of judges

Day 1

Yoon Choi — Muirwoods Ventures

Yoon Choi Muirwoods Ventures“Yoon has been a Venture Investor and strategic partner to many Silicon Valley startups/founders for 18 years prior to Muirwoods. Yoon founded a seed fund, Forest Ventures focusing in automotive sector and was an investment director at SAIC capital, one of the leaders in China’s automotive industry. Before SAIC, she led the Corporate Venture Group at Maxim Integrated, where she led multiple strategic technology acquisitions and venture investments. Earlier in her career, Yoon was one of the founding members at Samsung Ventures.”

Mar Hershenson — Pear VC

Mar Hershenson Pear VC

“Mar Hershenson is a co-founder and Managing Partner at Pear VC, a seed-stage investment firm in Palo Alto backing companies like Guardant Health (NASDAQ: GH), Doordash (NYSE: DASH), Gusto, and Branch. Mar has been recognized in the Midas List of Top Tech Investors in 2021.” Mar is a successful serial entrepreneur, with numerous industry accolades. Mar received her Ph.D. in Electrical Engineering from Stanford University in 2000 for her breakthrough work in circuit design automation.

Gabriel Scheer — Elemental Excelerator

Gabe Scheer Elemental Excelerator“Gabriel is the Director of Innovation, focused on mobility and energy, for Elemental Excelerator, a climatetech accelerator founded in 2009 in Hawaii. Previously, Gabriel was on the founding team at Lime, where he spent three years working on government relations, data policy, and transit partnerships globally. He has also worked at and consulted to Chariot, Zipcar, Superpedestrian, and Spin. In addition, he founded and ran a large environmental nonprofit and built a social innovation consultancy, as well as attempting to build a company to help small businesses pursue energy efficiency retrofits.”


Victoria Beasley — Prelude Ventures

Prelude Ventures Investor“Victoria is General Partner at Prelude Ventures, where her climate tech investments span mobility, food and agriculture, clean energy, sustainable apparel and carbon markets. Prior to Prelude Ventures, Victoria worked on climate change strategy at BCG and started an agriculture supply chain company. Earlier, she led Finance at a major solar manufacturer. Victoria holds an MBA and M.S. in Environment and Resources from Stanford University. She also holds a B.S. in Biology from the University of North Carolina at Chapel Hill where she was a Copland Scholar. Victoria lives in San Francisco with her husband and young son.”

Sven Strohband — Khosla Ventures

Sven Strohband“Sven Strohband is a partner and managing director at Khosla Ventures, and led the firm’s early investments in Berkshire Grey, GitLab, Hermeus and Rocket Lab among others. An engineer at heart, Sven is passionate about technologies that forge new industries and accelerate novel user experiences.
Previously, he spent six years at Mohr Davidow Ventures, where he led technical diligence for the infrastructure IT and sustainability practices and worked with the portfolio to recruit technical talent, run product-market fit experiments and develop fundraising strategies.” Strohband also served as PM at Volkswagen, led Stanford racing’s autonomous car, Stanley, co-founded Metamind, and sits on the board of various companies. He holds a B.A. from Purdue University and a Ph.D. from Stanford University.

John Du — GM Ventures

John Du is a partner at GM Ventures and the chief technologist for GM China. Prior to this appointment, John was director of General Motors Research & Development organization’s China Science Lab, which he led from its founding in 2009. He was responsible for building a strong and innovative research team and leading the research and development in intelligent and connected vehicles, battery, advanced materials and electrified propulsion systems. Prior to GM, Du held several positions at Intel Corp as early as 1993, leading the Intel network processor business expansion in China in 2001; serving as Director of Intel China Research Center. John received his Ph.D. degree in Electrical Engineering from the Beijing Institute of Technology in 1989 and an Executive MBA from China Europe International Business School in 2010.

Tractian, which uses AI to monitor industrial equipment, raises $15M

Tractian, a startup developing a product to monitor the status of machines and electrical infrastructure, today announced that it closed a $15 million Series A funding round led by Next47, with participation from Y Combinator and others. The money will be put toward product development and expanding Tractian’s workforce and geographic footprint, according to co-founder and co-CEO Igor Marinelli, as well as ongoing customer acquisition efforts.

Founded in 2019, Tractian is the brainchild of Y Combinator alumni Marinelli and Gabriel Lameirinhas. Prior to starting Tractian, they worked at a paper manufacturer, International Paper, as software engineers, where Marinelli says they noticed how backwards the systems were for monitoring machinery health.

“Industrial managers of any kind need traceability of work orders, and need to know the health of their machines from kilometers away from the operations,” Marinelli said. “[W]ithout the proper combination of hardware and software, you can’t solve the industry’s real challenge.”

Tractian’s flagship product, which Marinelli says is patent pending in the U.S., uses AI to identify mechanical problems a machine might be having by analyzing its “rotational assets,” like motors, pumps and compressors. Tractian can spot signs of looseness, imbalance and misalignment from vibration and temperature anomalies measured by custom sensors, Marinelli claims, in addition to potential electrical failures.

Tractian provides sensors that attach to — and send data about — machines via 3G or 4G cellular networks. The company’s software provides checklist and inspection steps for each machine, plus diagnostics, recommendations, alerts and scheduling tools and inventories.


Monitoring equipment with Tractian. Image Credits: Tractian

Marinelli readily acknowledges that Tractian isn’t the first to the machine analytics space. Predictive maintenance technologies have been used for decades in jet engines and gas turbines, and companies including Samsara, Augury, Upkeep and MaintainX offer solutions with capabilities similar to Tractian. In April, Amazon threw its hat in the ring with the general launch of Lookout for Equipment, a service that ingests sensor data from a customer’s industrial equipment and then trains a machine learning model to predict early warning signs of machine failure.

In a sign of the segment’s competitiveness, Augury just this month acquired Seebo, a startup that provided manufacturing teams with the insights to optimize their industrial processes. Augury is one of the better-funded startups in the sector, having raised nearly $300 million in venture capital to date.

But both Marinelli and Lameirinhas sense opportunity in a market that could be worth $12.3 billion by 2025. In 2018, Gartner predicted that by 2022, spend on internet of things-enabled predictive maintenance would increase to $12.9 billion, up from $3.4 billion in 2018.

While Marinelli declined to go into detail when asked about the technical details of Tractian’s platform, including the accuracy of its algorithms, he noted that Tractian’s customer base of roughly 200 companies spans well-known brands like John Deere, Bosch, Embraer and Hyundai.

Looking ahead, the key for Tractian will be convincing would-be customers that its technology performs better than the rest. In a survey by McKinsey, analysts at the firm highlight the dangers of an under-performing predictive maintenance algorithm, claiming that one company saved over 10% on the breakdown of a piece of equipment using an algorithm but spent significantly more as a result of the algorithm’s high false-positive rate.

“[O]ur technology involves the same concept of Shazam, but for machines,” Marinelli said. [The pandemic especially] increased the necessity of real time monitoring of assets because many operators [can’t] be physically working [near them] for long periods of time.”

In March, Tractian announced its expansion to North America, opening a new office in Mexico with a team dedicated to developing the company’s activities there. Tractian plans to follow up with market entry in Atlanta, Georgia later this year.

When reached for comment, Debjit Mukerji, a partner at Next47 who plans to join Tractian’s board of directors, said: “This is a critical space, the heartbeat of our economy. Next47 is thrilled to join Tractian on its mission to transform the maintenance experience for enterprises globally. Having followed this space for years, we concluded that frictionless deployment, intuitive user interfaces and a mobile/cloud-first approach are essential ingredients of success, particularly in the underserved medium enterprise segment. Tractian combines these in its extraordinary product vision and consistently delights its customers.”

Tractian currently has 100 employees and it expects to expand its headcount to 200 in the next 18 months. The company’s total capital raised stands at $19 million; Marinelli demurred when asked about the valuation.

With a fresh $46M, Instabug aims to do more than fix your app’s bugs

Instabug, a startup that aims to help mobile developers monitor, identify and fix bugs within apps, has raised $46 million in a Series B funding round led by Insight Partners.

The raise comes just over two years after the startup raised $5 million in a seed round led by Accel, which doubled down on its investment in the latest financing. New backers Forgepoint Capital and Endeavor also put money in the round. The company declined to reveal its current valuation. The new capital brings its total equity raised to date to $54 million. 

A lot has changed for Instabug, which has dual headquarters in Cairo, Egypt and San Francisco, since its last raise. For one, the company expanded its focus from bug and crash reporting to building out application performance monitoring software “to capture everything around mobile performance.”

The company was founded in 2013, first released its offering in beta in 2015, and publicly launched in February 2016 during its time at Y Combinator. Its recent decision to expand into monitoring was driven by the fact that consumers have higher expectations from the apps they’re using and companies need to be more proactive to prevent bugs from occurring in the first place, said Instabug CEO and co-founder Omar Gabr in an interview with TechCrunch. For example, he said, four years ago, if an app wasn’t crashing, that was considered good. But now, users increasingly expect apps to not only not crash, but be slick and run well.

“Our goal is to make sure that the developers and the engineering teams who are building apps have full visibility about how that app operates and is performing,” he told TechCrunch. “We want to give them the right tools so they can be proactive. For example, so they can see if an issue is happening, and understand what’s going on before a user is giving bad reviews or ranting on Twitter.” 

Instabug saw “record” growth in 2021, a year in which it saw its ARR double and number of enterprise customers grow by 10 times, landing new clients such as DoorDash, Verizon, Qualtrics, Porsche and Gojek to join existing ones such as Clubhouse, according to Gabr. Overall, Instabug counts “many” of the fortune 500 companies and top 100 apps on the app store as customers, he added.

Gabr went on to share that in 2021, the company’s software sat within 2.7 billion mobile devices, processed 110 billion mobile sessions (up at least 20x from 2020) and helped customers resolve 4.2 billion issues. 

Image Credits: Instabug

With so much adoption on the enterprise side, Instabug last year invested more on compliance and security so it would be able to onboard large organizations, including banks and telcos. Landing more enterprise clients has been one of the main drivers of the company’s revenue growth.

“Mobile in general has been around for 15 years, but with the pandemic, it became the primary way we interface with brands and services around us,” Gabr told TechCrunch.Before for large companies, mobile was nice to have but now, it’s a must have — a core product, not just a channel or marketing thing.”

When an app crashes, Instabug automatically reports the incident back to its customer. The company says it goes beyond crash reporting, though, to give mobile developers detailed information such as where the bugs are, how an app is performing generally and when it’s completely failing.

“There are so many alternatives to apps out there, and if users don’t like one, they’ll delete and use another,” Gabr said. “Companies can’t afford to be reactive to issues.”

Combining performance monitoring with crash reporting was the “missing piece of the puzzle,” Gabr said.

“The data we gather helps us make our product better,” he added. “Now they can get the full picture of every single interaction happening on the phone and if the customer is having a really good experience or a bad one.”

Instabug has a standard SaaS business model, charging companies that are building mobile apps an annual subscription fee based on how big the app is. For example, the product is free for a company whose app has less than 10,000 monthly active users. As a business grows and more sessions are conducted on their app, they pay more, according to Gabr. In other words, how much revenue Instabug makes is directly correlated to the number of sessions that are processed on its customers’ apps.

While the company was profitable at one time, it is not currently so (as it has been doubling down and investing more on growth), according to Gabr, who says his team has been “really capital efficient.” It plans to use its new funding toward more hiring and investing in product development. Presently, it has 190 employees, with a predominantly Cairo-based engineering team.

Ganesh Bell, managing director at Insight Partners, said in doing its due diligence, his firm concluded that Instabug was “developers’ most trusted solution for performance monitoring on the mobile app stack.”

“Because Instabug focuses on mobile platforms exclusively, we believe it has the best mobile-specific capabilities and comprehensive mobile platform and framework coverage,” Bell told TechCrunch. “Instabug is flexible enough to deploy on a single app or for testing in a development environment but can immediately scale to monitor even the largest global enterprises’ app portfolios.”