Document onboarding startup Flatfile nabs $50M from investors, including Workday

Data cleansing — prepping data for applications like predictive analytics — takes time. In fact, data scientists spend an estimated 60% of their time cleaning and organizing data, according to one recent survey. It’s not just time that’s lost. According to Experian, “dirty data” costs the average business 15% to 25% of their revenue and the U.S. economy $3 trillion annually.

On a mission to change things, Eric Crane and David Boskovic started Flatfile, a platform that automatically learns how imported data should be structured and cleaned. With customers like ClickUp, Square, AstraZeneca and Spotify, the startup is gearing up for its next growth phase, closing a $50 million Series B round that brings Flatfile’s total to $94.7 million.

Tiger Global led the Series B tranche with participation from GV (Google’s AI-focused fund) and Workday — the last of which no doubt saw the applicability of Flatfile’s data processing pipeline to its HR business. Scale Ventures and angel investors from Airtable, DocuSign, LinkedIn and Gainsight also contributed, Boskovic told TechCrunch in an email.

“Data exchange and onboarding the data of new customers in particular can take thousands of hours to complete as data is collected, cleaned and moved from one business to another,” Boskovic said. “Examples of this include clients sending bulk payments to a credit card company, or vendors sending supply chain updates to a food conglomerate. For large companies, data exchange can mean upwards of six months to prepare data causing delayed customer onboarding, cost overruns and lost clients … We envisioned a way to streamline the data exchange process to save them vast amounts of time and money.”

Crane and Boskovic created the tech behind Flatfile while at productivity startup Envoy, where they shared a mutual frustration with the many wasted hours spent manipulating and cleaning up the firm’s data. Through Flatfile, they sought specifically to address challenges in data onboarding, where the high variance across input files has historically made rule-based models ineffective.


Image Credits: Flatfile

Flatfile uses AI trained on over 25 billion “data decisions” to map and resolve schema with files such as spreadsheets and CSVs. When the algorithms encounter an anomaly or a data type they can’t process automatically, they prompt customers to make a decision and then add that scenario to a database for future reference.

Flatfile recently released a software development kit that will allow developers to build on top of Flatfile’s components to access import, match, merge and export functions. While the company continues to offer an out-of-the-box import workflow, the kit enables customers with more specific requirements to customize the experience, Boskovic said.

“It’s basically letting our customers get under the hood, allowing them to stitch together all the pieces required to move information between systems with maximum flexibility and at scale,” he added. “[The] platform enables companies to leverage their data sooner. It allows employees to focus on their core strengths and leave the dirty work to us. By eliminating the thousands of hours that companies consume ensuring that data is properly formatted for their system, Flatfile helps them get their products to market faster and at a substantial cost savings.”

Flatfile competes with incumbents like Textract, Amazon’s service that can automatically extract text and data from scanned documents, and Microsoft’s data onboarding tool Form Recognizer. Google offers its own data-extracting tools including Cloud Natural Language, which performs syntax, sentiment and entity analysis on existing files.

In any case, Boskovic says that the pandemic and economic downturn were huge growth opportunities for Flatfile — the pandemic because it led companies to migrate data to the cloud and the downturn because it put pressure on them to “prove their value faster.” Flatfile’s customer base stands at thousands of developers and 500 companies as well as several unnamed government organizations.

“Flatfile is in a strong position because it offers a comprehensive solution to a business critical challenge. While we had two years of runway left, we raised an opportunistic Series B to maximize on investor demand, [and now] we have four years of runway to continue improving our operations around customer feedback,” Boskovic said. “This investment will be used to expand and support Flatfile’s fastest growing segment: global enterprise companies. We have been rapidly growing over the last three quarters to reach about 75 employees, and we expect to continue this growth into the near future. Annual recurring revenue is over $5 million, and we project it will more than double over the next 12 months.”

Document onboarding startup Flatfile nabs $50M from investors, including Workday by Kyle Wiggers originally published on TechCrunch

LibreOffice begins charging Mac App Store users $8.99

LibreOffice, the popular open source document processing suite, has begun charging users who download the software through the Mac App Store a one-time fee of $8.99. First spotted by The Register, it’s an unexpected step for The Document Foundation (TDF) — the organization behind LibreOffice — which since its inception has made all versions of LibreOffice available at no charge.

In a blog post, Italo Vignoli, head of marketing and public relations at LibreOffice, said that the change was reflective of a “new marketing strategy” where TDF will focus on releasing free, community versions of LibreOffice while “ecosystem companies” develop “value-added” releases targeted at enterprise customers. The LibreOffice client on the Mac App Store falls into this latter category because it’s not based on the same source code as the base LibreOffice project, Vignoli says, and was maintained by U.K.-based software consultancy Collabora. (LibreOffice on the Mac App Store doesn’t include Java because external dependencies aren’t allowed on the store.)

The objective is to draw a clearer distinction between LibreOffice clients backed by professional services and community releases that are supported by volunteers, Vignoli added. “We are grateful to Collabora for having supported LibreOffice on Apple’s Mac App Stores for quite a long time,” Vignoli said. “The objective is to fulfill the needs of individual and enterprise users in a better way, although we know that the positive effects of the change will not be visible for some time. Educating enterprises about [free and open source software] is not a trivial task and we have just started our journey in this direction.”

It’s unclear whether Collabora or another developer will take charge of maintaining the version of LibreOffice hosted on the Mac App Store; The Register notes that Collabora previously charged $10 for “LibreOffice Vanilla” with three years of support. When contacted for comment, Collabora productivity general manager Michael Meeks implied that TDF would lead the charge going forward:

“Recently, TDF have got around to distributing LibreOffice themselves on the Mac App Store, and (it is to be hoped) should re-invest the proceeds in developing LibreOffice themselves — although there’s quite a complex picture around that,” he said via email. “In terms of financial arrangements, when we created LibreOffice Vanilla, we decided to donate 10% of our revenue to TDF to ensure that there was no financial loss from missed donations there. We also donated 10% of Collabora Office revenues, and when the Apple app-store reduced its commission rate, we increased that, too.”

In any case, Vignoli was quick to note that LibreOffice will remain free for MacOS — just not through the Mac App Store. Users have to take the extra step of downloading the release from the LibreOffice website, forgoing App Store features such as automatic updates and account management.

The newly-implemented charge might still strike some as consumer-hostile. But it costs $100 per year to publish apps on the App Store, with Apple taking a 30% cut of sales, and loads of open source projects have commercial flavors as their licenses don’t prevent developers from creating paid apps. For example, the open source painting programs Paint.NET and Krita are available for free from the projects’ websites but charge for downloads through the Microsoft Store — the proceeds from which go toward development and support.

LibreOffice begins charging Mac App Store users $8.99 by Kyle Wiggers originally published on TechCrunch

Hebbia raises $30M to launch an AI-powered document search tool

Hebbia, a startup developing AI-infused search tools, today announced that it raised $30 million in a Series A round led by Index Ventures with participation from Radical Ventures. Of note, among the investors was Yahoo! co-founder Jerry Yang (full disclosure: Yahoo! is TechCrunch’s parent company) and Raquel Urtasun, a former head of AI research at Uber.

CEO George Sivulka says that the new cash will be put toward building out Hebbia’s engineering team and “accelerating development” of its product platform, in addition to expanding its customer acquisition efforts into professional services industries.

When TechCrunch last wrote about Hebbia, the company — founded by a team of Stanford AI researchers — was applying AI techniques to create search and summarization tools that could make sense of specialized domain knowledge. One of them was a Chrome plugin called Ctrl-F, which upgraded Chrome’s search functionality to go beyond text pattern matching with natural language processing, highlighting useful information directly on pages.

Now, after something of a pivot, Hebbia is launching a new AI-powered product with an eye toward deep document analysis: a “neural” search engine. Launched today, it can look over billions of documents at once including PDFs, PowerPoints, spreadsheets and transcripts to return answers to questions like “Which are the largest acquisitions in the supply chain industry within the past five years?”

“With Hebbia, you bring your own data or you search a trusted … primary source repository of data we’ve already indexed for you: earnings transcripts, news, [meeting] minutes, SEC filings, recently passed legislation, scientific research and more,” a company spokesperson told TechCrunch. “[There’s more] trust and transparency around what corpus is informing your search results.”

The inspiration for the neural search engine came from Sivulka’s personal experience. During his doctoral research, Sivulka says that many of his friends, who worked in finance, had to scramble over thousands of documents in hundred-hour work weeks. AI, he thought, could solve this problem — or at least streamline some of the core processes involved.

It’s early days. But Sivulka says that Hebbia’s search engine is finding early traction among financial services firms, which are using it for due diligence and other steps across investment pipelines.

“Hebbia currently counts 20 paying enterprises as customers, including several of the world’s largest private equity firms, hedge funds, consultancies and government projects,” the spokesperson continued.

New York-based Hebbia, which has a 15-person workforce currently, expects to double its headcount by the end of the year.

Hebbia raises $30M to launch an AI-powered document search tool by Kyle Wiggers originally published on TechCrunch

PSPDFkit raises $116M, its first outside money; now nearly 1B people use apps powered by its collaboration, signing and markup tools

An under-the-radar, bootstrapped startup from Vienna, Austria — a hit with developers for technology that underpins user experience for some of the world’s most popular apps — is doubling down on momentum and announcing its first outside investment, in the form of a large growth round of funding.

PSPDFkit — which provides APIs and an SDK that developers use to power document processing features like e-signing, document viewing and editing, collaboration and much more — has raised €100 million ($116 million). The funding is coming from a single investor, Insight Partners.

PSPDFkit is already profitable, and it has been for a while, so this investment is about stepping up its pace of growth. It plans to use the investment to build more developer tools, make strategic acquisitions (co-founder and CEO Jonathan Rhyne is mum about what, except to say that it will be to expand the suite of useful tools that it provides); and, for the first time, make some concerted efforts in the areas of sales and marketing.

A lot of PSPDFkit’s growth to date, in fact, has been by word of mouth, a strategy that has gotten it very far up to now. Its customers include Dropbox, DocuSign, SAP, IBM, Volkswagen, Fabasoft, Wolters Kluwer Deutschland, and the European Patent Office, among a number of others that it works with under NDA.

In many cases, not every company is happy to admit just how much of their user experience and technology have been built by third parties, and that’s the situation with where and how PSPDFkit is used, too, but the fact remains that it’s quietly huge: altogether, PSPDFkit’s tech, by way of its APIs and SDKs, is now approaching 1 billion users in 150 countries.

Unsurprisingly, this traction has also meant that PSPDFkit has had a lot of acquisition interest over the years. Large technology companies building productivity tools, working with developers already, and are already very active in mobile apps and cloud services have all knocked on PSPDFkit’s door. Although the startup is not disclosing its valuation, you can guess that, given its size and profitable status, with this latest round, it’s definitely become a more expensive buy. (And if all goes to plan, will become even more so.)

The story of PSPDFkit is an interesting one that mirrors a lot of how mobile development itself has grown up over the years.

Originally the company started out around 2011 as a framework and toolset created by Austrian engineer Peter Steinberger, who was already involved in the iOS developer community and could see a need in the market from a number of apps for document-manipulation features like e-signing, document editing, and document viewing. Apps were letting us turn a lot of things into virtual experiences, and paper was shaping up to be one of the first things to go.

That need turned out to be a classic use case for building that functionality and making it something that many others could access by way of SDKs and APIs: document manipulation tools — even something so basic as previewing a file that is contained in a cloud-based folder — are very hard to build from scratch, are not necessarily part of companies’ core businesses, yet are nevertheless central to how they work.

Steinberger’s framework thus became one of the early examples of how SDKs and APIs could be used to integrate different services and functionality into other apps — a basic principle has now been applied to a whole host of other features: embeeded financial services, eg, neobanks; embedded payments, eg Stripe and others; embedded communications, eg, Twilio, Sinch, etc; and so on.

(The name PSPDFkit was a reference to Peter Steinberger’s initials; PDF because PDFs were, and largely remain, the company’s initial focus; and “kit” in reference to the SDK that it was.)

As mobile app creation and usage started to really take off, so did Steinberger’s framework, and soon Martin Schürrer joined him in building it. Rhyne — who was working in the U.S. as an attorney representing developers — then became their lawyer after meeting them at a developer event. Soon into that relationship, the three realized that not only was there a proper, growing business to be managed, but Steinberger and Schürrer had little interest in doing that. By 2014, Rhyne gave up practicing law and came on as a third co-founder, with Steinberger and Schürrer in Vienna, and Rhyne based out of North Carolina in the U.S.

With this round, Steinberger and Schürrer are stepping away from full-time roles but remain “significantly invested” in the business, while Rhyne is staying on as CEO.

Things evolved rapidly from there, in keeping with the meteoric rise of apps themselves.

Starting out on iOS, today PSPDFkit provides tools that can be used for building apps on Android and the web, using Flutter and React Native. The strategy is to work on building out a bigger platform to handle multiple, related functionality that developers might want to use related to documents, and perhaps the wider world of productivity.

Rhyne’s basic description of what PSPDFkit and its customers do today is “obsoleting paper.” But over time — not unlike how Stripe has progressed from its core feature, providing APIs to power payments, to a wider suite of services related to transactions — PSPDFkit sees an opportunity to do more, too, not least because the world’s expectations have also changed.

For example, Rhyne recalled how PSPDFkit put out a real time collaboration platform in 2015, “but we were like, ‘Man, it’s not getting traction, people don’t really want to use this.'” Then Covid-19 happened, he continued: “Now every single customer is saying, ‘This looks great. Yeah, we want to use that…’ I think that we’re seeing a change in the way people interact with documents.”

That speaks of a lot of opportunity both for the startup, and for its investors.

“Software developers and engineers are on the cutting edge of work simply by the nature of their craft,” said Ryan Hinkle, MD at Insight Partners, in a statement. “How they work and collaborate should be on the cutting edge, too. With PSPDFKit’s software development kits and hosted solutions, the company is revolutionizing document processing for enterprises and the developers they task with keeping the company at the forefront of innovation. Insight is thrilled to play a role in the company’s growth journey.” Hinkle is joining the board with this round.