Arrikto expands it MLOps platform with Kubeflow as a Service

Arrikto’s mission is to enable data scientists to build and deploy their machine learning models faster. The company, which raised a $10 million Series A round in late 2020, is building its platform on top of Kubeflow, a cloud-native open-source project for building machine learning operations that was originally developed by Google but which is now mostly managed by the community. Until now, Arrikto’s main product was a self-managed enterprise distribution of Kubeflow for enterprises (aptly named ‘Enterprise Kubeflow’) that wanted to run it in their data centers or virtual private clouds. Today, the company is also launching a fully managed version of Kubeflow.

“Pushing ML models from experimentation all the way to production is incredibly complex,” Arrikto CEO and co-founder Constantinos Venetsanopoulos told me. “We see a few common reasons for this. Number one is data scientists are essentially not ops experts and ops people aren’t data scientists — and they don’t want to become data scientists. Second, we have seen an explosion of ML tools the last couple of years. They are extremely fragmented and they require a lot of integration. What we’re seeing is people struggling to stitch everything together. Both of those factors create a massive barrier to entry.”

Image Credits: Arrikto

With its fully managed Kubeflow, Arrikto aims to give businesses a platform that can help them accelerate their ML pipelines and free data scientists from having to worry about the infrastructure, while also allowing them to continue to use the tools they are already familiar with (think notebooks, TensorFlow, PyTorch, Hugging Face, etc.). “We want to break down the technical barrier that keeps most companies from deploying real machine learning capabilities,” said Venetsanopoulos.

With Kubeflow as a Service, the company argues, data scientists will get instant access to an end-to-end MLops platform. It’s essentially Arrikto’s Enterprise Kubeflow with a lot of custom automation tooling on top of it to abstract away all of the details of the Kubernetes platform it sits on top of.

For now, Arrikto will only run on a single cloud but in the long run, the plan is to support the three major cloud providers to ensure low latencies (and reduce the need to move lots of data between clouds).

Interestingly, Venetsanopoulos argues that the company’s biggest competitor right now isn’t other managed services like AWS’ SageMaker but businesses trying to build their own platforms by stitching together open-source tools.

“Kubeflow as a Service gives both data scientists and DevOps engineers the easiest way to use an MLOps platform on Kubernetes without having to request any infrastructure from their IT departments,” said Venetsanopoulos. “When an organization deploys Kubeflow in production – whether on-prem or in the cloud – Arrikto’s Kubeflow as a Service will turbocharge the process.”

The company, which now has about 60 employees, will continue to offer Kubeflow Enterprise in addition to this new fully managed service.

Baseten nabs $20M to make it easier to build machine learning-based applications

As the tech world inches a closer to the idea of artificial general intelligence, we’re seeing another interesting theme emerging in the ongoing democratization of AI: a wave of startups building tech to make AI technologies more accessible overall by a wider range of users and organizations.

Today, one of these, Baseten — which is building tech to make it easier to incorporate machine learning into a business’s operations, production and processes without a need for specialized engineering knowledge — is announcing $20 million in funding and the official launch of its tools.

These include a client API and a library of pretrained models to deploy models built in TensorFlow, PyTorch or scikit-learn; the ability to build APIs to power your own applications; and the ability the create custom UIs for your applications based on drag and drop components.

The company has been operating in a closed, private beta for about a year and has amassed an interesting group of customers so far, including both Stanford and the University of Sydney, Cockroach Labs, and Patreon, among others, who use it to, for example, help organizations with automated abuse detection (through content moderation) and fraud prevention.

The $20 million is being discussed publicly for the first time now to coincide with the commercial launch, and it’s in two tranches, with equally notable names among those backers.

The seed was co-led by Greylock and South Park Commons Fund, with participation also from the AI Fund, Caffeinated Capital and individuals including Greg Brockman, co-founder and CTO at general intelligence startup OpenAI; Dylan Field, co-founder and CEO of Figma; Mustafa Suleyman, co-founder of DeepMind; and DJ Patil, ex-Chief Scientist of the United States.

Greylock also led the Series A, with participation from South Park Commons, early Stripe exec Lachy Groom; Dev Ittycheria, CEO of MongoDB; Jay Simon, ex-president of Atlassian, now at Bond; Jean-Denis Greze, CTO of Plaid; and Cristina Cordova, another former Stripe exec.

Tuhin Srivastava, Baseten’s co-founder and CEO, said in an interview that the funding will be used in part to bring on more technical and product people, and to ramp up its marketing and business development.

The issue that Baseten has identified and is trying to solve is one that is critical in the evolution of AI: Machine learning tools are becoming ever more ubiquitous and utilized, thanks to cheaper computing power, better access to training models and a growing understanding of how and where they can be used. But one area where developers still need to make a major leap, and businesses still need to make big investments, is when it comes to actually adopting and integrating machine learning: there remains a wide body of technical knowledge that developers and data scientists need to actually integrate machine learning into their work.

“We were born out of the idea that machine learning will have a massive impact on the world, but it’s still difficult to extract value from machine learning models,” Srivastava said. Difficult, because developers and data scientists need to have specific knowledge of how to handle machine learning ops, as well as technical expertise to manage production at the back end and the front end, he said. “This is one reason why machine learning programs in businesses often actually have very little success: it takes too much effort to get them into production.”

This is something that Srivastava and his co-founders Amir Haghighat (CTO) and Philip Howes (Chief Scientist) experienced first-hand when they worked together at Gumroad. Haghighat, who was head of engineering, and Srivastava and Howes, who were data scientists, wanted to use machine learning at the payments company to help with fraud detection and content moderation, and realised that they needed to pick up a lot of extra full-stack engineering skills — or hire specialists — to build and integrate that machine learning along with all of the tooling needed to run it (eg notifications and integrating that data into other tools to action).

They built the systems — still in use, and screening “hundreds of millions of dollars of transactions” — but also picked up an idea in the process: others surely were facing the same issues they did, so why not work on a set of tools to help all of them and take away some of that work?

Today, the main customers of Baseten — a reference to base ten blocks, often used to help younger students learn the basics of mathematics (“It humanizes the numbers system, and we wanted to make machine learning less abstract, too,” said the CEO) — are developers and data scientists who are potentially adopting other machine learning models, or even building their own, but lack the skills to practically incorporate them into their own production flows. There, Baseten is part of a bigger group of companies that appear to be emerging building “MLops” solutions — full sets of tools to make machine learning more accessible and usable by those working in devops and product. These include Databricks, Clear, Gathr and more. The idea here is to give tools to technical people to give them more power and more time to work on other tasks.

“Baseten gets the process of tool-building out of the way so we can focus on our key skills: modeling, measurement and problem solving,” said Nikhil Harithras, senior machine learning engineer at Patreon, in a statement. Patreon is using Baseten to help run an image classification system, used to find content that violates its community guidelines.

Over time, there a logical step that Baseten could make, continuing on its democratization trajectory: considering how to build tools for non-technical audiences, too — an interesting idea in light of the many no-code and low-code products that are being rolled out to give them more power to build their own data science applications, too.

“Non-technical audiences are not something we focus on today, but that is the evolution,” Srivastava said. “The highest level goal is to accelerate the impact of machine learning.”

Strong Compute wants to speed up your ML model training

Training neural networks takes a lot of time, even with the fastest and costliest accelerators on the market. It’s maybe no surprise then that a number of startups are looking at how to speed up the process at the software level and remove some of the current bottlenecks in the training process. For Strong Compute, a Sydney, Australia-based startup that was recently accepted into Y Combinator’s Winter ’22 class, it’s all about removing these inefficiencies in the training process. By doing so, the team argues that it can speed up the training process by 100x or more.

“PyTorch is beautiful and so is TensorFlow. These toolkits are amazing, but the simplicity they have — and the ease of implementation they have — comes at the cost of things being inefficient under the hood,” said Strong Compute CEO and founder Ben Sand, who previously co-founded AR company Meta (before Facebook used that name).

While there are companies that focus on optimizing the models themselves and Strong Compute will also do that if its customers request it, Sand noted that this may compromise the results. What the team focuses on instead is everything around the model. That may be a slow data pipeline or pre-computing a lot of the values before the training begins. Sand also noted that the company has optimized some of the often-used libraries for data augmentation.

The company also recently hired Richard Pruss, a former Cisco principal engineer, to focus on removing networking bottlenecks in the training pipeline, which can quickly add up to a lot of latency. But, of course, the hardware, too, can make a lot of difference, so Strong Compute works with its customers to run models on the right platform, too.

““Strong Compute took our core algorithm training from thirty hours to five minutes, training hundreds of terabytes of data,” said Miles Penn, the CEO of MTailor, which specializes in creating custom clothes for its online clients. “Deep learning engineers are probably the most precious resource on this planet, and Strong Compute has enabled ours to be 10x more productive. Iteration and experimentation time is the most important lever for ML productivity, and we were lost without Strong Compute.”

Sand argues that the large cloud providers don’t really have any incentives to do what his compay does, given that their business model relies on people using their machines for as long as possible, something Y Combinator managing director Michael Seibel agrees with. ‘Strong Compute is aimed at a serious incentive misalignment in cloud computing, where faster results that are valued by clients are less profitable for providers,” Seibel said.

Image Credits: Strong Compute’s Ben Sand (left) and Richard Pruss (right).

Currently, the team still provides white-glove service to its customers, though developers shouldn’t notice too much of a difference since integrating its optimizations should not really change their workflow. The promise Strong Compute makes here is that it can “10x your dev cycles.” Looking ahead, the idea is to automate as much of the process as possible.

“AI Companies can keep their focus on their customer, data and core algorithm, which is where their core IP and value lies, leaving all the configuration and operations work to Strong Compute,” said Sand. This not only gives them the rapid iteration they need for success, it critically makes sure that their developers are only focused on work that is adding value for the company. Today they are spending up to two thirds of their time on complex system administration work ‘ML Ops,’ which is largely generic across AI companies and often outside their area of expertise–it makes no sense for that to be in house.”

Bonus: here’s a video of our own Lucas Matney trying out the Meta 2 AR headset from Sand’s last company back in 2016. raises $30M for its for its neural search platform

Berlin-based, an open-source startup that uses neural search to help its users find information in their unstructured data (including videos and images), today announced that it has raised a $30 million Series A funding round led by Canaan Partners. New investor Mango Capital, as well as existing investors GGV Capital, SAP.iO and Yunqi Partners also participated in this round, which brings the company’s total funding to $39 million to date. CEO and co-founder Han Xiao, who co-founded the company together with Nan Wang and Bing He, explained that the idea behind neural search is to use deep learning neural networks to go beyond traditional keyword-based search tools. Making use of relatively new machine learning technologies like transfer learning and representation learning, the company’s core Jina framework can help developers quickly build search tools for their specific use cases.

“Given an image, audio, video or whatever — we first use deep neural networks to translate this data format into a universal representation,” Xiao explained. “In this case, it’s mostly a mathematic vector — 100-dimensional vectors. And then, the matching [algorithm] does not count how many letters match but counts the mathematical distance, the vector distance between these two vectors. In this way, you can basically use this kind of methodology to solve all kinds of data search problems or relevance problems.”

Xiao described Jina as akin to TensorFlow for search (with TensorFlow being Google’s open-source machine learning framework). Just like TensorFlow or PyTorch defined the design pattern of how people design AI systems, Jina wants to define how people build neural search systems — and become the de-facto standard for doing so in the process.

But Jina is only one of the company’s current set of products. It also offers the Jina Hub, a marketplace that allows developers to share and discover the building blocks for Jina-based neural search applications, as well as the recently launched Finetuner, a tool for fine-tuning any deep neural network.

“Over the last 18 months, we spent a lot of effort on building the core infrastructure, on building the foundation of this big neural search tower — and that part is already done,” Xiao said. “And now we are slowly building the first floor, the second floor of this big building — and we try to provide an end-to-end development experience.”

Image Credits:

The company says the Jina AI developer community currently counts about 1,000 users, with applications that range from a video game developer who use it to auto-fill relevant game assets in the right-click many of its game editor to a legal-tech startup that uses it to enable its chatbot to provide a Q&A experience that draws on data from PDF documents.

The open-source Jina framework already has almost 200 external contributors since its launch in May 2020 and the company also hosts an active Slack community around the project.

“The reasons we are doing open source is mostly because of the velocity of open source — and I believe the velocity of the development is a key factor for the success of a software project. A lot of software just dies because this velocity goes to zero,” Xiao said. “We are building the community and we are leveraging the community to gather feedback to iterate fast. And this is super important for infrastructure software like us. we are building the community. And we are leveraging the community to gather feedback to fast iterate. And this is super important for infrastructure software like us. You need all these top-tier developers to give your feedback about the usability, accessibility and so on in order to improve it quickly.” plans to use the new funding to double its team and especially to expand its operations in North America. With this expanded team, the company plans to invest in R&D to expand the overall Jina ecosystem and launch new tools and services around it.

“Traditional search systems built for textual data don’t work in a world brimming with images, video, and other multimedia. Jina AI is moving companies from black and white into color, unlocking unstructured data in a way that’s fast, scalable, and data-agnostic,” said Canaan Partners’ Joydeep Bhattacharyya. “The early applications of its open-source framework already show glimmers of the future, with neural search underpinning opportunities to improve decision-making, refine operations and even create new revenue streams.”

Microsoft Azure launches enterprise support for PyTorch

Microsoft today announced PyTorch Enterprise, a new Azure service that provides developers with additional support when using PyTorch on Azure. It’s basically Microsoft’s commercial support offering for PyTorch

PyTorch is a Python-centric open-source machine learning framework with a focus on computer vision and natural language processing. It was originally developed by Facebook and is, at least to some degree, comparable to Google’s popular TensorFlow framework.

Frank X. Shaw, Microsoft’s corporate VP for communications, described the new PyTorch Enterprise service as providing developers with “a more reliable production experience for organizations using PyTorch in their data sciences work.”

With PyTorch Enterprise, members of Microsoft’s Premier and Unified support program will get benefits like prioritized requests, hands-on support and solutions for hotfixes, bugs and security patches, Shaw explained. Every year, Microsoft will also select one PyTorch support for long-term support.

Azure already made it relatively easy to use PyTorch and Microsoft has long invested in the library by, for example, taking over the development of PyTorch for Windows last year. As Microsoft noted in today’s announcement, the latest release of PyTorch will be integrated with Azure Machine Learning and the company promises to feed back the PyTorch code it developers back to the public PyTorch distribution.

Enterprise support will be available for PyTorch version 1.8.1 and up on Windows 10 and a number of popular Linux distributions.

“This new enterprise-level offering by Microsoft closes an important gap. PyTorch gives our researchers unprecedented flexibility in designing their models and running their experiments,” said Jeremy Jancsary, Senior Principal Research Scientist at Nuance. “Serving these models in production, however, can be a challenge. The direct involvement of Microsoft lets us deploy new versions of PyTorch to Azure with confidence.”

With this new offering, Microsoft is taking a page out of the open-source monetization playbook for startups by offering additional services on top of an open-source project. Since PyTorch wasn’t developed by a startup, only to have a major cloud provider then offer its own commercial version on top of the open-source code, this feels like a rather uncontroversial move.


SambaNova raises $676M at a $5.1B valuation to double down on cloud-based AI software for enterprises

Artificial intelligence technology holds a huge amount of promise for enterprises — as a tool to process and understand their data more efficiently; as a way to leapfrog into new kinds of services and products; and as a critical stepping stone into whatever the future might hold for their businesses. But the problem for many enterprises is that they are not tech businesses at their cores and so bringing on and using AI will typically involve a lot of heavy lifting. Today, one of the startups building AI services is announcing a big round of funding to help bridge that gap.

SambaNova — a startup building AI hardware and integrated systems that run on it that only officially came out of three years in stealth last December — is announcing a huge round of funding today to take its business out into the world. The company has closed in on $676 million in financing, a Series D that co-founder and CEO Rodrigo Liang has confirmed values the company at $5.1 billion.

The round is being led by SoftBank, which is making the investment via Vision Fund 2. Temasek and the Government of Singapore Investment Corp. (GIC), both new investors, are also participating, along with previous backers BlackRock, Intel Capital, GV (formerly Google Ventures), Walden International and WRVI, among other unnamed investors. (Sidenote: BlackRock and Temasek separately kicked off an investment partnership yesterday, although it’s not clear if this falls into that remit.)

Co-founded by two Stanford professors, Kunle Olukotun and Chris Ré, and Liang, who had been an engineering executive at Oracle, SambaNova has been around since 2017 and has raised more than $1 billion to date — both to build out its AI-focused hardware, which it calls DataScale and to build out the system that runs on it. (The “Samba” in the name is a reference to Liang’s Brazilian heritage, he said, but also the Latino music and dance that speaks of constant movement and shifting, not unlike the journey AI data regularly needs to take that makes it too complicated and too intensive to run on more traditional systems.)

SambaNova on one level competes for enterprise business against companies like Nvidia, Cerebras Systems and Graphcore — another startup in the space which earlier this year also raised a significant round. However, SambaNova has also taken a slightly different approach to the AI challenge.

In December, the startup launched Dataflow-as-a-service as an on-demand, subscription-based way for enterprises to tap into SambaNova’s AI system, with the focus just on the applications that run on it, without needing to focus on maintaining those systems themselves. It’s the latter that SambaNova will be focusing on selling and delivering with this latest tranche of funding, Liang said.

SambaNova’s opportunity, Liang believes, lies in selling software-based AI systems to enterprises that are keen to adopt more AI into their business, but might lack the talent and other resources to do so if it requires running and maintaining large systems.

“The market right now has a lot of interest in AI. They are finding they have to transition to this way of competing, and it’s no longer acceptable not to be considering it,” said Liang in an interview.

The problem, he said, is that most AI companies “want to talk chips,” yet many would-be customers will lack the teams and appetite to essentially become technology companies to run those services. “Rather than you coming in and thinking about how to hire scientists and hire and then deploy an AI service, you can now subscribe, and bring in that technology overnight. We’re very proud that our technology is pushing the envelope on cases in the industry.”

To be clear, a company will still need data scientists, just not the same number, and specifically not the same number dedicating their time to maintaining systems, updating code and other more incremental work that comes managing an end-to-end process.

SambaNova has not disclosed many customers so far in the work that it has done — the two reference names it provided to me are both research labs, the Argonne National Laboratory and the Lawrence Livermore National Laboratory — but Liang noted some typical use cases.

One was in imaging, such as in the healthcare industry, where the company’s technology is being used to help train systems based on high-resolution imagery, along with other healthcare-related work. The coincidentally-named Corona supercomputer at the Livermore Lab (it was named after the 2014 lunar eclipse, not the dark cloud of a pandemic that we’re currently living through) is using SambaNova’s technology to help run calculations related to some Covid-19 therapeutic and antiviral compound research, Marshall Choy, the company’s VP of product, told me.

Another set of applications involves building systems around custom language models, for example in specific industries like finance, to process data quicker. And a third is in recommendation algorithms, something that appears in most digital services and frankly could always do to work a little better than it does today. I’m guessing that in the coming months it will release more information about where and who is using its technology.

Liang also would not comment on whether Google and Intel were specifically tapping SambaNova as a partner in their own AI services, but he didn’t rule out the prospect of partnering to go to market. Indeed, both have strong enterprise businesses that span well beyond technology companies, and so working with a third party that is helping to make even their own AI cores more accessible could be an interesting prospect, and SambaNova’s DataScale (and the Dataflow-as-a-service system) both work using input from frameworks like PyTorch and TensorFlow, so there is a level of integration already there.

“We’re quite comfortable in collaborating with others in this space,” Liang said. “We think the market will be large and will start segmenting. The opportunity for us is in being able to take hold of some of the hardest problems in a much simpler way on their behalf. That is a very valuable proposition.”

The promise of creating a more accessible AI for businesses is one that has eluded quite a few companies to date, so the prospect of finally cracking that nut is one that appeals to investors.

“SambaNova has created a leading systems architecture that is flexible, efficient and scalable. This provides a holistic software and hardware solution for customers and alleviates the additional complexity driven by single technology component solutions,” said Deep Nishar, Senior Managing Partner at SoftBank Investment Advisers, in a statement. “We are excited to partner with Rodrigo and the SambaNova team to support their mission of bringing advanced AI solutions to organizations globally.”

AWS launches Trainium, its new custom ML training chip

At its annual re:Invent developer conference, AWS today announced the launch of AWS Trainium, the company’s next-gen custom chip dedicated to training machine learning models. The company promises that it can offer higher performance than any of its competitors in the cloud, with support for TensorFlow, PyTorch and MXNet.

It will be available as EC2 instances and inside Amazon SageMaker, the company’s machine learning platform.

New instances based on these custom chips will launch next year.

The main arguments for these custom chips are speed and cost. AWS promises 30% higher throughput and 45% lower cost-per-inference compared to the standard AWS GPU instances.

In addition, AWS is partnering with Intel to launch Habana Gaudi-based EC2 instances for machine learning training. Coming next year, these instances promise to offer up to 40% better price/performance compared to the current set of GPU-based EC2 instances for machine learning. These chips will support TensorFlow and PyTorch.

These new chips will make their debut in the AWS cloud in the first half of 2021.

Both of these new offerings complement AWS Inferentia, which the company launched at last year’s re:Invent. Inferentia is the inferencing counterpart to these machine learning pieces, which also uses a custom chip.

Trainium, it’s worth noting, will use the same SDK as Inferentia.

“While Inferentia addressed the cost of inference, which constitutes up to 90% of ML infrastructure costs, many development teams are also limited by fixed ML training budgets,” the AWS team writes. “This puts a cap on the scope and frequency of training needed to improve their models and applications. AWS Trainium addresses this challenge by providing the highest performance and lowest cost for ML training in the cloud. With both Trainium and Inferentia, customers will have an end-to-end flow of ML compute from scaling training workloads to deploying accelerated inference.”

Google updates Android Studio with better TensorFlow Lite support and a new database inspector

Google launched version 4.1 of Android Studio, its IDE for developing Android apps, into its stable channel today. As usual for Android Studio, the minor uptick in version numbers doesn’t quite do the update justice. It includes a vast number of new and improved features that should make life a little bit easier for Android developers. The team also fixed a whopping 2370 bugs during this release cycle and closed 275 public issues.

Image Credits: Google

The highlights of today’s release are a new database inspector and better support for on-device machine learning by allowing developers to bring TensorFlow Lite models to Android, as well as the ability to run the Android Emulator right inside of Android Studio and support for testing apps for foldable phones in the emulator as well. That’s in addition to various other changes the company has outlined here.

The one feature that will likely improve the quality of life for developers the most is the ability to run the Android Emulator right in Android Studio. That’s something the company announced earlier this summer, so it’s not a major surprise, but it’s a nice update for developers since they won’t have to switch back and forth between different windows and tools to test their apps.

Talking about testing, the other update is support for foldable devices in the Android Emulator, which now allows developers to simulate the hinge angle sensor and posture changes so their apps can react accordingly. That’s still a niche market, obviously, but more and more developers are now aiming to offer apps to actually support these devices.

Image Credits: Google

Also new is improved support for TensorFlow Lite models in Android Studio, so that developers can bring those models to their apps, as well as a new database inspector that helps developers get easier insights into their queries and the data they return — and that lets them modify values white running their apps to see how their apps react to those.

Other updates include new templates in the New Project dialog that support Google’s Material Design Components, Dagger navigation support, System Trace UI improvements and new profilers to help developers optimize their apps’ performance and memory usage.

Hailo challenges Intel and Google with its new AI modules for edge devices

Hailo, a Tel Aviv-based startup best known for its high-performance AI chips, today announced the launch of its M.2 and Mini PCIe high-AI acceleration modules. Based around its Hailo-8 chip, these new models are meant to be used in edge devices for anything from smart city and smart home solutions to industrial applications.

Today’s announcement comes about half a year after the company announced a $60 million Series B funding round. At the time, Hailo said it was raising those new funds to roll out its new AI chips, and with today’s announcement, it’s making good on this promise. In total, the company has now raised $88 million.

“Manufacturers across industries understand how crucial it is to integrate AI capabilities into their edge devices. Simply put, solutions without AI can no longer compete,” said Orr Danon, CEO of Hailo, in today’s announcement. “Our new Hailo-8 M.2 and Mini PCIe modules will empower companies worldwide to create new powerful, cost-efficient, innovative AI-based products with a short time-to-market – while staying within the systems’ thermal constraints. The high efficiency and top performance of Hailo’s modules are a true gamechanger for the edge market.”

Image Credits: Hailo

Developers can still use frameworks like TensorFlow and ONNX to build their models, and Hailo’s Dataflow compiler will handle the rest. One thing that makes Hailo’s chips different is its architecture, which allows it to automatically adapt to the needs of the neural network running on it.

Hailo is not shy about comparing its solution to that of heavyweights like Intel, Google and Nvidia. With 26 tera-operations per second (TOPS) and power efficiency of 3 TOPS/W, the company claims its edge modules can analyze significantly more frames per second than Intel’s Myriad-X and Google’s Edge TPU modules — all while also being far more energy efficient.

Image Credits: Hailo

The company is already working with Foxconn to integrate the M.2 module into its “BOXiedge” edge computing platform. Because it’s just a standard M.2 module, Foxconn was able to integrate it without any rework. Using the Hailo-8 M.2 solution, this edge computing server can process 20 camera streams at the same time.

“Hailo’s M.2 and Mini PCIe modules, together with the high-performance Hailo-8 AI chip, will allow many rapidly evolving industries to adopt advanced technologies in a very short time, ushering in a new generation of high performance, low power, and smarter AI-based solutions,” said Dr. Gene Liu, VP of Semiconductor Subgroup at Foxconn Technology Group.

The AI stack that’s changing retail personalization

Consumer expectations are higher than ever as a new generation of shoppers look to shop for experiences rather than commodities. They expect instant and highly-tailored (pun intended?) customer service and recommendations across any retail channel.

To be forward-looking, brands and retailers are turning to startups in image recognition and machine learning to know, at a very deep level, what each consumer’s current context and personal preferences are and how they evolve. But while brands and retailers are sitting on enormous amounts of data, only a handful are actually leveraging it to its full potential.

To provide hyper-personalization in real time, a brand needs a deep understanding of its products and customer data. Imagine a case where a shopper is browsing the website for an edgy dress and the brand can recognize the shopper’s context and preference in other features like style, fit, occasion, color etc., then use this information implicitly while fetching similar dresses for the user.

Another situation is where the shopper searches for clothes inspired by their favorite fashion bloggers or Instagram influencers using images in place of text search. This would shorten product discovery time and help the brand build a hyper-personalized experience which the customer then rewards with loyalty.

With the sheer amount of products being sold online, shoppers primarily discover products through category or search-based navigation. However, inconsistencies in product metadata created by vendors or merchandisers lead to poor recall of products and broken search experiences. This is where image recognition and machine learning can deeply analyze enormous data sets and a vast assortment of visual features that exist in a product to automatically extract labels from the product images and improve the accuracy of search results. 

Why is image recognition better than ever before?

retail and artificial intelligence


While computer vision has been around for decades, it has recently become more powerful, thanks to the rise of deep neural networks. Traditional vision techniques laid the foundation for learning edges, corners, colors and objects from input images but it required human engineering of the features to be looked at in the images. Also, the traditional algorithms found it difficult to cope up with the changes in illumination, viewpoint, scale, image quality, etc.

Deep learning, on the other hand, takes in massive training data and more computation power and delivers the horsepower to extract features from unstructured data sets and learn without human intervention. Inspired by the biological structure of the human brain, deep learning uses neural networks to analyze patterns and find correlations in unstructured data such as images, audio, video and text. DNNs are at the heart of today’s AI resurgence as they allow more complex problems to be tackled and solved with higher accuracy and less cumbersome fine-tuning.

How much training data do you need?