tensor processing unit – Product Management Confabulation

May 11, 2022

Google launches a 9 exaflop cluster of Cloud TPU v4 pods into public preview

At its I/O developer conference, Google today announced the public preview of a full cluster of Google Cloud’s new Cloud TPU v4 Pods.

Google’s fourth iteration of its Tensor Processing Units launched at last year’s I/O and a single TPU pod consists of 4,096 of these chips. Each chip has a peak performance of 275 teraflops and each pod promises a combined compute power of up to 1.1 exaflops of compute power. Google now operates a full cluster of eight of these pods in its Oklahoma data center with up to 9 exaflops of peak aggregate performance. Google believes this makes this “the world’s largest publicly available ML hub in terms of cumulative computing power, while operating at 90% carbon-free energy.”

“We have done extensive research to compare ML clusters that are publicly disclosed and publicly available (meaning – running on Cloud and available for external users),” a Google spokesperson told me when I asked the company to clarify its benchmark. “Those clusters are powered by supercomputers that have ML capabilities (meaning that they are well-suited for ML workloads such as NLP, recommendation models etc. The supercomputers are built using ML hardware — e.g. GPUs (graphic processing units) — as well as CPU and memory. With 9 exaflops, we believe we have the largest publicly available ML cluster.”

At I/O 2021, Google’s CEO Sundar Pichai said that the company would soon have “dozens of TPU v4 pods in our data centers, many of which will be operating at or near 90% carbon-free energy. And our TPUv4 pods will be available to our cloud customers later this year.” Clearly, that took a bit longer than planned, but we are in the middle of a global chip shortage and these are, after all, custom chips.

Ahead of today’s announcement, Google worked with researchers to give them access to these pods. “Researchers liked the performance and scalability that TPU v4 provides with its fast interconnect and optimized software stack, the ability to set up their own interactive development environment with our new TPU VM architecture, and the flexibility to use their preferred frameworks, including JAX, PyTorch, or TensorFlow,” Google writes in today’s announcement. No surprise there. Who doesn’t like faster machine learning hardware?

Google says users will be able to slice and dice the new cloud TPU v4 cluster and its pods to meet their needs, whether that’s access to four chips (which is the minimum for a TPU virtual machine) or thousands — but also not too many, either, because there are only so many chips to go around.

As of now, these pods are only available in Oklahoma. “We have run an extensive analysis of various locations and determined that Oklahoma, with its exceptional carbon-free energy supply, is the best place to host such a cluster. Our customers can access it from almost anywhere,” a spokesperson explained.

Google launches the next generation of its custom AI chips

March 21, 2022

Gensyn applies a token to distributed computing for AI developers, raises $6.5M

For self-driving cars and other applications developed using AI, you need what’s known as ‘deep learning’, the core concepts of which emerged in the ‘50s. This requires training models based on similar patterns as seen in the human brain. This, in turn, requires a large amount of compute power, as afforded by TPUs (Tensor Processing Units) or GPUs (Graphics Processing Units) running for lengthy periods. However, cost of this compute power is out of reach of most AI developers, who largely rent it from cloud computing platforms such as AWS or Azure. What is to be done?

Well, one approach is that taken by UK startup Gensyn. It’s taken the idea of the distributed computing power of older projects such as SETI@home and the COVID-19 focussed Folding@home and applied it in the direction of this desire for deep learning amongst AI developers. The result is a way to get high-performance compute power from a distributed network of computers.

Gensyn has now raised a $6.5 million seed led by Eden Block, a Web3 VC. Also participating in the round is Galaxy Digital, Maven 11, Coinfund, Hypersphere, Zee Prime and founders from some blockchain protocols. This adds to a previously unannounced pre-seed investment of $1.1m in 2021 – led by 7percent Ventures and Counterview Capital, with participation from Entrepreneur First and id4 Ventures.

In a statement, Harry Grieve, co-founder of Gensyn, said: “The ballooning demand for hardware – and fat margins – is why the usual names like AWS and Azure have fought to command such high market share. The result is a market that is expensive and centralized…. We designed a better way – superior on price, with unlimited scalability, and no gatekeepers.”

To achieve this, Gensyn says it will launch its decentralized compute network for training AI models. This network uses a blockchain to verify that the deep learning tasks have been performed correctly, triggering payments via a token. This then monetizes unused compute power in a verifiable manner. Gensyn also claims it’s a more environmentally conscious solution, because this compute power would otherwise go unused.

Lior Messika, managing partner at Eden Block, commented: “Gensyn’s goal of truly democratizing compute with decentralized technology is perhaps the most ambitious endeavor we’ve come across… The team aims to positively disrupt one of the largest and fastest-growing markets in the world, by drastically reducing the costs and friction associated with training neural networks at scale.”

Over a call with me Grieve added: “Our estimate is that it’s up to 80% cheaper in the average price per unit or the kind of standard Nividia GPU, or 45 cents an hour, compared to about two bucks an hour for other cloud services.”

October 30, 2019

Google launches TensorFlow Enterprise with long-term support and managed services

Google open-sourced its TensorFlow machine learning framework back in 2015 and it quickly became one of the most popular platforms of its kind. Enterprises that wanted to use it, however, had to either work with third parties or do it themselves. To help these companies — and capture some of this lucrative market itself — Google is launching TensorFlow Enterprise, which includes hands-on, enterprise-grade support and optimized managed services on Google Cloud.

One of the most important features of TensorFlow Enterprise is that it will offer long-term support. For some versions of the framework, Google will offer patches for up to three years. For what looks to be an additional fee, Google will also offer engineering assistance from its Google Cloud and TensorFlow teams to companies that are building AI models.

All of this, of course, is deeply integrated with Google’s own cloud services. “Because Google created and open-sourced TensorFlow, Google Cloud is uniquely positioned to offer support and insights directly from the TensorFlow team itself,” the company writes in today’s announcement. “Combined with our deep expertise in AI and machine learning, this makes TensorFlow Enterprise the best way to run TensorFlow.”

Google also includes Deep Learning VMs and Deep Learning Containers to make getting started with TensorFlow easier and the company has optimized the enterprise version for Nvidia GPUs and Google’s own Cloud TPUs.

Today’s launch is yet another example of Google Cloud’s focus on enterprises, a move the company accelerated when it hired Thomas Kurian to run the Cloud businesses. After years of mostly ignoring the enterprise, the company is now clearly looking at what enterprises are struggling with and how it can adapt its products for them.

October 25, 2019

Google brings in BERT to improve its search results

Google today announced one of the biggest updates to its search algorithm in recent years. By using new neural networking techniques to better understand the intentions behind queries, Google says it can now offer more relevant results for about one in ten searches in the U.S. in English (with support for other languages and locales coming later). For featured snippets, the update is already live globally.

In the world of search updates, where algorithm changes are often far more subtle, an update that affects 10 percent of searches is a pretty big deal (and will surely keep the world’s SEO experts up at night).

Google notes that this update will work best for longer, more conversational queries — and in many ways, that’s how Google would really like you to search these days because it’s easier to interpret a full sentence than a sequence of keywords.

The technology behind this new neural network is called “Bidirectional Encoder Representations from Transformers,” or BERT. Google first talked about BERT last year and open-sourced the code for its implementation and pre-trained models. Transformers are one of the more recent developments in machine learning. They work especially well for data where the sequents of elements is important, which obviously makes them a useful tool for working with natural language and, hence, search queries.

This BERT update also marks the first time Google is using its latest Tensor Processing Unit (TPU) chips to serve search results.

Ideally, this means that Google Search is now better able to understand exactly what you are looking for and provide more relevant search results and featured snippets. The update started rolling out this week, so chances are you are already seeing some of its effects in your search results.