We (skim)read Meta’s metaverse manifesto so you don’t have to…

Meta’s recently crowned president of global affairs, Nick Clegg — who, in a former life, was literally the deputy prime minister of the U.K. — has been earning his keep in California by penning an approximately 8,000-word manifesto to promo “the metaverse”: aka, the sci-fi-inspired vapourware the company we all know as Facebook fixed on for a major rebranding last fall.

Back then, founder and CEO Mark Zuckerberg, pronounced that the new entity (Meta) would be a “metaverse-first” company “from now on”. So it’s kinda funny that the key question Clegg says he’s addressing in his essay is “what is the metaverse” — and, basically, why should anyone care? But trying to explain such core logic is apparently keeping Meta’s metamates plenty busy.

The Medium post Clegg published yesterday warns readers it will require 32 minutes of their lives to take in. So few people may have cared to read it. As a Brit, I can assure you, no one should feel obliged to submit to 32 minutes of Nick Clegg — especially not bloviating at his employer’s behest. So TechCrunch took that bullet for the team and read (ok, skim-read) the screed so you don’t have to.

What follows is our bullet-pointed digest of Clegg’s metaverse manifesto. But first we invite you to chew over this WordCloud (below), which condenses his ~7,900-word essay down to 50 — most boldly featuring the word “metaverse” orbiting “internet”, thereby grounding the essay firmly in our existing digital ecosystem.

Glad we could jettison a few thousand words to arrive at that first base. But, wait, there’s more!

Image credits: Natasha Lomas/TechCrunch

Fun found word pairs that leap out of the CleggCloud include “companies rules” (not democratic rules then Clegg?); “people technologies” (possibly just an oxymoron; but we’re open to the possibility that it’s a euphemistic catch-all for ill-fated startups like HBO’s Silicon Valley‘s (satirical) ‘Human Heater’); “around potential” (not actual potential then?); “meta physical” (we lol’d); and — squint or you’ll miss it! — “privacy possible” (or possibly “possible privacy”).

The extremely faint ink for that latter pairing adds a fitting layer of additional uncertainty that life in the Zuckerberg-Clegg metaverse will be anything other than truly horrific for privacy. (Keen eyed readers may feel obligated to point out that the CleggCloud also contains “private experience” as another exceptionally faint pairing. Albeit, having inhaled the full Clegg screed, we can confirm he’s envisaging “private experience” in exceptional, siloed, close-friend spaces — not that the entire metaverse will be a paradise for human privacy. Lol!)

Before we move on to the digest, we feel it’s also worth noting a couple of words that aren’t used in Clegg’s essay — and so can only be ‘invisibly inked’ on our wordcloud (much like a tracking pixel) — deserving a mention by merit of their omission: Namely, “tracking” and “profiling”; aka, how advertising giant Meta makes its money now. Because, we must assume, tracking and profiling is how Meta plans to make its money in the mixed reality future Clegg is trying to flog.

His essay doesn’t spare any words on how Meta plans to monetize its cash-burning ‘pivot’ or reconfigure the current “we sell ads” business model in the theoretical, mixed reality future scenario he’s sketching, where the digital commerce playground is comprised of a mesh of interconnecting services owned and operated by scores of different/competing companies.

But perhaps — and we’re speculating wildly here — Meta is envisaging being able to supplement selling surveillance-targeted ads by collecting display-rents from the cottage industry of “creators” Clegg & co. hope will spring up to serve these spaces by making digital items to sell users, such as virtual threads for their avatars, or virtual fitting rooms to buy real threads… (‘That’s a nice ‘Bored Ape T-Shirt’ you’re planning to sell — great job! — but if you want metamates to be able to see it in full glorious color you’ll want to pay our advanced display fees’, type thing. Just a thought!)

Now onwards to our digest of Clegg’s screed — which we’ve filleted into a series of bulleted assertions/suggestions being made by the Meta president (adding our commentary alongside in bold-italics). Enjoy how much time we’ve saved you.

  • There won’t be ‘a’ or ‘the metaverse’, in the sense of a single experience/owned entity; there will be “metaverse spaces” across different devices, which may — or may not — interoperate nicely [so it’s a giant rebranding exercise of existing techs like VR, AR, social gaming etc?] 
  • But the grand vision is “a universal, virtual layer that everyone can experience on top of today’s physical world” [aka total intermediation of human interaction and the complete destruction of privacy and intimacy in service of creating limitless, real-time commercial opportunities and enhanced data capture]
  • Metaverse spaces will over index on ephemerality, embodiment and immersion and be more likely to centre speech-based communication vs current social apps, which suggests users may act more candid and/or forget they’re not actually alone with their buddies [so Meta and any other mega corporates providing “metaverse spaces” can listen in to less guarded digital chatter and analyze avatar and/or actual body language to derive richer emotional profiles for selling stuff] 
  • The metaverse could be useful for education and training [despite the essay’s headline claim to answer “why it matters”, Clegg doesn’t actually make much of a case for the point of the metaverse or why anyone would actually want to fritter their time away in a heavily surveilled virtual shopping mall — but he includes some vague suggestions it’ll be useful for things like education or healthcare training. At one one point he enthuses that the metaverse will “make learning more active” — which implies he was hiding under a rock during pandemic school shutdowns. He also suggests metaverse tech will remove limits on learning related to geographical location — to which one might respond have you heard of books? Or the Internet? etc]
  • The metaverse will create new digital divides — given those who can afford the best hardware will get the most immersive experience [not a very equally distributed future then is it Clegg?] 
  • It’s anyone’s guess how much money the metaverse might generate — or how many jobs it could create! [🤷]
  • But! Staggeringly vast amounts of labor will be required to sustain these interconnected metaverse spaces [i.e. to maintain any kind of suspension of disbelief that it’s worth the time sink and to prevent them from being flooded with toxicity]
  • Developers especially there will be so much work for you!!! [developers, developers, developers!]
  • Unlike Facebook, there won’t be one set of rules for the metaverse — it’s going to be a patchwork of ToS [aka, it’ll be a confusing mess. Plus governments/states may also be doing some of the rule-making via regulation]
  • A lack of interoperability/playing nice between any commercial entities that build “metaverse experiences” could fatally fragment the seamless connectivity Meta is so keen on [seems inevitable tbh; thereby threatening the entire Meta rebranding project. Immersive walled gardens anyone?]
  • Meta’s metaverse might let you create temporary, siloed private spaces where you can talk with friends [but only in the same siloed way that FB Messenger offers E2EE via “Secret Conversations” — i.e. surveillance remains Meta’s overarching rule]
  • Bad metaverse experiences will probably be even more horrible than 2D-based cyberbullying etc [yep, virtual sexual assault is already a thing]
  • There are big challenges and uncertainties ahead for Meta [no shit]
  • It’s going to take at least 10-15 years for anything resembling Meta’s idea of connected metaverse/s to be built [Clegg actually specified: “if not longer”; imagine entire decades of Zuckerberg-Clegg!]
  • Meta hopes to work with all sorts of stakeholders as it develops metaverse technologies [aka, it needs massive buy-in if there’s to be a snowflake’s chance in hell of pulling off this rebranding pivot and not just sinking billions into a metaverse money-hole]
  • Meta names a few “priority areas” it says are guiding its metaverse development — topped by “economic opportunity” [just think of all those developer/creator jobs again! Just don’t forget who’s making the mega profits right now… All four listed priorities offer more PR soundbite than substance. For example, on “privacy” — another of Meta’s stated priorities — Clegg writes: “how we can build meaningful transparency and control into our products”. Which is a truly rhetorical ask from the former politician, since Facebook does not give users meaningful control over their privacy now — so we must assume Meta is planning a future of more of the same old abusive manipulations and dark patterns so it can extract as much of people’s data as it can get away with… Ditto “safety & integrity” and “equity & inclusion” under the current FB playbook.] 
  • “The metaverse is coming, one way or another” [Clegg’s concluding remark comes across as more of a threat than bold futuregazing. Either way, it certainly augurs Meta burning A LOT more money on this circus]

New Apple ad targets data brokers

Apple is doubling down on raising consumer awareness of privacy risks in a new ad campaign, unveiled today, which puts the spotlight on how the data broker industry trades in mobile users’ personal data — from selling browsing history and shopping habits, to location data, contacts and plenty more besides.

The campaign also highlights a number of features Apple has developed to counter this background trade in web users’ information by giving iOS users’ tools they can use to counter tracking — such as Mail Privacy Protection, which helps users combat email trackers; and App Tracking Transparency (ATT), which lets them request that third party apps do not track their mobile activity.

The new 90 second ad spot will run globally this summer on broadcast and social media across 24 countries, per Apple, which also said the campaign will include related creative being splashed across billboards.

In a press screening of the ad ahead of today’s launch the iPhone maker said the goal is to show how features it’s developed can help iOS users protect their privacy by taking back control over their personal data.

The ad (which can be seen in the embedded video below) casts the data broker industry as a gaggle of “dubious” ‘human trackers’ — who the protagonist, a consumer called Ellie, whom we meet as she’s shopping for records, stumbles upon engaged in a backroom auction.

Shock horror! — or, well, zero surprise to those of us who are more than casually online — it’s her personal data that’s going under the hammer…

In the ad, the smirking audience of data brokers can be seen making bids for Ellie’s ‘digital items’ — including her drug store purchases, emails she’s opened, details of her late night messaging habits and the contact data of her nana (as well as, presumably, the rest of her address book). With mounting horror at the sale of her private information, Ellie is shown activating features on her iPhone, including the aforementioned Mail Privacy Protection — which result in the data brokers vanishing in a puff of smoke, until, eventually, the room has been cleaned out.

The advert makes a decent stab at trying to get consumers to understand — and thus care — about a murky trade that’s designed to strip away their privacy by tracking their daily activity and trading and triangulating different bundles of information gleaned about them to create highly detailed per-person profiles — which may contain thousands of inferred characteristics.

It does this by dramatizing what is undoubtedly an exceptionally intrusive trade as an in-person auction for a single consumer’s data. Of course the reality is that most tracking (and trading) is done at scale, with trackers invisibly baked into everyday services, both online (via technologies such as tracking cookies and pixels) and offline (data gathered via card payment firms can and is sold to brokers) — so it can be hard for consumers to understand the real-world implications of technologies like cookies. Or know there’s an entire data broker industry that’s busy buying and selling their info for a fat profit.

The ad is perhaps not as instantly powerful as an earlier tracking-focused ad — in which Apple depicted trackers as an ever-crowding crowd of stalkers, who inserted themselves, rudely and without asking, into an iPhone user’s personal space — watching them and taking notes on their daily activity.

One narrative challenge for Apple with this latest privacy-focused ad is it can’t shown Ellie using a rival device — which could help explain how come so much of her info is being tracked in the first place.

That said, many of Apple’s privacy features do require the user to opt in to obtain the slated protections — not all, though (Safari’s Intelligent Tracking Prevention feature is on by default, for example) — so even iOS users need to take proactive action to get the best level of protection possible. Hence there’s value in Apple shelling out to drive awareness of privacy — both for existing iOS users, as well as in the hopes of encouraging Android users to make the switch.

The tech giant has made pro-privacy messaging an increasingly important plank of its brand over the past five years or so, leaning into blistering attacks on what CEO Tim Cook memorably dubbed the “data industrial complex” back in a major 2018 keynote speech.

It’s a stance that has become an essential differentiator for a premium brand in a world of commoditized mobile devices and services. But it also brings Cupertino into conflict not only with adtech giants like Google and Facebook — the latter’s revenue was reported to have taken a hit after Apple launched ATT, for example — but with developers themselves, many of whom rely on ads to monetize free apps and do that by being plugged into the tracking and targeting adtech ecosystem Apple is busy warning consumers against.

The company also risks straining relations with carriers — many of whom are themselves implicated in privacy-hostile tracking of users — after it debuted a VPN-like, network proxy encrypted browsing feature for iCloud+, called Private Relay last year. The feature, which is still in beta, is designed to prevent ISPs from loggings web users’ browsing data — and it’s notable that certain carriers (and countries) have been reported blocking access.

Private Relay does not feature in Apple’s new ad on data brokers. Asked about this Apple said it necessarily had to limit the number of features it focused on to fit the 90 second ad format. It also noted that as well as the feature still being in beta it needs in-region partners for it to work as smoothly as possible — which is a network Apple said it’s still building out.

Certain of Apple’s privacy flexes — most notably ATT — have also drawn attention from competition regulators, following ad industry complaints. So there are wider reasons for Cupertino to be keen for its pro-consumer actions to be viewed through a privacy (rather than an anti-competition) lens.

Earlier this year, an interesting research paper found that Apple and other large companies had been able to increase their market power as a result of the ATT feature giving individual users more control over what third parties could do with their data — linking better consumer privacy to more concentrated data collection. Although the researchers also found evidence of the tracking industry trying to evolve its tactics to circumvent a user denial of tracking.

Spain slaps Google for frustrating the EU’s ‘right to be forgotten’

Here’s a rare sight: Google has been hit with a €10 million fine by Spain for serious breaches of the European Union’s General Data Protection Regulation (GDPR) which found it had passed information that could be used to identify citizens requesting deletion of their personal data under EU law, including their email address; the reasons given; and the URL claimed, to a US-based third party without a valid legal basis for this further processing.

As well as being fined, Google has been ordered to amend its procedures to bring them into compliance with the GDPR — and to delete any personal data it still holds related to this enforcement.

The fine is not Google’s first GDPR penalty — France gets the accolade for most swiftly enforcing the bloc’s flagship data protection framework against it, some years ago — but, as far as we’re aware, it’s only the second time the adtech giant been sanctioned under the GDPR since the regulation came into application, four years ago this month. (Although use of certain Google tools has, more recently, been found to breach GDPR data export rules. Google has also been hit with far meatier fines under the EU’s ePrivacy rules.)

Spain’s data protection authority, the AEPD, announced the penalty today, saying it was sanctioning Google for what it described as “two very serious infringements” — related to transferring EU citizens’ data to a third party without a legal basis; and, in doing so, hampering people’s right of erasure of their personal data under the GDPR.

The third party Google was deemed to be illegally transferring data to is the Lumen Project, a US-based academic project out of the Berkman Klein Center for Internet & Society at Harvard University which aims to collect and study legal requests for the removal of online information by amassing a database of content takedown requests.

The AEPD found that by passing the personal data of European citizens who were requesting erasure of their data to the Lumen Project, Google was essentially frustrating their legal right to erasion of their information (under GDPR Article 17) — aka the ‘right to be forgotten’; rtbf. (And Google has, to put it mildly, a long history with railing against the EU’s rtbf — which, in search de-indexing form, predates the GDPR, via a 2014 CJEU ruling. So the ability of EU individuals to make certain legal requests attached to their personal data is not at all new.)

In its decision, the AEPD says Google did not provide users who were requesting erasure of their data with a choice over their information being passed to the Lumen Project — meaning it lacked a valid legal basis for sharing the data.

The regulator also criticized the form-based system Google devised for individuals to request erasure of their data — for being confusing and requiring they select an option for their request which it said could lead to it being treated under a different regulatory regime than data protection.

“The Agency’s decision states that this system is equivalent to ‘leaving Google LLC’s decision as to when and when not GDPR applies, and this would mean accepting that this entity can circumvent the application of personal data protection rules and, more specifically, accept that the right to erase personal data is conditioned by the content removal system designed by the responsible entity’,” the AEPD notes in a press release.

A Google spokesperson told us it’s assessing the regulator’s decision.

The company claimed it’s already taken steps to amend its processes, such as by reducing the amount of information it shares with Lumen for removal requests which come from EU countries. Google also suggested its general policy is not to share any right to erasure/right to be forgotten search delisting requests or any other removal requests in which data protection or privacy rights are invoked — but if that’s the case it’s not clear why the AEPD found otherwise.

In a statement Google’s spokesperson added:

“We have a long commitment to transparency in our management of content removal requests. Like many other Internet companies, we have worked with Lumen, an academic project of the Harvard Berkman Klein Center for Internet and Society, to help researchers and the public better understand content removal requests online.

“We are reviewing the decision and continually engage with privacy regulators, including the AEPD, to reassess our practices. We’re always trying to strike a balance between privacy rights and our need to be transparent and accountable about our role in moderating content online. We have already started reevaluating and redesigning our data sharing practices with Lumen in light of these proceedings.”

We’ve reached out to the Lumen Project with questions.

The AEPD has also ordered Google to “urge” the Lumen Project to cease use of and erase any EU people’s data it communicated to it without a valid legal basis — although, ultimately, Spain’s regulator has limited means to force a non-EU based entity to comply with EU law.

The case is interesting because of a separate GDPR jurisdiction question.

The regulation’s one-stop-shop (OSS) mechanism funnels cross border complaints through a ‘lead’ supervisor, typically in the EU market where the company has its main establishment — which in Google’s case (and for many other tech giants) is Ireland’s Data Protection Commission (DPC), which continues to face strong criticizism over the painstaking pace of its GDPR enforcement, especially in cross-border cases which apply to tech giants. Indeed, the DPC is currently being sued over inaction on an Google adtech complaint.

That complaint dates back almost four years at this point. The DPC also has a number of other long-running Google enquiries — including one looking into its location tracking practices. But the Irish regulator still hasn’t issued any decisions on any Google cases. Hence GDPR enforcement of Google being a rare sight.

If Spain’s far less well resourced data protection agency can get a decision and enforcement out the door (it’s actually one of the most active EU DPAs), critics will surely ask, why can’t Ireland?

France’s earlier GDPR spank of Google, meanwhile, was only possible because the adtech giant had not yet reconfigured its business to shrink its ‘regulatory risk’ in the region, via OSS ‘forum shopping’, by moving citizens’ accounts to its Irish-based entity — thereby putting EU users under the jurisdiction of the (painstaking) Irish DPC.

So how, then, has Spain sidestepped the DPC GDPR enforcement bottleneck in this Google-Lumen case?

Basically the agency has competency because Google’s US-based business was carrying out the processing in question, as well as the Lumen Project itself being US-based. The regulator was also, presumably, able to show that Spanish citizens’ data was being processed — meaning it could step in on their behalf.

The AEPD confirmed it relied upon a mechanism in the GDPR to liaise with the Irish DPC on the question of competency, tell us: “Once this procedure was completed, and after jurisdiction had been determined, the AEPD agreed to open this sanctioning procedure.”

India pushes ahead with its strict VPN and breach disclosure rules despite concerns

India is pushing ahead with its new cybersecurity rules that will require cloud service providers and VPN operators to maintain names of their customers and their IP addresses despite many players threatening to leave the world’s second largest internet market over the new guidelines.

The Indian Computer Emergency Response Team clarified (PDF) on Wednesday that “virtual private server (VPS) providers, cloud service providers, VPN service providers, virtual asset service providers, virtual asset exchange providers, custodian wallet providers and government organisations” shall follow the directive, called Cyber Security Directions, that requires them to store customers’ names, email addresses, IP addresses, know your customer records, financial transactions for a period of five years.

The new rules, which were unveiled late last month, won’t be applicable to corporate and enterprise VPNs, the government agency clarified.

New Delhi is also not relaxing a new rule that will mandates firms to report incidents of security incidents and data breaches within six hours of noticing such cases.

Rajeev Chandrasekhar, the junior IT minister of India, told reporters on Wednesday that India was being “very generous” in giving firms six hours of time to report security incidents, pointing to nations such as Indonesia and Singapore that have stricter requirements.

“If you look at precedence all around the world — and understand that cybersecurity is a very complex issue, where situational awareness of multiple incidents allow us to understand the larger force behind it — reporting accurately, on time, and mandatorily is an absolute essential part of the ability of CERT and the government to ensure that the internet is always safe,” he said.

Several VPN providers have expressed worries about India’s new cybersecurity rules. NordVPN, one of the most popular VPN operators, said earlier that it may remove its services from India if “no other options are left.”

Other service providers, including ExpressVPN and ProtonVPN, have also shared their concerns.

On the other hand, many have welcomed some changes. “There has been a lot of pressure on CERT-In with large scale data breaches being reported across India. Most of the breaches were denied by the companies and despite its mandate, CERT-In never acted on these reports,” said Srinivas Kodali, a researcher.

This is a developing story. More to follow…

Google faces fresh class action-style suit in UK over DeepMind NHS patient data scandal

Google is facing a new class-action style lawsuit in the UK in relation to a health data scandal that broke back in 2016, when it emerged that its AI division, DeepMind, had been passed data on more than a million patients as part of an app development project by the Royal Free NHS Trust in London — without the patients’ knowledge or consent.

The Trust was later sanctioned by the UK’s data protection watchdog which found, in mid 2017, that it had breached uk data protection law when it signed the 2015 data-sharing deal with DeepMind. However the tech firm — which had been engaged by the Trust to help develop an app wrapper for an NHS algorithm to alert clinicians to the early signs of acute kidney injury (aka the Streams app) — avoided sanction since the Trust had been directly responsible for sending it the patients’ data.

So it’s interesting that this private litigation is targeting Google and DeepMind Technologies, several years later. (Albeit, if a claim seeking damages against one of the world’s most valuable companies prevails there is likely to be considerably more upside vs litigation aimed at a publicly funded healthcare Trust.)

Mishcon de Reya, the law firm that’s been engaged to represent the sole named claimant, a man called Andrew Prismall — who says he’s bringing the suit on behalf of approximately 1.6 million individuals whose records were passed to DeepMind — said the litigation will seek damages for unlawful use of patients’ confidential medical records. The claim is being brought in the High Court of Justice of England & Wales.

The law firm also confirmed that the Royal Free is not being sued.

“The claim is for Misuse of Private Information by Google and DeepMind. This is under common law,” a spokeswoman for Mishcon de Reya told us. “We can also confirm this is a damages claim.”

A similar claim, announced last September, was discontinued, according to the spokeswoman — who confirmed: “This is a new claim for the misuse of private information.”

In a statement on why he’s suing Google/DeepMind, Prismall said: “I hope that this case can achieve a fair outcome and closure for the many patients whose confidential records were — without the patients’ knowledge — obtained and used by these large tech companies.”

“This claim is particularly important as it should provide some much-needed clarity as to the proper parameters in which technology companies can be allowed to access and make use of private health information,” added Ben Lasserson, partner at Mishcon de Reya, in another supporting statement.

The firm notes that the litigation is being funded by a litigation finance agreement with Litigation Capital Management Ltd, a Sydney, Australia headquartered entity which it describes as an alternative asset manager specialising in dispute financing solutions internationally.

Google was contacted for comment on the new suit but at the time of writing the adtech giant had not responded.

There has been an uptake in class-action style litigations targeting tech giants over misuse of data in Europe, although a number have focused on trying to bring claims under data protection law.

One such case, a long-running consumer class action-style suit in the UK against Google related to a historic overriding of Safari users’ privacy settings, failed in the UK Supreme Court last year.  However Prismall is (now) suing for damages under the common law tort of misuse of private information so the failure of that earlier UK case does not necessarily have strong relevance here.

It does appear to explain why the earlier suit was discontinued and a fresh one filed, though. “It’s correct that the previous claim was brought on the basis of a breach to the Data Protection Act and the new claim is being brought on a for misuse of private information,” Mishcon de Reya’s spokeswoman told us when we asked about this. 

While the DeepMind NHS patient data scandal may seem like (very) old news, there was plenty of criticism of the regulatory response at the time — as the Trust itself did not face anything more than reputational damage.

It was not, for example, ordered to tell DeepMind to delete patient data — and DeepMind was able to carry on inking deals with other NHS Trusts to roll out the app despite it having been developed without a valid legal basis to use the patient data in the first place.

And while DeepMind had defended itself against privacy concerns attached to its adtech parent Google, claiming the latter would have no access to the sensitive medical data after the scandal broke, it subsequently handed off its health division to Google, in 2018, meaning the adtech giant directly took over the role of supplying and supporting the app for NHS Trusts and processing patients’ data… (Which may be why both Google and DeepMind Technologies are named in the suit.)

There was also the issue of the memorandum of understanding inked between DeepMind and the Royal Free which set out a five-year plan to build AI models using NHS patient data. Though DeepMind always claimed no patient data had been processed for AI.

In a further twist to the saga last summer, Google announced it would be shuttering the Streams app — which, at the time, was still being used by the Royal Free NHS Trust. The Trust claimed it would continue using the app despite Google announcing its intention to decommission it — raising questions over the security of patient data once support (e.g. security patching) got withdrawn by Google.

While the tech giant may have been hoping to put the whole saga behind it by quietly shuttering Streams it will now either have to defend itself in court, generating fresh publicity for the 2015 NHS data misuse scandal — or offer to settle in order to make the suit go away quietly. (And the litigation funders are, presumably, sniffing enough opportunity either way.)

The backlash against market-dominating tech giants continues to fuel other types of class-action style lawsuits. Earlier this year, for example, a major suit was launched against Facebook’s parent, Meta, seeking billions in damages for alleged abuse of UK competition law. But the jury is out on which — or whether — representative actions targeting tech giants’ data processing habits will prevail.

Report spotlights vast scale of adtech’s ‘biggest data breach’

New data about the real-time-bidding (RTB) system’s use of web users’ info for tracking and ad targeting, released today by the Irish Council for Civil Liberties (ICCL), suggests Google and other key players in the high velocity, surveillance-based ad auction system are processing and passing people’s data billions of times per day.

“RTB is the biggest data breach ever recorded,” argues the ICCL. “It tracks and shares what people view online and their real-world location 294 billion times in the U.S. and 197 billion times in Europe every day.”

The ICCL’s report, which is based on industry figures that the rights organization says it obtained from a confidential source, offers an estimate of RTB per person per day across US states and European countries which suggests that web users in Colorado and the UK are among the most exposed by the system — with 987 and 462 RTB broadcasts apiece per person per day.

But even online individuals living in bottom of the chart, District of Columbia or Romania, have their information exposed by RTB an estimated 486 times per day or 149 times per day respectively, per the report.

The ICCL calculates that people living in the U.S. have their online activity and real-world location exposed 57% more often than people in Europe — likely as a result of differences in privacy regulation across the two regions.

Collectively, the ICCL estimates that U.S. Internet users’ online behaviour and locations are tracked and shared 107 trillion times a year, while Europeans’ data is exposed 71 trillion times a year.

“On average, a person in the U.S. has their online activity and location exposed 747 times every day by the RTB industry. In Europe, RTB exposes people’s data 376 times a day,” it also writes, adding: “Europeans and U.S. Internet users’ private data is sent to firms across the globe, including to Russia and China, without any means of controlling what is then done with the data.”

The report’s figures are likely a conservative estimate of the full extent of RTB since the ICCL includes the caveat that: “[T]he figures presented for RTB broadcasts as a low estimate. The industry figures on which we rely do not include Facebook or Amazon RTB broadcasts.”

Per the report, Google, the biggest player in the RTB system, allows 4,698 companies to receive RTB data about people in the U.S., while Microsoft — which ramped up its involvement in RTB in December last year when it bought adtech firm Xandr from AT&T — says it may send data to 1,647 companies.

That too is likely just the tip of the iceberg since RTB data is broadcast across the Internet — meaning it’s ripe for interception and exploitation by non-officially listed RTB ‘partners’, such as data brokers whose businesses involve people farming by compiling dossiers of data to reidentify and profile individual web users for profit, using info like device IDs, device fingerprinting, location etc to link web activity to a named individual, for example.

Privacy and security concerns have been raised about RTB for years — especially in Europe where there are laws in place that are supposed to prevent such a systematic abuse of people’s information. But awareness of the issue has been rising in the US too, following a number of location-tracking and data-sharing scandals.

The leaked Supreme Court opinion earlier this month which suggested the US’ highest court is preparing to overturn Roe v Wade — removing the constitutional protection for abortion — has further dialled up concern and sent shock waves through the country, with some commentators immediately urging women to delete their period tracking apps and pay close attention to their digital security and privacy hygiene.

The concern is ad tracking could expose personal data that can be used to identify women and people who are pregnant and/or seeking abortion services.

Many US states have already heavily restricted access to abortion. But if the Supreme Court overturns Roe v Wade a number of states are expected to ban abortion entirely — which means people who can get pregnant will be at increased risk from online surveillance as any online searches for abortion services or location tracking or other types of data mining of their digital activity could be used to built a case against them for obtaining or seeking to obtain an illegal abortion.

Highly sensitive personal data on web users is, meanwhile, routinely sucked up and shared for ad targeting purposes, as previous ICCL reports have detailed in hair-raising detail. The data broker industry also collects information on individuals to trade and sell — and in the US, especially, people’s location data appears all too easy to obtain.

Last year, for example, a top Catholic priest in the US was reported to have resigned after allegations were made about his sexuality based on a claim that data on his phone had been obtained which indicated use of the location-based gay hook-up app, Grindr.

A lack of online privacy could also negatively impinge on women’s health issues — making it easier to gather information to criminalize pregnant people who seek an abortion in a post-Roe world.

There is no way to restrict the use of RTB data after it is broadcast,” emphasizes the ICCL in the report. “Data brokers used it to profile Black Lives Matter protestors. The US Department of Homeland Security and other agencies used it for warrant-less phone tracking. It was implicated in the outing of a gay Catholic priest through his use of Grindr. ICCL uncovered the sale of RTB data revealing likely survivors of sexual abuse.”

The report raises especially cutting question for European regulators since, unlike the US, the region has a comprehensive data protection framework. The General Data Protection Regulation (GDPR) has been in force across the EU since May 2018 and regulators should have been enforcing these privacy rights against out-of-control adtech for years.

Instead, there has been a collective reluctance to do so — likely as a result of how extensively and pervasively individual tracking and profiling tech has been embedded into web infrastructure, coupled with loud claims by the adtech industry that the free web cannot survive if Internet users’ privacy is respected. (Such claims ignore the existence of alternative forms of ad targeting, such as contextual, which do not require tracking and profiling of individual web users’ activity to function and which have been shown to be profitable for years, such as for non-tracking search engine, DuckDuckGo.)

An investigation opened by the Irish Data Protection Commission (DPC) into Google’s adtech three years ago (May 2019), following a number of RTB complaints, is — ostensibly — ongoing. But no decision has been issued.

The UK’s ICO also repeatedly fumbled enforcement action against RTB following complaints filed back in 2018, despite voicing a view publicly since 2019 that the behavioral ad industry is wildly out of control. And in a parting shot last fall, the outgoing information commissioner, Elizabeth Denham, urged the industry to undertake meaningful privacy reforms.

Since then, a flagship adtech industry mechanism for gathering web users’ consent to ad tracking — the IAB Europe’s self-styled Transparency and Consent Framework (TCF) — has itself been found in breach of the GDPR by Belgian’s data protection authority.

Its February 2022 decision, also found the IAB itself at fault, giving the industry body two months to submit a reform plan and six months to implement it. (NB: Google and the IAB are the two bodies that set standards for RTB.)

That consent issue is one (solid) complaint against RTB under Europe’s GDPR. However the ICCL’s concern has been focused on security — as it argues that high velocity, massive scale trading of people’s data to place ads by broadcasting it over the Internet to thousands of ‘partners’ (but also with the clear risk of interception and appropriation by scores of unknown others) is inherently insecure. And, regardless of the consent issues, the GDPR requires people’s information is adequately protected — hence its framing of RTB as the “biggest ever data breach”.

In March, the ICCL announced it intended to sue the DPC — accusing the regulator of years of inaction over RTB complaints (some of which were lodged the same year the GDPR came into application). That litigation is still pending.

It has also approached the EU ombudsperson to complaint that the European Commission is failing to properly monitor application of the regulation — which led to the former opening an enquiry to look at the Commission’s claims to the contrary earlier this year.

A requested deadline for the EU’s executive to submit information to the ombudsperson passed yesterday without a submission, per the ICCL, with the Commission reportedly asking for 10 more days to provide the requested data — which suggests the four-year anniversary of the GDPR coming into force (May 25, 2018) will pass by in the meanwhile (perhaps a little more quietly than it might have done if the ombudsperson had been in a position to issue a verdict)…

“As we approach the four year anniversary of the GDPR we release data on the biggest data breach of all time. And it is an indictment of the European Commission, and in particular commissioner [Didier] Reynders, that this data breach is repeated every day,” Johnny Ryan, senior fellow at the ICCL, told TechCrunch.

“It is time that the Commission does its job and compels Ireland to apply the GDPR correctly,” he added.

We also contacted Google, Microsoft, the DPC and the European Commission with questions about the ICCL’s report but at the time of writing none had not responded.

Ryan told us the ICCL is also writing to US lawmakers to highlight the scale of the “privacy crisis in online advertising” — and specifically pressing the Senate Subcommittee on Competition Policy, Antitrust and Consumer Rights to ensure adequate enforcement resources are provided to the FTC — so it can take urgent action “against this enormous breach”.

In the letter, which we’ve reviewed, the ICCL points out that private data on US citizens is sent to firms across the globe, including to Russia and China — “without any means of controlling what is then done with the data”.

War in Europe certainly adds a further dimension to this surveillance adtech story.

Russia’s invasion of Ukraine earlier this year has fuelled added concern about adtech’s mass surveillance of web users — i.e. if citizens’ data is finding its way back, via online tracking, to hostile third countries like Russia and its ally China.

Back in March, the Financial Times reported that scores of apps contain SDK technology made by the Russian search giant Yandex — which was accused of sending user data back to servers in Russia where it might be accessible to the Russian government. 

In Europe, the GDPR requires that exports of personal data out of the bloc are protected to the same standard as citizens’ information should be wrapped with when it’s being processed or stored in Europe.

A landmark EU ruling in July 2020 saw the bloc’s top court strike down a flagship EU-US data transfer agreement over security concerns attached to US government mass surveillance programs — creating ongoing legal uncertainty around international data flows to risky third countries as the court underscored the need for EU regulators to proactively monitor data exports and step in to suspend any data flows to jurisdictions that lack adequate data protection.

Many of the key players in adtech are US-based — raising questions about the legality of any processing of Europeans’ data by the sector that’s taking place over the pond too, given the high standard that EU law requires for data to be legally exported.

Twitter launches a new web game to make its privacy policy easier to understand

Twitter announced today that it has rolled out a new web video game to make it easier for users to understand its privacy policy. The goal of the game, which is called Data Dash, is to educate people on the information that Twitter collects, how the information is used and what controls users have over it. The social media giant says the game is designed to help users learn how to “safely navigate the Twitterverse.”

Once you start the game, you’ll be asked to pick the language that you would like to play in. After that, you’ll have the option to select a character. The game is played by helping a dog, named Data, safely navigate “PrivaCity” by dodging ads, steering clear of spammy DMs and avoiding Twitter trolls. Each time you complete a level, you’ll learn more about Twitter’s privacy policy and what you can do to keep yourself safe on the platform.

The easy gameplay is designed to help users learn something during each level. The game was created by pixel artist and game developer Momo Pixel.

“Through Twitter Data Dash, we hope to encourage more people around the world to take charge of their personal information on our service and maybe even have a little fun in the process,” the company said in a statement. “Transparency is core to our approach and we want to help you understand the information we collect, how it’s used, and the controls at your disposal.”

Twitter has released the game as part of its broader efforts to make its privacy policy easier to understand. The social media giant has revamped its privacy policy website to include less “legalese” and be easier to understand. Twitter has also reorganized the policy into three primary categories: data collection, data use and data sharing. There’s also a clearer explanation of how Twitter personalizes users’ experiences and the ads they see. The new privacy policy website and game are available beginning today in nine languages: English, Spanish, French, Italian, German, Japanese, Korean, Portuguese and Russian.

The launch comes a day after Twitter rolled out a new “Copypasta and Duplicate Content” policy to clarify how the platform works to combat spam and duplicative content. The social media giant first revealed in August 2020 that it would limit the visibility of copypasta tweets and is now highlighting what it considers to be a violation and what action is taken to limit the visibility of such violations.

Europe’s CSAM scanning plan unpicked

The European Union has formally presented its proposal to move from a situation in which some tech platforms voluntarily scan for child sexual abuse material (CSAM) to something more systematic — publishing draft legislation that will create a framework which could obligate digital services to use automated technologies to detect and report existing or new CSAM, and also identify and report grooming activity targeting kids on their platforms.

The EU proposal — for “a regulation laying down rules to prevent and combat child sexual abuse” (PDF) — is intended to replace a temporary and limited derogation from the bloc’s ePrivacy rules, which was adopted last year in order to enable messaging platforms to continue long-standing CSAM scanning activity which some undertake voluntarily.

However that was only ever a stop-gap measure. EU lawmakers say they need a permanent solution to tackle the explosion of CSAM and the abuse the material is linked to — noting how reports of child sexual abuse online rising from 1M+ back in 2014 to 21.7M reports in 2020 when 65M+ CSAM images and videos were also discovered — and also pointing to an increase in online grooming seen since the pandemic.

The Commission also cites a claim that 60%+ of sexual abuse material globally is hosted in the EU as further underpinning its impetus to act.

Some EU Member States are already adopting their own proposals for platforms to tackle CSAM at a national level so there’s also a risk of fragmentation of the rules applying to the bloc’s Single Market. The aim for the regulation is therefore to avoid that risk by creating a harmonized pan-EU approach.  

EU law contains a prohibition on placing a general monitoring obligations on platforms because of the risk of interfering with fundamental rights like privacy — but the Commission’s proposal aims to circumvent that hard limit by setting out what the regulation’s preamble describes as “targeted measures that are proportionate to the risk of misuse of a given service for online child sexual abuse and are subject to robust conditions and safeguards”.

What exactly is the bloc proposing? In essence, the Commission’s proposal seeks to normalize CSAM mitigation by making services elect to put addressing this risk on the same operational footing as tackling spam or malware — creating a targeted framework of supervised risk assessments combined with a permanent legal basis that authorizes (and may require) detection technologies to be implemented, while also baking in safeguards over how and indeed whether detection must be done, including time limits and multiple layers of oversight.

The regulation itself does not prescribe which technologies may or may not be used for detecting CSAM or ‘grooming’ (aka, online behavior that’s intended to solicit children for sexual abuse).

“We propose to make it mandatory for all providers of service and hosting to make a risk assessment: If there’s a risk that my service, my hosting will be used or abused for sharing CSAM. They have to do the risk assessment,” said home affairs commissioner Ylva Johansson, explaining how the Commission intends the regulation to function at a press briefing to announce the proposal today. “They have also to present what kind of mitigating measures they are taking — for example if children have access to this service or not.

“They have to present these risk assessments and the mitigating measures to a competent authority in the Member State where they are based or in the Member State where they appointed a legal representative authority in the EU. This competent authority will assess this. See how big is the risk. How effective are the mitigating measures and is there a need for additional measures,” she continued. “Then they will come back to the company — they will consult the EU Centre, they will consult their data protection agencies — to say whether there will be a detection order and if they find there should be a detection order then they should ask another independent authority — it could be a court in that specific Member State — to issue a detection order for a specific period of time. And that could take into account what kind of technology they are allowed to use for this detection.”

“So that’s how we put the safeguards [in place],” Johansson went on. “It’s not allowed to do a detection without a detection order. But when there is a detection order you’re obliged to do it and you’re obliged to report when and if you find CSAM. And this should be reported to the EU Centre which will have an important role to assess whether [reported material] will be put forward to law enforcement [and to pick up what the regulation calls “obviously false positives” to prevent innocent/non-CSAM from being forward to law enforcement].”

The regulation will “put the European Union in the global lead on the fight on online sexual abuse”, she further suggested.

Stipulations and safeguards

The EU’s legislation proposing body says the regulation is based on both the bloc’s existing privacy framework (the General Data Protection Regulation; GDPR) and the incoming Digital Services Act (DSA), a recently agreed horizontal update to rules for ecommerce and digital services and platforms which sets governance requirements in areas like illegal content.

CSAM is already illegal across the EU but the problem of child sexual abuse is so grave — and the role of online tools, not just in spreading and amplifying but also potentially facilitating abuse — that the Commission argues dedicated legislation is merited in this area.

It adopted a similarly targeted regulation aimed at speeding up takedowns of terrorism content last year — and the EU approach is intended to support continued expansion of the bloc’s digital rulebook by bolting on other vertical instruments, as needed.

“This comes of course with a lot of safeguards,” emphasized Johansson of the latest proposed addition to EU digital rules. “What we are targeting in this legislation are service providers online and hosting providers… It’s tailored to target this child sexual abuse material online.”

As well as applying to messaging services, the regime includes some targeted measures for app stores which are intended to help prevent kids downloading risky apps — including a requirement that app stores use “necessary age verification and age assessment measures to reliably identify child users on their services”.  

Johansson explained that the regulation bakes in multiple layers of requirements for in-scope services — starting with an obligation to conduct a risk assessment that considers any risks their service may present to children in the context of CSAM, and a requirement to present mitigating measures for any risks they identify.

This structure looks intended by EU lawmakers to encourage services to proactively adopt a robust security- and privacy-minded approach towards users to better safeguard any minors from abuse/predatory attention in a bid to shrink their regulatory risk and avoid more robust interventions that could mean they have to warn all their users they are scanning for CSAM (which wouldn’t exactly do wonders for the service’s reputation).

It looks to be no accident that — also today — the Commission published a new strategy for a “better Internet for kids” (BI4K) which will encourage platforms to conform to a new, voluntary “EU code for age-appropriate design”; as well as fostering development of “a European standard on online age verification” by 2024 — which the bloc’s lawmakers also envisage looping in another plan for a pan-EU ‘privacy-safe’ digital ID wallet (i.e. as a non-commercial option for certifying whether a user is underage or not).

The BI4K strategy doesn’t contain legally binding measures but adherence to approved practices, such as the planned age-appropriate design code, could be seen as a way for digital services to earn brownie points towards compliance with the DSA — which is legally binding and carries the threat of major penalties for infringers. So the EU’s approach to platform regulation should be understood as intentionally broad and deep; with a long-tail cascade of stipulations and suggestions which both require and nudge.

Returning to today’s proposal to combat child sexual abuse, if a service provider ends up being deemed to be in breach the Commission has proposed fines of up to 6% of global annual turnover — although it would be up to the Member State agencies to determine the exact level of any penalties.

These local regulatory bodies will also be responsible for assessing the service provider’s risk assessment and existing mitigations — and, ultimately, deciding whether or not a detection order is merited to address specific child safety concerns.

Here the Commission looks to have its eye on avoiding forum shopping and enforcement blockages/bottlenecks (as have hampered GDPR) as the regulation requires Member State-level regulators to consult with a new, centralized (but independent of the EU) agency — called the “European Centre to prevent and counter child sexual abuse” (aka, the “EU Centre” for short) — a body lawmakers intend to support their fight against child sexual abuse in a number of ways.

Among the Centre’s tasks will be receiving and checking reports of CSAM from in-scope services (and deciding whether or not to forward them to law enforcement); maintaining databases of “indicators” of online CSAM which services could be required to use on receipt of a detection order; and developing (novel) technologies that might be used to detect CSAM and/or grooming.

In particular, the EU Centre will create, maintain and operate databases of indicators of online child sexual abuse that providers will be required to use to comply with the detection obligations,” the Commission writes in the regulation preamble. 

The EU Centre should also carry out certain complementary tasks, such as assisting competent national authorities in the performance of their tasks under this Regulation and providing support to victims in connection to the providers’ obligations. It should also use its central position to facilitate cooperation and the exchange of information and expertise, including for the purposes of evidence-based policy-making and prevention. Prevention is a priority in the Commission’s efforts to fight against child sexual abuse.”

The prospect of apps having to incorporate CSAM detection technology developed by a state agency has, unsurprisingly, caused alarm among a number of security, privacy and digital rights watchers.

Although alarm isn’t limited to that one component; Pirate Party MEP, Patrick Breyer — a particularly vocal critic — dubs the entire proposal “mass surveillance” and “fundamental rights terrorism” on account of the cavalcade of risks he says it presents, from mandating age verification to eroding privacy and confidentiality of messaging and cloud storage for personal photos.

Re: the Centre’s listed detection technologies, it’s worth noting that Article 10 of the regulation includes this caveated line on obligatory use of its tech — which states [emphasis ours]: “The provider shall not be required to use any specific technology, including those made available by the EU Centre, as long as the requirements set out in this Article are met” — which, at least, suggests providers have a choice over whether or not they apply its centrally devised technologies to comply with a detection order vs using some other technologies of their choice.

(Okay, so what are the requirements that must be “met”, per the rest of the Article, to be freed from the obligation to use EU Centre approved tech? These include that selected technologies are “effective” at detection of known/new CSAM and grooming activity; are unable to extract other information from comms other than what is “strictly necessary” for detecting the targeted CSAM content/behavior; are “state of the art” and have the “least intrusive” impact on fundamental rights like privacy; and are “sufficiently reliable, in that they limit to the maximum extent possible the rate of errors regarding the detection”… So the primary question arising from the regulation is probably whether such subtle and precise CSAM/grooming detection technologies exist anywhere at all — or even could ever exist outside the realms of sci-fi.)

That the EU is essentially asking for the technologically impossible has been another quick criticism of the proposal.

Crucially for anyone concerned about the potential impact to (everybody’s) privacy and security if messaging comms/cloud storage etc are compromised by third party scanning tech, local oversight bodies responsible for enforcing the regulation must consult EU data protection authorities — who will clearly have a vital role to play in assessing the proportionality of proposed measures and weighing the impact on fundamental rights.

Per the Commission, technologies developed by the EU Centre will also be assessed by the European Data Protection Board (EDPB), a steering body for application of the GDPR, which it stipulates must be consulted on all detection techs included in the Centre’s list. (“The EDPB is also consulted on the ways in which such technologies should be best deployed to ensure compliance with applicable EU rules on the protection of personal data,” the Commission adds in a Q&A on the proposal.)

There’s a further check built in, according to EU lawmakers, as a separate independent body (which Johansson suggests could be a court) will be responsible for finally issuing — and, presumably, considering the proportionality of — any detection order. (But if this check doesn’t include a wider weighing of proportionality/necessity it might just amount to a procedural rubber stamp.)

The regulation further stipulates that detection orders must be time limited. Which implies that requiring indefinite detection would not be possible under the plan. Albeit, consecutive detection orders might have a similar effect — albeit, you’d hope the EU’s data protection agencies would do their job of advising against doing that or the risk of a legal challenge to the whole regime would certainly crank up.

Whether all these checks and balances and layers of oversight will calm the privacy and security fears swirling around the proposal remains to be seen.

A version of the draft legislation which leaked earlier this week quickly sparked loud alarm klaxons from a variety of security and industry experts — who reiterated (now) perennial warnings over the implications of mandating content-scanning in an digital ecosystem that contains robustly encrypted messaging apps.

The concern is especially what the move might mean for end-to-end encrypted services — with industry watchers querying whether the regulation could force messaging platforms to bake in backdoors to enable the ‘necessary’ scanning, since they don’t have access to content in the clear?

E2EE messaging platform WhatsApp’s chief, Will Cathcart, was quick to amplify concerns of what the proposal might mean in a tweet storm.

Some critics also warned that the EU’s approach looked similar to a controversial proposal by Apple last year to implement client-side CSAM scanning on users’ devices — which was dropped by the tech giant after another storm of criticism from security and digital rights experts.

Assuming the Commission proposal gets adopted (and the European Parliament and Council have to weigh in before that can happen), one major question for the EU is absolutely what happens if/when services ordered to carry out detection of CSAM are using end-to-end encryption — meaning they are not in a position to scan message content to detect CSAM/potential grooming in progress since they do not hold keys to decrypt the data.

Johansson was asked about encryption during today’s presser — and specifically whether the regulation poses the risk of backdooring encryption? She sought to close down the concern but the Commission’s circuitous logic on this topic makes that task perhaps as difficult as inventing a perfectly effective and privacy safe CSAM detecting technology.

“I know there are rumors on my proposal but this is not a proposal on encryption. This is a proposal on child sexual abuse material,” she responded. “CSAM is always illegal in the European Union, no matter the context it is in. [The proposal is] only about detecting CSAM — it’s not about reading or communication or anything. It’s just about finding this specific illegal content, report it and to remove it. And it has to be done with technologies that have been consulted with data protection authorities. It has to be with the least privacy intrusive technology.

“If you’re searching for a needle in a haystack you need a magnet. And a magnet will only see the needle, and not the hay, so to say. And this is how they use the detection today — the companies. To detect for malware and spam. It’s exactly the same kind of technology, where you’re searching for a specific thing and not reading everything. So this is what this about.”

“So yes I think and I hope that it will be adopted,” she added of the proposal. “We can’t continue leaving children without protection as we’re doing today.”

As noted above, the regulation does not stipulate exact technologies to be used for detection of CSAM. So EU lawmakers are  — essentially — proposing to legislate a fudge. Which is certainly one way to try to sidestep the inexorable controversy of mandating privacy-intrusive detection without fatally undermining privacy and breaking E2EE in the process.

During the brief Q&A with journalists, Johansson was also asked why the Commission had not made it explicit in the text that client-side scanning would not be an acceptable detection technology — given the major risks that particular ‘state of the art’ technology is perceived to pose to encryption and to privacy.

She responded by saying the legislation is “technology neutral”, before reiterating another relative: That the regulation has been structured to limit interventions so as to ensure they have the least intrusive impact on privacy. 

“I think she is extremely important in these days. Technology is developing extremely fast. And of course we have been listening to those that have concerns about the privacy of the users. We’ve also been listening to those that have concerns about the privacy of the children victims. And this is the balance to find,” she suggested. “That’s why we set up this specific regime with the competent authority and they have to make a risk assessment — mitigating measures that will foster safety by design by the companies.

“If that’s not enough — if detection is necessary — we have built in the consultation of the data protection authorities and we haver built in a specific decision by another independent authority, it could be a court, that will take the specific detection order. And the EU Centre is there to support and to help with the development of the technology so we have the least privacy intrusive technology.

“But we choose not to define the technology because then it might be outdated already when it’s adopted because the technology and development goes so fast. So the important [thing] is the result and the safeguards and to use the least intrusive technology to reach that result that is necessary.”

There is, perhaps, a little more reassurance to be found in the Commission’s Q&A on the regulation where — in a section responding to the question of how the proposal will “prevent mass surveillance” — it writes [emphasis ours]:

“When issuing detection orders, national authorities have to take into account the availability and suitability of relevant technologies. This means that the detection order will not be issued if the state of development of the technology is such that there is no available technology that would allow the provider to comply with the detection order.”

That said, the Q&A does confirm that encrypted services are in-scope — with the Commission writing that had it explicitly excluded those types of services “the consequences would be severe for children”. (Even as it also gives a brief nod to the importance of encryption for “the protection of cybersecurity and confidentiality of communications”.)

On E2EE specifically, the Commission writes that it continues to work “closely with industry, civil society organisations, and academia in the context of the EU Internet Forum, to support research that identifies technical solutions to scale up and feasibly and lawfully be implemented by companies to detect child sexual abuse in end-to-end encrypted electronic communications in full respect of fundamental rights”.

“The proposed legislation takes into account recommendations made under a separate, ongoing multi-stakeholder process exclusively focused on encryption arising from the December 2020 Council Resolution,” it further notes, adding [emphasis ours]: “This work has shown that solutions exist but have not been tested on a wide scale basis. The Commission will continue to work with all relevant stakeholders to address regulatory and operational challenges and opportunities in the fight against these crimes.”

So — the tl;dr looks to be that, in the short term, E2EE services are likely to dodge a direct detection order, being as there’s likely no (legal) way to detect CSAM without fatally compromising user privacy/security, so the EU’s plan could, in the first instance, end up encouraging further adoption of strong encryption (E2EE) by in scope services — i.e. as a means of managing regulatory risk. (What that might mean for services that operate intentionally user-scanning business models is another question.)

That said, the proposed framework has been set up in such a way as to leave the door open to a pan-EU agency (the EU Centre) being positioned to consult on the design and development of novel technologies that could, one day, tread the line — or thread the needle, if you prefer — between risk and rights.

Or else that theoretical possibility is being entertained as another stick for the Commission to hold over unruly technologists to encourage them to engage in more thoughtful, user-centric design as a way to combat predatory behavior and abuse on their services.

UK opts for slow reboot of Big Tech rules, pushes ahead on privacy ‘reforms’

The UK government has confirmed it will move forward on a major ex ante competition reform aimed at Big Tech, as it set out its priorities for the new parliamentary session earlier today.

However it has only said that draft legislation will be published over this period — booting the prospect of passing updated competition rules for digital giants further down the road.

At the same time today it confirmed that a “data reform bill” will be introduced in the current parliamentary session.

This follows a consultation it kicked off last year to look at how the UK might diverge from EU law in this area, post-Brexit, by making changes to domestic data protection rules.

There has been concern that the government is planning to water down citizens’ data protections. Details the government published today, setting out some broad-brush aims for the reform, don’t offer a clear picture either way — suggesting we’ll have to wait to see the draft bill itself in the coming months.

Read on for an analysis of what we know about the UK’s policy plans in these two key areas… 

Ex ante competition reform

The government has been teasing a major competition reform since the end of 2020 — putting further meat on the bones of the plan last month, when it detailed a bundle of incoming consumer protection and competition reforms.

But today, in a speech setting out prime minister Boris Johnson’s legislative plans for the new session at the state opening of parliament, it committed to publish measures to “create new competition rules for digital markets and the largest digital firms”; also saying it would publish “draft” legislation to “promote competition, strengthen consumer rights and protect households and businesses”.

In briefing notes to journalists published after the speech, the government said the largest and most powerful platform will face “legally enforceable rules and obligations to ensure they cannot abuse their dominant positions at the expense of consumers and other businesses”.

A new Big Tech regulator will also be empowered to “proactively address the root causes of competition issues in digital markets” via “interventions to inject competition into the market, including obligations on tech firms to report new mergers and give consumers more choice and control over their data”, it also said.

However another key detail from the speech specifies that the forthcoming Digital Markets, Competition and Consumer Bill will only be put out in “draft” form over the parliament — meaning the reform won’t be speeding onto the statue books.

Instead, up to a year could be added to the timeframe for passing laws to empower the Digital Markets Unit (DMU) — assuming ofc Johnson’s government survives that long. The DMU was set up in shadow form last year but does not yet have legislative power to make the planned “pro-competition” interventions which policymakers intend to correct structural abuses by Big Tech.

(The government’s Online Safety Bill, for example — which was published in draft form in May 2021 — wasn’t introduced to parliament until March 2022; and remains at the committee stage of the scrutiny process, with likely many more months before final agreement is reached and the law passed. That bill was included in the 2022 Queen’s Speech so the government’s intent continues to be to pass the wide-ranging content moderation legislation during this parliamentary session.)

The delay to introducing the competition reform means the government has cemented a position lagging the European Union — which reached political agreement on its own ex ante competition reform in March. The EU’s Digital Markets Act is slated to enter into force next Spring, by which time the UK may not even have a draft bill on the table yet. (While Germany passed an update to its competition law last year and has already designated Google and Meta as in scope of the ex ante rules.)

The UK’s delay will be welcomed by tech giants, of course, as it provides another parliamentary cycle to lobby against an ex ante reboot that’s intended to address competition and consumer harms in digital markets which are linked to giants with so-called “Strategic Market Status”.

This includes issues that the UK’s antitrust regulator, the CMA, has already investigated and confirmed (such as Google and Facebook’s anti-competitive dominance of online advertising); and others it suspects of harming consumers and hampering competition too (like Apple and Google’s chokepoint hold over their mobile app stores).

Any action in the UK to address those market imbalances doesn’t now look likely before 2024 — or even later.

Recent press reports, meanwhile, have suggested Johnson may be going cold on the ex ante regime — which will surely encourage Big Tech’s UK lobbyists to seize the opportunity to spread self-interested FUD in a bid to totally derail the plan.

The delay also means tech giants will have longer to argue against the UK introducing an Australian-style news bargaining code — which the government appears to be considering for inclusion in the future regime.

One of the main benefits of the bill is listed as [emphasis ours]:

“Ensuring that businesses across the economy that rely on very powerful tech firms, including the news publishing sector, are treated fairly and can succeed without having to comply with unfair terms.”

“The independent Cairncross Review in 2019 identified an imbalance of bargaining power between news publishers and digital platforms,” the government also writes in its briefing note, citing a Competition and Markets Authority finding that “publishers see Google and Facebook as ‘must have’ partners as they provide almost 40 per cent of large publishers’ traffic”.

Major consumer protection reforms which are planned in parallel with the ex ante regime — including letting the CMA decide for itself when UK consumer law has been broken and fine violating platforms over issues like fake reviews, rather than having to take the slow route of litigating through the courts — are also on ice until the bill gets passed. So major ecommerce and marketplace platforms will also have longer to avoid hard-hitting regulatory action for failures to purge bogus reviews from their UK sites.

Consumer rights group, Which?, welcomed the government’s commitment to legislate to strengthen the UK’s competition regime and beef up powers to clamp down on tech firms that breach consumer law. However it described it as “disappointing” that it will only publish a draft bill in this parliamentary session.

“The government must urgently prioritise the progress of this draft Bill so as to bring forward a full Bill to enact these vital changes as soon as possible,” added Rocio Concha, Which? director of policy and advocacy, in a statement.

Data reform bill

In another major post-Brexit policy move, the government has been loudly flirting with ripping up protections for citizens’ data — or, at least, killing off cookie banners.

Today it confirmed it will move forward with ‘reforming’ the rules wrapping people’s data — just without being clear about the exact changes it plans to make. So where exactly the UK is headed on data protection still isn’t clear.

That said, in briefing notes on the forthcoming data reform bill, the government appears to be directing most focus at accelerating public sector data sharing instead of suggesting it will pass amendments that pave the way for unfettered commercial data-mining of web users.

Indeed, it claims that ensuring people’s personal data “is protected to a gold standard” is a core plank of the reform.

A section on the “main benefits” of the reform also notably lingers on public sector gains — with the government writing that it will be “making sure that data can be used to empower citizens and improve their lives, via more effective delivery of public healthcare, security, and government services”.

But of course the devil will be in the detail of the legislation presented in the coming months. 

Here’s what else the government lists as the “main elements” of the upcoming data reform bill:

  • Using data and reforming regulations to improve the everyday lives of people in the UK, for example, by enabling data to be shared more efficiently between public bodies, so that delivery of services can be improved for people.
  • Designing a more flexible, outcomes-focused approach to data protection that helps create a culture of data protection, rather than “tick box” exercises.

Discussing other “main benefits” for the reform, the government touts increased “competitiveness and efficiencies” for businesses, via a suggested reduction in compliance burdens (such as “by creating a data protection framework that is focused on privacy outcomes rather than box-ticking”); a “clearer regulatory environment for personal data use” which it suggests will “fuel responsible innovation and drive scientific progress”; “simplifying the rules around research to cement the UK’s position as a science and technology superpower”, as it couches it; and ensuring the data protection regulator (the ICO) takes “appropriate action against organisations who breach data rights and that citizens have greater clarity on their rights”.

The upshot of all these muscular-sounding claims boils down to whatever the government means by an “outcomes-focused” approach to data protection vs “tick-box” privacy compliance. (As well as what “responsible innovation” might imply.)

It’s also worth mulling what the government means when it says it wants the ICO to take “appropriate” action against breaches of data rights. Given the UK regulator has been heavily criticized for inaction in key areas like adtech you could interpret that as the government intending the regulator to take more enforcement over privacy breaches, not less.

(And its briefing note does list “modernizing” the ICO, as a “purpose” for the reform — in order to “[make] sure it has the capabilities and powers to take stronger action against organisations who breach data rules while requiring it to be more accountable to Parliament and the public”.)

However, on the flip side, if the government really intends to water down Brits’ privacy rights — by say, letting businesses overrule the need to obtain consent to mine people’s info via a more expansive legitimate interest regime for commercial entities to do what they like with data (something the government has been considering in the consultation) — then the question is how that would square with a top-line claim for the reform ensuing “UK citizens’ personal data is protected to a gold standard”?

The overarching question here is whose “gold standard” the UK is intending to meet? Brexiters might scream for their own yellow streak — but the reality is there are wider forces at play once you’re talking about data exports.

Despite Johnson’s government’s fondness for ‘Brexit freedom’ rhetoric, when it comes to data protection law the UK’s hands are tied by the need to continue meeting the EU’s privacy standards, which require the an equivalent level of protection for citizens’ data outside the bloc — at least if the UK wants data to be able to flow freely into the country from the bloc’s ~447M citizens, i.e. to all those UK businesses keen to sell digital services to Europeans. 

This free flow of data is governed by a so-called adequacy decision which the European Commission granted the UK in June last year, essentially on account that no changes had (yet) been made to UK law since it adopted the bloc’s General Data Protection Regulation (GDPR) in 2018 by incorporating it into UK law.

And the Commission simultaneously warned that any attempt by the UK to weaken domestic data protection rules — and thereby degrade fundamental protections for EU citizens’ data exported to the UK — would risk an intervention. Put simply, that means the EU could revoke adequacy — requiring all EU-UK data flows to be assessed for legality on a case-by-case basis, vastly ramping up compliance costs for UK businesses wanting to import EU data.

Last year’s adequacy agreement also came with a baked in sunset clause of four years — meaning it will be up for automatic review in 2025. Ergo, the amount of wiggle room the UK government has here is highly limited. Unless it’s truly intent on digging ever deeper into the lunatic sinkhole of Brexit by gutting this substantial and actually expanding sunlit upland of the economy (digital services).

The cost — in pure compliance terms — of the UK losing EU adequacy has been estimated at between £1BN-£1.6BN. But the true cost in lost business/less scaling would likely be far higher.

The government’s briefing note on its legislative program itself notes that the UK’s data market represented around 4% of GDP in 2020; also pointing out that data-enabled trade makes up the largest part of international services trade (accounting for exports of £234BN in 2019).

It’s also notable that Johnson’s government has never set out a clear economic case for tearing up UK data protection rules.

The briefing note continues to gloss over that rather salient detail — saying that analysis by the Department for Digital, Culture, Media and Sport (DCMS) “indicates our reforms will create over £1BN in business savings over ten years by reducing burdens on businesses of all sizes”; but without specifying exactly what regulatory changes it’s attaching those theoretical savings to.

And that’s important because — keep in mind — if the touted compliance savings are created by shrinking citizens’ data protections that risks the UK’s adequacy status with the EU — which, if lost, would swiftly lead to at least £1BN in increased compliance costs around EU-UK data flows… thereby wiping out the claimed “business savings” from ‘less privacy red tape’.

The government does cite a 2018 economic analysis by DCMS and a tech consultancy, called Ctrl-Shift, which it says estimated that the “productivity and competition benefits enabled by safe and efficient data flows would create a £27.8BN uplift in UK GDP”. But the keywords in that sentence are “safe and efficient”; whereas unsafe EU-UK data flows would face being slowed and/or suspended — at great cost to UK GDP…

The whole “data reform bill” bid does risk feeling like a bad-faith PR exercise by Johnson’s thick-on-spin, thin-on-substance government — i.e. to try to claim a Brexit ‘boon’ where there is, in fact, none.

See also this “key fact” which accompanies the government’s spiel on the reform — claiming:

“The UK General Data Protection Regulation and Data Protection Act 2018 are highly complex and prescriptive pieces of legislation. They encourage excessive paperwork, and create burdens on businesses with little benefit to citizens. Because we have left the EU, we now have the opportunity to reform the data protection framework. This Bill will reduce burdens on businesses as well as provide clarity to researchers on how best to use personal data.”

Firstly, the UK chose to enact those pieces of legislation after the 2016 Brexit vote to leave the EU. Indeed, it was a Conservative government (not led by Johnson at that time) that passed these “highly complex and prescriptive pieces of legislation”.

Moreover, back in 2017, the former digital secretary Matt Hancock described the EU GDPR as a “decent piece of legislation” — suggesting then that the UK would, essentially, end up continuing to mirror EU rules in this area because it’s in its interests to do so to in order to keep data flowing.

Fast forward five years and the Brexit bombast may have cranked up to Johnsonian levels of absurdity but the underlying necessity for the government to “maintain unhindered data flows”, as Hancock put it, hasn’t gone anywhere — or, well, assuming ministers haven’t abandoned the idea of actually trying to grow the economy.

But there again the government lists creating a “pro-growth” (and “trusted”) data protection framework as a key “purpose” for the data reform bill — one which it claims can both reduce “burdens” for businesses and “boosts the economy”. It just can’t tell you how it’ll pull that Brexit bunny out of the hat yet.

The market for synthetic data is bigger than you think

“By 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated.” This is a prediction from Gartner that you will find in almost every single article, deck, or press release related to synthetic data.

We are repeating this quote here despite its ubiquity because it says a lot about the total addressable market of synthetic data.

Let’s unpack: First, describing synthetic data that is “synthetically generated” may seem tautologic, but it is also quite clear: We are talking about data that is artificial/fake and created, rather than gathered in the real world.

Next, there’s the core of the prediction — that synthetic data will be used in the development of most AI and analytics projects. Since such projects are on the rise, the correlation is that the market for synthetic data is also set to grow.

Last but not least is the time horizon. In our startup world, 2024 is almost today, and people at Gartner already have a longer-term prediction: Some of its team published a piece of research “Forget About Your Real Data — Synthetic Data Is the Future of AI.”

“The future of AI” is the kind of promise that investors like to hear, so it’s no surprise that checks have been flowing into synthetic data startups.

In 2022 alone, MOSTLY AI raised a $25 million Series B round led by Molten Ventures; Datagen landed a $50 million Series B led by Scale Venture Partners, and Synthesis AI pocketed a $17 million Series A.

Synthetic data startups that have raised significant amounts of funding already serve a wide range of sectors, from banking and healthcare to transportation and retail. But they expect use cases to keep on expanding, both inside new sectors as well as those where synthetic data is already common.

To understand what’s happening, but also what’s coming if synthetic data does get more broadly adopted, we talked to various CEOs and VCs over the last few months. We learned about the two main categories of synthetic data companies, which sectors they address, how to size the market, and more.

The tip of the iceberg

Quiet Capital’s founding partner, Astasia Myers, is one of the investors bullish about synthetic data and its applications. She declined to disclose whether she invested in this space, but said that “there’s a lot to be excited about in the synthetic data world.”

Why the enthusiasm? “Because it gives teams faster access to data in a secure way at a lower cost,” she told TechCrunch.

We can simply say that the TAM of synthetic data and the TAM of data will converge. Ofir Zuk (Chakon)

Access to large troves of data has become critical for machine learning teams, and real data is often not up to the task, for different reasons. This is the gap that synthetic data startups are hoping to fill.

There are two main contexts in which these startups focus: structured data and unstructured data. The former refers to the kind of datasets that sit in tables and spreadsheets, while the latter points toward what we could call media files, such as audio, text, and visual data.

“It makes sense to distinguish between structured and unstructured synthetic data companies,” Myers said, “because the synthetic data type is applied to different use cases and therefore different buyers.”