In UK AI policy, the vibes (and gloves) are off; a Neuralink implant gets a boost from AI; inside the org of small, AI-native startups; NIMBYs oppose AI infrastructure; 1,000 ways to use AI in health

Meta wants us to have AI friends; China AI market on course to hit $50b; Europe struggles to keep hold of its AI companies; meet an artist who's not afraid of AI; hallucinations in models are worse

May 09, 2025

∙ Paid

The level of public discourse about AI hit a new low in the UK this week. For the first time in a while, I’m starting to get the sense that — and I can’t believe I’m writing this as an almost 40-year old — the vibes are off.

On Wednesday, I listened to the full, four hour debate in the House of Commons in which MPs discussed the Data (Access and Use) Bill and its amendments. MPs expressed apprehension about AI's impact on copyright, described many problems but offered no clear strategies to address these challenges. Then on Thursday, I attended Politico’s AI & Tech Summit which hosted a melee… (*checks notes*) apologies, a panel specifically on AI and copyright.

As I sat through both events, I kept trying to come up with a term that describes people who use simple and emotionally charged language to frame complex issues and overfocus on the division between two groups but nothing came to mind. Maybe it’ll come to me by the end of this article.

Nevertheless, the word transparency was brought up repeatedly during both conversations. Transparency is a great and noble principle of course, but how you put it in practice is a different matter. For example, there are some who argue for absolute transparency regarding the data used in AI models and envision a public register where companies dump massive Excel spreadsheets with links and IDs. But that will only result in two predictable outcomes: a bureaucratic system akin to GDPR that only favors incumbents and a myriad of performative technological solutions such as the cookie banners that we mindlessly click on today.

A large AI model ingests tens of trillions of tokens scraped from billions of URLs plus private data. Publishing the full list of files or URLs would run to many terabytes and would need to be refreshed with every incremental fine-tune. Even if we envisage monthly updates, this would be barely feasible for UK AI startups with models that re-train continuously but may be achievable for their much-better funded American counterparts. So what is the likely outcome of such a measure? UK AI companies are forced to train smaller models and, over time, they get wiped out by big American tech.

A register that literally enumerates every record also risks re-exposing personal data that the GDPR obliges developers to minimize or anonymize. AI models are not built like traditional software — they require a lot of experimentation. Therefore, data set composition, filtering tricks and dead-end experiments are often the secret sauce of model quality. Publishing them all would wipe out any advantage that a smaller company such as Mistral has against OpenAI or Anthropic. It would also likely prevent the emergence of a European DeepSeek.

Lastly, regulators would need to audit petabytes of material to check if the register is honest. This is an unfunded mandate that today’s ICO, CMA or EU AI Office simply cannot meet.

So what are some real-world solutions? We can start by limiting disclosure to a sufficiently detailed summary rather than raw data, and focusing on general purpose AI models rather than domain-specific ones. The companies developing these general purpose models would have to publish a concise, standardised summary that lists each major dataset, its provenance (e.g., Common Crawl, licensed, synthetic, etc.), date range, basic diversity statistics, and filtering steps. This could be enough for rights holders, civil-society auditors and competing developers to understand where risk may lie, without revealing every byte.

Then, we can build the infrastructure that would allow for secure deposit and accredited audits, with a trusted authority under strict confidentiality. This authority would accredit auditors (similar to the ISO model) who can check compliance and publish an assurance statement. This satisfies the “show your working” demand without a public data dump.

Finally, we can close the transparency loop with a rights holder query portal where a creator or publisher can upload hashes or small samples. The system returns a yes/no on whether their work is included in an AI model and — if yes — provides a licensing or takedown workflow. This aligns with the intent of offering granular transparency for rights holders while limiting broad public exposure.

I heard none of these solutions this week. Instead, people got red in the face a lot and shouted “Trumpian tech bro” and “luddite” at each other, hoping for a good headline.

I’ve always been of the opinion that it is far easier to track money flows than petabytes of individual files. So the creative industry should put down their “leave AI to sci-fi” signs and come back to the real world with practical solutions for high-value copyrighted corpora such as collective licences that are fit for purpose, take months and not years to sign, and are accessible to AI companies of all sizes. On the other hand, the AI industry should speed up the adoption of technical standards such as C2PA and not be afraid to call out bullshit when AI crawlers ignore them.

Investors and developers need predictable, proportionate rules; regulators and creators need verifiable insight. A layered regime that combines public nutrition labels, confidential audit access, and targeted tools for rights holders delivers accountability without paralysing UK AI companies.

And now, here are the week’s news:

❤️Computer loves

Our top news picks for the week - your essential reading from the world of AI

MIT Technology Review: This patient’s Neuralink brain implant gets a boost from generative AI
Bloomberg: Built to Stay Small: Inside the Org Charts of AI-Native Startups
Semafor: NIMBYism hits US AI infrastructure buildout
New Yorker: Everyone Is Cheating Their Way Through College
WSJ: Zuckerberg’s Grand Vision: Most of Your Friends Will Be AI
Sifted: Meet the 18-year-old dropout building the AI agent to rule them all
Business Insider: A Caltech professor who led Nvidia's AI lab says AI can't replace this one skill
Bloomberg: Nvidia CEO Says China AI Market Is on Course to Hit $50 Billion
Fortune: Europe should be a global AI leader given its talent pool, but the trouble is keeping companies there
The Guardian: Ministers reconsider changes to UK copyright law ahead of vote
Time: Why This Artist Isn’t Afraid of AI’s Role in the Future of Art
Forbes: This Startup Is Using AI Agents To Fight Malicious Ads And Impersonator Accounts
The New York Times: A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse
WSJ: UnitedHealth Now Has 1,000 AI Use Cases, Including in Claims
Bloomberg: Google Can Train Search AI With Web Content After AI Opt-Out

Keep reading with a 7-day free trial

Subscribe to Computerspeak by Alexandru Voica to keep reading this post and get 7 days of free access to the full post archives.