Computerspeak by Alexandru Voica

Computerspeak by Alexandru Voica

Share this post

Computerspeak by Alexandru Voica
Computerspeak by Alexandru Voica
Researchers have a better definition for low resource languages; Physical Intelligence is a $1bn robotics startup; Slack surveys 17,000 workers about AI; tech giants investing in "sovereign AI"
Copy link
Facebook
Email
Notes
More

Researchers have a better definition for low resource languages; Physical Intelligence is a $1bn robotics startup; Slack surveys 17,000 workers about AI; tech giants investing in "sovereign AI"

Big labs struggle to build more advanced AI; xAI's supercomputer freaks out rivals; Amazon bets big on in-house AI chips; Donald Trump's AI policies explained; Andrew Ross Sorkin meets his AI clone

Alexandru Voica's avatar
Alexandru Voica
Nov 15, 2024
∙ Paid
1

Share this post

Computerspeak by Alexandru Voica
Computerspeak by Alexandru Voica
Researchers have a better definition for low resource languages; Physical Intelligence is a $1bn robotics startup; Slack surveys 17,000 workers about AI; tech giants investing in "sovereign AI"
Copy link
Facebook
Email
Notes
More
1
Share

I’m from Romania, the sixth largest member of the European Union by population. Despite being surrounded by mostly Slavic nations, Romanians speak a dialect of Vulgar Latin which separated from Western Romance languages such as Italian or Spanish between the 5th and the 8th centuries. Romanian has evolved from fewer than 2,500 words to a lexicon of over 150,000 words in its contemporary form by demonstrating a high degree of lexical permeability, reflecting contact with the indigenous Thraco-Dacian as well as the languages of various nations that invaded or ruled over parts of Romania, such as Russian, Greek, Hungarian, German, Turkish, or languages that served as cultural models during and after the Age of Enlightenment, in particular French, Italian and English.

Today, there are over 25 million speakers of Romanian around the world and yet, for many AI researchers, it is considered a low resource language. This may surprise you because the term low resource language immediately brings up the image of an isolated community with thousands of speakers living somewhere in the Global South, and not an entire country with tens of millions of people communicating in a language directly derived from Latin. 

A new paper presented this week by researchers from MBZUAI and UC Berkeley at the EMNLP conference in Miami explains why Romanian or Cherokee (a language spoken by 2,000 people in the United States) can both be considered low resource, albeit for different reasons.

The paper reasons that the relationship between low and high resource languages can be best understood through Zeno’s paradox of Achilles and the tortoise. Imagine Achilles forever pursuing a slow-moving tortoise with a head start. Despite his speed, Achilles can never quite reach the tortoise—a metaphor for the perpetual struggle of low resource languages to catch up to the ever-moving target set by well-resourced counterparts like English or Chinese. 

Historically, low resource languages have been defined in a binary way by putting them in direct contrast with high-resource languages in terms of data availability. However, this approach oversimplifies the reality that resourcedness sits on a spectrum that is also influenced by critical socio-political and cultural dimensions. Criteria for labeling languages as low resource range from the number of speakers and available linguistic resources to economic conditions and digital presence. For example, while Quechua is spoken by millions, it is still considered low resource due to its lack of linguistic data and infrastructure for AI tasks. By contrast, some languages with smaller speaker populations but a richer linguistic database are better represented in AI research.

With these factors in mind, the researchers propose that we should evaluate low resource languages by looking at four key dimensions that affect the resourcedness of a language:

  1. Socio-political context: Economic and historical factors shape language resources, especially for languages marginalized within their countries. Many communities lack the financial means to create language resources, and in some cases, language policies prioritize dominant languages, further marginalizing others.

  2. Human and digital resources: low resource languages often lack essential human resources, such as linguistic experts or native speakers, and digital tools. The scarcity of trained AI researchers from these communities exacerbates the issue.

  3. Artifacts and infrastructure: These include curated linguistic data, computational tools, and other resources for language technology development. For many languages, even if data exists, it may lack consistency or standardization, complicating efforts to build effective language technologies such as AI models. In the case of Romanian, many existing datasets contain a significant amount of noise, with samples in Slavic languages being incorrectly labeled as Romanian partly because of the wrong assumption that Romanian is a Slavic language since it is the language of an Eastern European country.

  4. Community agency: The degree of involvement by native-speaking communities significantly impacts the creation and adoption of language tools. Technologies built without considering community needs may have minimal real-world application, limiting their effectiveness.

This precise classification of low resource languages is crucial for developing targeted interventions. Without a clear definition, measuring progress and addressing specific needs for each language becomes nearly impossible. For example, if one language lacks community agency while another lacks linguistic data, they require different strategies to achieve technological equity. 

But there’s another reason why this paper stood out to me: when ChatGPT exploded into the mainstream, the AI community began to debate when and how we will make the leap from foundation models to AGI. The problem was that everyone was using their own definition of AGI so Google DeepMind, Anthropic, OpenAI and others began publishing their views on the topic, including frameworks for classifying the capabilities of AI. These positions then converged to a relatively stable definition of AGI that allowed for society at large to move from debating the progress we’re making in advancing AI to measuring it.

So while low resource languages may never fully overtake high resource languages, we now have a clearer definition of these concepts which hopefully will lead to faster progress or, at least, a more accurate measurement of the distance between Achilles and the tortoise. 

And now, here are this week’s news:

❤️Computer loves

Our top news picks for the week - your essential reading from the world of AI

  • Wired: Inside the Billion-Dollar Startup Bringing AI Into the Physical World

  • Business Insider: 5 interesting takeaways from Slack's survey of 17,000 desk workers about AI

  • CNBC: Tech giants are investing in ‘sovereign AI’ to help Europe cut its dependence on the U.S.

  • Harvard Business Review: Research: How Gen AI Is Already Impacting the Labor Market

  • Fortune: This United Nations AI official explains why she doesn’t want an international agency for AI

  • Bloomberg: OpenAI, Google and Anthropic Are Struggling to Build More Advanced AI

  • The Information: How Elon Musk’s Supercomputer Freaked Out AI Rivals

  • WSJ: It’s a Legacy Agriculture Company—And Your Newest AI Vendor

  • FT: Amazon steps up effort to build AI chips that can rival Nvidia

  • Fortune: Think Donald Trump’s AI policy plans are predictable? Prepare to be surprised

  • FT: AI groups rush to redesign model testing and create new benchmarks

  • Reuters: OpenAI and others seek new path to smarter AI as current methods hit limitations

  • The Information: OpenAI Shifts Strategy as Rate of ‘GPT’ AI Improvements Slows

  • WSJ: How ChatGPT Brought Down an Online Education Giant

  • Time: What Donald Trump’s Win Means For AI

  • New York Times: I Took a ‘Decision Holiday’ and Put A.I. in Charge of My Life

  • CNBC: Meet the AI version of Andrew Ross Sorkin and David Faber

⚙️Computer does

AI in the wild: how artificial intelligence is used across industry, from the internet, social media, and retail to transportation, healthcare, banking, and more

  • TechCrunch: ChatGPT can now read some of your Mac’s desktop apps

  • VentureBeat: This startup's AI platform could replace 90% of your accounting tasks—here's how

  • The Verge: YouTube is testing music remixes made by AI

  • The Verge: Particle is a new app using AI to organize and summarize the news

  • MIT Technology Review: Generative AI taught a robot dog to scramble around a new environment

  • Variety: Jerry Garcia’s AI-Created Voice Can Now Narrate Audiobooks, Articles and More

  • The Verge: Instagram could let AI generate a profile picture for you

  • Axios: Putting AI to work for public defenders

  • The Telegraph: NHS doctors get AI assistant to listen to appointments and make notes

  • The Telegraph: Blind woman has better than 20/20 vision after AI surgery

  • Washington Post: Randy Travis’s beautiful baritone was lost. AI helped him sing again.

🧑‍🎓Computer learns

Interesting trends and developments from various AI fields, companies and people

  • Washington Post: AI travel influencers are here. Human travelers hate it.

  • TechCrunch: TikTok plugs Getty Images into its AI-generated ads and avatars

  • Wired: The First Entirely AI-Generated Video Game Is Insanely Weird and Fun

  • Axios: NHL updates massive video trove, readying for an AI world

  • TechCrunch: Tiger Global-backed InVideo launches gen AI-based video creation

  • TechCrunch: AI pioneer Francois Chollet leaves Google

  • WSJ: The Wall Street Journal is testing AI article summaries

  • The Verge: More AI-generated ads are coming to TikTok

  • New York Times: Are A.I. Clones the Future of Dating? I Tried Them for Myself.

  • The Information: The Enterprise Search App That Got Google and OpenAI’s Attention

  • MIT Technology Review: Google DeepMind has a new way to look inside an AI’s “mind”

  • Business Insider: Amazon's AI chatbot Q is entering enemy turf by integrating with Microsoft's Office 365

  • Business Insider: Instead of killing jobs, there's a strange AI hiring boom happening, according to Marc Andreessen

  • Fortune: Europe’s AI industry watches Trump’s return with a mix of fear and hope

  • Business Insider: Golin's first chief AI officer shares the company's strategy for using AI to transform public relations

  • The Telegraph: Shakespeare’s poetry ‘not as good as AI’

  • Fortune: Elon Musk’s xAI safety whisperer just became an advisor to Scale AI

  • Bloomberg: OpenAI Nears Launch of AI Agent Tool to Automate Tasks for Users

  • VentureBeat: You can now run the most powerful open source AI models locally on Mac M4 computers, thanks to Exo Labs

  • Business Insider: The race for the best AI model is 'heated,' a TeamViewer tech executive says — here's how the company is leveraging it

  • Business Insider: Inside Forward's failed attempt to revolutionize the doctor's office

  • Axios: Study: Growth of AI adoption slows among U.S. workers

  • TechCrunch: DeepL launches DeepL Voice, real-time, text-based translations from voices and videos

  • TechCrunch: Marc Benioff says it’s ‘crazy talk’ that AI will hurt Salesforce, wants a billion AI agents in a year

  • TechCrunch: Almost all of this year’s top 40 startups at Station F use AI

  • TechCrunch: Perplexity brings ads to its platform

  • New York Times: Stand-Up, Drama and Spambots: The Creative World Takes On A.I.

  • CNBC: Startup CEO says humans won’t be needed for translation in 3 years as it launches AI app

  • VentureBeat: ServiceNow rolls out enterprise AI governance capabilities to accelerate production deployment

  • FT: Recruiters urge candidates to use AI to apply for jobs

  • Fortune: Glassdoor CEO talks about the hottest jobs in the AI boom—and the one job he thinks is phasing out

  • Sifted: Google, Meta and some of France's top universities: Where Mistral poaches its top talent from

  • MIT Technology Review: The AI lab waging a guerrilla war over exploitative AI

  • TechCrunch: Amazon attempts to lure AI researchers with $110M in grants and credits

  • Bloomberg: OpenAI Co-Founder Returns to Startup After Monthslong Leave

  • The Information: Ex-OpenAI CTO Murati’s New Team Takes Shape

  • VentureBeat: Qwen2.5-Coder just changed the game for AI programming—and it's free

  • VentureBeat: Magic Story launches AI-based media platform for children to create their own adventures

  • VentureBeat: Box continues to expand beyond just data sharing, with agent-driven enterprise AI studio and no-code apps

  • Fortune: Recession could create an ‘abrupt shift’ in AI adoption: ‘That’s when you really see the effects of automation’

  • Fortune: T&T’s CEO says AI may cause power shortages and it could be ‘the next big social issue in the United States’

  • Reuters: Baidu bolsters AI lineup with enhanced text-to-image tech, no-code app builder

  • The Verge: Google’s AI ‘learning companion’ takes chatbot answers a step further

  • CNBC: China’s Alibaba releases AI search tool for small businesses in Europe and the Americas

  • FT: China’s Baidu joins Meta in race to make AI-integrated smart glasses

  • Lex Fridman: Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

  • VentureBeat: Google DeepMind open-sources AlphaFold 3, ushering in a new era for drug discovery and molecular biology

  • Reuters: Vatican unveils AI services for St. Peter's Basilica ahead of Jubilee

  • Import AI: Tencent's new Hunyuan model is a MoE triumph, and by some measures is world class

  • Wired: I Went Birding With the World’s First AI-Powered Binoculars

  • TechCrunch: OpenAI loses another lead safety researcher, Lilian Weng

  • TechCrunch: X is testing a free version of AI chatbot Grok

  • The Verge: Spotify’s AI is no match for a real DJ

  • The Verge: How to use the latest AI video editing tools in Google Photos

  • VentureBeat: Multimodal RAG is growing, here's the best way to get started

  • Wired: The AI Machine Gun of the Future Is Already Here

  • AFP: Robot artist Ai-Da’s portrait of Alan Turing shatters auction records selling for over $1 million—a first for AI artwork

  • MIT Technology Review: A bold AI movement is underway in Africa—but it is being held up

  • Business Insider: Google's head of research on whether 'learn to code' is still good advice in the age of AI

  • Business Insider: Nvidia CEO says there's 'no question' that we'll all be working alongside AI employees

  • Business Insider: Here's how far we are from AGI, according to the people developing it

  • Business Insider: Indeed prepares for 2025 rollout of its new AI tool, Pathfinder, which aims to help job seekers

  • Reuters: Web Summit kicks off in Lisbon as tech leaders weigh Trump’s return

Keep reading with a 7-day free trial

Subscribe to Computerspeak by Alexandru Voica to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Alexandru Voica
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More