Skip to content

6 emerging AI and data trends for 2025

By Björn De Vidts, CEO @ Athumi

It’s that time of the year again, when everyone is eager to be inspired by trends and innovations for the fresh year. So, I wanted to share my top 6 (I set out to list 5 but became too enthusiastic and allowed myself 1 more) trends that will define all the companies and projects that work at the intersection of AI and data in the coming months and years, which is of course a truly essential area for my company Athumi.

1. Deep data: from emotions to actions

The most interesting (emerging) technologies of our times - from AI and robots to agents, IoT, AR, VR, quantum computing and many more - are extremely data intensive and, as a result, also privacy intensive. We have long passed the threshold where data was text based only, but the hardware to gather it was far from ubiquitous. And neither was the software to analyze it. Today, we are finding ourselves at a point where both are converging:

  1. On the one hand we see the rise of context aware hardware that is able to read, see and listen. I’m talking about (humanoid) robots like Figure (Figure) and Optimus (Tesla), XR devices like Apple Vision Pro, Meta Quest, Meta Ray Ban and its prototype Orion, (self-driving) EVs, Brain Computer Interfaces (like those of Neuralink and Synchron), etc.
  2. On the other hand, many companies are also investing in software that is able to analyze this new type of data.

See and understand

For instance, some of the most underestimated launches of the year took place in the realm of Large World Models or Spatial Computing. OpenAI’s Sora was one of them, which does not just create fun (and, according to some, scary) videos but “teaches AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.”

Another was Godmother of AI, Fei-Fei Li’s launch of World Labs, “a spatial intelligence company building Large World Models (LWMs) to perceive, generate, and interact with the 3D world.” Simply put, these models are teaching (among other things) robots and AR glasses to deeply understand what they “see” and “hear”, up to the actual physics of the 3D world, which is a very hard thing for a computer system to do. The latter is a part of Moravec's paradox which describes that tasks that are easy for humans - like motor or social skills - are difficult for machines to replicate, whereas tasks that are difficult for humans - like performing mathematical calculations or large-scale data analysis - are relatively easy for machines to accomplish.

“Feel”

Companies are also investing quite some money in teaching their systems to understand emotions. Recent examples are Emteq Labs glasses ‘Sense’ which have cameras pointed inwards which aim to track the user’s life and quantify how they’re feeling. Hume AI recently introduced its latest voice-to-voice foundation model EVI 2 - with which you can have remarkably human-like voice conversations – and made it “emotionally intelligent” (well, you know...). OpenAI’s GPT-4o voice model is also said to be able to detect the emotions of the user. In its demo, for instance, it listened to an executive’s breathing and encouraged him to calm down.

Act

Not only are AI systems starting to understand context and emotion, they also are nearing a point where they will also be able to act on behalf of users, through the Large Action Models of the much-hyped AI agents. OpenAI, for instance, is working on an AI agent codenamed 'Operator' that will be able to use a computer to take actions on users' behalf. Google is working on Jarvis, an advanced AI agent capable of autonomously browsing the web, making purchases, and completing online tasks on behalf of users. Anthropic updated its flagship model that can now try to control your screen to do tasks in your apps.

Computer systems have always been fed on data, of course. But the big change I see is that all of the emerging technologies are gathering even MORE data and more TYPES of data. And this will obviously bring with it quite some privacy, security, control and compliance challenges, which - I’m proud to say - my company Athumi can all help address.

2. A raging war for data

After the war for talent, we are now entering the war for data. As with the trend described above, the essence of this is not new, but Generative AI and Large Language Models (LLMs) like Gemini (Google), Claude (Anthropic) and GPT (OpenAI), have truly changed the scale. And the stakes.

If we want Generative AI models for consumers and for companies to become smarter, they need to be fed with an enormous amount of data. And the market leaders like OpenAI and Perplexity are running into all kinds of trouble amassing these since they tend to handle the “ask for forgiveness not permission” approach.

The media and creative industry, for instance, are deeply un-amused about the unauthorized use of their data. The New York Times and Daily News filed lawsuits against OpenAI on allegations that it scraped data without their permission. The former also sent a cease-and-desist letter to Perplexity, the AI search engine startup, demanding they stop accessing content from its site. Facebook, Instagram, Craigslist, Tumblr, The New York Times, The Financial Times, The Atlantic, Vox Media, the USA Today network, and WIRED’s parent company, Condé Nast, are among the many organizations that exclude their data from Apple’s AI training. Over 25,000 creatives, including Thom Yorke and Julianne Moore, also signed a statement addressing AI companies' use of copyrighted work. They feel that training AI models on their work without permission threatens their professional livelihoods and creative rights.

Others chose collaboration and receiving compensation rather than resisting. Meta, for instance, signed a licensing deal with Reuters to train its AI models on high-quality news content. Condé Nast struck a multiyear partnership with OpenAI. Cloudflare launched a marketplace where website owners can sell access to AI companies for content scraping, aiming to address the issue of uncompensated content use in AI models.

Fight or collaborate?

Of course, IP and copyright problems aren’t the only challenge here, privacy is also a big one. Many people are worried. A survey by PrivacyHawk, for instance, found that nearly half of the US population (45%) are “very or extremely concerned about their personal data being exploited, breached, or exposed,” while about 94% are “generally concerned.” Only about 6% were not concerned at all about their personal data risk. Interestingly, nearly 90% said they would like to get a “privacy score,” similar to a credit score, that shows how exposed their data is. If you add these concerns to trend #1 described above - where the amount and the type of sensitive data that is being amassed and analysed is only increasing - then you’ll understand why we need to act on that now.

Once these LLMs and other models will become more trustworthy and (mostly) stop hallucinating, we’ll probably see massive uptake in companies as well, with so called enterprise GenAI. That means that for instance a pharmaceutical company could add its own proprietary data to an existing model like GPT-4, for internal or external use. Even better, they could work together with other companies from other industries, like cosmetics, sports and healthcare, to combine their data. And that would give the exact same IP, copyright and privacy problems as described above in the consumer GenAI.

For me, far more interesting than companies fighting for the best clean data, will be when they start working together and share their data to develop applications that could help save some of the world’s biggest challenges, like climate change or health hazards.

3. Slop-py models

Slop - poor quality synthetic data which is not human-made but created by AI-systems - is quickly becoming one of the major challenges of Generative AI and other types of data application. The biggest danger of slop and other synthetic data is that it can lead to Habsburg AI and model collapse.

Habsburg AI is a term coined by data researcher Jathan Sadowski. This is how he himself described it on X: “Habsburg AI – a system that is so heavily trained on the outputs of other generative AI's that it becomes an inbred mutant, likely with exaggerated, grotesque features.”

Habsburg AI

The term draws an analogy to the Habsburg dynasty, a European royal family known for extensive intermarriage, which led to inbreeding and genetic anomalies over generations. Similarly, AI models trained extensively on AI-generated data risk developing exaggerated, distorted, grotesque and potentially nonsensical features, deviating from the richness and nuance of human-generated data. This phenomenon is also called model collapse.

If we want the data economy to live up to its promises, data quality will be key. It’s not because huge amounts of data are being gathered and produced, that this data is clean, human-made and relevant. One way of dealing with the slop and synthetic data problem would be by creating data pods for human users. They would function as some kind of certificate making clear that the data is made by humans, and thus a clean source for training models. With, as cherry on the cake, that the user can track and trace who uses their data and why.

4. A reshuffle of trust

The age of Generative AI also ushered in a complete reshuffle of trust, on many different levels. First of all, users are worried about what big tech companies are using their data for, as mentioned above. Then there is also its hallucination problem, leading to only 29% of users agreeing that they would trust information from GenAI (Forrester).

There is also a very interesting phenomenon where users tend to like interacting with Generative AI, until they learn that they are talking to a machine and not a human. Especially when it comes to emotions like empathy, AI aversion is still a big hurdle for developing trust. So it’s probably not a surprise that 73% of GenAI-aware online adults agree that companies should disclose when they use the technology to interact with them (Forrester).

The proliferation of slop on the one hand and misinformation or even scamming - which have been exponentially increase by Generative AI - on the other, also leave us feeling a lot of doubt when we scroll our social media accounts or read our e-mails. I find myself wondering more and more “is this real”, when I encounter phenomena or news items that are more radical or weird in my news feed. Even in some cases where they turn indeed out not to be fabricated.

There has been a lot of resistance to technology. Not just for the reasons described above. Also because people fear that AI will cost them their jobs, or that it’s bad for the environment or that it is responsible for the collapse of democracy. It’s safe to say that AI, and technology in general, have a trust problem.

Trust by design

That is why I am such a big believer in trust by design, where you embed it in the system, rather than taking restrictive measures. It is not exactly a new evolution, of course and probably started with some of the sharing economy pioneers, like Airbnb. Remember when your mother told you to never get into a car with a complete stranger? Uber solved that problem with extreme transparency and simple ratings. When you misbehave in an Uber, everyone will know it and they may choose not to ride with you or not to pick you up.

My company Athumi follows the same concept of embedding trust in the system but goes a lot further than that. We make every “data subject” a “data controller”. That means that everyone who leaves data traces gets to choose who uses them, and who not. Many of the trust problems created by AI can and should be tackled by a digital trust approach.

5. No more USP

One of AI’s biggest advantages is also one of its biggest hurdles: it is a great leveler. For instance, a study conducted by the Boston Consulting Group revealed that generative AI functions as a skill equalizer: "The consultants who scored the lowest at the start of the experiment experienced the largest improvement in performance—43%—when they used AI. Meanwhile, the top-performing consultants also saw gains, but to a lesser extent." This means that AI is highly effective at improving poor performance but is less adept at turning mediocre work into excellence or raising top performers to an even higher standard.

Peter Hinssen calls this phase “The end of Awful”, where the quality distribution of work, content and other output in organizations will fundamentally be altered by AI. Mediocre will explode. Awful will disappear. Brilliant will keep existing but it might also become more difficult to uncover it in a sea of mediocrity.

Source: Peter Hinssen

The biggest question, however, may be this: if everyone uses AI and all companies become hyper efficient, super productive and extra intelligent, then how will companies differentiate themselves? What will their competitive advantage be? In other words: how can they make sure that they are the brilliant ones?

Well, one differentiator will be talent, of course. Just like the war for data, there will still be a ruthless war for talent, where human creativity and a human premium service might become a USP. On the other hand, it will no longer be the algorithms or even heaps of company data that will make you stand out. Your competitors will have access to similar data as you, and they will use the same models and the same algorithms as you to analyze them.

What will allow you to be different, then, has everything to do with connections, between departments, companies and industries. The greater the amount and the types of data, the better the products and services that will be issued from them.

I believe that the winners in this coming age of mediocrity will be those companies that organize themselves in effective ecosystems and sharing collaborations so that they will have access to a massive amount of data. If one energy company mines its own data it will never have the greatness of insight that will emerge if it shares that data with other companies, especially if they come from different sectors like mobility, hospitality, entertainment and smart cities.

There is some parallel with the software building experience. The impact of software used to be measured by the number of lines of code. The more complex it was, the greater the result. Today, the number of code lines no longer is a differentiator, it’s the amount of data that is accessed by the code. I truly believe that in a near future, the depth and scale of the ecosystems that companies build around themselves will become their true USP.

6. Sovereign AI & the AI race

One of the darker sides of AI is national inequality. Some countries like the US and China are significantly ahead in the race, while many European nations – with some exceptions like the UK, Germany and France – are lagging. So, in our current geopolitically unstable environment, we see an increasing number of countries planning and building their own domestic AI infrastructure, expertise and industry in order to reduce reliance on foreign AI technologies, gear up their competitiveness and safeguard their future. They are developing what they consider as their own “Sovereign AI”, which I believe will only grow in importance in the coming years.

Stimulating Sovereign AI is essential in the current context. It is about having geopolitical leverage. Above all, it is about minimizing our own (supply chain) vulnerability and growing our autonomy. Just think about what would happen if Europeans no longer had access to some of the leading Generative AI ecosystems because of the current geopolitical volatility. It’s unlikely, but never impossible.

According to the World Economic Forum, nations that want to build Sovereign AI tend to be guided by these six strategic pillars

  1. Digital infrastructure: Sovereign AI relies on robust digital infrastructure, featuring advanced data centers and data localization policies to ensure efficient processing, enhanced security, and effective deployment of AI technologies.
  2. Workforce development: A skilled workforce is essential for AI advancement, requiring investments in STEM education, updated curricula, vocational training, and lifelong learning to cultivate talent and drive national AI innovation.
  3. Research, development and innovation (RDI): Investing in RDI is crucial for advancing AI, requiring government incentives, funding, and a collaborative innovation ecosystem among industry, academia, and investors to drive breakthroughs and global competitiveness.
  4. Regulatory and ethical framework: Balancing innovation with ethics and compliance requires a comprehensive framework that ensures responsible AI development through clear guidelines, oversight, and accountability in areas like privacy, transparency, and cybersecurity.
  5. Stimulating AI industry: Fostering AI-driven growth requires government incentives, public sector adoption, and public-private partnerships to drive innovation, entrepreneurship, and economic advancement across vital sectors.
  6. International cooperation: While developing Sovereign AI is about harnessing capabilities within national borders, its development also requires international cooperation to establish global standards, enable secure data flows, address shared challenges, and accelerate progress through collaborative projects for mutual benefit.

This is exactly why we at Athumi have ambitions on a European level (and are, in fact, already in the process of building European data ecosystems). Because we believe that we need to put our own critical data to work in our economies so we can help drive growth on our continent, too. We are at risk of becoming a mere data colony for AI giants like China and the US, while we have so much talent and ideas at our disposal. Just remember that Deepmind was British before it was acquired by Google. And that there is a reason why Microsoft is making so much advances at Mistral AI. Above all, building a powerful European Sovereign AI economy will be about collaboration and sharing data across companies, industries and borders.

As the AI era unfolds, it is clear that the convergence of advanced hardware and software is reshaping the foundations of innovation and progress. From enabling machines to understand emotions and actions or making sure that companies can keep their competitive edge when AI becomes a commodity, to fostering global collaborations that address shared challenges, it is clear that data will be a leading character in our future economy. And that the accompanying privacy, ethical, and trust challenges will be equally significant. Success in this new age will depend on embedding trust by design, fostering transparent practices to keep citizen and consumer data safe, and creating ecosystems that prioritize quality, collaboration, and sovereignty. Which is why I’m so proud to be leading a company that can help with each of these challenges.