Interview on generative AI

Interview on generative AI

I was interviewed by Elena de Sus of Spanish progressive magazine CTXT (Contexto y Acción). The article in Spanish is published on their web site. This is the English version of the interview, published with kind permission of CTXT.

Wim Vanderbauwhede is Professor in Computing Science at the University of Glasgow, where he leads the Low Carbon and Sustainable Computing research group.

He has written about the high energy consumption of the Large Language Models used for generative AI such as ChatGPT. He posits that their projectec expansion is not sustainable.He has is also skeptical that advances in efficiency will lead to a reduction in emissions from this industry.

He talked to CTXT over video call.

You research low-carbon and sustainable computing. How did you get interested in the topic?

I have been aware of climate change for a very long time, since the 1980s. After all, it is nothing new. Originally I am from Belgium, and when I lived there I was active in an environmental organization, I did volunteering work with them.

For my academic career, my focus has been mostly on improving the efficiency of computers. But it’s been known for a long time that if you improve the efficiency of something, usually it means it gets cheaper and then you get more demand, and because you get more demand, your emissions actually go up, not down.

The whole history of the Industrial revolution is one of improved efficiency. The efficiency gains of the steam engine led to us burning lots of coal, because it allowed to pump the mines more efficiently, so mining coal became cheaper.

Computers have become literally millions of times more efficient since 1930s or 40s. At the same time their use has become ubiquitous. So the total emissions from computing have gone up despite all the efficiency savings. There was this conflict with doing efficiency work, and I wanted to look at sustainability of computing more widely. So some years ago, I essentially got the opportunity to start a new research activity in the department that I work and with the support of the head of the department, and that was the Low Carbon and Sustainable Computing group, which we have now.

The term I use is frugal computing. The frugal computing message is that we should use fewer computing resources, just like essentially we should use fewer of any resource if we don’t want catastrophic climate change. We should not go for growth in the terms of growth of resource consumption and growth of energy consumption, because that is destructive. Our whole societal model is built to encourage us to use more resources and more energy. But that is not a sustainable model.

And the opposite is happening because we are developing very resource intensive stuff like generative AI and Bitcoin.

At some point before the AI hype started, there was the Bitcoin bubble and it looked like Bitcoin might end up using a huge amount of compute resources. But Bitcoin is not a viable currency for something like a nation state. The former finance minister of Greece, Yanis Varoufakis, has written extensively to explain it. If there is a need for of evidence, El Salvador has abandoned bitcoin as their currency. That means that Bitcoin and the derived models will probably remain rather popular in some circles, but not grow dramatically. And therefore the emissions will probably not go up a lot. Also, cryptocurrencies like Ethereum, based on the proof of stake protocol as opposed to proof of work, have become more popular. The carbon footprint of those is literally a hundred times less, so the emissions from cryptocurrencies haven’t had a dramatic growth. The current amount of emissions is not dramatic. So if it stays that way it is not really a problem. It could have been different, but it didn’t turn out like that.

AI is a different issue because there’s huge support from governments around the world. Everybody seems to think it’s like magic and it will create unlimited growth. Or maybe even if they don’t believe that, they act like they believe it. That means that the this is a serious driver for creating more computer chips, data centers and generating more electricity. And at the moment, around 70% of the electricity is still from fossil fuels. So it means we’re just going to burn more coal.

So the problem here is the state support of AI?

It works as a delaying factor. If there is a hype bubble, it would normally collapse by itself, because people start to see that there’s nothing in it. But if governments think this is a good idea and they should invest in it, they will commit the investments and the investments will happen even after people start to see that it wasn’t worth it, because they are slow. So the whole thing gets delayed. That’s really what happens. Not so much that it stops the process. It just causes a delay. But in that delay, of course, you create more emissions.

These days it’s very hard to get economic growth. And if you believe that you must get growth, then anything that promises to give you growth is something that you may want to look into. The UK government, for example, is like that. The US government also, they think that AI is going to give them growth, so they commit to investments in that area and those investments will happen. Probably even if the bubble would burst this year, they would still happen, because they have already set things in motion. It’s not that the governments drive the hype, it just doesn’t help. Of course, if your government is saying that AI is good, then it’s much harder for the ordinary person to say that AI is bad.

With the launch of generative AI models by the Chinese company DeepSeek, which are apparently more efficient and had to get around the chip export constraints for China, it looks like the bubble has burst. Nvidia stock has gone down and there has been a lot of discourse about the fact that we don’t need that many data centers if we can have a more efficient AI. But I think you are skeptical about this.

I looked at DeepSeek, based on the information that they have wanted to give. To start with, the narrative that they had to use less capable computer chips, because of the export restrictions of the US government, in short, this is not true. I’ll explain why. There are export restrictions on GPUs and to satisfy these restrictions, Nvidia has created a special series of chips for the Chinese market, and they are less capable in one specific respect: double precision floating point performance. Now, AI does not use double precision floating point performance. This is needed for supercomputers that do scientific computing. But the Chinese have their own very good supercomputers for scientific computing. So they’re not buying Nvidia GPUs for their supercomputers. They’re buying them for AI; and for AI, you don’t need this.

OpenAI, Google, and the rest of the US companies use an Nvidia GPUs called A100 for running the model. For training, they use a more capable type called H100. For the Chinese market Nvidia sells A800 and H800. In their whitepaper, DeepSeek says that they use the H800. Now the H800 is more capable than the A100 in almost every respect. It’s a little bit less capable in the networking. So if you combine several of these GPUs in a network, the bandwidth of the network is less, and that’s what they explained in that paper, what they did to get around that. And that’s nice engineering, but that’s not getting you so much of a benefit.

All in all, it’s not as if this is some really constrained compute device. This is high end. This is actually better than what most of the large companies use now for their data centers.

Deep Seek has been very clever in two ways. They have this app that people started to like. Their pricing is competitive. They have a lot of smaller models that people can play with. And I think that’s what the media has jumped on, these smaller models. But then that’s nothing special. Meta also has released open source smaller models with Llama. They are not really open source, but DeepSeek is also not really open source because the data set is not open. It’s only the code that does the compute that is released as a binary. But that’s a very different discussion.

I think this was a clever way to say that they have a small model, but what runs the main inference is not all that small. Compared to GPT-4, if they can really get the same performance, then they have done something quite clever because they use a lot less of the parameters at any one time. So it will be a bit more energy efficient, but not all that much.

I mean, this idea is clever, they show that it works and it is a good thing. But we get the same problem. If their pricing is competitive, it means it’s cheaper so more people will use it. So it’s very likely that there will not really be a decrease in emissions as a result. It might very well be an increase if the company gets really big.

We talk a lot about the cost of training the generative AI models, but you have written that using them is a lot more costly.

Um, yes. And that’s both true in terms of environmental cost and in terms of financial cost. I’m not the only one to write about this. There’s lots of people who look at the economic costs and say the training cost is fast becoming just a detail. I calculated it.

For inference (running prompts), it scales with the number of users, whereas the model training, of course, only scales if you make a bigger model. And this is probably where Deep Seek has done something clever because their cluster is not very big. So they managed to train the model on a smaller cluster. That saves them their initial cost. Because they started as a small company. So that’s the training cost, but if they are going to become a big company, then they will need lots of data centers for serving all the queries. Then that will be the dominant cost. If you’re an AI company that works with big models and you train them, and that’s a huge cost, then you need a huge amount of users to make it profitable. But to get all those users, you need a lot of hardware to support them, and that’s expensive.

A few years ago, the costs of training were a lot higher because they started by training these models not very efficiently. They didn’t really know how to do this well. So they needed a lot of resources to get a not so good model. They probably needed to do it a few times and so on. But now it’s definitely the cost of inference that dominates. And also the emissions from inference dominate everything.

Do you think like the markets have overreacted a little bit?

Absolutely. Yeah. Especially the US markets because this is Chinese and they are afraid of China. But I think Nvidia should not be worried. I mean, for the reason I explained, their sales depend more on the fact that people project enormous growth.

The CEOs of big tech companies had been saying that they need like a 100 times expansion of chip manufacturing in the next ten years or so. These claims made the market go up. The problem is the data centers are getting committed. And then the electricity generators have to provision the electricity ahead of time because the data center needs to get the electricity as soon as it’s finished. That means if you want more electricity, you have to start building today. And because most of it is not renewable, it’s going to be gas or coal. So even if all this AI stuff does not happen at all, they will have started building it and then they will want to use it because, well, once you’ve built capacity you want to sell your electricity, right? Otherwise you have made a very bad deal. So that’s the damage I think this is doing. It’s the hype that does the damage.

It’s not possible to scale up semiconductor manufacturing by a factor of 100 because, at best, we could scale up the global mining capacity for all the materials that we need to make the chips by a factor of two. So this factor of 100 is not going to happen. And probably all those people know that.

So everybody may know it’s a bubble?

Yes. But it does a lot of damage because it gives the fossil fuel industry a perfect excuse to produce more fossil fuels, to provide the energy that they say we will need for something that is probably not going to happen.

Do you think this large language models are obviously not worth it, right? Even if they can be useful for some things.

Yeah, I personally think that the generative AI that is pushed by OpenAI and then the other companies that follow suit to compete with them, this is not very useful. I mean, it is useful for specific scenarios, but then usually when you have a specific scenario, you could use a much, much smaller model and achieve the same thing.

We have had the really big models that can do everything for everybody since 2020 or so, meanwhile global productivity definitely has not gone up. The companies that start using the Microsoft Copilot and so on, the large language models for programming, they see that it’s problematic, because it’s much harder to debug code that was not written by your own developers, but written by a machine. Although you may think that you write code faster because the machine writes it, the machine doesn’t guarantee your code is correct. It can’t. You know, a generative language model has no notion of what it means. It’s just guessing. So if you’re lucky, the guess works, and usually it doesn’t. And then the developers still have to debug it. And that takes more time because they can’t read the code so well because they haven’t written it.

And there’s a lot of things like that. If you look at generative AI for image processing, for generating images, superficially it looks brilliant, but it’s actually quite average. It will not replace good illustrators because people who really want a decent illustration cannot use this. Do you want to burn the planet to produce cheap illustrations?

I think before generative AI appeared, there was no demand by the people to have it. And so it’s technology push, right? That’s what it’s called, rather than market pull. So the problem is that by creating this extra technology we create a lot of extra emissions at a point in time where we cannot afford any extra emissions. Emissions should go down. It’s not affordable for the planet as a whole. That’s really the problem. Whether it’s useful or not is neither here nor there. It may be extremely useful, but if it’s still burns the planet, it’s no good.

And from the calculations I’ve done, if those business people’s projections would be realized, AI on its own would be enough to miss all the climate targets. Like I said, this is very unlikely. But it means that they don’t care that it would happen. In terms of energy expenditure, we can’t afford it.

Plain and simple.

We can afford smaller models. In computing science, we really make a big distinction between what we prefer to call Machine Learning and what is being called AI, which usually means generative AI.

Yeah. Okay. There’s a lot of confusion with this. Could you explain what is the difference?

The UK government also makes this mistake. They talk about how AI can do great things like detect cancer in an MRI or X-ray image and therefore we should build more data centers for generative AI. But SegNet, the leading model in colon cancer detection, with a 99% accuracy rate, has 7.6 million parameters, while GPT4 has more than a trillion. This means that SegNet uses 100,000 times less energy than GPT4. It can run on a PC in the hospital. You don’t need to build any data center to get better diagnostics. Just a few servers in hospitals.

But is there something in common between these different things that we call AI?

Most models these days use a neural network. A neural network is an abstraction inspired by the brain where essentially you get some inputs which are numbers, and you multiply them by other numbers and you add them, and that gives you another number as an output. And then usually you try to limit the range of that other number or something. That’s what they call a neuron. So something that gets a few inputs multiplies them by weights and then adds them and then normalizes the result and sends that on to another neuron. And if you do that enough times, you get something that actually can do extrapolation on a very large parameter space. So it is very good at… let’s call it guessing, but it’s statistical approximation.

The model that is used for cancer detection is called a convolutional neural network. These are the ones that are used for images. For text, it’s called recurrent neural network. In an image, you have to look at all the data in parallel in space. So all the pixels are next to one another. In language, your words come one after another. So there are basically two types of neural networks. The generative AI, large language models that we use, they are essentially much advanced versions of these simple neural networks that I described.

There is a difference between a model that detects a pattern in an image and a generative model has to produce new text or a new image. It’s more work than just finding a pattern. That’s also why generative models are more expensive in energy terms. They need to do more computations because they do more work.

Some people are saying that there is a limit on the training data, that that these models have already used as much data as possible and they maybe cannot find a lot more. I don’t know if that’s true.

Uh, it’s worse than that. A lot of the content on the internet now is AI generated. And the problem is that if you give an AI model AI generated content as input data, then it tends to get very poor performance very quickly. It’s called poisoning. But it’s not easy to avoid because the bots that scrape the internet cannot tell whether a page is AI generated or not. That actually means the best quality general purpose data sets will be from before 2022.

Also, you can’t really keep on making the models bigger, you have to start doing these things like what DeepSeek does. Actually OpenAI already did that. They just didn’t do it on the same scale. But so OpenAI has a model that is 1.76 trillion parameters, but actually at any point in time they ever use, what was it again, two times 200 billion. So they use a much smaller subset of that. And this is purely because you can’t access all of them all the time. What DeepSeek has shown is that if you use even less, it still works well. Most of the concepts that they use in their paper are already being researched by all the other companies as well.

Anyway, you can’t keep on making them all bigger and expect that it will perform better because there are limits both on the quality of the data, but also on the engineering of making this happen. So yeah, the performance will probably start to stagnate, will not get much better.

So the notion that we can reach some kind of artificial general intelligence by doing this is false?

This is absurd. I mean, I think the people who promote this idea, they know this this is a distraction, right? Because then you can say “Oh, artificial general intelligence would be very dangerous and we have to have all kinds of safeguards in place to make sure that if we have one, it behaves and it does the right thing for us and so on”, and that’s a perfect distraction not to have to worry about all the real negative consequences of the fact that companies put out all these products. That’s to my mind, what’s behind it.

There is no chance that what is effectively just a statistical pattern generator can become intelligent. There is nothing in the model that actually mimics intelligence.

I mean, people have been thinking about artificial intelligence for probably 50 years or more. Very deeply. And I think anyone who really has spent a lot of thought on this would agree that the generative AI models or whatever models that we call AI now really are not of the type that would give us a self-aware piece of software. It seems intelligent because almost everything that we know is in there. A summary of all the knowledge that humans have put online is in those models. So there is an approximation of just about anything in there. But it’s by no means intelligent.