7th Februray 2025
In this edition of That Space Cadet Glow, I ponder how much of all this AI stuff people are actually taking in, I look at the impact of Deepseek, and I rant about the disproportionate and distrubing influence of all the President’s men.
Information Overload
I have a daily alert email (generated from a ChatGPT Scheulded Task, obvs) that tells me if any new LLM models have been released that day. In most weeks there will be at least a couple of entries. And in many cases, those new models are not just upgrades (or downgrades) but models with new capabilities or functionalities. Just before Christmas there was a slew of new releases from, particularly, OpenAI and Google, and this week we have seen even more models, including o3-mini from OpenAI and Flash-Lite from Google. And we can’t ignore the seismic entrance of Deepseek R1 the week before (more of which later).
Now, I earn my money from advising clients on AI and, particularly in the last few years, on Generative AI. So I have to test and understand each of these models, but even I am finding it a challenge to keep up with the constant onslaught of new models. And if I am finding it tricky, then how must other people feel, those who have proper day jobs to do? The plain fact is that 99.99% (my estimate) of business people have either not used GenAI or have only used it for relatively simple tasks, and 99.99% (my estimate) of those will have used the browser version of ChatGPT.
So, what does that mean for all these other models that are being released? I know that I, and companies I partner with, are building LLM-enabled applications that use some of these ‘other’ models. Personally, I use 2 or 3 different LLMs every day to help me in my work. But, out of my AI bubble, all of this is the exception rather than the rule. With the huge costs incurred to design and train each model (and don’t forget the energy and water), the only conclusion you can come to is that the developers (and investors) are betting big that, in the future, we will all be using all of these models all of the time.
I’m not sure that is going to be the case. As with most information-focused capitalist industries, the market could be dominated by one or two giants, with a very, very long tail of also-rans. Or, (and this is my guess), we could see it go the other way, with the proliferation of specialist models focused on giving really good answers around specific domains. These domains could be as big as, say, ‘Physics’ or as small as, say, ‘You’. What that requires is more transparency and openness in how the models are trained (what is called ‘open-weight’), and making them much more accessible, either as web pages or built into apps (inlcuding on your smartphone). Which brings me nicely onto the subject of Deepseek…
DeepSeek and ye shall find
As a quick introduction, Deepseek R1 is an open-weight Large Language Model that has been developed in China and has been made freely available through a number of channels, including a smartphone app, web browser and as a download. There are multiple versions that can be downloaded (depending or your storage and processing power limits) and ‘fine tuned’ on specific subjects. This approach is very similar to the one Meta has taken with its LLaMa models but, right now, Deepseek seems to be a much more capable model. So, not only has Deepseek got OpenAI and Google worried, it has also got Meta panicking as well: Deepseek could easily become the default ‘base’ model for many of those domain-specific models, and it could also cannibalise big chunks of revenue from OpenAI et al. Nvidia, the leading computer chip maker, is also worried since Deepseek was trained on old versions of their chips, and pretty much all of Nvidia’s valuation is based on demand for their latest and next chips. This use of last years chips, plus some clever training techniques, means that Deepseek consumed far less energy in training (according to the developer’s claims, but even if you double that they are still much better), and, as I pointed out in the last issue, GenAI’s disproportionate impact on the climate will have many businesses seeking environmentally-friendly solutions in 2025. No wonder the markets got all jittery.
But, and this is a big but, the model was developed in China. In the West, at least, there is limited trust in Chinese technology, especially one that ingests so much data and can be influenced by the Chinese state as to the answers it gives (it will famously not answer questions on Tiananmen Square, for example). You only have to look at the T&Cs for the app to give you enough of a scare: the company would be able to store (on ‘secure servers in China’) your email address, phone number, date of birth, any user input including text and audio, chat histories, your phone's model and operating system, your IP address and your ‘keystroke patterns’. It can then share all this information with service providers, advertising partners, and its corporate group, which will be kept ‘for as long as necessary’. I’m pretty certain that none of my clients will start using Deepseek instead of, say, ChatGPT or Copilot.
But is it possible to use Deepseek safely? There are actually a few options, although they do require some technical expertise. The first is to access Deepseek through one of the cloud providers such as Microsoft Azure, AWS or Google Cloud, where you will be charged by the amount of compute resource you require. The other is to download one of the smaller versions onto a (relatively powerful) computer, which means there is no registration required and the model is effectively isolated from sending or reciving data unless you explicitly want it to (you could even turn the wifi off and it would still work). This download approach is the one I have taken - my Macbook Pro M4 Max comfortably runs the 32b version (although struggles with the 70b).
On testing Deepseek, the first thing you notice is that it is very good indeed. It is a ‘reasononing’ model, which means it was designed to think through problems before it answers the question (so is comparable to OpenAI’s GPT-o1). That makes it great for research tasks and problem solving, but means it can be too verbose for simple queries (it gave me a 750 word answer to the question ‘why is the sky blue?’). Interestingly, it was able to write me a letter that was critical of President Xi’s foreign policies, so the censorship must be happening outside of the core model.
So the fact that it is capable, free and less-energy hungry means that it could indeed become the model of choice when building LLM-enabled applications. The race now is for someone else (not based in China) to build their own version of Deepseek.
Information Overlord
With all of these new models being released you would think that everything is rosy and abundant with opportunities in the AI garden. Whilst that may be true there is also a dark side to all of this, which, in the current political climate is only getting darker.
The fears that some of us (most of us?) had as we saw the rise of a few core technology companies developing these dominating models has pretty much come to pass. The election of Donald Trump has seen most of the leaders of these firms cower to the extreme right wing agenda, either because this has always been thier natural persona and now they are free to express it, or they are simply prioritising profit over the safety and health of society without the safeguards (and morals) that would normally be in place.
We have seen, for example, Meta remove all human fact-checking from their US services which followed X’s earlier lead, and Google drop their commitment to not using AI for weapons. The demonisation of Diversity, Equity and Inclusion (DEI) policies in the US will ultimately affect the level of safeguards that are applied to AI models, and it is quite easy to imagine a world in the not too distant future where it will be illegal to apply bias checking to AI models. Efforts to halt the climate emergency have been put into reverse in the race to build data centre capacity, despite the recent Deepsek developments (again all encouraged by the Presidential mandate to “drill, baby, drill”). And all of this is before I mention the richest tech bro in the world and close confidant of the President apparently making a nazi salute on stage.
None of this is good for AI and, since AI is now dominating global discourse and will soon have a disproportionate impact on everyone’s lives, it is not good for the society either. The world is in a bad place, and, if things continue as they have done in last 6 months, it will get worse. I don’t have any easy answers for this, but, in each of our own ways, we must not just bend over and accept any of this as normal. We must look at boycotting the worse offenders (you’re not still on X are you?) or seek alternatives, even if that means more cost or lower performance. And adversity can be the mother of invention - Deepseek has shown, ironically, how new solutions can be found despite severe limitations. This must be the approach we take in how we build and use AI, and, actually, in everything we do. We can’t let these people win.
Doechii - Alligator Bites Never Heal
To bring us back to lovelier things, one of my favourite new rap artists of last year is Doechii. Her LP (or ‘mixtape’ to be precise), “Alligator Bites Never Heal” is a master work of style and word play. The rapping on the song Nissan Altima is next-level. Her Tiny Desk concert is a joy to watch. But just this week she won a Grammy for Best Rap Album 2025 which included a performance that just made your jaw drop. You can watch it here. Enjoy!
As a reminder, you can still check out my 2024 Spotify playlist.
And don’t forget, you can buy the second edition of my book, The Executive Guide to Artificial Intelligence, which has been massively updated to include everything GenAI.