If you wondered about all those “open-source” AI solutions on the market, you might be right. Not every “AI” solution is artificial intelligence, not every one of them that claims open-source really is open-source.
The topic has become so prominent and wide-spread that even the New York Times made a feature on it: “Openwashing” is the title, it’s of the “shop talk” column and was written by Sarah Kessler, who regularly has been diving into the abysses of IT and nerd jargon.
Open Source makes you look good!
Her observation is totally correct and represents a well-known view that now has arrived in main stream journalism:
Proponents of open source A.I. models say they’re more equitable and safer for society, while detractors say they are more likely to be abused for malicious intent. One big hiccup in the debate? There’s no agreed-upon definition of what open source A.I. actually means. And some are accusing A.I. companies of “openwashing” — using the “open source” term disingenuously to make themselves look good. (Sarah Kessler, NYT)
Yes, openwashing has arrived in the world of AI. After months and years of evil best practice in the world of open source itself, openwashing made it into the recent hype. Of course, one might say. The term ignited and became popular when more and more companies and public administrations started to require open source software, e.g. in tenders.
“We need the OSS label!”
In late 2022 a company asked us for help, they needed consulting and coaching services. Their main question was: “We don’t want our IP to go public, we will never publish code, but we want to win tenders that require open source. What can we do? We’re losing one tender after the other because we’re not open source. Where do we get that label?” Sadly, the potential customer left when we suggested a basic, introductory workshop into the world of open source, kind of an opensource 101 for management.
But the example shows: Open source has become a conditio sine qua non:. If you’re not open source, you’re out, you have to explain. That has changed. I remember the days when it was vice versa, when consultants came in and told us “you can’t do that in opensource, how will you make sure your customers have to stay? How do you plan to lock them in?”.
In the world of software, many companies have created innovative evasion strategies (see my talks and panels on the topic at FOSDEM, come to KDE akademy soon). But today, cloud and shortly later AI companies learnt their creations won’t work without open source. But not only are they using it stealthily, no, they are also promoting it actively, because the public has accepted the term as a label of quality. Of course, open source will not guarantee security nor safety, but the latter two are impossible to achieve without it.
OSS AI is all about marketing
But in terms of artificial intelligence, especially when dealing with large language models, speaking of “open source AI” is more and more considered empty buzzword content.
Kessler is quoting from a scientific study done at Carnegie Mellon called “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI” that comes to a devastating summary:
We find that the terms ‘open’ and ‘open source’ are used in confusing and diverse ways, often constituting more aspiration or marketing than technical descriptor, and frequently blending concepts from both open source software and open science. This complicates an already complex landscape, in which there is currently no agreed on definition of ‘open’ in the context of AI, and as such the term is being applied to widely divergent offerings with little reference to a stable descriptor. (Widder, West, Whittaker)
Fun Fact: Among the authors is Meredith Whittaker, well-known president of the Signal foundation who just was awarded with this year’s Helmut-Schmidt Future Award and for this opportunity gave a remarkable thank you speech.
There is no open source AI
Face it, marketing: There’s no open-source AI, there’s no complete open-source AI solution, no matter that even open-source cloud companies keep pretending it. Neither is there all necessary software available as OSS, nor are the models nor training data nor the crucial settings (like those for weights, prompts, bias) freely available. Only parts of the whole tool chain are open source. And that is the true reason why some small open source companies offer “open source AI integration”, but leave the dirty details to the customers.
And it gets worse:
The main reason is that while open source software allows anyone to replicate or modify it, building an A.I. model requires much more than code. Only a handful of companies can fund the computing power and data curation required. That’s why some experts say labeling any A.I. as “open source” is at best misleading and at worst a marketing tool. (Sarah Kessler, NYT)
I couldn’t say it better. Whoever promises an “open source AI” solution is either lying, omitting parts of reality or has a marketing department gone wild. Or they just made a mistake, that also happens, they say, even more so with AI, these days.