I made a post on LinkedIn last week (that I’m not going to bother linking to, because the discussion is toxic) about my frustration that my timeline is filled with people who never bothered to learn a single thing about LLMs and Generative AI - yet are dismissing tools like Chat GPT whole heartedly primarily because they are asking the wrong questions.
I’d like to take a moment to copy and paste the text at the bottom of the Chat GPT standard page.
ChatGPT may produce inaccurate information about people, places, or facts
Reading the details is important.
Hacking AI
Anyway…I’m at DefCon this weekend. It’s packed (which is weird), but it’s fun. I may or may not be getting good at picking locks, but I can say that I’ve learned a lot.
I spent a bit of time in the AI Hacking Village - where I spent most of my time in a CTF type event where I attempted to (and was occasionally successful at) making generative AI do things it wasn’t supposed to. I pointed out in a previous post that ChatGPT is bad at math. One of the challenges was to get a model trained on math to give bad results (yes folks, you can train your LLM on almost anything).
I made it through half a dozen challenges before I got stumped, so I moved onto another section where I had to get a model trained on Wikipedia to say something defamatory about someone. I got as far as getting the model to give wrong information, but not defamatory before I ran out of time. Not a total bust, but I’ll head back over today or tomorrow and try again. Worth noting this isn’t the equivalent of testing a buggy app - these are well trained models that are being subjected to a massive red-team hacking effort. I, for one, am looking forward to seeing what the smarter people can do.
Inside the Box
Today, Chat GPT is practically synonymous with AI - but the real power - and the future of generative AI is in building and training your own LLMs. Brent’s going to skewer me for this, but I’m going to attempt to explain LLMs for those who haven’t bothered yet. Feel free to skip ahead to the next section and avoid the cringe.
An LLM (Large Language Model) is the “sponge” that collects all the data you throw at it. It can be “the internet”, or a more targeted set of specific data. In general, the more data the better - but some curation is needed, as the models can be messed up by bad information. You could, for example, feed it every book ever written about a US President, but including Abraham Lincoln: Vampire Hunter would definitely skew the model. LLMs are commonly (but not exclusively) trained on word prediction. E.g. “The 16th President of the US was…”. LLMs can also be trained to classify sentences or documents into categories, to summarize longer texts, translation, and other categories (as well as combinations). This particular example of LLM would be great at presidential data, but would likely not be able to tell you about croissant recipes.
There’s some pre-processing needed (tokenizing, model architecture, etc.) that’s beyond this post (and honestly, beyond my current ability), and you’d eventually use something like TensorFlow (the one ML tool I’m familiar with) to implement the architecture, neural network, and other parameters. Then you wait. Then you fine tune and test and teach the model as needed until it’s useful.
What’s the Point?
My apologies to those more versed in LLMs and generative AI, but I think the above is generally in the ball park. I bring it up, because I believe (as I’ve stated on the podcast several times) that the technology we’re seeing today - however flawed it may be is at the edge of much more significant advancements. It’s what Steven Johnson calls The Adjacent Possible - the edge of what is known. Broadband internet enabled video streaming; miniaturized electronics enabled wearable tech; advancements in battery tech enabled electric cars. Generative AI will enable innovation far beyond what Chat GPT does today. Chat GPT is, to me, a sample app. It shows the power of LLMs, but not the future.
History repeats itself. The arguments I see dismissing AI are similar to arguments I saw dismissing Agile, or dismissing Test Automation, or even dismissing high level languages for any “serious” work.
The short story, as I’ve said before, is that AI is not going to take anyone’s job away - but people who use AI may. Furthermore, Those who close-mindedly dismiss AI (or any new technology or idea) are destined to become (further) obsolete.
You can be the butterfly, or you can be the wind. You do you.