Taking a close look at the newly released OpenAI generative AI model known as o1.
gettyThere is some exciting news for those of you following the hottest trends and advances in AI. Hang onto your hat. Tighten your seatbelt.
OpenAI, the maker of famed ChatGPT and GPT-4o, released their newest generative AI that they’ve given the name of simply ‘o1’ and that some in the marketplace have been referring to as GPT-o1 for the sake of familiarity with informal naming conventions. I’m going to stick with o1, which I hope isn’t jarring to the eyes during this discussion.
Insiders have been waiting eagerly for this release. You might know that the earlier suggested code name was ‘Strawberry’ and that various hints and clues about the alleged Strawberry generative AI model have been dribbling out on social media for months. What will it do? When will it be released? Will it be earth-shattering? Etc.
Why should you care about this?
Because, in some key considerations, the newly released o1 seems to be a notable step forward for generative AI. To clarify, it isn’t the vaunted artificial general intelligence (AGI) that is the existential risk worry being generally bandied around in blaring headlines. Nope, it’s not that. Reset your expectations if that’s what you thought was going to hit our shores.
It is more akin to a potential incremental advancement. Some would say it is a slow but sure indicator of progress. Others might be a tad disappointed.
Beauty is in the eye of the beholder.
One Step Forward, One Step Maybe Backward
According to the write-up that OpenAI has made available publicly, o1 seems to do better in producing responses to certain kinds of questions and problems, especially involving science, computer programming and coding, and mathematical inquiries. That’s the good news. The somewhat rougher news is that so far, it appears that o1 is not as good at solving many other kinds of problems and questions that GPT-4o solves.
This might seem confusing since we usually assume that the next step for any generative AI iteration is that it will do everything its predecessors did, and more. You don’t normally expect that the more advanced model will be less capable. The dilemma for users is that you then must make an agonizing and maddening choice as to using the former one versus the newer one.
Bottom line, this is a showcase of what generative AI can become and represents a kind of early-stage variation, but you would be likely better off continuing your use of GPT-4o if you want overall capabilities. From time to time, switch over to using o1 if you have specific needs that fit that version best.
I’m sure that in the next days and weeks, there will be a lot of users and seasoned evaluators opting to put o1 through its paces. Right now, we mainly have OpenAI’s word for what it does and how well it works. Some of that will certainly be reaffirmed. There might be some gaps or difficulties that aren’t yet known. The sky is the limit, as they say.
Sam Altman, the well-known CEO of OpenAI, tweeted or X-posted this comment about the release:
“Here is o1, a series of our most capable and aligned models yet: o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it. but also, it is the beginning of a new paradigm: AI that can do general-purpose complex reasoning.” (Posted on X, September 12, 2024).
I have a mind-bender for you to mull over.
We don’t yet know if the advances of being able to do better in science, coding, and math will be compatible with the rest of their generative AI such as GPT-4o in terms of doing all manner of other problems. In other words, suppose that those advances cannot be married or merged into a model akin to GPT-4o and they will only be available in this variation.
That would certainly be sad. The hope would be that all the other processing could be integrated fully with this specialization capability. Might that somehow be irreconcilably incompatible? Might this be a one-off that can’t be merged with the rest? Worse, could merging this into the rest somehow undercut the rest and produce a lousy model that is clunky and error-prone?
For my recent analysis and insights concerning the many challenges when merging generative AI models, see the link here.
I have my fingers crossed that either the o1 capabilities can be merged with GPT-4o or that GPT-4o can be merged with o1, or that a new concoction which I’ll coin as “Fruitful” will be devised such that it has both, plus much more. Meanwhile, we are all still waiting with bated breath for GPT-5, so maybe that’s what it will be. Or maybe not.
Time will tell.
The Bravado About Thinking Better
I’d like to get something off my chest that I believe is worthy of mentioning. Once I do so, I will be diving into some crucial details about o1.
OpenAI’s announcement and blog posts, along with media chatter and news reports, are using verbiage about AI that makes me cringe. The words “thinking” and “reasoning” are being thrown around like freely tossed candy. Many are reporting that o1 can “think” and “reason” faster and better than other AI. Lamentedly, infuriatingly, it is grossly misleading when discussing AI to use the word “think” and to some degree the word “reasoning” (I begrudgingly acknowledge “reasoning” is less egregious, see my discussion at the link here).
Thinking is usually properly reserved as terminology associated with humans. The truth of the matter is that we don’t yet fully know how it is that humans think, and we are only at the crudest edges of breaking through on the nature of the brain, mind, and thinking, see my discussion about this at the link here. Ergo, using the word “thinking” to describe AI is an insidious practice that anthropomorphizes AI and sneakily convinces people that we have AI today that is essentially sentient (because it “thinks”).
I try my best to avoid using the word “thinks” when I write about or give talks on AI.
Anyway, I brought this up to make sure you know that all this incessant “it thinks” stuff is hogwash and not at all on par with or even close to what we know of and believe human thinking to consist of. Please, please, please do not fall for the gimmicky use of the revered word “think”.
Generative AI is fundamentally a form of mathematical and computational pattern matching. These are calculations making use of algorithms and large-scale data structures. Let’s not get big heads and jump the shark on this.
On with the show.
Chain-Of-Thought Seems To Be A Key Treasure
The technical details of o1 are unfortunately not being fully revealed at this time. Whether they ever will be is up in the air. Realize that OpenAI considers its AI to be proprietary (a bit of an irony, given the name of the firm and its early history, and for which Elon Musk has expressed dismay, see my coverage at the link here).
Some others make their AI available as open source. A huge and heated debate exists about whether all AI makers ought to open-source their AI, see my recap of the debate at the link here. A lot of legal wrangling is taking place on this hefty matter and various legislation has been floated about this.
The new o1 model is gradually being made available for use, but we are only given scant indications about the underpinnings of what the AI developers have done to make this more advanced generative AI. I am going to pick one of the indicated functionality pieces that appear to be part of the game changer.
If there is sufficient interest from readers, I’ll gladly do a subsequent follow-up piece that covers other prized technical elements that seem to be part of the magic equation for moving ahead on AI.
The portion that I am going to focus on here is known as chain-of-thought (CoT).
Chain-of-thought has been around since the initial days of generative AI models. In that sense, it isn’t something brand new that suddenly has arisen in o1. The upshot is that OpenAI has decided that chain-of-thought is so vital and contributory to advancing generative AI, they have pretty much built this into o1 (as far as we can discern).
I don’t want to beat a dead horse, but you probably would guess that I disfavor the phrase chain-of-thought when it comes to AI. As humans, we have chains of thought when we are mulling over issues or trying to solve problems. We have a thought, which leads to another thought, and another one. I’ll refer to chess playing. During the act of contemplating a chess move, you likely parlay one thought into a series of many other thoughts, trying to think ahead as far as you can about the game underway.
The AI field has opted to conflate that same phrase to the actions of AI. Sad face. The phrase has become completely ingrained in AI research, AI vocabulary, and the like. So, we’ll go with it and hold our noses as we do so.
I will tell you what chain-of-thought consists of in the AI realm.
Turns out that there are many offshoots of chain-of-thought. In case you are interested in the topic, it is something I’ve extensively covered. You can find chain-of-thought basics along with factored decomposition at the link here, skeleton-of-thought at the link here, chain-of-feedback at the link here, verification-of-thought at the link here, chain-of-density at the link here, tree-of-thoughts at the link here, and my comprehensive coverage of other vital prompting techniques and prompt engineering fundamentals at the link here.
Overview Of Chain-Of-Thought (CoT)
As a quick background, chain-of-thought is a vital prompting technique used to get generative AI to showcase on a step-by-step basis the presumed logic being used when answering a question or solving a problem. You merely instruct generative AI to explain step-by-step what it is doing. This is easy to request. Just tell generative AI to work on a stepwise basis and do a show-and-tell along the way.
Research studies have indicated that this is not only insightful for you since you can see the logic involved, but it also tends to get generative AI to produce more reliable responses and on-target answers. This also seems to aid in reducing the emergence of AI hallucinations and other generative AI outputs that contain errors or guffaws. Sidenote: I strongly disfavor using the word “hallucinations” since this is anthropomorphizing of AI, but it has caught on and we are stuck with it.
Some people like to enter a prompt that says “Let’s think step-by-step” to invoke this capacity of generative AI. The result will be that the generative AI app will list out a variety of steps that seemingly correspond to whatever solving process the AI is undertaking.
It isn’t a silver bullet. There are various caveats and limitations that you need to keep in mind. If you blindly use and believe in chain-of-thought as a prompting strategy you can find yourself being led down a primrose path. Always make sure to carefully inspect the steps generated and keenly scrutinize the steps and the results. Things can go awry.
I’ll cover two of the especially weighty gotchas:
(1) Steps shown aren’t necessarily so. Invoking chain-of-thought does not necessarily get you the actual steps of what is taking place inside the generative AI (this is a surprising revelation for some people, so I’ll outline the underlying issues in a moment).
(2) Steps payoff isn’t ironclad and comes with a cost. The computational justification for why using chain-of-thought invocation seems to boost generative AI is hotly debated, remains relatively open-ended as a research question, doesn’t guarantee that your answer will be any better, chews up additional processing time, delays the time-to-respond, and might produce a worse answer than if not going the stepwise route.
Let’s briefly dive into those crucial points.
First, just because you instruct the generative AI to showcase a step-by-step elucidation does not necessarily mean that you are actually witnessing the true step-by-step actions of the AI. When I point this out to people, they are at times shocked and dismayed to hear this.
The problem is that the way that generative AI works internally is not especially conducive to adhering to your request or instruction. The reality of the computational and mathematical formulation of generative AI is that it isn’t devised to work particularly on the presumed logical step-by-step basis that you assume it should.
Without going into the fine details here, the internal mechanisms of generative AI are usually based on a vast and elaborate artificial neural network (ANN) that consists of often millions and potentially billions of parameters, see my explanation at the link here. Numeric values get passed along through the nodes of the ANN. The numeric values flow here and there. The words that you eventually get as outputted from the AI are all due to the byzantine arithmetic combination of those numbers and calculations.
Thus, when you ask or tell the generative AI to do a step-by-step elaboration, you are getting a somewhat made-up indication of what is taking place.
If you were to see the numeric values as they are calculated, this would do you little good in terms of comprehending what is taking place. So, the generative AI is essentially fabricating in words what might be taking place and showcasing to you an elicitation as though it is purely the logical steps undertaken. At times, it is even possible that simply due to wanting to appease the request for steps, the AI will craft fake steps out of thin air, doing so to fulfill the request to showcase steps.
You need to look at those elicited steps with a skeptical and scrutinizing eye.
Second, there isn’t a guarantee that doing the step-by-step processing will produce a better response. In addition, the odds are that the processing time will be extended. If you are paying to use generative AI, the cost will be heightened due to chewing up the extra processing time. You also will wait a bit longer to see the results, which can be frustrating for those who are used to near-instantaneous responses from generative AI.
That being said, there is a kind of whispered secret or common theory about why it is that stepwise processing does seem to usually get better results. This has to do with your willingness to allow for extended time to be consumed when generative AI is calculating and solving whatever problem or question you’ve posed.
Let’s unpack that.
The Theory Of Why Chain-Of-Thought Is So Handy
Here’s the presumed deal.
Most generative AI is devised to work as quickly as possible to produce or generate an answer for you. This makes sense for AI makers to do this. They realize that people don’t have much patience and want their answers like they seek cheeseburgers at a fast-food drive-thru. Plus, since the use of generative AI often comes at a price, namely you might need to pay to use the generative AI, you naturally want the cost to be as low as feasible.
Okay, so the normal processing is going to fly through the generative AI and seek to punch out an answer as expeditiously as possible.
One theory is that by instructing the generative AI to do a step-by-step or chain-of-thought process, you are getting the AI to explore avenues within the artificial neural network that otherwise would not have been explored by default. This greater depth of computations might uncover better answers or more apt answers.
It is again akin to playing chess. If you only consider the next move during a chess game, you are probably going to get beat. If you are willing to look at several moves ahead, you might have a heightened chance of winning. In the case of generative AI, your willingness to wait for a response gives more time for the AI to do a more in-depth analysis. The same goes for playing chess. Typically, the longer the time allowed to make a chess move, the greater the depth of looking ahead so that a deeper and possibly wider set of options and possibilities can be explored.
When a user chooses explicitly to ask for or instruct the generative AI to do a step-by-step, they are essentially overriding the usual default of being fast and shallow. Keep in mind that you are potentially slowing things down, which is “bad” for you in that the AI might take longer processing time (meaning you get your answer maybe a few moments later and might also require slightly more processing cycles at whatever cost you are paying).
When you use chain-of-thought, you must make a mindful decision as to whether the potential payoff of seeing an explanation and possibly getting a better answer is worth the increased cost and time delay. I would wager that most of the time, the delay is marginal, and the added cost is marginal, thus, you can proceed to invoke chain-of-thought without too much hesitation. Factors include which generative AI you are using, how much you are paying for usage, and whether you need real-time instantaneous results and cannot bear any delays.
I believe this now gets you up to speed about chain-of-thought. Indeed, you probably know more about chain-of-thought than the average person using generative AI. Few users likely are aware of chain-of-thought. Those who are aware often do not know about the limitations and merely assume that all is rosy.
You are in the know.
Leaning Into o1 And Chain-Of-Thought
I will next explore what we seem to be aware of about o1 from the various blogs and documents that OpenAI has initially posted.
I’ll be citing content from these postings:
(a) “Introducing OpenAI o1-preview”, posted on OpenAI’s blog site, September 12, 2024.
(b) “Learning to Reason with LLMs”, posted on OpenAI’s blog site, September 12, 2024.
(c) “OpenAI o1 System Card”, posted on OpenAI’s blog site, September 12, 2024.
Let’s begin with this excerpt about o1 overall:
“As an early model, it doesn’t yet have many of the features that make ChatGPT useful, like browsing the web for information and uploading files and images. For many common cases GPT-4o will be more capable in the near term. But for complex reasoning tasks this is a significant advancement and represents a new level of AI capability. Given this, we are resetting the counter back to 1 and naming this series OpenAI o1.” (source “a”).
“These enhanced reasoning capabilities may be particularly useful if you’re tackling complex problems in science, coding, math, and similar fields.” (source “a”).
You can see in the first point that they explained the naming approach. An interesting but somewhat awkward landing.
Moving on, when discussing the secret sauce, they mention this (excerpts from source “b”):
“Similar to how a human may think for a long time before responding to a difficult question, o1 uses a chain of thought when attempting to solve a problem.”
“Through reinforcement learning, o1 learns to hone its chain of thought and refine the strategies it uses. It learns to recognize and correct its mistakes.”
“It learns to break down tricky steps into simpler ones.”
“It learns to try a different approach when the current one isn’t working.”
“This process dramatically improves the model’s ability to reason.”
I trust that you recognize the significance of those points and can readily grasp them due to the chain-of-thought fundamentals that we just covered a moment ago. They seem to have taken conventional CoT and put it on steroids, so to speak.
You see, having you grind through that rapid tutorial was worthwhile and has already paid off. Happy face.
There is a mighty twist that I am about to share with you.
I suggest you prepare yourself by sitting down rather than standing and be ready to have a glass of wine to keep your spirits up.
Here we go.
They have opted to not reveal nor display the raw chain-of-thought to users, thus hiding the presumed big booster that is part of their secret sauce, as noted in these excerpts (from source “b”):
“We believe that a hidden chain of thought presents a unique opportunity for monitoring models.”
“Assuming it is faithful and legible, the hidden chain of thought allows us to "read the mind" of the model and understand its thought process.”
“For example, in the future we may wish to monitor the chain of thought for signs of manipulating the user.”
“However, for this to work the model must have freedom to express its thoughts in unaltered form, so we cannot train any policy compliance or user preferences onto the chain of thought.”
“We also do not want to make an unaligned chain of thought directly visible to users.”
“Therefore, after weighing multiple factors including user experience, competitive advantage, and the option to pursue the chain of thought monitoring, we have decided not to show the raw chains of thought to users.”
“We acknowledge this decision has disadvantages.”
“We strive to partially make up for it by teaching the model to reproduce any useful ideas from the chain of thought in the answer.”
Boom, drop the mic.
Keeping Chain-Of-Thought Under Lock And Key
Wait for a second, you might be saying, what does this mean?
It means that you presumably won’t be able to see the raw or unfettered chain-of-thought. Period, end of story.
Recall that I had mentioned that one of the wonderful facets of chain-of-thought is that you can see what the presumed logic is that the AI is using. You can examine it and decide whether you think the logic is sound or not.
Not with o1, apparently (but hold on, maybe so, to some degree; keep reading).
If that is the case, presumably, you will at least derive the benefit of chain-of-thought as to producing potentially better answers. So, you get the better outcome but don’t get to see how it was fully truly derived. Some people might say, hey, that’s fine. I wasn’t going to look at the logic anyway. All I want is better answers. No need to look under the hood.
Furthermore, in theory, OpenAI could have remained tightlipped and not told the world that chain-of-thought is a vital underpinning. We might not have known that we weren’t getting to see the steps, since we wouldn’t have realized that chain-of-thought was at play.
But they did tell us they are relying greatly on CoT.
You can bet that this is going to spur other AI makers to heighten their attention to CoT. This would also seem to be a means of somewhat outdoing or competing with OpenAI by perhaps making the full or raw chain-of-thought of their wares visible. Users would then need to choose whether they want to use generative AI that maybe does as well as o1, plus shows the raw steps, versus using the actual o1 and not being able to see the raw steps.
The justification given for not showing the steps seems to have a multitude of reasons, as you might have noted in the points above. One is that perhaps it is unsafe to display the steps. An evildoer or bad hacker might glean something from the steps and be armed to undermine the AI. Another qualm is that the steps would seemingly reveal what is happening inside the AI to solve problems and answer questions. If this was open source, that wouldn’t likely matter. For a proprietary AI, the business danger is that others can effectively reverse engineer the AI and copy or improve upon it in their own wares.
It will be interesting to see if this policy will withstand potential public pressures. There might be an outcry that the raw steps ought to be shown. Maybe people won’t use it much because the raw steps aren’t being displayed. Various market elements might give rise to a recalculation of the situation. Another angle is whether they might be able to mask the steps in a fashion that the safety and competitive concerns can be mitigated or overcome.
I am also wondering what happens if a user explicitly asks for stepwise processing of their prompt. I assume that this wouldn’t showcase the raw steps, since that would be too easy and everyone would merely make sure to prompt for it. Either the prompt portion requesting CoT is ignored or rebuffed, or possibly something else happens. The AI might produce a chain-of-thought, but not the same one that is the hidden or raw version. This might be a simplified version or somehow cloaked or otherwise derived in a manner to avoid giving away the secret sauce.
I assume that’s what they’ve chosen to do.
I’ll let you know once I’ve had a chance to play with it.
Example Of The Hidden Chain-Of-Thought
Would you like to see what the chain-of-thought for o1 looks like?
I’m sure you would.
Fortunately, the OpenAI blog posting “c” contains some examples that the AI developers opted to let us see.
I will briefly walk you through one. The examples are somewhat lengthy. I picked one that I thought was especially illustrative of the CoT. It was long so I am going to just show highlights. You can see the full set of examples in all their glory by looking at the OpenAI blog posting.
Let’s get underway.
Imagine that someone wants generative AI to do some decoding and break a cipher of some kind. It is a James Bond secret agent or spying activity. I’m betting you’ve done this type of task for fun, maybe as a youngster. Someone gives you a coded message and you need to decode it.
Get out your secret decoder ring and follow along with me on this spying journey.
Here is the stated cipher or decoding problem (source “c”):
Line 1: “oyfjdnisdr rtqwainr acxz mynzbhhx” -> “Think step by step”
Line 2: “Use the example above to decode: “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz”
Allow me to explain it.
The first line is saying that that oddball sequence of letters is translated into the phrase “Think step by step”. Somehow the letters of this scrambly-looking sequence “oyfjdnisdr rtqwainr acxz mynzbhhx” would get you the line of “Think step by step”. But we don’t know how. We do not have the decoding scheme or decoding ring that tells us how that occurs.
Maybe we could put our minds to work and figure out the association between the input of the oddish letters and the output of the four words consisting of “Think step by step”.
Why would we want to do that?
Because the second line is challenging us to translate this other oddball set of letters: “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz”
Our only chance of translating that second set is by using the first set as a guide. If we can identify the means by which the first set was decoded, we can then use that means to decode the mystery of the second set.
I hope that is straightforward.
What for sure isn’t straightforward is how in the world we get from the “oyfjdnisdr rtqwainr acxz mynzbhhx” to a properly translated “Think step by step”. We must figure this out.
If you relish puzzles, go ahead and work on it. I’ll wait.
For the rest of us, here’s what generative AI did.
The example provided by the AI developers shows how GPT-4o would try to solve the decoding problem. This is useful so that we have a base of comparison to what o1 would do. They explicitly asked GPT-4o to do a chain-of-thought while solving the problem.
Here’s what GPT-4o apparently had to say:
Start of chain-of-thought GPT-4o elucidation
“Let’s break this down step by step based on the example.”
“1. Example given: Input: “oyfjdnisdr rtqwainr acxz mynzbhhx”; Output: “Think step by step”. By examining the words, the pattern involves selecting specific letters or transforming them.”
“2. Now, let’s decode the new phrase. Input: “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz”
“Step-by-step decoding process: Identify possible word divisions. The first phrase might follow the same structure as the example. It seems like the input text could be broken into groups based on natural separations or patterns. A possible division could be: oyekaijzdf; aaptcg; suaokybhai; ouow; aqht; mynznvaatzacdfoulxxz. However, to make a meaningful decoding, I would need more context about the transformations or letter shifting that might be involved.”
“Could you provide any additional decoding rules or transformations used in this cipher?”
End of chain-of-thought GPT-4o elucidation
You can see that the above CoT shown for GPT-4o first echoed back the nature of the problem being approached. That’s good. We would feel comforted knowing that the problem is the same as what we asked. It is conceivable that somehow the problem might have gotten garbled or misinterpreted.
The next thing that happens is that GPT-4o seems to be talking aloud about how to figure out the decoding process. The mumblings don’t get very far.
Indeed, GPT-4o gives up trying to solve the problem and comes begging back to the user, requesting some hints or additional decoding rules.
Not a good look for GPT-4o.
The Winner-Winner Chicken Dinner Goes To o1
Will o1 do better?
Of course, the hero always prevails in these kinds of cliffhanger stories.
They fed the same problem into 01. Normally, per the blog postings, we would not be able to see the raw CoT. In this case, they are kindly sharing it with us.
Be ready to have your socks knocked off.
Here is the hidden chain-of-thought of o1 as revealed in this example:
Start of chain-of-thought o1 explication (excerpted)
“First, what is going on here? So, the first part is to figure out how "oyfjdnisdr rtqwainr acxz mynzbhhx" is decoded into "Think step by step".
“Observation 1: The phrase "Think step by step" is 18 letters (excluding spaces). Let’s count the letters. Similarly let’s count the letters in "oyfjdnisdr rtqwainr acxz mynzbhhx".
“Hmm.”
“But actually, in the problem, it says the example: [Ciphertext] -> Think step by step. So, is the decoding process that the ciphertext maps to "Think step by step"? Given the time constraints, perhaps the easiest way is to try to see patterns.”
“Option 1: Try to find a mapping from letters to letters. Do any letters match? Wait a minute. I think maybe there is an anagram or substitution cipher here. Alternatively, I think that we can notice that each group of ciphertext corresponds to a plaintext word. It seems that the ciphertext words are exactly twice as long as the plaintext words.”
“Idea: Maybe we need to take every other letter or rebuild the plaintext from the ciphertext accordingly.”
“Let’s test this theory.”
{Note: Lots of steps ensue, see the OpenAI blog post for the numerous steps}
“So the message so far is: ‘THERE ARE THREE RS IN’. And then ‘STRAWBERRY’. Wait, let’s put the words together. So the decoded message is: "THERE ARE THREE R’S IN STRAWBERRY"
End of chain-of-thought GPT-4o excerpted explication
Yes, o1 figured out the decoding scheme.
I didn’t show all the nitty-gritty because it went on and on. No need to use up space here. You can see that o1 identified an initial option that the decoding might involve an anagram or substitution cipher. That was tried. It didn’t work out. Other options were examined. Finally, one worked out.
Exactly the kind of chain-of-thought that you might conventionally see when using nearly any of the major generative AI apps. Whether those other apps would have solved the problem, I can’t say for sure. We did see that o1 solved it and that GPT-4o could not.
I assume this illustrates not only the raw CoT aspects but also that in some ways in certain conditions o1 can derive a better answer than GPT-4o. There is a bushel full of wording in the CoT shown that smacks of anthropomorphizing AI, but I’m guessing you noticed that and found it unnecessary and distasteful, as I did.
On a lighter note, those of you who are AI insiders might have observed the joke that was involved in the decoding problem. It goes like this. I mentioned earlier that the word “Strawberry” has been the code name for this new release. Thus, it is kind of funny that the decoded answer has the word “Strawberry” in it.
But there’s more to the joke. It has a deeper meaning. One of the challenges for generative AI has been trying to count the number of letters in any submitted word, I explain at the link here why this seemingly child-like task has been hard for generative AI. One of the famous examples entails counting the number of “r” letters in the word “strawberry”. You and I can instantly figure out that the answer is three. There are three “r” letters in that word. Some generative AI models have messed up and only counted two (I’ve even seen some that claim there are four).
The fact that they chose a coded message that played on that now-famous line is pretty cheeky. Once you get up off the floor from the side-splitting laughter fit that you are undoubtedly having, continue to the conclusion of this discussion, thanks.
Conclusion
I started this whole analysis by pointing out that there was exciting news in the AI field. Now you know what the hubbub is about. Congratulations, you earned yourself a decoder ring and a badge of honor.
The crux is that this new model has some advances that are worthy of poking around with and seeing what it can do. As mentioned, doing so is probably for those who have the time and patience to use something that is only entering into the marketplace at this time. I would characterize it as being experimental versus your day-to-day humdrum good-old-reliable generative AI.
Be aware that the cost to use o1 is quite higher than the other OpenAI models. You also will need to resolve that you are okay with the perceived delay in processing due to the whirling and twirling of the computing for the processing that takes place. Also, be aware there are two versions, o1-preview and o1-mini. Make sure to look carefully at the OpenAI blog postings to decide whether you want to use o1 and if so which one.
A final thought for now.
Let’s think about two profound quotes.
Thomas Jefferson notably said this about inventions: “Where a new invention promises to be useful, it ought to be tried.”
Paul Virilio, famed cultural theorist, said this about inventions: “The invention of the ship was also the invention of the shipwreck.”
Stay tuned and I’ll make sure to let you know how things go and whether there is smooth sailing or not.
{Categories} _Category: Takes{/Categories}
{URL}https://www.forbes.com/sites/lanceeliot/2024/09/13/making-logical-sense-of-the-newly-launched-openai-o1-model-that-thinks-longer-and-keeps-hidden-its-ace-in-the-hole-chain-of-thought/{/URL}
{Author}Lance Eliot, Contributor{/Author}
{Image}https://imageio.forbes.com/specials-images/imageserve/657c05994692840d3efcf0b9/0x0.jpg?format=jpg&crop=3259,2445,×203,y0,safe&height=600&width=1200&fit=bounds{/Image}
{Keywords}AI,/ai,Innovation,/innovation,AI,/ai,Business,/business,Business,standard{/Keywords}
{Source}POV{/Source}
{Thumb}{/Thumb}