AI Thinking Time And Prompting To Be Better Handled By AI Makers This Way

Advanced generative AI involves stipulating “thinking time” for the AI and here’s the way this is … [+] going to go.

getty

In today’s column, I identify three eras underlying the advent of so-called “thinking time” when it comes to using generative AI and large language models (LLMs) and discuss the changes in prompting that will arise soon accordingly.

Let’s talk about it.

This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI including identifying and explaining various impactful AI complexities (see the link here). For my extensive discussion of fifty key prompting strategies and the value of solid prompt engineering, see the link here.

AI Thinking Time Is A Hot Topic

When you use modern-day generative AI, the latest versions tend to have a chain-of-thought (CoT) capability that is now built into the AI, see my in-depth explanation at the link here.

The essence of the chain-of-thought approach is that the AI derives a series of steps to follow when trying to process the prompt that a user has entered. You can somewhat liken this to human decision-making and problem-solving in the broad sense of coming up with logical steps to figure things out (do not though anthropomorphize AI, such that AI isn’t on par with how humans think).

In the AI community, this extended use of processing time has been given all kinds of fanciful titles such as providing additional test-time compute or so-called thinking, reasoning, or inference time. I disfavor the phrasing of thinking/reasoning/inference time since those words smack of human thought. I find quite questionable the phrase test-time compute since the idea of testing is something usually done before a system or app is rolled out to users, while in the case of AI, the semblance seems to be that “test-time” also encompasses when users are actively utilizing a supposedly fielded system.

I prefer the admittedly bland but apt indication that you are simply providing more processing time for the AI to dig deeper into deriving an answer or solution.

Imagine things this way.

You are using an AI app that plays chess. If you timebox the app to a split second of processing time, there isn’t sufficient time to perhaps examine a plethora of possible chess moves. The AI must cut off the look-ahead since there isn’t sufficient processing time to do a lot of calculations. On the other hand, you might tell the AI that it can take up to five minutes to make a chess move. This would allow for a bunch more processing time to explore a wide variety of chess moves and potentially derive a better choice for the move at hand.

Some would say this is adding thinking time, reasoning time, test-time compute, and so on. I’d suggest we could also be more plainspoken and note this as extending the run-time, real-time processing, or execution time allotment.

You decide which vernacular is the most parsimonious and, frankly, honest.

Time Estimation And Consumption

Since the AI community seems to have decided that they like the impressive-sounding parlance of thinking time, I am going to proceed to use that catchphrase in this discussion since I will be showing generative AI examples entailing this topic. Please know that I use that phrase with great reluctance. In your mind, construe the thinking time as simply the processing time, thanks.

I have identified that there are three eras associated with the amount of thinking time for generative AI:

(1) First Era: AI vendor choice. AI makers decide on the amount of thinking time.
(2) Second Era: User choice. Users get to decide on the amount of thinking time.
(3) Third Era: Human-AI choice. Human-AI collaboration on the amount of thinking time.

Right now, we are primarily in the first era.

It goes like this. You ask generative AI a question and the AI will run for some amount of thinking time to figure out your answer. The AI maker has decided how much thinking time you will be allowed. To some extent, you can pay a fee to get extended thinking times, otherwise it is set by the AI maker.

This has led us to the second era.

After realizing that users might want to set how much thinking time is to be consumed, AI makers are variously now implementing the option of users being able to choose the amount of thinking time. For example, you might choose between Low, Medium, and High. That’s the ChatGPT approach for the moment. Another angle is a sliding scale, which is what Anthropic is moving toward.

You will see in a moment that this ability to choose the desired thinking time isn’t all that it’s cracked up to be. Hang in there.

I predict we are heading at a rapid pace toward a third era.

In the third era, the amount of thinking time will be collaboratively ascertained on a human-AI basis. In short, the generative AI will discuss the thinking time aspects with the user, of which the user then gets the final say on the amount of thinking time that will be utilized for a prompt or when used on a default basis.

My logic for saying that AI makers are inevitably and soon going to shift into my defined third era is that users of generative AI are going to express their dissatisfaction with the second-era approach. Maybe few of the AI makers will admit that there is such discontent and will merely proclaim they have enhanced how thinking time is set. Sure, whatever gets us to improvements for those using generative AI. Spin away.

Example of Low, Medium, High Approach

The first era is what you have already experienced when using most of the conventional generative AI. No need to dwell on that era. Let’s jump into the slowly emerging second era and see some examples.

The ability to pick from low, medium, or high is straightforward and provides a useful illustration of the second-era approach. There is either a picklist presented to you for each prompt, or the AI directly asks you which you prefer.

Here is an example of generative AI asking for your preference.

Generative AI: “Indicate the amount of thinking time: Low, Medium, or High.”
My entered prompt: “The thinking time for this question is to be Low. What is the capital of Paris?”

Observe that I told generative AI that I wanted to go with low for the prompt I am entering.

My basis for choosing low is that the question I am going to ask is pretty easy. By simply asking about the capital of Paris, I expect that the thinking time should be quite minimal. No sense in telling the AI to go with high when I can guess that the thinking time isn’t going to be notable.

This brings up an important point.

Some generative AI apps will do whatever you say, such that if you say high, the AI will potentially whirl and calculate far beyond the true amount of time needed. You have essentially told the AI that this is okay with you. The problem is that you might end up paying for extra cycles of processing that you really didn’t need to have expended. It’s kind of a rip-off.

The retort by the AI maker is that if the user has said they want a high amount of thinking time, by gosh, that’s on their shoulders. It is up to the user to realize what they are doing. Plus, the added thinking time can be construed as a handy double-check. You are getting the AI to do a lengthier exploration and as a result, you can be happier with the answer given.

Mull that over.

Example Of A Sliding Scale

Though there is a bit of a splash made about using a sliding scale instead of the low, medium, and high, the rub is still about the same.

Take a look at this example.

Generative AI: “Indicate on a sliding scale from 1 to 10 the amount of thinking time (1 is low, 10 is high).”
My entered prompt: “The thinking time should be 7. Tell me the best chess move based on the board positions that I will give you in my next prompt.”

You can certainly tout the extravagance of being able to use a sliding scale, which might be an actual bar on the screen with a slider or could be as I’ve shown above the entry of a number. In reality, with a scale of 1 to 10, you could reasonably assume that 1 is low, 5 is medium, and 10 is high. Your ability to choose those numbers or something in between might be nice, but it doesn’t get us to the moon if you know what I mean.

The essence is that the selection of the thinking time is still on the shoulders of the user.

Third Era Approaching Fast

I will now shift into the approaching third era.

Let’s continue my example using the sliding scale. Suppose that the AI consumed the 7 as the amount of thinking time, but the answer wasn’t quite as good as it could have been if I had given a higher number. Please know that in the first and second eras, you would not be informed that your choice of time was a tad low. Tough luck to you.

In the third era, something like this would happen.

Generative AI: “You told me to use 7 as the amount of thinking time. I did so. The best chess move based on that amount of thinking time is shown next. I suggest you consider rerunning at a higher amount of thinking time if you want a deeper answer.”
My entered prompt: “Go ahead and rerun with a thinking time of 9.”

You can see that the AI not only proceeded as I originally instructed, but it also detected that there was more that could have been done to give a stronger answer. The AI kindly informed me accordingly.

I then opted to do a rerun with a 9 as the amount of thinking time.

This showcases the third era as consisting of human-AI collaboration in establishing the thinking time. Rather than the first era where the AI makes the choice, and the second era where the user makes the choice (though somewhat blindly), the third era entails the AI and the user working hand-in-hand to figure out the thinking time. Nice.

Midstream Adjustments To Occur

You might have had some heartburn that the AI informed me after the fact that my 7 was less than what might have been a better choice. I reran the prompt with a 9, but I had already incurred the cost and delay associated with my prompt that said to use 7. You might say that I am doubling my cost and that this seems unfair.

I agree.

The third era will introduce the midstream capability of adjusting thinking time. So, for this next example, envision that my initial prompt of 7 was accepted and the AI got underway.

Here’s what might have happened.

Generative AI: “You told me to proceed with a thinking time of 7. I started on this. I now believe that if you are willing to go to a thinking time of 9, the answer will be notably improved. Is that okay with you or should I stop at the 7?”
My entered prompt: “Please further proceed with a thinking time of 9.”

The beauty is that I don’t incur a complete rerun. Midstream of processing, the AI came back to me and asked if I was willing to up the ante to 9. I said yes.

Cynical readers might right away be bellowing that this is going to incentivize the AI makers to convince users to increase their thinking times, perhaps just to make a buck. I get that. There is little doubt that the AI could be tilted to ask the user for more thinking time even when it is misleading or an outright lie. Ka-ching goes the cash register for the AI maker. It will be hard for an average user to discern whether they are being honestly told to increase their time or are being tricked into doing so.

The saving grace, perhaps, would be that AI makers doing this tomfoolery are taking a huge reputational risk if it is discovered they are purposely gaming users. Possibly lawsuits and maybe criminal action could be in their future for such deceptions of users (for more on the evolving realm of AI and the law, see my analysis at the link here).

We’ll have to wait and see how this pans out.

Collaboration Gets More Robust

Part of the issue with my having stipulated the 7 as my desired thinking time for my prompt was that I had to take a wild guess about the matter.

Consider things this way.

You go to a car mechanic to fix your car. Suppose the car mechanic asks you how much you are willing to spend to fix the car. That seems zany. The car mechanic ought to give you an estimate. Few people would magically know how much they think the car fix is going to cost. It doesn’t make much sense to do things that way.

The same will hold true in the third era of generative AI thinking time.

Here is an example.

Generative AI: “You can tell me how much thinking time you want to use, doing so with a sliding scale from 1 (low) to 10 (high). If you are unsure of how much thinking time is needed, I can give you an estimate before I get started on solving or answering your question or problem. Would you like an estimate?”
My entered prompt: “Yes, I would like an estimate. My question is going to be that I want to know how many manhole covers there are in New York City. What is the estimated amount of thinking time needed?”
Generative AI: “I estimate that would be a 2 on a scale of 1 to 10. You can approve that or change to some other amount of thinking time that is between 1 and 10.”
My entered prompt: “Proceed with the thinking time of 2, thanks.”

This makes a lot more sense. I was able to provide my prompt and get a preliminary estimate. I approved the estimate.

Once the AI gets underway, if it determines that the estimate was not sufficient, it will come back to me midstream and let me know. I could then adjust if desired.

Not Always Dealing With Estimates

A user that is frequently utilizing generative AI might get tired of having to continually deal with estimates and approvals for thinking time. It could be exasperating and irksome.

In the third era, the AI will keep track of how things are going and make recommendations to the user.

Consider this example.

Generative AI: “I periodically analyze the thinking time that is being used to answer your questions. I’ve noticed that you often ask for Medium. Most of your questions so far have been answered within the Low timeframe. You might want to consider using Low unless your questions start to become more complex.”
My entered prompt: “Thanks for the analysis. I’d like you to automatically default that my preference for thinking time henceforth is Low. I will tell you if I want to switch the thinking time to a different level.”
Generative AI: “Got it. All your questions will now have a default of Low. You will tell me in your prompt whether that is to be changed. I will also notify you if a question you ask is estimated by me as being well above a Low, and then let you decide what you want to do.”

The AI has handily determined that my best bet is to generally be at Low. This isn’t rigid. The AI will adjust, and I can adjust.

Thinking About Thinking Time

If you haven’t been dealing with thinking time when using generative AI, you now know what’s coming up. I trust that you are prepared for the changes afoot.

My expectation is that we will advance quickly to the third era. No sense in making life harder for users by getting mired in the first era or the second era. It’s time to move on.

A final comment for now.

Henry Ford famously said this: “Coming together is a beginning; keeping together is progress; working together is success.”

The same applies to working with generative AI. Human-AI collaboration is the best path toward success. Humans will be happy, and I suppose the AI will be “happy” though let’s not hand out that emotion to non-sentient AI. We must keep our heads and minds clear on what contemporary AI can and cannot do.

That’s a good use of our thinking time.

Source link

AI Thinking Time And Prompting To Be Better Handled By AI Makers This Way

AI Thinking Time Is A Hot Topic

Time Estimation And Consumption

Example of Low, Medium, High Approach

Example Of A Sliding Scale

Third Era Approaching Fast

Midstream Adjustments To Occur

Collaboration Gets More Robust

Not Always Dealing With Estimates

Thinking About Thinking Time

Leave a ReplyCancel Reply

Investigating the viral heat protectant test

Should you avoid retinyl palmitate in sunscreens?

How to REALLY fix your itchy scalp: The Science

AI Thinking Time Is A Hot Topic

Time Estimation And Consumption

Example of Low, Medium, High Approach

Example Of A Sliding Scale

Third Era Approaching Fast

Midstream Adjustments To Occur

Collaboration Gets More Robust

Not Always Dealing With Estimates

Thinking About Thinking Time

Leave a ReplyCancel Reply

Trending Now

Investigating the viral heat protectant test

Should you avoid retinyl palmitate in sunscreens?

How to REALLY fix your itchy scalp: The Science