Much of the content of this forum involves numerical calculation and combining more than one fact into a chain of reasoning. I believe that ChatGPT, Bing Chat, and Bard make gross errors on such topics when they cannot find a ready-made template that exists, premade, within its knowledge base. How serious is this?
As an editor remarked in the 1980s when a word processor company tried to sell her a scanner with OCR, "if it is only 98% accurate, it's no good to us, because we would still need to proofread, and proofreading takes more time than data entry." Fact-checking the output from these language models is labor-intensive, skilled, and unpleasant work. Any work savings from using the language model has to be offset by the work needed to verify and fact-check the results. And you can't say "let's not bother to fact-check the ones that are right!"
Here's an example. If looking at this makes your eyes glaze over, I get it. My initial reaction to the first answer was "Wow! Nailed it!" Then I did a double-take: "almost two-and-a-
half feet?" My reaction to the second was "Wow! It was able to find the error and it fixed it." But no, it didn't made a serious conceptual error again, this time on something it had gotten right the first time (allowing for the width of the cut made when removing the 2' piece). (Of course I'm troubled by the nagging fear that
I made a dumb mistake and that ChatGPT was right, and that I am about to look like an idiot.
But this, too, would be a problem in any real-world use of ChatGPT for numeric problems as well. My own answer is at the very end.)
Q: I have a saw whose blade is 2 mm thick, and a board that is 10 feet long. I saw off a 2-foot-long piece, and then cut the remaining part into four equal pieces. How long is each piece?
chat.forefront.ai, ChatGPT v4, 'Helpful Assistant' wrote:To solve this problem, we first need to convert the board's length to the same unit as the saw blade's thickness, which is millimeters. There are 304.8 millimeters in a foot. Then, we'll calculate the total length of the cuts made by the saw and subtract that from the board's length. Finally, we'll divide the remaining length by the number of pieces to find the length of each piece.
Convert the board's length to millimeters: 10 feet * 304.8 mm/foot = 3048 mm
Calculate the total length of the cuts made by the saw: 1 cut (2-foot piece) + 3 cuts (4 equal pieces) = 4 cuts * 2 mm/cut = 8 mm
Subtract the total length of the cuts from the board's length: 3048 mm - 8 mm = 3040 mm
Divide the remaining length by the number of pieces: 3040 mm / 4 pieces = 760 mm/piece
Convert the length of each piece back to feet: 760 mm * 1 foot/304.8 mm ≈ 2.493 feet
Each piece is approximately 2.493 feet long.
It completely forgot that we cut off the 2' piece. Although its next is numerically closer, in my opinion it's worse, because it shows clearly that the model does not have any actual understanding of the situation. It makes another guess at how to solve the problem and bungles in a different way.
Prompt: "I think you got confused about 'remaining length.'"
chat.forefront.ai, ChatGPT v4, 'Helpful Assistant' wrote:Apologies for the confusion. Let's re-calculate the lengths of the pieces.
Cut off a 2-foot-long piece: 10 feet - 2 feet = 8 feet remaining
The saw blade is 2 mm thick, which is approximately 0.00656 feet (2 mm * 1 foot/304.8 mm)
Since there are 3 cuts to make the 4 equal pieces, the total length removed by the saw is 3 * 0.00656 feet ≈ 0.01968 feet
Subtract the total length of the cuts from the remaining length: 8 feet - 0.01968 feet ≈ 7.98032 feet
Divide the remaining length by the number of pieces: 7.98032 feet / 4 pieces ≈ 1.99508 feet
Each of the four equal pieces is approximately 1.99508 feet long.
The fact that the result is numerically
closer to the right answer should obscure the fact that the
mistake is again "forgetting" a key fact. This time, it forgot that the total number of cuts is 4, not 3.
It seems as if it couldn't link the two cutting operations involved: cutting off the 2' piece, and cutting the remaining part into three pieces... possibly because the particular combination was not already in its knowledge base?
I gave it one more chance, still wrong but very revealing:
Prompt: Try making a sketch.
chat.forefront.ai, ChatGPT v4, 'Helpful Assistant' wrote:I apologize for any confusion, but as a text-based AI, I am unable to create visual sketches. However, I can help you visualize the problem with a text-based description.
Start with a 10-foot-long board.
Cut off a 2-foot-long piece from one end. You now have an 8-foot-long board remaining.
You need to make 3 cuts to divide the remaining 8-foot-long board into 4 equal pieces.
Each cut removes 0.00656 feet (2 mm) of material due to the saw blade's thickness.
Here's a text representation of the board and cuts:
|---2ft---|---cut1---|---piece1---|---cut2---|---piece2---|---cut3---|---piece3---|---piece4---|
Now, let's calculate the length of each piece:
Calculate the total length removed by the saw: 3 cuts * 0.00656 feet/cut ≈ 0.01968 feet
Subtract the total length of the cuts from the remaining length: 8 feet - 0.01968 feet ≈ 7.98032 feet
Divide the remaining length by the number of pieces: 7.98032 feet / 4 pieces ≈ 1.99508 feet
Each of the four equal pieces is approximately 1.99508 feet long.
Again, in a way this is hugely impressive, but it also confirms that ChatGPT has lost track of the logical connections, and forgets that it needs to allow for the width of the cut between piece3 and piece4. But it "knew" it had to do that in its first try!
For the record, I
hope this is the right answer! We end up with a total of five pieces each about two feet long. The first piece we cut off is exactly two feet long. The other four pieces are each missing the width of one cut. They are each two feet minus one cut long, i.e. 609.6 - 2 = 607.6 mm long = 1.99344 feet. To check, we have made a total of four cuts, so adding up the length of the final boards plus the length gone into sawdust, we have 4x607.6 + 1x609.6 + 4x2 = 3048 mm, accounting for the original length of the board.
Annual income twenty pounds, annual expenditure nineteen nineteen and six, result happiness; Annual income twenty pounds, annual expenditure twenty pounds ought and six, result misery.