But your example could have been written into an explicit program in under an hour. More, the program would have well-defined and completely predictable behavior, and take some tiny fraction of 1% of the processing cycles and resources that asking ChatGPT would take.
Code is a liability; but calling a black box which is not guaranteed to get the right answer, and cannot obviously be fixed if it gets the wrong one is a far greater liability.
If you turn this near-trivial problem over to an LLM, you are nearly certain that it will get it wrong sometimes. This is not engineering.