Interesting, I'm not in full agreement on o3-mini being better than gemini 2. The position of the yellow circle is relative to the outermost box making it make unnatural turns following the rotation of the box. My guess is that the circle is a child of the box and would be bouncing around the box naturally if the box weren't rotating. But because of the rotation, the circle doesn't move naturally. Gemini doesn't have this issue. It's innermost ball has a coefficient of restitution of 0 which means that it doesn't bounce at all and that looks somewhat unnatural. I'd still say that Gemini's animation was the most physically correct.
This is cool and very visually understandable test 👍🏼
nice comparison !
Very cool test. Have you listed the prompts you used anywhere?
I used your prompt in Chat GPT o3-mini. It created html, but green square was always in the centre of yellow circle without any collisions, it was rounding by the way. A ball inside green rectangle worked fine and yellow circle also bounced white square... So, maybe your results are something random?
In my coding tests i will say that there's no Llm still to make code good and we must wait. Altrough i don't use corporate cloud models at all. The Mistral models is the most "fraudulent" because even Codestral or Mistral Large which uses somewhere in 150Gb Ram range - can't repair code like at all or create what i needed. The Deepseek v2.5 was better in code than V3 or R1 - all of which i've tested on my 576Gb Ram machine.
Your video is too old in AI world
@BlueSpork