Frontier models scored at PhD level intelligence but failed to execute well - both true for robotics to computer-use
To Think Is Not to Do: The Paradox of…
Frontier models scored at PhD level intelligence but failed to execute well - both true for robotics to computer-use