Huawei's New Benchmark Gives AI Agents Months of Your Life—Then Watches Them Fail
In brief Researchers from Huawei and three partner institutions released Claw-Anything, a benchmark that evaluates AI agents on personal-assistant tasks. GPT-5.5, OpenAI’s flagship model, scored...
