Claude Fable 5 was tested again in BridgeBench after its return. The results dropped sharply.

Debugging: 86.2 → 25.9

Refactoring: 73.6 → 38.4

Hallucination: 75.9 → 61.7

When tasks pass the safety filters, the model performs like the June 12 version.

The main problem is the new filters. They too often classify coding tasks as risky and switch execution to Opus 4.8.