AI Models Rapidly Advance Toward Longer Task Completion
- Recent analysis reveals a striking pattern in AI development: language models are doubling their task completion duration every seven months across both 50% and 80% success rates. The METr chart tracking this progress shows models like GPT-4, Gemini 2.5 Pro Preview, and Claude 3.7 Sonnet steadily advancing into territory that once required sustained human attention for multi-step tasks.

- This rapid advancement has sparked regulatory discussions, particularly around taxation of AI-driven productivity gains. Policymakers are considering new approaches, though experts warn that poorly designed tax policies could backfire. Higher levies on automated output might push smaller companies out of business and drive experienced engineers to relocate abroad where regulatory frameworks are more supportive.
- Financial projections suggest governments face budget gaps if they rely on outdated tax structures while automation accelerates. Industry voices propose an alternative: raise profit taxes on large AI companies rather than taxing automation output directly. This approach, they argue, would preserve competitiveness while maintaining stable revenue collection.
- Multiple legislative amendments spanning employment and tax law are being explored to address potential labor market disruptions as AI handles increasingly complex workflows.
- The capabilities are expanding faster than many anticipated. OpenAI’s unreleased internal systems reportedly handle up to four hours of continuous work—a dramatic leap from current public models. Sam Altman expects next year’s Codex-derived models to progress from multi-hour tasks to multi-day workflows, matching the upward trajectory shown in the METr chart through 2026.
- As AI capabilities scale at this pace, balanced policy becomes critical. Governments must avoid overcorrecting while ensuring AI-driven productivity gains remain sustainable and broadly shared.
Source: Haider