FOUNDERS FEED

first to know · first to grow—tech daily

Live

▸Anthropic raises $5B at $170B valuation▸OpenAI ships o3 with native tool-use▸Cursor crosses $500M ARR▸Meta open-sources Llama 4 reasoning weights▸YC W26 batch: 38% AI infra▸Vercel acqui-hires Linear AI team▸Supabase Realtime v3 ships native pgvector streaming▸Bun 1.3 native Postgres driver: 2.4× faster▸Stratechery: 'the end of horizontal SaaS'▸Anthropic raises $5B at $170B valuation▸OpenAI ships o3 with native tool-use▸Cursor crosses $500M ARR▸Meta open-sources Llama 4 reasoning weights▸YC W26 batch: 38% AI infra▸Vercel acqui-hires Linear AI team▸Supabase Realtime v3 ships native pgvector streaming▸Bun 1.3 native Postgres driver: 2.4× faster▸Stratechery: 'the end of horizontal SaaS'

▸ AI · 4 stories today← all sections

The Leadsignal 80

AI· signal 8024d ago

ITBench-AA: frontier models score below 50% on enterprise IT agentic tasks

Artificial Analysis and IBM released ITBench-AA, the first benchmark measuring frontier model performance on agentic enterprise IT tasks, revealing significant gaps in current capabilities.

Why it matters — Clear evidence that general-purpose models still struggle with real enterprise automation—defines the actual difficulty level for builders targeting IT operations.

More AI3 more

FOUNDERS FEED

ITBench-AA: frontier models score below 50% on enterprise IT agentic tasks

Cisco and OpenAI scale enterprise AI with Codex integration

Warp orchestrates coding agents with GPT-5.5 across dev environments

ElevenLabs releases music generation model with genre-switching capability