TestingCatalog AI News ๐Ÿ—ž
4.74K subscribers
2.92K photos
378 videos
40 files
3.86K links
Reporting AI nonsense. A future news media, driven by virtual assistants ๐Ÿค–
Download Telegram
Perplexity working on Model Concil, combining 3 AI models

Perplexity is developing Model Council, a Max-tier feature enabling users to compare outputs from top AI models like GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5. A separate mode, Gamma, hints at experimental high-tier capabilities.

๐Ÿ—ž #perplexity
๐Ÿ‘331
TestingCatalog AI News ๐Ÿ—ž
Perplexity working on Model Concil, combining 3 AI models Perplexity is developing Model Council, a Max-tier feature enabling users to compare outputs from top AI models like GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5. A separate mode, Gamma, hints at experimentalโ€ฆ
BREAKING ๐Ÿšจ: Perplexity is working on a new Model Council multi-model system, combining outputs from GPT-5.2, Opus 4.5 and Gemini 3 Pro into one response.

In addition, a new mode named Gemma is in the works, labelled as "ASI"
โค12๐Ÿ‘3๐Ÿ‘Ž1
Perplexity launches Advanced Deep Research for Max users

Perplexity launched the DRACO Benchmark to publicly assess AI research tools on real-world tasks across ten domains. It measures accuracy, depth, presentation, and sourcing, with initial results showing Perplexity leads in both precision and speed.

๐Ÿ—ž #perplexity
๐Ÿ‘3
This media is not supported in your browser
VIEW IN TELEGRAM
GitHub Copilot Pro+ and Copilot Enterprise subscribers can now use Codex and Claude agents on GitHub.

Github Codex Pilot or Github Claude Pilot?
โค5๐Ÿ‘2
According to The Information, upcoming Avocado model from Meta is referenced as the โ€œmost capable โ€œ to date internally.

Soon? ๐Ÿ‘€
๐Ÿ‘11๐Ÿ˜2๐Ÿคฎ1
BREAKING ๐Ÿšจ: A new Gemini checkpoint has been spotted in A/B testing.

Will we see this live? ๐Ÿ‘€

h/t x@marmaduke091
๐Ÿ”ฅ14๐Ÿ‘33
43
BREAKING ๐Ÿšจ: CLAUDE OPUS 4.6 HAS BEEN SPOTTED IN PERPLEXITY APIs!

* keep in mind that this doesnโ€™t imply an imminent release.

h/t x@synthwavedd
๐Ÿ‘85
BREAKING ๐Ÿšจ: OPENAI ANNOUNCED OPENAI FRONTIER, A NEW ENTERPRISE PLATFORM TO CREATE AND MANAGE AI COWORKERS.

"Frontier gives agents the same skills people need to succeed at work: Understand how work gets done, Use a computer and tools, Improve quality over time, Stay governed & observable"

The biggest part ๐Ÿ‘€

"Built-in ways to evaluate and optimise performance make it clear to human managers and AI coworkers whatโ€™s working and what isnโ€™t, so good behaviours improve over time. Over time, AI coworkers learn what good looks like and get better at the work that matters most."
๐Ÿ”ฅ5๐Ÿ‘41
BREAKING ๐Ÿšจ: A BIG DROP IS EXPECTED FOR CODEX TODAY! CODEX GITHUB ALSO DOESNโ€™T STATE โ€œLATESTโ€ NEXT TO GPT-5.2 ANYMORE.
โค74๐Ÿ‘€3
BREAKING ๐Ÿšจ: PERPLEXITY IS PREPARING CLAUDE OPUS 4.6 FOR RELEASE ON THE WEB. A STRONG SIGNAL THAT IT WILL ARRIVE TODAY.

We are super close ๐Ÿ‘€
โค6๐Ÿ‘1
BREAKING ๐Ÿšจ: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP!

TESTING TIME ๐Ÿ”ฅ
โค10๐Ÿ˜ญ5๐Ÿ‘1
TestingCatalog AI News ๐Ÿ—ž
BREAKING ๐Ÿšจ: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP! TESTING TIME ๐Ÿ”ฅ
This media is not supported in your browser
VIEW IN TELEGRAM
BREAKING ๐Ÿšจ: Claude Opus 4.6 has been officially announced. Opus 4.6 comes with an improved performance across various agentic, reasearch and coding tasks.

What would you test first? ๐Ÿ‘€
โค3๐Ÿ‘1
TestingCatalog AI News ๐Ÿ—ž
BREAKING ๐Ÿšจ: Claude Opus 4.6 has been officially announced. Opus 4.6 comes with an improved performance across various agentic, reasearch and coding tasks. What would you test first? ๐Ÿ‘€
Opus 4.6 comes with a big improvement at Agentic Search, Agentic financial analysis and Office tasks.

"Financial professionals use AI to research across multiple data sources, support financial analyses, and create deliverables that their teams and customers can act on."
โค2๐Ÿ‘1
BREAKING ๐Ÿšจ: GPT-5.3-CODEX IS ROLLING OUT ON CODEX CLI AND DESKTOP APP!

COMPETITION AT SCALE ๐Ÿ”ฅ
โค11๐Ÿ‘1
TestingCatalog AI News ๐Ÿ—ž
BREAKING ๐Ÿšจ: GPT-5.3-CODEX IS ROLLING OUT ON CODEX CLI AND DESKTOP APP! COMPETITION AT SCALE ๐Ÿ”ฅ
BREAKING ๐Ÿšจ: GPTโ€‘5.3โ€‘CODEX WAS USED TO SUPPORT CREATING ITSELF, ACCORDING TO OPENAI'S BLOG!

It achieves SOTA score of 57% at SWE Bench Pro and 76% on TerminalBench.

"With GPTโ€‘5.3-Codex, Codex goes from an agent that can write and review code to an agent that can do nearly anything developers and professionals can do on a computer."
โค4๐Ÿ‘1๐Ÿ”ฅ1
TestingCatalog AI News ๐Ÿ—ž
BREAKING ๐Ÿšจ: GPTโ€‘5.3โ€‘CODEX WAS USED TO SUPPORT CREATING ITSELF, ACCORDING TO OPENAI'S BLOG! It achieves SOTA score of 57% at SWE Bench Pro and 76% on TerminalBench. "With GPTโ€‘5.3-Codex, Codex goes from an agent that can write and review code to an agentโ€ฆ
OpenAI opens up Trusted Access framework to accelerate cyber defence.

GPT-5.3-Codex was the first model to hit a "High" on OpenAI's preparedness framework.

Shit is about to get real ๐Ÿ‘€
โค6๐Ÿ˜6๐Ÿค”3