OpenAI acquired Promptfoo, an AI security platform for enterprises.
βPromptfoo brings deep engineering expertise in evaluating, securing, and testing AI systems at enterprise scaleβ
βPromptfoo brings deep engineering expertise in evaluating, securing, and testing AI systems at enterprise scaleβ
π9
This media is not supported in your browser
VIEW IN TELEGRAM
Figure published a new demonstration of Helix 02, where it can clean up your living room fully autonomously.
In several years, we will be testing completely different things.
In several years, we will be testing completely different things.
πΎ4π2π₯2
This media is not supported in your browser
VIEW IN TELEGRAM
Anthropic released Claude Review for Claude Code, a new code-review solution that uses parallel agents to hunt for bugs and issues.
"Agents search for bugs in parallel, verify each bug to reduce false positives, and rank bugs by severity. You get one high-signal summary comment plus inline flags."
In general, I expect that in 2026 we will see a rise of "parallelization", which will also significantly bump average token consumption. Models will get cheaper, but we will start consuming them more and more.
"Agents search for bugs in parallel, verify each bug to reduce false positives, and rank bugs by severity. You get one high-signal summary comment plus inline flags."
In general, I expect that in 2026 we will see a rise of "parallelization", which will also significantly bump average token consumption. Models will get cheaper, but we will start consuming them more and more.
β€6π€3π1 1
BREAKING π¨: Advanced Machine Intelligence (AMI), founded by Yann LeCun raised $1.03B in a seed round.
βAMI is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe.β
βAMI is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe.β
π€5π2
BREAKING π¨: According to Axios, Meta has aquired @moltbook, a social network for AI agents which became popular along with a rise of OpenClaw agents.
Looks like all AI agents will get their Facebook page at some point.
Looks like all AI agents will get their Facebook page at some point.
π7π4πΏ3
This media is not supported in your browser
VIEW IN TELEGRAM
Google is rolling out a new Gemini experience in Docs, Sheets, and Slides, allowing users to offload more tasks to AI.
Gemini will be able to pull context from relevant sources and generate or modify the document's content.
I have big hopes on this feature π
Gemini will be able to pull context from relevant sources and generate or modify the document's content.
I have big hopes on this feature π
β€7π1π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
Google released a new embedding multimodal model, Gemini Embedding 2, with SOTA performance!
Unimodal and multimodal π
Unimodal and multimodal π
π₯7π4π€1π1
Google tests Multi-agent planning mode on Gemini Business
Google is testing a Gemini Enterprise feature in Workspace to identify task specialists and create delegation plans.
π #gemini
Google is testing a Gemini Enterprise feature in Workspace to identify task specialists and create delegation plans.
π #gemini
TestingCatalog
Google tests Multi-agent planning mode on Gemini Business
Google is testing a Gemini Enterprise feature in Workspace to identify task specialists and create delegation plans.
π3
Anthropic launches AI code review tool for Claude Teams & Enterprise
Anthropic has introduced Code Review, an automated PR evaluation tool in research preview for Team and Enterprise users. It deploys multiple AI agents to detect, verify, and prioritize bugs, delivering detailed feedback. Pricing ranges from $15β25 per review.
π #claude
Anthropic has introduced Code Review, an automated PR evaluation tool in research preview for Team and Enterprise users. It deploys multiple AI agents to detect, verify, and prioritize bugs, delivering detailed feedback. Pricing ranges from $15β25 per review.
π #claude
TestingCatalog
Anthropic launches AI code review tool for Claude Teams & Enterprise
What's new? Anthropic launched Code Review PR evaluation tool using AI agents to inspect errors and rank severity for team and enterprise beta;
β€2π1
Google launches new multimodal Gemini Embedding 2 model
Google has launched Gemini Embedding 2 in Public Preview via the Gemini API and Vertex AI, delivering a unified multimodal embedding model for text, images, video, audio, and PDFs. It supports 100+ languages and flexible dimensions for search, RAG, and clustering use cases.
π #gemini
Google has launched Gemini Embedding 2 in Public Preview via the Gemini API and Vertex AI, delivering a unified multimodal embedding model for text, images, video, audio, and PDFs. It supports 100+ languages and flexible dimensions for search, RAG, and clustering use cases.
π #gemini
TestingCatalog
Google launches new multimodal Gemini Embedding 2 model
What's new? Gemini embedding 2 supports text, image, video, audio and document embeddings in a unified space; available via Gemini API and Vertex AI with adjustable output dimensions;
π2
Claude for mobile got updated π
βImprovements to voice mode, transcription, LaTeX rendering, artifact display, large prompts perf, MCP connections, attachment uploads, and more.β
βImprovements to voice mode, transcription, LaTeX rendering, artifact display, large prompts perf, MCP connections, attachment uploads, and more.β
π₯5π4
TestingCatalog AI News π
Claude for mobile got updated π βImprovements to voice mode, transcription, LaTeX rendering, artifact display, large prompts perf, MCP connections, attachment uploads, and more.β
Anthropic is also working on a new bottom nav bar and upgraded onboarding experience for Claude on mobile.
π3 3π₯1π€1
Thinking Machines and NVIDIA announced a long-term strategic partnership to deploy at least 1 gigawatt of NVIDIA Vera Rubin systems to support Thinking Machinesβ frontier model training
βDeployment on NVIDIAβs Vera Rubin platform is targeted for early next year.β
βDeployment on NVIDIAβs Vera Rubin platform is targeted for early next year.β
π4π1
OpenClaw is recruiting beta testers for a new stealth project that may let companies to better scale OpenClaw agents across the organisation.
π4β€3