Salesforce reveals digital twin for business ops so your business can test AI agents before deployment

Salesforce Agentforce
(Image credit: Salesforce)

  • Many AI pilots fail real-world operations and 95% of GenAI pilots don’t reach production, Salesforce claims
  • CRMArena-Pro lets enterprises stress-test their AI agents with digital twins
  • Two new benchmarks are used for stress-testing AI agents

Salesforce says enterprises are struggling with their AI pilots failing in real-world operations, and has launched CRMArena-Pro, a new service to allow businesses to create a digital twin of their operations to stress-test AI agents before they get deployed.

The company cited recent MIT research which found 95% of generative AI pilots don’t even reach the production stage.

CRMArena-Pro evaluates AI agents on real tasks, like customer service, sales forecasting and supply chain disruptions, but using synthetic data that’s been validated by experts.

Salesforce lets you stress-test AI agents using digital twins

“CRMArena-Pro creates a rigorous, context-rich simulated enterprise environment framework with synthetic data, where it can safely evaluate API calls to relevant systems, as well as the ability to safeguard PII data,” the company wrote in an announcement.

By adding real-world noise into the test environment, CRMArena-Pro can better evaluate performance, strengthen resilience and bridge the gap between pre- and post-deployment.

“The result is AI agents that are capable, consistent, trustworthy, and agentic enterprise-ready.”

Companies can also see how AI agents handle real-world challenges like messy data, legacy systems and complex workflows.

Salesforce noted part of the complexity comes from the vast array of models available to choose today, and knowing which specific model or combination of models to use isn’t so simple.

To that tune, the company has published two new benchmarks to measure agent performance: MCP-Eval for evaluation through synthetic tasks and MCP-Universe, which adds real-world tasks and execution-based evaluators to stress-test agents in complex scenarios.

In a previous post, Salesforce noted that CRMArena-Pro “lays the groundwork for the next frontier: Enterprise General Intelligence” - and for now, users can expect “safe, capable and impactful” AI for all organizations.

You might also like

TOPICS

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.