Preventing Human-Agent Trust Exploitation in AI Agents
Your health data agent says: “Your sleep quality improved 23% this month compared to last month.” You adjust your bedtime routine, change your medication timing, or skip a doctor’s appointment because “the AI says I’m improving.” But what if the 23% was hallucinated? What if the agent compared 30 days to 28 days without normalising? What if one of the tool calls failed and the agent filled in the gap with a plausible-sounding number? ...