Go to main content

Large Language Models (LLMs) now write with a fluency and persuasiveness that can subtly steer users’ choices. When their outputs lack clear and comprehensible explanations, this persuasive power risks undermining human decision-making capacity, raising serious ethical concerns. Current explainable artificial intelligence (XAI) techniques focus primarily on technical transparency for epistemic purposes (how a model works); they are rarely intended to reveal to the user the kind of influence they are subject to. Drawing on the Indifference View of manipulation, we advance a preliminary framework that reconceives explainability as both an epistemic and an ethical imperative. The core idea is based on explanatory metadata: layered annotations that accompany model outputs with four complementary types of explanation-informative, justificatory, causal, and precautionarywhich give models the ability to detail the reasons underlying the influence they exert. Doing so shifts the XAI goal from mere transparency to responsible influence. It positions explanations as a safeguard against the manipulative behavior of generative AI systems, laying the groundwork for future methods that measure, audit, and actively constrain ethically problematic influence.