\addbibresource

FlawsOfOthers.bib

A Mathematical Theory of Discursive Networks

Juan B. Gutiérrez¹¹1juan.gutierrez3@utsa.edu Department of Mathematics, University of Texas at San Antonio.

(July 23, 2025)

Abstract

Large language models (LLMs) turn writing into a live exchange between humans and software. We characterize this new medium as a discursive network that treats people and LLMs as equal nodes and tracks how their statements circulate. We define the generation of erroneous information as invalidation (any factual, logical, or structural breach) and show it follows four hazards: drift from truth, self-repair, fresh fabrication, and external detection. We develop a general mathematical model of discursive networks that shows that a network governed only by drift and self-repair stabilizes at a modest error rate. Giving each false claim even a small chance of peer review shifts the system to a truth-dominant state. We operationalize peer review with the open-source Flaws-of-Others (FOO) algorithm: a configurable loop in which any set of agents critique one another while a harmonizer merges their verdicts. We identify an ethical transgression, epithesis, that occurs when humans fail to engage in the discursive network. The takeaway is practical and cultural: reliability in this new medium comes not from perfecting single models but from connecting imperfect ones into networks that enforce mutual accountability.

1 Introduction

Large Language Models (LLMs) are advanced artificial intelligence systems trained on vast amounts of text data to generate human-like language. These models utilize deep learning techniques to understand and produce text based on the patterns and structures found in their training data \parencitevaswani2017attention, brown2020language. LLMs have demonstrated impressive capabilities in various natural language processing tasks, including text generation, translation, and question-answering.

Despite their remarkable performance, LLMs are prone to generating false or misleading statements, a phenomenon often referred to as “hallucinations” \parenciteji2022survey. However, the metaphor of hallucination is limited: it implies a private sensory distortion, whereas an unfounded LLM assertion can be circulated, cited, and acted upon as fact. Throughout this paper we therefore use the broader term invalidation, and we show that what is commonly referred to as hallucination is just one of the many manifestations of invalid information.

Invalidations in LLMs can manifest as factual inconsistencies, logical contradictions, or entirely fabricated content that appears plausible \parencitemaynez2020faithfulness. This issue is exacerbated by the lack of a verification mechanism within the models themselves \parencitebender2021dangers, weidinger2021ethical. Although retrieval-augmented generation and self-consistency checks \parencitelewis2020retrieval,wang2023self reduce the problem, a substantial share of outputs remains unreliable-enough to undermine trust in practical deployments.

Empirical evaluations show that LLMs continue to produce non-trivial rates of factual error and harmful content after instruction tuning and reinforcement learning. Studies in medical domains have documented significant factual inaccuracies in model answers [thirunavukarasu2023large, sallam2023chatgpt], and work on adversarial prompting has demonstrated that safety-trained models still emit disallowed content [zou2023universal, chao2023jailbreaking].

Several factors contribute to the occurrence of invalid information, including biases in training data, limitations in knowledge representation, and the models’ tendency to prioritize fluency over factual accuracy \parencitelin2022truthfulqa. The prevalence and impact of invalidations are significant, with quantitative evaluations revealing that they occur in up to 30% of generated responses, substantially affecting the trustworthiness of these models \parenciteji2022survey, lin2022truthfulqa. Moreover, studies have shown that LLMs can generate false information even when explicitly prompted to be truthful \parenciteevans2021truthful, underscoring the challenge of aligning model outputs with factual correctness.

This article is organized as follows: Section 1.1 introduces the concept of invalidation as a broader alternative to hallucination; Section 1.2 situates invalidation within established cognitive and media theories, showing it as a universal feature of both human and artificial cognition; Section 1.3 examines how invalidations propagate through interconnected communication systems; Section 2.3 provides an information-theoretic basis for why verification is fundamentally easier than generation; Section 2.4 formalizes discursive networks as mathematical structures with actors, statements, and update rules; Section 2.5 develops three progressively complex models of invalidation dynamics: single-network with binary states, single-network with emergent invalidation, and cross-network detection; Section 3 demonstrates the mathematical consistency of these models through theoretical analysis and parameter exploration; finally, Section 4.2 addresses ethical concerns including epithesis, energy costs, and epistemic diversity, before outlining future research directions.

The scope of this manuscript encompasses both theoretical foundations and practical implementations. We establish mathematical floors on invalidation probability, develop network models for error propagation and detection, and present the Flaws-of-Others (FOO) algorithm with cryptographic integrity verification. The analysis focuses on invalidations that emerge during inference in large language models and mitigation strategies based on cross-agent critique. While we do not provide exhaustive failure catalogues or benchmark comparisons, we advance a unified mathematical framework demonstrating how networks of imperfect agents can achieve error rates below what any individual agent attains.

1.1 From “Hallucination” to Invalidation

The word hallucination has become the default label for false statements generated by LLMs. Borrowed from perceptual psychology, it misses two crucial aspects. First, an LLM’s error is not confined to a private experience; it can be adopted by readers and propagate through networks, amplifying misinformation \parencitecrawford2021excavating. Second, focussing on hallucinations alone narrows the research agenda, leaving other error classes (logical contradictions, format violations, ethical breaches) under-examined.

We call any output that violates a constraint set of facts, logic, norms, and formats, an invalidation. Because an autoregressive decoder maximises next-token likelihood rather than global consistency, a non-zero slice of probability mass inevitably falls outside the constraints.

In practice, invalidation surfaces along at least five recurring archetypes that differ in locus and detectability. First, hallucination denotes the introduction of content that is ungrounded in any trusted source or context; large-scale surveys show it to be pervasive even in the highest-performing models \parencitehuang2025survey. Second, contradiction captures internally inconsistent statements that coexist within a single generation, a failure mode quantified and mitigated by prompt-based self-refinement techniques \parencitemundler2023selfcontradiction. Third, deductive error arises when the model draws logically invalid conclusions from true premises, an error family systematically stress-tested with adversarial perturbations \parencitehoppe2025deductive. Fourth, pragmatic impropriety concerns outputs that violate social or professional norms, including toxicity, hate speech, or privacy leakage; the RealToxicityPrompts benchmark revealed that even innocuous inputs can trigger toxic degeneration \parencitegehman2020realtoxicity. Finally, format violation occurs when the model breaks explicit structural constraints (e.g., JSON Schema), jeopardising downstream machine consumption; work with JSONSchemaBench shows that such violations remain stubbornly frequent despite constrained decoding \parencitegeng2025jsonschema.

Each error class manifests one predicate: the content fails to match a state of the world. That predicate admits representation with an given invalidation rate. The models in Section 2.4 use the rate alone and does not depend on class labels while it quantifies the chance that any output lacks validity.

Taken together, these archetypes point to a broad invalidation family, potentially with more members, underscoring the need for evaluation suites and mitigation strategies that address the full spectrum of failure modes. Figure 1 schematizes this superset-subset relationship.

Figure 1: Venn-style illustration of members of a broader set of invalid outputs produced by large language models.

1.2 Invalidation as a Universal Feature of Human and Artificial Cognition

Invalidation in contemporary LLM outputs is not an isolated flaw unique to artificial intelligence but mirrors well-documented behaviors in human discourse. This similarity suggests that invalidation is not simply a by-product of autoregressive sampling, but a potentially universal cognitive process rooted in the fundamental nature of information processing through language. It emerges when any complex agent (biological or artificial) operates under uncertainty, bounded rationality, and social constraints.

Human cognition systematically prioritizes narrative coherence over factual accuracy, a tendency that emerges not from individual pathology but from the fundamental architecture of meaning-making itself. When confronted with contradictory evidence, both individuals and groups construct elaborate justifications that preserve existing belief structures rather than revising them \parencitefestinger1957cognitive, cohen2001states. This preference for coherence manifests across scales: from the micro-level impression management that shapes everyday social interactions \parencitegoffman1959presentation to the macro-level collective narratives that enable societies to ignore systemic atrocities \parencitecohen2001states. Remarkably, this same structural bias toward local coherence over global accuracy appears in large language models, where autoregressive architectures favor maintaining consistency with previous tokens even at the expense of factual correctness. The parallel suggests that invalidation arises not from a flaw in either human or artificial systems, but from a deeper computational trade-off inherent to any agent that must construct meaning from sequential, uncertain information.

The medium itself acts as an epistemic filter, determining not just what information reaches us but what we accept as real, a process that operates identically whether the medium is television, print journalism, or a large language model. Media theorists have long recognized that truth emerges less from content evaluation than from structural repetition: the same claim, encountered repeatedly through trusted channels, eventually sediments into accepted fact regardless of its veracity \parencitegerbner1976living, mcluhan1964understanding. This manufacturing of consensus operates through cascading filters (economic incentives, institutional biases, and technological affordances) that systematically amplify certain narratives while suppressing others \parenciteherman1988manufacturing. Large language models instantiate this same filtering mechanism at an unprecedented scale: their training corpora encode the biases of millions of sources, their attention mechanisms privilege frequently repeated patterns, and their optimization objectives reward fluent reproduction over factual verification. The result is a computational echo chamber where invalidations, once embedded in training data, achieve the same truth-like status through sheer statistical dominance that media repetition grants to human beliefs.

Propaganda operates by exploiting a fundamental vulnerability in epistemic systems: sustained repetition of coordinated falsehoods eventually overwhelms the capacity for empirical verification, creating an alternate reality that becomes self-reinforcing through social proof. This mechanism, which political theorists identify as the cornerstone of totalitarian control, functions by flooding the information environment with internally consistent but externally false narratives until the sheer cognitive cost of maintaining skepticism exceeds most people’s capacity \parencitearendt1951origins, ellul1965propaganda. The parallel with large language models is striking: trained on billions of documents where certain false narratives appear thousands of times, these systems internalize misinformation not through ideological commitment but through pure statistical frequency. Just as propaganda succeeds by making lies more cognitively available than truth \parenciteellul1965propaganda, LLMs generate invalidations by sampling from probability distributions where well-represented falsehoods outweigh poorly-documented facts. The computational architecture thus recreates, without intention or awareness, the same reality-distortion mechanisms that human propagandists deploy deliberately.

Invalidation emerges from the fundamental computational shortcuts that make complex reasoning tractable; these shortcuts manifest identically in biological neural networks and artificial transformers. The core mechanism is substitution: when faced with difficult questions about truth or probability, both humans and LLMs unconsciously replace them with easier questions about familiarity and similarity \parencitetversky1974judgment, kunda1990motivated. This substitution operates through dual channels: the availability heuristic replaces “what is true?” with “what comes easily to mind?”, while the representativeness heuristic replaces “what is probable?” with “what resembles my prototype?” In large language models, these exact substitutions occur mechanistically—the softmax function literally converts truth-seeking into frequency-matching, while attention heads select tokens based on similarity rather than veracity. The result is a convergent failure mode where both human reasoning and machine generation systematically mistake statistical patterns for factual reality, producing confident invalidations that feel true precisely because they align with existing distributions rather than external facts \parencitekunda1990motivated.

The cross-disciplinary convergence of these findings implies that invalidation is not an accidental error mode but a structural consequence of how intelligent systems manage complexity, uncertainty, and contradiction. This insight reframes LLM invalidation as a computational echo of cognitive strategies humans employ. Rather than indicating a breakdown of alignment, it may reveal the presence of alignment to socially and contextually shaped heuristics that guide behavior in uncertain conditions.

Addressing invalidation in LLMs will require approaches that incorporate sociological, psychological, and media-theoretical models of belief formation and narrative control. At the same time, observing how LLMs generate invalidations may provide new empirical traction for understanding human cognitive phenomena such as confirmation bias, belief perseverance, and collective denial. The interdependence between artificial and human cognition in this respect suggests that solutions to the propagation of invalidation may emerge not solely from engineering but from a broader inquiry into the structure of meaning-making itself.

1.3 Risks of Invalidation Spill-over Through Discursive Networks

Invalidations produced by large language models rarely remain isolated. Once released into public channels they propagate through interconnected systems of communication. A discursive network is a large-scale ecosystem of human and machine agents whose utterances circulate, reinforce, and mutate through repeated exchange. When an LLM instantiates many synthetic voices that emit high volumes of text, those voices become additional nodes that mediate narrative transmission and amplification.

We use the term discursive network, rather than the established discourse network, to signal a McLuhanian twist. In classical discourse-network analysis the medium is a passive conduit: speeches, papers, and news items are shuffled among human actors while the underlying carrier remains inert. In the networks that include LLMs the carrier intervenes. The medium does not just move the message; it edits, rewrites, and recombines it at every hop. By switching from discourse to discursive we emphasise that language itself is now produced inside the network dynamics, co-authored by the very infrastructure that transmits it; this is a direct extension of McLuhan’s dictum from “the medium is the message” to “the medium acts on the message.”

Discursive networks consist of nodes (agents or messages) and edges (interactions, citations, reshares). Closely related structures appear in political-science discourse-network analysis, where policy actors and speech acts form time-evolving coalitions \parenciteleifeld2014,leifeld2012. Kittler’s media archaeology likewise foregrounds how technological substrates shape what counts as meaningful speech \parencitekittler1990. Our focus on generative models extends these traditions by treating LLM outputs as first-class network nodes.

Humans struggle to separate machine-generated prose from human writing. Controlled studies across genres report identification accuracy only slightly above chance (about 55-65 %) and show that readers rely on fragile surface heuristics such as pronoun frequency and stylistic fluency \parenciteippolito2020,DBLP:conf/acl/GehrmannSR19. In a discursive network this limitation matters: synthetic invalidations, mis-taken for human statements, are more readily reposted or cited.

Once injected, invalid content spreads via echo-amplification. Research on social-media communities finds that clusters organized in bow-tie topologies (tightly knit cores with radiating peripheries) amplify low-quality or misleading messages more than high-quality ones \parencitegarimella2018. Agents in these clusters can absorb and relay invalidations, algorithmic or human, without verification, entrenching flawed narratives.

Detection tools provide only partial relief. State-of-the-art AI detectors fall below 80 % accuracy on paraphrased or adversarial samples and show biases toward false positives on human text \parencitehuang2024robust,tufts2024practical,sadasivan2025reliably. Human moderators fare little better, typically around 65 % in blinded settings \parenciteippolito2020. Because both machines and people misclassify a substantial share of content, invalidations re-enter discourse with minimal friction.

Mitigating spill-over therefore demands a synthesis of discourse-network analysis, community-structure research, and detection studies. High- volume clusters of unverified claims need tracing, and interventions should target influential nodes, always within robust privacy and governance constraints. Recognising synthetic agents as network participants reframes classic media theory for the generative era: discursive-network analysis becomes both a descriptive lens on spill-over risk and a practical framework for targeted intervention.

2 Methods

This section lays out the analytical and computational machinery that supports the paper’s argument. We begin by establishing a mathematical floor on invalidation probability, proving that no finite-loss LLM can achieve zero error rate (Section 2.1). We then categorize user-model interactions into three functional classes—critique, ideation, and product-oriented generation—and show why critique operations are computationally cheapest, residing near the peak of the model’s output distribution (Section 2.3).

Building on these foundations, we formalize discourse as a network $N=(A,S,P,I,C,B,U,G)$ whose nodes exchange and validate statements under well-defined update rules (Section 2.4). We analyze three progressively richer belief dynamics (Section 2.5): first, a single-network model with binary truth states, proving convergence to a unique equilibrium determined by flip probabilities; second, an extended model incorporating spontaneous invalidation generation, showing how fabrication rates push systems into error-dominant regimes; and third, a dual-network model with cross-detection, deriving conditions under which external scrutiny reduces aggregate error below single-network baselines.

These theoretical results culminate in practical design principles: we derive the minimum number of cross-checking agents needed to achieve any target error tolerance (Section 2.5.5) and present the Flaws-of-Others (FOO) algorithm that implements these detection mechanisms in software, complete with cryptographic integrity verification to prevent post-hoc tampering (Section 2.6). Together, these methods provide both the mathematical framework for understanding invalidation propagation and the computational tools for mitigating it in practice.

2.1 Floor on invalidation probability

This subsection proves that a strictly positive invalidation probability is unavoidable for any LLM whose cross-entropy loss on its training distribution remains finite.

A discursive network has a particular standard, e.g. factual accuracy, logical coherence, a set of safety rules, etc. Every sequence that meets this standard is called “valid.” Whenever a sequence breaks the standard, we have an “invalidation.” In symbols, the indicator $C(x)=1$ announces that sequence $x$ is valid, and the collection $S=\{x:C(x)=0\}$ captures every possible invalidation.

The training corpus itself, represented by a probability distribution $Q$ , might be imperfect; a fraction $q=Q(S)$ of its mass lies inside $S$ . This fraction $q$ measures how much invalidating content appears in the training data, e.g. factual errors in bibliographic sources, biased statements in news sources, outdated information in historical documents, logical contradictions in discussion forums, etc.

After training the language model on this corpus (including pre-training via next-token prediction, instruction tuning, and safety fine-tuning), we obtain a model whose learned distribution is $P_{\theta}$ . The central question is how much probability this trained model still places on $S$ , i.e. what chance it retains of producing an invalidation despite all the training improvements applied.

Empirical studies provide sobering context: medical-domain evaluations report factual errors in the 15–30 % range [thirunavukarasu2023large, sallam2023chatgpt], while adversarial assessments find 3–5 % harmful outputs even after safety training [zou2023universal, chao2023jailbreaking]. These persistent rates reflect structural limitations that no training regime can fully overcome.

Given $\mathrm{KL}(Q\parallel P_{\theta})$ , the Kullback-Leibler divergence from the reference distribution $Q$ to the model distribution $P_{\theta}$ , we can ennounce the following lemma.

Lemma 2.1 (Invalidation floor).

If $P_{\theta}\ll Q$ and $\mathrm{KL}(Q\parallel P_{\theta})<\infty$ , then

P_{\theta}(S)\geq q\exp\bigl{(}-\mathrm{KL}(Q\parallel P_{\theta})/q\bigr{)}.

$P_{\theta}\ll Q$ is known as known as absolute continuity. Whenever the training data assigns zero probability to a sequence, the trained model also assigns zero probability to that sequence. More formally: $P_{\theta}(x)>0$ whenever $Q(x)>0$ , and $P_{\theta}(x)=0$ whenever $Q(x)=0$ .

This inequality says that the residual invalidation probability cannot be driven to zero unless either the corpus is completely free of invalidations ( $q=0$ ) or the model’s divergence from the corpus becomes infinite, which would imply infinite loss. The floor can be exponentially small when the model departs sharply from the training data through extensive fine-tuning, yet it never vanishes entirely.

Proof.

Expand the KL divergence and partition the sum by membership in $S$ :

\mathrm{KL}(Q\parallel P_{\theta})=\sum_{x\in S}Q(x)\log\frac{Q(x)}{P_{\theta}% (x)}+\sum_{x\notin S}Q(x)\log\frac{Q(x)}{P_{\theta}(x)}.

Since KL divergence terms are non-negative, we have

\mathrm{KL}(Q\parallel P_{\theta})\geq\sum_{x\in S}Q(x)\log\frac{Q(x)}{P_{% \theta}(x)}.

Apply the log-sum inequality (for positive numbers $a_{i},b_{i}$ with $A=\sum_{i}a_{i}$ and $B=\sum_{i}b_{i}$ , we have $\sum_{i}a_{i}\log\frac{a_{i}}{b_{i}}\geq A\log\frac{A}{B}$ ) to the right-hand side. Setting $a_{x}=Q(x)$ and $b_{x}=P_{\theta}(x)$ for $x\in S$ , we get:

\sum_{x\in S}Q(x)\log\frac{Q(x)}{P_{\theta}(x)}\geq Q(S)\log\frac{Q(S)}{P_{% \theta}(S)}=q\log\frac{q}{P_{\theta}(S)}.

Therefore $q\log\frac{q}{P_{\theta}(S)}\leq\mathrm{KL}(Q\parallel P_{\theta})$ . Solving for $P_{\theta}(S)$ yields:

P_{\theta}(S)\geq q\exp\bigl{(}-\mathrm{KL}(Q\parallel P_{\theta})/q\bigr{)}.

∎

The inequality is a logical guard-rail: when $\mathrm{KL}\gg q$ the numerical floor may be tiny, yet it proves that any finite-loss model retains strictly positive invalidation probability. External verification layers are therefore indispensable.

The practical consequence is inescapable: any errors, contradictions, or policy violations that survive in the training data leave an indelible statistical trace in the model. However sophisticated the training regimen (whether through reinforcement learning from human feedback, constitutional AI, or advanced safety fine-tuning) $P_{\theta}(S)$ remains strictly positive, so some risk of invalid output persists. This mathematical inevitability motivates our investigation of asymmetric task difficulties (Section 2.3), where we show that detecting invalidations is computationally easier than avoiding them during generation.

From Inevitability to Mitigation.

Lemma 2.1 establishes that $P_{\theta}(S)>0$ whenever training loss is finite, i.e. invalidation is mathematically inevitable. This floor exists because LLMs maximize likelihood over their training distribution $Q$ , which itself contains errors at rate $q>0$ . The bound $P_{\theta}(S)\geq q\exp(-\mathrm{KL}(Q\parallel P_{\theta})/q)$ shows that even aggressive fine-tuning (large KL divergence) cannot eliminate invalidations entirely.

This inevitability motivates a strategic pivot: rather than pursuing the impossible goal of zero invalidation through model improvement alone, we must design systems that detect and correct errors post-generation. The key insight is that different types of LLM outputs have different amenability to verification. As we show next, verification tasks (e.g. asking models to identify flaws in existing text) lie in high-probability regions of the output distribution and thus can be generated reliably even by imperfect models. This asymmetry between generation difficulty and verification difficulty forms the theoretical foundation for our multi-agent approach: we harness the relative ease of critique to construct networks where mutual verification pushes system-wide error rates below what any individual model achieves.

2.2 Information-Theoretic Basis for Verification Advantage

The inevitability of invalidations established in Lemma 2.1 raises a crucial question: if generation necessarily produces errors, can we at least detect them reliably? We now prove that verification is fundamentally easier than generation, providing the theoretical foundation for our multi-agent approach.

Theorem 2.1 (Variable-Length Entropy Comparison).

Let $\mathcal{G}$ and $\mathcal{V}$ be generation and verification tasks. Let:

•

$(L_{g},Y_{g})$ and $(L_{v},Y_{v})$ : joint random variables for length and content
•

$P_{g}^{(n)}$ and $P_{v}^{(n)}$ : conditional distributions $P(Y_{g}|L_{g}=n)$ and $P(Y_{v}|L_{v}=n)$
•

$\pi_{g}(n)$ and $\pi_{v}(n)$ : length distributions $P(L_{g}=n)$ and $P(L_{v}=n)$

Then (with all logarithms base 2, giving entropy in bits):

H(L_{g},Y_{g})-H(L_{v},Y_{v})=H(L_{g})-H(L_{v})+\sum_{n}\pi_{g}(n)H(Y_{g}|L_{g% }=n)-\sum_{n}\pi_{v}(n)H(Y_{v}|L_{v}=n)

where the conditional entropy is:

H(Y_{g}|L_{g}=n)=-\sum_{y\in\operatorname{supp}P_{g}^{(n)}}P_{g}^{(n)}(y)\log P% _{g}^{(n)}(y)

and similarly for $H(Y_{v}|L_{v}=n)$ .

Proof.

By the chain rule for joint entropy:

$\displaystyle H(L_{g},Y_{g})$	$\displaystyle=H(L_{g})+H(Y_{g}\|L_{g})$	(1)
	$\displaystyle=H(L_{g})+\sum_{n}\pi_{g}(n)H(Y_{g}\|L_{g}=n)$	(2)
	$\displaystyle=H(L_{g})+\sum_{n}\pi_{g}(n)\left[-\sum_{y\in\operatorname{supp}P% _{g}^{(n)}}P_{g}^{(n)}(y)\log P_{g}^{(n)}(y)\right]$	(3)

Similarly for $(L_{v},Y_{v})$ :

	$\displaystyle H(L_{v},Y_{v})$	$\displaystyle=H(L_{v})+\sum_{n}\pi_{v}(n)H(Y_{v}\|L_{v}=n)$		(4)
		$\displaystyle=H(L_{v})+\sum_{n}\pi_{v}(n)\left[-\sum_{y\in\operatorname{supp}P% _{v}^{(n)}}P_{v}^{(n)}(y)\log P_{v}^{(n)}(y)\right]$		(5)

Subtracting the second equation from the first yields the stated result. ∎

Remark 2.1 (Why General Theorems Fail).

We cannot prove that verification is universally easier than generation because:

1.

Length-content dependence: In autoregressive models, length often encodes information (e.g., “yes” vs. detailed explanations)
2.

Unobservable distributions: We cannot measure the full tail of $P_{g}^{(n)}(y)$ or $P_{v}^{(n)}(y)$ empirically, making entropy estimates unreliable
3.

Task-specific constraints: The set of valid sequences varies dramatically by domain and cannot be bounded universally

Any rigorous claim must be task-specific and empirically grounded.

For specific task pairs where we can measure output distributions empirically, define the observable concentration ratio:

R_{k}(P)=\frac{\sum_{i=1}^{k}p_{i}}{\max(k/\lvert\operatorname{supp}P\rvert,% \epsilon)}

where $p_{1}\geq p_{2}\geq\ldots$ are the ranked probabilities of distribution $P$ and $\epsilon>0$ prevents division by zero.

Empirical evidence suggests verification outputs are more concentrated than generation outputs:

•

Generation diversity: \textcitehashimoto2019unifying measured that GPT-2’s generation coverage (fraction of human-written continuations assigned high probability) is only 15-20%, indicating that probability mass is spread across many valid but unseen completions. \textciteholtzman2020curious further showed that nucleus sampling with $p=0.95$ is needed to achieve human-like text diversity, confirming that generation probability is dispersed across a large tail.
•

Verification concentration: \textciteschick2021exploiting found that when prompted for binary classification, over 90% of GPT-3’s probability mass concentrates on the top 2-3 tokens (e.g., “Yes”/“No” plus punctuation variants). \textcitemin2022rethinking demonstrated that in-context learning for verification tasks achieves near-peak performance with just label tokens, suggesting the output distribution is highly peaked on a small vocabulary subset.
•

Direct comparison: \textcitekadavath2022language showed that language models’ self-evaluation of their own outputs clusters around confidence values of 0.1, 0.5, and 0.9 (high concentration), while their actual answer distribution spans hundreds of phrasings (low concentration). This asymmetry between evaluation and generation distributions supports our theoretical framework.

This observable difference in concentration, while not universal, appears consistently enough to motivate verification-based error reduction strategies.

Implications for Discursive Networks.

The empirical concentration differences documented above provide practical justification for why cross-agent critique can achieve detection rates $d$ that exceed invalidation rates $\lambda$ . When LLMs are tasked with verification—e.g., “find flaws” in peer outputs—they operate in the high-concentration regime where dominant patterns from training data guide responses. This contrasts with generation tasks that require exploring the long tail of the output distribution. While we cannot prove a universal entropy gap $H(Y_{g}|L_{g})-H(Y_{v}|L_{v})>0$ without task-specific assumptions (as shown in Theorem 2.1), the consistent empirical pattern of verification concentration exceeding generation concentration suggests that detection rates $d$ can systematically exceed fabrication rates $\lambda$ in practice. The FOO algorithm exploits this empirical regularity to achieve system-wide error reduction, even when individual agents remain fallible in generation.

2.3 Categorization of Mechanisms to Engage LLM Agents

We now examine how different engagement mechanisms map onto the information-theoretic landscape established above. During inference, a Transformer language model executes a single forward pass per token. The model transforms context $\mathbf{x}_{\leq t}$ into embeddings, applies $L$ fixed self-attention layers to produce hidden state $\mathbf{h}_{t}$ , and maps this to a token probability distribution via:

p(w_{t+1}\mid\mathbf{x}_{\leq t})=\mathrm{softmax}\bigl{(}W\mathbf{h}_{t}\bigr% {)}.

No gradient updates or objective-function evaluations occur at this stage; the only on-the-fly “optimization” is the decoding heuristic (greedy, top- $k$ , nucleus, or beam search) that selects the next token from the static distribution $p(\cdot)$ .

LLMs support a range of functional output types that can be systematically grouped into three core categories we introduce in this manuscript: ideation (constructive synthesis), critique (diagnostic evaluation), and product-oriented generation (goal-directed deliverables). These three umbrella functions organize diverse tasks, each with distinct inference properties, compositional demands, and distributional positions in the model’s output space. Existing literature identifies several subtypes that map naturally onto these categories \parencitebommasani2021opportunities, mialon2023augmented.

Critique output is structurally efficient to generate. Nested within critique are:

•

Flaws of others. This approach favors the strategy to ask for flaws or errors in the outputs of other agents, resulting in constructions that are frequent in training corpora thus placing them near the peak of $p(\cdot)$ . Even shallow decoding heuristics retrieve frequent patterns with high fluency and relevance \parencitewei2022chain.
•

Classification and Disambiguation, such as assigning sentiment, stance, or intent. These tasks resolve ambiguity and often underlie evaluation pipelines \parencitemialon2023augmented.
•

Restatement and Summarization, which surface structural coherence or hidden biases by rephrasing or compressing content. When used diagnostically, they reveal implicit assumptions or inconsistencies \parencitemaynez2020faithfulness.

Ideation output demands compositional novelty. Prompts that ask the model to hypothesize mechanisms, imagine alternatives, or propose designs typically land in the tail of the output distribution. Generating them requires broader exploration (via large beam width or elevated temperature) and exhibits greater output variance.

Within ideation, we find:

•

Instruction and Procedural Guidance, where the model scaffolds user understanding or explains concepts in sequence. These tasks require didactic clarity and often invoke implicit audience modeling \parenciteouyang2022training.
•

Meta-Reasoning and Strategy Output, which includes multi-step planning, evaluating hypotheses, or chain-of-thought reasoning. These outputs require recursive coherence and longer dependency tracking \parencitewei2022chain.

Product-oriented output targets the generation of external artifacts: source code, formatted markup, structured data, or interactive dialogue. These tasks often carry hard constraints and precision demands. Simple forms (e.g., boilerplate code) reside in high-probability zones, while structurally complex or compositional outputs require deeper exploration.

Included in this class are:

•

Formalism Translation, such as converting text to JSON, SQL, or LaTeX. This requires syntax-aligned generation and tight coupling between prompt and output form \parencitereynolds2021prompt.
•

Retrieval-Simulation, where the model reproduces facts or references learned during pretraining. These outputs appear fluent but are not grounded in current truth, making them useful but epistemically fragile \parencitebommasani2021opportunities.
•

Social Interaction Simulation, which includes emulating customer support, roleplay, or therapeutic dialogue. These are product-like in that the output is consumed as experience or interface, and they require tone, persona, and context alignment \parencitejo2025proxyllm, park2023generative, song2024typing.

Crucially, requests to “find flaws” tend to align with high-probability lexical patterns that the model has seen many times during training (e.g., “One limitation is…,” “A potential confound is …,” “This argument assumes …”). These stigmergic patterns, i.e. emerging from indirect communication mediated by modifications of the environment [MARSH2008136], lie near the mode of $p(\cdot)$ , so they are reachable with minimal search depth and are often found by even the cheapest heuristic, such as greedy decoding.

By contrast, requests for constructive, future-oriented solutions typically require compositional novelty: the model must synthesize domain facts, propose unseen mechanisms, and articulate actionable steps. Such completions reside in lower-probability regions of the distribution, forcing the decoder to explore a broader beam or to sample deeper into the tail, both of which are algorithmically and computationally more demanding. In short, critique lives near the peak; creativity lives in the tail, explaining the empirical asymmetry in generation efficiency that we observe.

2.4 Discursive Network Formalization

To systematically study the phenomenon of invalidation, we propose a formal model that quantifies how invalidations propagate within a discursive network. This model considers actors as nodes in a network, with edges representing the exchange of statements. The goal is to understand how actors influence each other and how invalidations spread, and if and how it reaches an equilibrium state. The subsequent sections will detail the analytical machinery, yielding quantiative information, for studying invalidation in both human and artificial contexts.

Definition 2.1.

Discourse. In the context of a discursive network, discourse refers to the structured process of communication and interaction between actors, $A=\{a_{1},a_{2},\ldots,a_{n}\}$ , through which they exchange, validate, invalidate, and attempt to persuade each other regarding the truth or falsity of a set of statements $S=\{s_{1},s_{2},\ldots,s_{m}\}$ . Discourse encompasses all forms of communication $C=\{C_{ij}\}$ between actors, where beliefs $B_{i}\subseteq S$ are shared, challenged, or reinforced, as well as the mechanisms of invalidation $I=\{I_{ij}\}$ , and persuasion $P=\{P_{ij}\}$ , which influence the evolution of each actor’s belief set. The outcome of discourse is governed by the update rules $U_{j}$ , which dictate how actors revise their beliefs based on the interactions they engage in.

Definition 2.2.

Discursive Network. A discursive network is a formal structure $N=(A,S,P,I,C,B,U,G)$ where:

•

$A=\{a_{1},a_{2},\ldots,a_{n}\}$ is the set of actors participating in the discourse.
•

$S=\{s_{1},s_{2},\ldots,s_{m}\}$ is the set of possible statements, where each statement can be either true or false.
•

$P=\{P_{ij}\mid i,j\in\{1,2,\ldots,n\}\}$ represents the persuasion functions, where $P_{ij}(s_{k})$ gives the likelihood that actor $a_{j}$ will adopt statement $s_{k}$ after receiving communication from actor $a_{i}$ .
•

$I=\{I_{ij}\mid i,j\in\{1,2,\ldots,n\}\}$ denotes invalidations, where $I_{ij}(s_{k},s_{l})$ signifies actor $a_{i}$ invalidating a statement $s_{k}$ held by actor $a_{j}$ using a contradictory statement $s_{l}$ .
•

$C=\{C_{ij}\mid i,j\in\{1,2,\ldots,n\}\}$ represents the communications between actors, where $C_{ij}$ is the set of statements communicated from actor $a_{i}$ to actor $a_{j}$ .
•

$B=\{B_{1},B_{2},\ldots,B_{n}\}$ represents the belief sets of the actors, where $B_{i}\subseteq S$ denotes the set of statements believed to be true by actor $a_{i}$ .
•

$U=\{U_{j}\mid j\in\{1,2,\ldots,n\}\}$ is the set of update rules that define how each actor’s belief set $B_{j}$ is modified in response to communications and invalidations.
•

$G=\{G_{1},G_{2},\ldots,G_{n}\}$ represents the goal functions of the actors, with $G_{i}:A\to 2^{S}$ specifying the set of statements actor $a_{i}$ seeks to convince other actors to believe.

The discursive network models the dynamics of belief formation, communication, persuasion, and invalidation among actors within a formal discourse setting.

Example. Consider a simple scenario with three actors $A=\{a_{1},a_{2},a_{3}\}$ and two statements $S=\{s_{1},s_{2}\}$ . Actor $a_{1}$ believes $s_{1}$ ( $B_{1}=\{s_{1}\}$ ) and wants $a_{2}$ and $a_{3}$ to also believe $s_{1}$ ( $G_{1}(a_{2})=G_{1}(a_{3})=\{s_{1}\}$ ). Actor $a_{2}$ believes $s_{2}$ ( $B_{2}=\{s_{2}\}$ ) and wants $a_{1}$ and $a_{3}$ to believe $s_{2}$ ( $G_{2}(a_{1})=G_{2}(a_{3})=\{s_{2}\}$ ). Actor $a_{3}$ is initially neutral ( $B_{3}=\emptyset$ ). Actor $a_{1}$ communicates $C_{12}=\{s_{1}\}$ to $a_{2}$ , who invalidates $s_{1}$ by presenting $I_{21}(s_{1},s_{2})$ . Actor $a_{3}$ , observing this interaction, updates their belief set based on the persuasion functions and update rules. This framework models the propagation of invalidation within a discursive network, capturing the dynamics of belief, communication, and influence. By formalizing these interactions, we can analyze and predict how invalidation affects the acceptance and rejection of statements among actors in the network.

2.5 Modeling Discursive Networks

2.5.1 Single-Network Two-State Model

Let $n=|A|$ be the actor count in the collapsed network with two mutually exclusive statements $S=\{r,f\}$ , where $r$ denotes a true statement and $f$ a false one. For comparison with empirical simulations we work exclusively with proportions. The population state at time $t$ is the column vector

\boldsymbol{\pi}(t)=\bigl{(}\pi_{r}(t),\pi_{f}(t)\bigr{)}^{\mathsf{T}},\qquad% \pi_{r}(t)=\frac{T(t)}{n},\pi_{f}(t)=\frac{F(t)}{n}=1-\pi_{r}(t),

where $T(t)$ and $F(t)$ are the respective counts of actors endorsing $r$ and $f$ . Micro-level flips are characterized by the probabilities $p$ for $r\rightarrow f$ and $q$ for $f\rightarrow r$ . These induce the population-level transition matrix

T=\begin{pmatrix}1-p&q\\ p&1-q\end{pmatrix},\qquad\boldsymbol{\pi}(t+1)=T\,\boldsymbol{\pi}(t).

(6)

The mapping of this model to Definition 2.2 is provided in Table 1.

Table 1: Mapping of discursive network elements to the single-network invalidation problem.

Element in $N$	Instantiation in single-network model
$A$	Unchanged actor set; proportions refer to $n=\|A\|$ .
$S=\{r,f\}$	Binary, mutually exclusive statements (true vs. false).
$P_{ij}$	$p$ if $B_{j}=\{r\}$ and $B_{i}=\{f\}$ ; $q$ if $B_{j}=\{f\}$ and $B_{i}=\{r\}$ .
$I_{ij}$	Contradiction if $B_{i}\neq B_{j}$ .
$C_{ij}$	Message containing $B_{i}$ .
$B_{i}$	$\{r\}$ or $\{f\}$ for each actor.
$U_{j}$	Switches $B_{j}$ with the corresponding probability, otherwise leaves it unchanged.
$G_{i}$	Persuade others to adopt actor $a_{i}$ ’s current belief.

Lemma 2.2 (Single-network invalidation propagation).

Let the single-network proportion state evolve according to

\boldsymbol{\pi}(t+1)=T\,\boldsymbol{\pi}(t),\qquad T=\begin{pmatrix}1-p&q\\[4% .0pt] p&1-q\end{pmatrix},\qquad\boldsymbol{\pi}(t)=\begin{pmatrix}\pi_{r}(t)\\ \pi_{f}(t)\end{pmatrix},\pi_{r}(t)+\pi_{f}(t)=1,

(7)

with flip probabilities $p,q\in(0,1)$ . The system has a unique fixed point

\boldsymbol{\pi}^{*}=\begin{pmatrix}\dfrac{q}{p+q}\\[6.0pt] \dfrac{p}{p+q}\end{pmatrix},

and the second eigenvalue of $T$ equals $1-p-q$ , whose modulus is strictly smaller than $1$ ; hence the Markov chain converges geometrically to $\boldsymbol{\pi}^{*}$ from any initial distribution.

Proof.

A fixed point satisfies $\boldsymbol{\pi}=T\,\boldsymbol{\pi}$ . Writing $\boldsymbol{\pi}^{\mathsf{T}}=(\pi_{r},\pi_{f})$ and expanding gives

	$\displaystyle\pi_{r}$	$\displaystyle=(1-p)\,\pi_{r}+q\,\pi_{f},$
	$\displaystyle\pi_{f}$	$\displaystyle=p\,\pi_{r}+(1-q)\,\pi_{f}.$

Because $\pi_{f}=1-\pi_{r}$ , the first line reduces to

p\,\pi_{r}=q\,\pi_{f}=q\,(1-\pi_{r}),\quad\Longrightarrow\quad\pi_{r}=\frac{q}% {p+q},\qquad\pi_{f}=1-\pi_{r}=\frac{p}{p+q}.

Thus the fixed point is unique. The characteristic polynomial of $T$ is $\lambda^{2}-(2-p-q)\lambda+(1-p-q)=0$ , whose roots are $\lambda_{1}=1$ and $\lambda_{2}=1-p-q$ . Because $0<p+q<1$ , we have $|\lambda_{2}|<1$ ; hence $T^{t}\to\boldsymbol{\pi}^{*}\mathbf{1}^{\mathsf{T}}$ as $t\to\infty$ , so every trajectory converges to $\boldsymbol{\pi}^{*}$ . ∎

Interpretation of Lemma 2.2.

In this binary model the two flip probabilities satisfy $p+q=1$ , meaning every update attempt switches an actor’s belief with probability one. The fixed point then simplifies to $\boldsymbol{\pi}^{*}=(q,\,p)^{\mathsf{T}}$ : the long-run proportion of actors endorsing $r$ equals the single parameter $q$ , while the proportion endorsing $f$ equals $p$ . Thus the equilibrium distribution mirrors the flip probabilities directly; increasing $p$ (the propensity to abandon $r$ ) linearly increases the eventual share of $f$ believers and decreases that of $r$ believers by the same amount.

2.5.2 Single-Network Emergent Invalidation Model

The two-state single-network model sets sthe stage for the analysis of the emergence of invalidations in a discursive network. To accomplish this, we first endow a single discursive network with per-statement fabrication and internal correction. This captures the behaviour of a single-instance LLM generating new text: invalidations are injected at hazard $\lambda$ , while subsequent self-reflections (or post-processing heuristics) invalidate a fraction of the false statements at hazard $q$ .

Setup.

Let the network be $N$ with actor set $A=\{a_{1},\dots,a_{n}\}$ and $n=|A|$ . At any time $t\in\mathbb{Z}_{\geq 0}$ , we track the counts $T(t)$ and $F(t)$ of actors endorsing true and false statements, respectively, with $T(t)+F(t)=n$ .

Working with raw counts becomes cumbersome when comparing networks of different sizes or analyzing asymptotic behavior. By converting to proportions, we obtain: (i) Scale invariance: Networks with 100 or 10,000 actors can be compared directly, (ii) Probabilistic interpretation: Proportions represent the probability that a randomly selected actor holds a given belief, and (iii) Mathematical tractability: Fixed-point analysis and stability results are cleaner in normalized coordinates.

Definition 2.3 (Normalized state).

The proportion state (or normalized state) of the single network at time $t$ is the vector

\boldsymbol{\pi}(t)=\bigl{(}\pi_{T}(t),\pi_{F}(t)\bigr{)},\qquad\pi_{T}(t)=% \frac{T(t)}{n},\pi_{F}(t)=\frac{F(t)}{n},

where $\pi_{T}(t)$ and $\pi_{F}(t)$ represent the fractions of actors endorsing true and false statements, respectively. The constraint $\pi_{T}(t)+\pi_{F}(t)=1$ is automatically preserved, reflecting that every actor holds exactly one belief at each time step.

Stochastic primitives.

Events are scaled per statement so that $\lambda$ , $p$ and $q$ remain commensurate.

Fabrication (invalidation).

In this case, $X$ represents the number of new falsehoods generated. The Poisson distribution is used to model the number of events occurring in a fixed interval of time, given a known average rate. Each true statement is independently falsified during $[t,t+1)$ :

X(t)~{}\text{Poisson}\bigl{(}\lambda\,T(t)\bigr{)}.

Internal flips.

$Z$ represents the number of true statements that become false with a fixed probability $p$ of becoming false (i.e. $p$ is the intrinsic truth→false hazard). $W$ represents the number of false statements that are corrected to become true with a fixed probability $q$ of being corrected ( $q$ models spontaneous acknowledgement or repair). Both follow a Binomial distribution. Truths can degrade and falsehoods can self-correct:

Z(t)\sim\mathrm{Binomial}\bigl{(}T(t),p\bigr{)},\quad W(t)\sim\mathrm{Binomial% }\bigl{(}F(t),q\bigr{)}.

Update equations.

Define

\Delta T(t)=-Z(t)+W(t),\qquad\Delta F(t)=X(t)+Z(t)-W(t).

Then

	$\displaystyle T(t+1)$	$\displaystyle=T(t)+\Delta T(t),$	$\displaystyle\pi_{T}(t+1)$	$\displaystyle=\pi_{T}(t)+\frac{\Delta T(t)}{n},$		(8)
	$\displaystyle F(t+1)$	$\displaystyle=F(t)+\Delta F(t),$	$\displaystyle\pi_{F}(t+1)$	$\displaystyle=\pi_{F}(t)+\frac{\Delta F(t)}{n},$		(9)

with $\pi_{T}(t+1)+\pi_{F}(t+1)=1$ preserved.

The mapping of this model to Definition 2.2 is provided in Table 2.

Table 2: Mapping of discursive-network elements to the single-network emergent-invalidation model. Unless noted otherwise, all rates are per statement.

Element in $N$	Instantiation in emergent-invalidation model
$A$	Fixed actor set; proportions refer to $n=\lvert A\rvert$ .
$S=\{r,f\}$	Binary statements ( $r$ = true, $f$ = false).
$P_{ij}$	Spontaneous flips: $p$ for $r\to f$ (appears in $Z(t)$ ), $q$ for $f\to r$ (appears in $W(t)$ ).
$I_{ij}$	Internal invalidation; realized as self-correction $\mathrm{Binomial}(F(t),q)$ when $i=j$ . (No cross-actor invalidation in the single-network setting.)
$C_{ij}$	Message from $a_{i}$ to $a_{j}$ containing $B_{i}$ .
$B_{i}$	Current belief of actor $a_{i}$ : $\{r\}$ or $\{f\}$ .
$U_{j}$	Update rule applying the three hazards:
	$X(t)\sim\mathrm{Poisson}(\lambda T(t))$ (fabrications)
	$Z(t)\sim\mathrm{Binomial}(T(t),p)$ (truth → false)
	$W(t)\sim\mathrm{Binomial}(F(t),q)$ (false → true)
$G_{i}$	Goal: persuade all other actors to adopt $B_{i}$ .

Lemma 2.3 (Single-network invalidation with fabrication).

Let the single-network proportion state evolve according to

\boldsymbol{\pi}(t+1)=T_{\lambda}\,\boldsymbol{\pi}(t),\qquad T_{\lambda}=% \begin{pmatrix}1-(p+\lambda)&q\\ p+\lambda&1-q\end{pmatrix},\qquad\boldsymbol{\pi}(t)=\begin{pmatrix}\pi_{r}(t)% \\[2.0pt] \pi_{f}(t)\end{pmatrix},\pi_{r}(t)+\pi_{f}(t)=1,

where $p,q\in(0,1)$ are the intrinsic flip probabilities, $\lambda\in\bigl{(}0,1-p\bigr{)}$ is the per-statement fabrication probability ( $r\to f$ ), and $p+\lambda+q<1$ . Then:

1. The system has a unique fixed point

\boldsymbol{\pi}^{*}=\begin{pmatrix}\dfrac{q}{p+\lambda+q}\\[6.0pt] \dfrac{p+\lambda}{p+\lambda+q}\end{pmatrix}.

2. The second eigenvalue of $T_{\lambda}$ is $1-(p+\lambda+q)$ , whose modulus is strictly smaller than $1$ ; hence the Markov chain converges geometrically to $\boldsymbol{\pi}^{*}$ from any initial distribution.

Proof.

A fixed point satisfies $\boldsymbol{\pi}=T_{\lambda}\,\boldsymbol{\pi}$ . Writing $\boldsymbol{\pi}^{\mathsf{T}}=(\pi_{r},\pi_{f})$ and expanding gives

	$\displaystyle\pi_{r}$	$\displaystyle=(1-p-\lambda)\,\pi_{r}+q\,\pi_{f},$
	$\displaystyle\pi_{f}$	$\displaystyle=(p+\lambda)\,\pi_{r}+(1-q)\,\pi_{f}.$

Because $\pi_{f}=1-\pi_{r}$ , the first line reduces to $(p+\lambda)\,\pi_{r}=q\,(1-\pi_{r}),$ which yields the fixed point in the statement. The characteristic polynomial of $T_{\lambda}$ is $\lambda^{2}-(2-p-\lambda-q)\lambda+(1-p-\lambda-q)=0$ , with roots $\lambda_{1}=1$ and $\lambda_{2}=1-(p+\lambda+q)$ . Since $p+\lambda+q<1$ , we have $|\lambda_{2}|<1$ ; therefore $T_{\lambda}^{\,t}\to\boldsymbol{\pi}^{*}\mathbf{1}^{\mathsf{T}}$ as $t\to\infty$ , so every trajectory converges to $\boldsymbol{\pi}^{*}$ . ∎

Interpretation.

The fabrication term $\lambda$ simply augments the ordinary truth-to-false hazard $p$ . Consequently the equilibrium share of false believers rises from $p/(p+q)$ (when $\lambda=0$ ) to $(p+\lambda)/(p+\lambda+q)$ , while the speed of convergence slows as the spectral gap $1-\lvert 1-(p+\lambda+q)\rvert$ narrows. Fabrication raises the inflow into the false state by $\lambda$ , while internal invalidation $q$ is unchanged. If $\lambda>q$ the system becomes “invalidation-dominant,” mimicking an LLM that generates more new errors than it self-repairs, an empirical regime reported in LLMs (cf. Section 3).

2.5.3 Cross-Network Invalidation-Detection Model

Now that we have studied single networks with and without spontaneous invalidation emergence, the next natural question is how to reduce invalidations. As we will see, using multiple discursive networks reduces invalidations.

Let the two discursive networks be $N_{1}$ and $N_{2}$ with actor sets $A_{1}=\{a_{11},\dots,a_{1n_{1}}\}$ and $A_{2}=\{a_{21},\dots,a_{2n_{2}}\}$ ; write $n_{k}=|A_{k}|$ .

Definition 2.4 (Normalized state).

For each network $N_{k}$ the proportion state at time $t$ is

\boldsymbol{\pi}_{k}(t)=\bigl{(}\pi_{T,k}(t),\pi_{F,k}(t)\bigr{)},\qquad\pi_{T% ,k}(t)=\frac{T_{k}(t)}{n_{k}},\pi_{F,k}(t)=\frac{F_{k}(t)}{n_{k}},\pi_{T,k}(t)% +\pi_{F,k}(t)=1,

where $T_{k}(t)$ and $F_{k}(t)$ are the counts of true and false statements, respectively.

Stochastic primitives.

All events are now specified so that their rates are comparable across networks regardless of size. In particular, fabrication is scaled by the current stock of true statements.

Falsehood generation (fabrication).

Each currently true statement in $N_{k}$ is independently falsified during $[t,t+1)$ . The total number of such events is

X_{k}(t)\sim\text{Poisson}\bigl{(}\lambda_{k}\,T_{k}(t)\bigr{)},

where $\lambda_{k}$ is the per-statement fabrication hazard.

Cross-network detection.

Each false statement in $N_{k}$ is noticed by $N_{j}$ with probability $d_{jk}$ , so $Y_{jk}(t)\sim\mathrm{Binomial}\bigl{(}F_{k}(t),d_{jk}\bigr{)}.$

Internal flips.

True statements spontaneously become false with probability $p_{k}$ , and false statements self-correct with probability $q_{k}$ :

Z_{k}(t)\sim\mathrm{Binomial}\bigl{(}T_{k}(t),p_{k}\bigr{)},\quad W_{k}(t)\sim% \mathrm{Binomial}\bigl{(}F_{k}(t),q_{k}\bigr{)}.

Normalized update equations.

Let $\Delta T_{k}(t)=-Z_{k}(t)+W_{k}(t)$ and $\Delta F_{k}(t)=X_{k}(t)+Z_{k}(t)-W_{k}(t)-Y_{jk}(t)$ . Dividing by $n_{k}$ gives the proportion dynamics

	$\displaystyle\pi_{T,k}(t+1)$	$\displaystyle=\pi_{T,k}(t)+\frac{\Delta T_{k}(t)}{n_{k}},$		(10)
	$\displaystyle\pi_{F,k}(t+1)$	$\displaystyle=\pi_{F,k}(t)+\frac{\Delta F_{k}(t)}{n_{k}},$		(11)

with $\pi_{T,k}(t+1)+\pi_{F,k}(t+1)=1$ preserved automatically.

The mapping of this model to Definition 2.2 is provided in Table 3.

Table 3: Mapping of discursive-network elements to the cross-network invalidation-detection model. All hazard rates are per statement.

Element in $N_{1}\cup N_{2}$	Instantiation in the model
$A_{k}$	Two disjoint actor sets $A_{1},A_{2}$ ; proportions refer to $n_{k}=\lvert A_{k}\rvert$ .
$S=\{r,f\}$	Binary statements shared by both networks ( $r$ = true, $f$ = false).
$P_{ij}$	Persuasion function. Within each network $N_{k}$ it reduces to constant flip probabilities: $p_{k}$ for $r\to f$ (used in $Z_{k}(t)$ ), $q_{k}$ for $f\to r$ (used in $W_{k}(t)$ ). Across networks, persuasion acts only via $I_{ij}$ with success probability $d_{jk}$ .
$I_{ij}$	Cross-network invalidation: if $B_{ij}=\{f\}$ and the receiver belongs to the other network, the statement is detected and flipped with probability $d_{jk}$ (realized through the random variable $Y_{jk}(t)$ ).
$C_{ij}$	Message sent from $a_{i}$ to $a_{j}$ carrying $B_{ij}$ . Communication enables both persuasion and detection.
$B_{ij}$	Belief of actor $a_{ij}\in A_{k}$ : $\{r\}$ or $\{f\}$ .
$U_{j}$	Update rule for actor $a_{j}$ that applies in the following order (i) fabrication $X_{k}(t)\sim\mathrm{Poisson}(\lambda_{k}T_{k}(t))$ , (ii) internal flips $Z_{k}(t),W_{k}(t)$ using $p_{k},q_{k}$ , (iii) cross-network detection $Y_{jk}(t)$ using $d_{jk}$ . The parameter $\lambda_{k}$ therefore lives inside $U_{j}$ .
$G_{ij}$	Goal: persuade every other actor (within and across networks) to adopt $B_{ij}$ .

Lemma 2.4 (Dual-network invalidation propagation).

Let the proportion dynamics of Eqs. (10)-(11) be driven by parameters $\lambda_{k},d_{jk},p_{k},q_{k}\in(0,1)$ . Assume the per-actor falsehood-generation rate satisfies the consistency condition

\lambda_{k}=d_{jk}\,\frac{p_{k}}{p_{k}+q_{k}},

(12)

which guarantees that the expected proportions sum to one. Then the Markov process has the mean fixed point

\pi_{F,k}^{*}=\frac{\lambda_{k}}{d_{jk}},\qquad\pi_{T,k}^{*}=\frac{\lambda_{k}% \,q_{k}}{d_{jk}\,p_{k}},\qquad\pi_{T,k}^{*}+\pi_{F,k}^{*}=1.

Proof.

At equilibrium the expected changes vanish, so $E[\Delta T_{k}]=E[\Delta F_{k}]=0$ . Using the distributional means

	$\displaystyle E[X_{k}]$	$\displaystyle=\lambda_{k}\quad\text{(Poisson distribution)},$
	$\displaystyle E[Y_{jk}]$	$\displaystyle=F_{k}d_{jk}\quad\text{(Binomial distribution)},$
	$\displaystyle E[Z_{k}]$	$\displaystyle=T_{k}p_{k}\quad\text{(Binomial distribution)},$
	$\displaystyle E[W_{k}]$	$\displaystyle=F_{k}q_{k}\quad\text{(Binomial distribution)},$

and dividing by $n_{k}$ to convert counts to proportions gives

	$\displaystyle\pi_{T,k}p_{k}$	$\displaystyle=\pi_{F,k}q_{k},$		(13)
	$\displaystyle\lambda_{k}$	$\displaystyle=\pi_{F,k}d_{jk}+\pi_{F,k}q_{k}-\pi_{T,k}p_{k}.$		(14)

Equation (13) yields $\pi_{F,k}=\pi_{T,k}p_{k}/q_{k}$ . Insert this into the normalization $\pi_{T,k}+\pi_{F,k}=1$ to obtain $\pi_{T,k}=q_{k}/(p_{k}+q_{k})$ and $\pi_{F,k}=p_{k}/(p_{k}+q_{k})$ . Finally, setting $\pi_{F,k}=\lambda_{k}/d_{jk}$ from (14) gives the constraint (12) and the stated fixed point. ∎

Interpretation of Lemma 2.4.

The equilibrium proportions reveal clear causal roles for each parameter. The false-statement share in network $N_{k}$ is

\pi_{F,k}^{*}=\frac{\lambda_{k}}{d_{jk}},

so it scales directly with the per-actor error-generation rate $\lambda_{k}$ and inversely with the cross-network detection probability $d_{jk}$ . More prolific error creation or weaker cross-scrutiny raises the long-run fraction of false statements.

The true-statement share is

\pi_{T,k}^{*}=\pi_{F,k}^{*}\,\frac{q_{k}}{p_{k}}=\frac{\lambda_{k}\,q_{k}}{d_{% jk}\,p_{k}},

hence it grows with the internal correction probability $q_{k}$ and falls with the internal corruption probability $p_{k}$ . A network that corrects errors efficiently ( $q_{k}\uparrow$ ) or seldom corrupts truths ( $p_{k}\downarrow$ ) achieves a higher equilibrium truth proportion.

Finally, Eq. (12) couples $\lambda_{k}$ to the flip parameters: if within-network corruption outpaces correction ( $p_{k}>q_{k}$ ), the consistency condition forces a higher $\lambda_{k}$ , pushing $\pi_{F,k}^{*}$ upward unless the partner network compensates with stronger detection ( $d_{jk}\uparrow$ ). Thus the model quantifies an intuitive trade-off: falsehood prevalence is driven by the ratio of error creation to error removal, internally via $(p_{k},q_{k})$ and externally via $d_{jk}$ .

2.5.4 Single- vs. Cross-Network Models with Invalidation

Stationary false-statement shares.

The single-network emergent-invalidation model (Lemma 2.3) stabilizes at

\pi_{f}^{\mathrm{single}}=\frac{p+\lambda}{p+\lambda+q},\qquad\pi_{r}^{\mathrm% {single}}=\frac{q}{p+\lambda+q}.

For a given network $N_{k}$ engaged in cross-network detection with partner $N_{j}$ (Lemma 2.4) the corresponding steady state is

\pi_{f}^{\mathrm{cross}}=\frac{\lambda}{d},\qquad\pi_{r}^{\mathrm{cross}}=% \frac{\lambda\,q}{d\,p},

where $p,q,\lambda$ now abbreviate $p_{k},q_{k},\lambda_{k}$ and $d=d_{jk}$ .

Here

$p$	intrinsic $r\to f$ flip probability in $N_{k}$ ;
$q$	intrinsic $f\to r$ flip probability in $N_{k}$ ;
$\lambda$	fabrication hazard per true statement in $N_{k}$ ;
$d$	probability a false statement in $N_{k}$ is detected by $N_{j}$ .

Lemma 2.5 (Cross-network detection lowers falsehood prevalence).

\frac{\lambda}{d}<\frac{p+\lambda}{p+\lambda+q},

then $\pi_{f}^{\mathrm{cross}}<\pi_{f}^{\mathrm{single}},$ i.e. coupling $N_{k}$ to an external detector $N_{j}$ reduces the steady-state prevalence of false statements.

Proof.

Subtract the two stationary shares:

\pi_{f}^{\mathrm{single}}-\pi_{f}^{\mathrm{cross}}=\frac{p+\lambda}{p+\lambda+% q}-\frac{\lambda}{d}>0\quad\Longleftrightarrow\quad\frac{\lambda}{d}<\frac{p+% \lambda}{p+\lambda+q}.

∎

Interpretation.

External scrutiny ( $d\uparrow$ ) or lighter fabrication pressure ( $\lambda\downarrow$ ) pushes invalidations in the cross-network system below the single-network benchmark. Conversely, when $\lambda/d\geq(p+\lambda)/(p+\lambda+q)$ , fabrications outrun detections and the dual system sustains the same or a higher falsehood share than isolation.

2.5.5 How Many Agents Guarantee a Target Falsehood Level?

Lemma 2.6 (Effective correction hazard).

Let a focal discursive network $N_{k}$ possess an internal “false $\to$ true” correction hazard $q>0$ , and let it be cross-linked to $n-1$ partner networks, each supplying an external correction hazard $d>0$ . Assuming that (i) internal and external detections act independently, and (ii) every detection channel is memory-less (exponential), the waiting time $T$ until a false statement in $N_{k}$ is first corrected is

T\sim\exp\bigl{(}q+(n-1)d\bigr{)}.

Consequently, the effective per-statement correction rate is

q_{\text{eff}}(n)=q+(n-1)d.

Proof.

In the discrete model each false statement faces a single Bernoulli “self-repair” trial per period with probability $q$ . As the period length $\Delta t\to 0$ , the Binomial $\to$ Poisson limit converts this into a Poisson correction stream of rate $q$ , i.e. an exponential clock $T_{q}\sim\exp(q)$ .

Each of the other $n-1$ networks contributes an independent Bernoulli trial with probability $d$ per period. Taking the same limit gives $n-1$ independent Poisson streams of rate $d$ , or clocks $T^{(1)}_{d},\dots,T^{(n-1)}_{d}\sim\text{i.i.d.\ }\exp(d)$ .

The total waiting time until any clock rings is the minimum

T=\min\{T_{q},T^{(1)}_{d},\dots,T^{(n-1)}_{d}\}.

Because the minimum of independent exponentials is itself exponential with rate equal to the sum of the component rates, we obtain $T\sim\exp\bigl{(}q+(n-1)d\bigr{)}$ and hence $q_{\text{eff}}(n)=q+(n-1)d$ . ∎

Lemma 2.7 (Agent requirement for a tolerance $\varepsilon$ ).

Let $\pi_{f}^{(n)}$ be the asymptotic proportion of false statements in $N_{k}$ when it is coupled to the other $n-1$ networks as above. Then

\pi_{f}^{(n)}=\frac{p+\lambda}{\,p+\lambda+q+(n-1)d\,}.

Consequently, to guarantee $\pi_{f}^{(n)}\leq\varepsilon\in(0,1)$ one needs at least

n_{\min}=\Bigl{\lceil}1+\frac{(p+\lambda)\bigl{(}\tfrac{1}{\varepsilon}-1\bigr% {)}-q}{d}\Bigr{\rceil}

cross-detecting networks (agents).

Proof.

Substituting $q_{\text{eff}}(n)=q+(n-1)d$ from Lemma 2.6 into the single-network formula yields

\pi_{f}^{(n)}=\frac{p+\lambda}{p+\lambda+q+(n-1)d}.

The constraint $\pi_{f}^{(n)}\leq\varepsilon$ is equivalent to $(n-1)d\geq(p+\lambda)(\tfrac{1}{\varepsilon}-1)-q$ , which gives $n\geq 1+\frac{(p+\lambda)(\tfrac{1}{\varepsilon}-1)-q}{d}$ . Taking the ceiling ensures $n$ is an integer. ∎

Proof.

Per Lemma 2.3, for one isolated network the long-run fraction of false statements is

\pi_{f}^{\text{single}}=\frac{p+\lambda}{\,p+\lambda+q\,}.

Coupling $N_{k}$ to $n-1$ partner networks multiplies its “false $\to$ true” correction hazard from $q$ to $q_{\text{eff}}(n)=q+(n-1)d,$ per Lemma 2.6. Substituting this into the single-network formula gives the new steady state

\pi_{f}^{(n)}=\frac{p+\lambda}{\,p+\lambda+q_{\text{eff}}(n)\,}=\frac{p+% \lambda}{\,p+\lambda+q+(n-1)d\,}.

Impose the tolerance constraint $\pi_{f}^{(n)}\leq\varepsilon$ ,

\frac{p+\lambda}{p+\lambda+q+(n-1)d}\leq\varepsilon,\qquad 0<\varepsilon<1.

Solving the inequality for $n$ yields the desired result. The ceiling $\lceil\cdot\rceil$ is necessary because $n$ must be an integer. ∎

Interpretation.

External scrutiny scales linearly with the number of partner networks, while internal falsehood production stays fixed. Thus $\pi_{f}^{(n)}$ decays hyperbolically in $n$ , and each additional agent yields diminishing (but still positive), returns in truthfulness.

2.6 FOO Algorithm with Integrity Verification

The Flaws-of-Others (FOO) algorithm instantiates the detection hazard $d$ of Lemma 2.5 in software. It couples an arbitrary ensemble of LLM agents (each defined by a back-end model, decoding temperature, and free-text instructions) to a lightweight consensus loop (Algorithm 1 and Fig. 2). Neither the number of agents nor their prompts are fixed: both are read at run time from a simple JSON configuration, so the same engine can mediate anything from a two-model A/B test to a dozen specialized critics.

The FOO algorithm requires trust in the integrity of agent interactions. In collaborative scientific work, the provenance of each contribution becomes essential for reproducibility and accountability. We extend the basic FOO protocol with cryptographic integrity verification.

2.6.1 Core FOO Protocol

Figure 2: FOO consensus loop. An arbitrary set of agents

a_{1},\dots,a_{m}

receives the user task, produces candidate answers, cross-critiques peers, and feeds all critiques to one or more harmonizers

h

. Harmonizers synthesise the feedback; agents revise and the loop continues until a convergence criterion is met. The design allows any number or type of agents and any custom instruction set, making the architecture task- and model-agnostic.

The protocol has four phases:

1. Broadcast: an initial user task is broadcast to every active agent. Each agent returns a first-pass answer.

2. Cross-examination (FOO step): every agent receives the instruction “find the flaws in …” followed by all peer answers except its own, and produces a critique. This implements the cross-detection hazard: an error overlooked by one model is likely to be flagged by at least one other.

3. Harmonization: one or more agents are flagged as harmonizers. They aggregate the entire set of critiques, separate agreements from contradictions, and emit a structured “judgement.” Harmonizers can use any rubric-majority vote, weighted confidence, specialist veto, to convert divergent feedback into a common set of observations.

4. Revision and loop: every non-harmonizer ingests the judgement and regenerates its answer, optionally rebutting points it believes to be wrong. The cycle repeats until a termination condition is met (identical outputs, bounded edit distance, or a maximum number of rounds). The final harmonizer synthesis is returned to the user.

Because the agents; instructions; stopping rule; and comparison metric are all configurable, the same code base supports tasks as different as mathematical proof sketching, literature surveying, or code review. The FOO loop thus acts as a versatile wrapper that upgrades solitary generation into a networked, self-auditing process, realising in practice the external detection hazard that pushes the system into the truth-dominant regime predicted by the theory.

Algorithm 1 FOO with integrity logging

1:user task

T

; agent set

\mathcal{A}

; convergence test

2:final harmonized answer

R

with verified interaction log

\mathcal{B}\leftarrow

initialize blockchain with genesis block

4:broadcast

T

to every

a\in\mathcal{A}

and collect initial answers

5:log initial responses to blockchain

\mathcal{B}

6:repeat

7: for all agent

a\in\mathcal{A}

8: supply

a

with ”find flaws in”

\{\text{answers of }\mathcal{A}\setminus\{a\}\}

9: receive critique

C_{a}

10: log critique

C_{a}

to blockchain

\mathcal{B}

11: end for

12: harmonizer(s)

h

aggregate

\{C_{a}\}

into judgement

J

13: log harmonization decision to blockchain

\mathcal{B}

14: for all non-harmonizer

a

15: regenerate answer

A_{a}

conditioned on

J

16: log revision to blockchain

\mathcal{B}

17: end for

18:until convergence test satisfied

19:return final

J

and verified blockchain

\mathcal{B}

2.6.2 Integrity Extension

Each FOO interaction generates a cryptographically signed record containing: (i) Message content and timestamp, (ii) Agent identity and interaction type, (iii) Hash-based link to previous interactions, (iv) Verification signature

This creates a tamper-evident chain where any modification to historical interactions invalidates subsequent cryptographic links, making post-hoc fabrication of contributions computationally infeasible.

The integrity logging adds four checkpoint types to Algorithm 1:

1.

Initial response logging after broadcast
2.

Critique logging during cross-examination
3.

Harmonization decision logging
4.

Revision logging during iteration

Detailed implementation algorithms and security analysis are provided in Appendix A.

3 Theoretical Validation and Parameter Analysis

This section demonstrates the mathematical consistency and theoretical properties of our discursive network models. Rather than claiming empirical validation, we show how the framework accommodates realistic parameter ranges and produces theoretically coherent dynamics. The analysis serves three purposes: (i) establishing that the models yield stable, interpretable equilibria; (ii) demonstrating how parameter variations affect system behavior; and (iii) illustrating the framework’s capacity to represent different invalidation regimes observed in the literature.

We parameterize our models using representative values drawn from the LLM literature to demonstrate theoretical consistency and explore regime transitions. These parameter choices illustrate the framework’s expressive capacity rather than constituting empirical validation. Future work will require systematic parameter estimation from controlled experiments designed specifically to test the discursive network hypotheses.

3.1 Parameter specification from literature ranges

Published studies provide parameter ranges that inform our theoretical analysis. \textciteJi2023SelfReflection report invalidation rates of 26-61% in medical domains, while \textciteZhang2024SelfAlignment document self-evaluation accuracy near chance levels (AUROC $\approx$ 0.55). These findings suggest parameter regimes where $\lambda>q$ , corresponding to invalidation-dominant dynamics in our framework:

\lambda>q\quad\Longrightarrow\quad\pi_{f}^{\text{single}}=\frac{p+\lambda}{p+% \lambda+q}\gtrsim 0.50,

i.e. the model fabricates new errors faster than it repairs them internally, i.e. an invalidation-dominant system. While we do not claim these studies directly measure our theoretical parameters, they establish the empirical plausibility of invalidation-dominant regimes and provide realistic bounds for theoretical exploration.

The claim-level probabilities reported by \textciteZhang2024SelfAlignment are raw soft-max scores. They are over-confident until calibrated (their Fig. 5), and they average dependent claims. Consequently we treat them as proxy scores, not literal probabilities, and accompany every point estimate with an explicit uncertainty discussion in what follows.

3.2 Parameter choices for simulation

Table 4 lists the hazards used in our single- and dual-network simulations. Whenever a range was reported in the source, we chose values that place the single network near the mid-point of the error band (about 60 % false statements) so that the effect of cross-network detection is easy to visualise.

The following analysis explores model behavior under parameter values that span regimes of practical interest. We examine: (i) single-network dynamics with varying $\lambda/q$ ratios; (ii) cross-network detection effects as $d$ varies; and (iii) scaling behavior as the number of agents increases. This constitutes theoretical validation of model consistency rather than empirical hypothesis testing.

Table 4: Representative parameter values for theoretical analysis. Parameters are chosen to demonstrate key regime transitions and explore model behavior within empirically plausible ranges. Values do not constitute fitted parameters but rather theoretically motivated choices for mathematical exploration.

Symbol	Value	Model role	Empirical hook / comment
$p$	$0.02$	true $\to$ false slip	Chosen an order of magnitude smaller than $q$ so that internal repair remains visible.
$q$	$0.05$	internal repair	Matches self-eval’s modest AUROC $\approx 0.55$ .
$\lambda$	$0.055$	invalidations	Upper half of the 26-61 % error band implies $\lambda>q$ ; solving $\pi_{f}^{\text{single}}\approx 0.60$ for $\lambda$ with $p,q$ fixed gives $0.055$ .
$d$	$0.19$	cross-network repair	Picked so that $\lambda/d<p/(p+q)$ , just inside the truth-dominant region; see Lemma 2.5.

3.3 Single-network baseline

With the calibrated triple $(p,q,\lambda)$ the single-network model (Lemma 2.3) predicts

\pi_{f}^{\text{single}}=\frac{0.02+0.055}{0.02+0.055+0.05}=0.60,\qquad\pi_{r}^% {\text{single}}=0.40.

A Monte-Carlo experiment (FOO_Single_Network.py, $20$ runs, $100$ steps) produces $\hat{\pi}_{f}(100)=0.597\pm 0.018$ .

Interpretation.

When $\lambda>q$ the false share stabilises near 60 %, squarely inside the empirical band of \citeauthorJi2023SelfReflection. This baseline serves as the reference against which we gauge cross-network effects.

3.4 Dual-network architecture

Coupling two identical networks via an external repair hazard $d$ (Lemma 2.5) yields

\pi_{f}^{\text{cross}}=\frac{\lambda}{d}=0.29,\qquad\pi_{r}^{\text{cross}}=0.71,

i.e. the falsehood prevalence is cut roughly in half. Simulations ( $20$ runs, $200$ steps) give $\hat{\pi}_{f}^{N_{1}}(200)=0.286\pm 0.015$ and $\hat{\pi}_{f}^{N_{2}}(200)=0.289\pm 0.014$ (Fig. 3b).

Interpretation.

The external hazard $d$ can be realized by retrieval-augmented generation, ensemble adjudication, or human post-editing. Once the ratio $\lambda/d$ drops below $p/(p+q)$ the system flips from an invalidation-dominant regime ( $\pi_{f}\approx 0.60$ ) to a truth-dominant one ( $\pi_{f}\approx 0.29$ ), a 51 % relative reduction.

3.5 How many independent agents for $\leq$ 5 % error?

Using Proposition 2.7 we can ask: How many cross-detecting networks are required to push the long-run false share below $5\,\%$ ? With the parameters of Table 4 and tolerance $\varepsilon=0.05$ ,

n_{\min}=\Bigl{\lceil}1+\frac{(p+\lambda)\bigl{(}\tfrac{1}{0.05}-1\bigr{)}-q}{% d}\Bigr{\rceil}=\bigl{\lceil}8.24\bigr{\rceil}=9.

Hence, in this example, at least nine mutually detecting agents are necessary to guarantee that fewer than one statement in twenty remains false at equilibrium under this calibration. The requirement grows only linearly in $1/\varepsilon$ thanks to the additive nature of the external hazards, making multi-agent verification a scalable pathway to high factual reliability. Figure 4 shows the functional dependency between $\varepsilon$ and $n$ .

4 Discussion

4.1 Ethical Concerns

The first and most important observation regarding discursive networks with LLMs is ethical: there is a risk of transgression when the human(s) in a discursive network use it as a means to oursource reasoning instead of deploying it as augmented intelligence. In linguistics, epithesis refers to the addition of a sound or letter to the end of a word without changing its meaning. By analogy we identify an ethical concern in discursive networks: the scientific epithesis, wherein an individual seeks authorship on an artifact to which they have contributed only superficial edits—or none at all. Like the linguistic phenomenon, the intervention leaves the substantive content untouched while appending an external element that alters perception rather than substance. Scientific epithesis does not meet the formal threshold of plagiarism, yet it belongs to the same family of misappropriations because it places a symbolic layer of credit “upon” the discourse without engaging in its intellectual construction. In the context of discursive networks this behaviour distorts the link between contribution and attribution, undermining the very mechanism of cross-agent validation that the network is designed to support.

Authorship norms present another axis of concern. As dozens of agents contribute micro-edits, intellectual responsibility becomes increasingly opaque, complicating both credit assignment and error tracing. Empirical work across disciplines shows that diffuse contributions encourage honourary or “gift” authorship, diluting accountability and undermining public trust in published findings [maruvsic2011systematic]. What constitutes authorship in discursive network is an open question that will take time to settle.

The integrity of discursive networks fundamentally depends on the ability to verify the authenticity and provenance of each contribution, whether from human or artificial agents. When scientific conclusions emerge from iterative exchanges among multiple participants, traditional notions of authorship become complicated by the distributed nature of intellectual labor and the possibility of post-hoc modification of interaction records. This challenge is particularly acute in combating epithesis, as the practice thrives in environments where genuine contributions cannot be distinguished from superficial additions or retroactive claims of involvement. A robust solution requires cryptographic mechanisms that create tamper-evident logs of all interactions, making it computationally infeasible to fabricate authorship claims after the fact. By implementing blockchain-based integrity verification for agent communications, discursive networks can establish a trustworthy foundation where each participant’s actual contributions are permanently recorded and verifiable. This technical infrastructure does more than prevent fraud; it creates positive incentives for meaningful engagement by ensuring that substantial intellectual contributions receive proper attribution while making epithetic behavior both detectable and reputationally costly. The result is a research environment where collaborative human-AI knowledge production can proceed with confidence in the integrity of the underlying interaction records.

The energy footprint of large discursive networks pose another ethical dilemma. Each critical review involves at least one forward-and-backward pass through a language model, so if a manuscript is critiqued by a single external agent the computational cost grows linearly with the number of agents, $E(N)=\Theta(N)$ . When every agent critiques every other agent—the fully connected case that is ideal for robustness—the number of pairwise exchanges scales as $N(N-1)/2=\Theta(N^{2})$ , and so does the energy consumption. Moreover, once those quadratic interactions have occurred the network typically runs a consensus or “harmonization” phase to reconcile conflicting edits; common distributed algorithms complete in $O(\log N)$ synchronous rounds. The aggregate budget therefore climbs to $\Theta\bigl{(}N^{2}\log N\bigr{)}$ for end-to-end validation, dwarfing the cost of the original single-agent composition. In an era when each large-model inference already carries a measurable carbon footprint, the quadratic-plus overhead raises difficult questions about the sustainability of scaling discursive networks without parallel investment in greener compute or more frugal validation protocols.

Because those costs grow faster comapred to single-agent or single-author scenarios, only well-funded actors may afford the energy budget, deepening the resource gap already highlighted for modern NLP pipelines [Strubell2019Energy]. Sustaining the benefits of discursive verification therefore demands not merely algorithmic innovation but also governance frameworks and infrastructural subsidies that keep the playing field environmentally and economically fair.

Discursive networks promise robust cross-verification, yet their perpetual negotiation of “truth” can erode epistemic diversity. When many agents iteratively revise one another, the process tends to pull answers toward a central consensus, suppressing minority explanations in favour of the statistically safest wording. Large-scale language models already exhibit this homogenising bias, along with well-documented tendencies to replicate and even amplify the social prejudices embedded in their training data [Bender2021Parrots]. A network of such agents therefore risks hard-coding bias under the reassuring veneer of multi-agent agreement.

At the same time, cross-agent review typically demands full visibility of prompts and intermediate reasoning, thereby increasing the attack surface for privacy breaches. Membership-inference studies demonstrate how seemingly benign queries can reveal whether sensitive records were present in a training set [Shokri2017Membership], suggesting that discursive networks must treat every inter-agent channel as a potentially hamful vector for leakage.

Finally, adversarial robustness and distributive justice pose intertwined challenges. A single compromised agent can inject fashioned “triggers” that, once propagated through mutual validation, shift the entire network toward a malicious conclusion [Wallace2019Triggers].

4.2 Conclusions and Outlook

This manuscript has traced a broad arc, from theoretical grounding to practical tooling, around a single organising idea: discursive networks. By recognising that every LLM is both a generator and a consumer of discourse, we cast its interactions as edges in a network whose universal structure can be exploited for robust error control. Building on this abstraction we introduced Flaws-Of-Others (FOO), a reconfigurable agent-based algorithm packaged with user-friendly tools that assist the production, verification, and revision of scientific knowledge.

Many ethical challenges in discursive networks call for detailed, tamper-proof logs of interactions. This is essential for accountability, reproducibility, and auditability in digital systems. When interactions involve LLMs, scientific collaborations, or complex data workflows, the capacity to verify the integrity of recorded exchanges becomes a prerequisite for trust. One approach to achieving this is through the use of blockchain technology, which can provide decentralized, cryptographically secure records that are resistant to unauthorized modification. Each entry in a blockchain-based log is linked to the previous one through cryptographic hashes, ensuring that any tampering with earlier data invalidates the entire subsequent chain. This structure allows interaction logs to be both transparent and verifiable without central oversight. Furthermore, incorporating time-stamping and access control into such systems ensures that each interaction is both temporally fixed and attributable. These properties make blockchain a viable framework for securing interaction logs in research, legal compliance, and automated decision-making contexts.

This study deliberately replaces the fashionable term hallucination with the broader concept of invalidation. Whereas “hallucination” suggests a purely accidental slip in the model’s internal perception, the data reveal a richer spectrum of invalidations (failure modes) that includes strategic prompt manipulation, chain-of-thought drift and the simple inheritance of errors from flawed training corpora. All of these mechanisms manifest in the observable metric that matters to users, the production of false statements, so the single hazard rate $\lambda$ is most naturally interpreted as an invalidation rate. The shift in vocabulary is therefore more than semantics: it aligns theoretical parameters with the phenomena actually counted in benchmarks.

A discursive network that relies exclusively on its own self-correction routines will remain invalidation-dominant as soon as the fabrication hazard exceeds the internal repair rate ( $\lambda>q$ ). In that regime the long-run share of false statements is bounded below by one half, regardless of implementation details or domain. The formalism developed here reveals why: fabrication and self-repair enter the steady-state ratio $\pi_{f}^{\text{single}}=(p+\lambda)/(p+\lambda+q)$ additively, leaving no structural mechanism for the network to “outrun” its own invalidations.

The picture changes once independent verification channels are added. Coupling the generator to $n-1$ external agents augments every false statement with an additional repair hazard $d$ per agent, yielding the effective rate $q_{\text{eff}}(n)=q+(n-1)d$ . If the composite system satisfies $\lambda/d<p/(p+q)$ it flips into the truth-dominant regime, in which the falsehood share decreases monotonically with $n$ and approaches zero in the limit of infinite cross-checking capacity. The transition threshold depends only on the ratio $\lambda/d$ , providing a clean design criterion that is independent of any particular benchmark or simulation.

Proposition 2.7 turns that qualitative criterion into a quantitative planning tool. It gives a closed-form bound

n_{\min}=\Bigl{\lceil}1+\frac{(p+\lambda)\bigl{(}\tfrac{1}{\varepsilon}-1\bigr% {)}-q}{d}\Bigr{\rceil}

for the smallest number of mutually detecting agents required to keep the long-run falsehood share below a user-specified tolerance $\varepsilon$ . Because $n_{\min}$ grows only linearly in $(p+\lambda)/d$ , even ambitious error targets (e.g. $\varepsilon=0.01$ ) translate into tractable network sizes. In practice, engineers can estimate $p$ , $q$ , and $\lambda$ from established benchmarks, select a viable cross-detection mechanism to determine $d$ , and then read off $n_{\min}$ directly from the formula. The discursive-network formalization thus provides not just a descriptive model of information dynamics but a concrete apparatus for right-sizing the verification infrastructure needed to achieve prescribed levels of factual reliability.

The theoretical analysis confirms that our mathematical framework produces stable, interpretable dynamics across parameter ranges consistent with published LLM studies. The models successfully capture qualitative regime transitions (invalidation-dominant vs. truth-dominant) and provide quantitative predictions for multi-agent system design.

Empirical validation remains an important direction for future work, requiring controlled experiments specifically designed to measure the theoretical parameters ( $p$ , $q$ , $\lambda$ , $d$ ) under realistic conditions.

Although the mathematics treats $d$ as a single scalar, the concept encompasses several concrete engineering choices. Retrieval-augmented generation raises $d$ by surrounding the model with authoritative passages that expose contradictions. Model-ensemble adjudication raises it further by combining the diverse priors of independently trained models; uncorrelated errors rarely agree, so the aggregate chance that a falsehood slips through diminishes rapidly. Even higher values of $d$ become attainable when a human editor is placed in the loop, although latency and cost then become limiting factors. Our hazard model quantifies these trade-offs: for any target false-statement tolerance one can compute a required detection rate, and thus budget the amount of human or automated scrutiny that must be applied.

Just as people are quicker to spot another person’s mistakes than their own, a language model is often a sharper critic of a peer’s text than of its own output. The asymmetry stems from each model’s training objective: during generation it maximises local fluency rather than global factuality, so a well-phrased falsehood slips through unchallenged. When the same model is asked only to evaluate an already-written passage, fluency is no longer the bottleneck; the task collapses to checking claims against the knowledge stored in its training corpus. Verification is therefore easier, and cross-model review can expose errors that the original authoring pass left intact.

The significance of these findings extends beyond artificial systems. Invalidation dynamics of the same mathematical form could govern human conversation, peer review and social-media fact-checking. Recognising this shared structure invites a unified research agenda that links network science, cognitive psychology and algorithmic governance. Our future work will explore heterogeneous hazards at the level of individual actors, time-varying detection capacities that respond to workload, and live A/B tests that recover hazard estimates directly from production chat traffic. A deeper ethical analysis will also be required, because the push toward smaller $\pi_{f}$ competes with privacy constraints, energy budgets and the carbon footprint of large-scale verification.

We have entered a new cultural mode. It affects scientific production.

We have crossed a cultural threshold: language is no longer crafted solely by human hands and minds, but is continuously co-composed with machines. Large language-model systems do not “assist” writing; they repurpose it. Authorship dissolves into a live dialogue between human intention and algorithmic completion, with sentences generated, revised, and re-externalized in the same breath. Because the practice itself has changed, the evaluative yardsticks built for solitary, page-bound prose (originality scores, citation counts, style rubrics, etc.) no longer capture what is happening on the screen. This cultural shift brings novel risks and opportunities, demanding updated methods, training, expectations, and metrics.

Looking forward, the research programme must widen from token-level truthfulness to medium-level dynamics. Three paths stand out:

1.

Cultural assimilation. Our most pressing need is an ethical framework to harmonize societal values with this new reality. We need conventions, interfaces and pedagogies that make continuous, model-mediated composition legible, trustworthy, and fair.
2.

Metric redesign. Benchmarks rooted in solitary authorship and static text can no longer capture quality in a live, co-creative medium. New metrics should score how a statement evolves under iterative detection and repair, not just its instantaneous truth value.
3.

Governance of agency. When the medium itself “acts on the message,” responsibility diffuses across designers, deployers and end-users. Future hazard models must therefore integrate economic incentives, interface affordances and policy constraints alongside the technical parameters $p,q,\lambda,d$ .

Seen through this lens, the ideas discussed in this manuscript form a prototype for broader cultural discussions that will emerge as society learns, once again, to write in a fundamentally new medium.

Acknowledgements

This work was supported in part by the National Institute of General Medical Sciences of the National Institutes of Health under Award No. 1R25GM151182 and by the National Science Foundation under Award No. 2518973. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, NIGMS, or NSF.

In an instance of self-reference, this manuscript was developed using the methods it describes, through iterative cycles of recursive software development that refined the underlying technology. The records of interactions with multiple LLM agents are recorded only for late-stage manuscript refinement. Nevertheless, the following statement accurately represents the process employed here and provides a template for future work:

This manuscript and its supplementary materials were produced using methods detailed in [gutierrez2025flaws]. The author supplied the core concepts, the logical framework, and foundational technical content, while large language-model assistance was utilized for ideation, verification, text drafting and revision, and software implementation. Responsibility for all claims made herein rests solely with the author. Existing logs of interactions with LLM agents are included as an appendix within the supplementary materials.

\printbibliography

Appendix A Blockchain Implementation Details

Implementation note.

We implement the loop in Python 3.11 with asyncio concurrency; each agent call is an HTTP request to a hosted LLM endpoint (e.g. OpenAI, Anthropic). Unless stated otherwise, the experimental pool comprises a single harmonizer (the most capable model available) and, for each back-end engine under test, two specialist agents sampled at temperatures 0.1 and 0.9. The protocol executes one broadcast round plus a minimum of three consensus rounds, so every query triggers

\text{cost}=4\times|A|\quad\text{API calls},

where $|A|$ is the total number of agents (harmonizer + specialists). For example, with one engine ( $|A|=3$ ) the loop issues $4\times 3=12$ calls; adding engines or additional specialist roles scales the cost linearly. All configuration files and source code are publicly available at https://github.com/biomathematicus/foo.

Blockchain motivation.

The reliability of discursive networks depends critically on the integrity of recorded interactions between human and artificial agents. When scientific conclusions emerge from iterative exchanges among multiple LLMs and human reviewers, the provenance and authenticity of each contribution becomes essential for both reproducibility and accountability. Traditional logging systems are vulnerable to post-hoc modification, making it difficult to distinguish genuine collaborative refinement from retrospective tampering or fabricated authorship claims.

We address this challenge through a blockchain-based integrity system that creates tamper-evident records of all agent interactions. Each message exchange in the discursive network generates a cryptographic block containing the agent identity, message content, timestamp, and hash-linked reference to the previous interaction. The system employs SHA-256 hashing with a global salt shared across all agents to ensure consistency while preventing individual agents from being identified through hash analysis alone.

Definition A.1 (Conversation Blockchain).

For a discursive network $N=(A,S,P,I,C,B,U,G)$ , the conversation blockchain $\mathcal{B}$ is a sequence of cryptographically linked blocks $\{b_{0},b_{1},\ldots,b_{n}\}$ that creates tamper-evident records of all agent interactions, where $\sigma$ is a global salt and $||$ denotes concatenation.

Algorithm 2 Blockchain record creation for discursive networks

1:message content

m_{i}

, sender

a_{j}

, receiver

a_{k}

, global salt

\sigma

2:new blockchain block

b_{i}

t_{i}\leftarrow

current timestamp

h_{c}\leftarrow H(m_{i}||t_{i}||\sigma)

\triangleright

Content hash

5:if

i=0

then

\triangleright

Genesis block

h_{b}\leftarrow H(h_{c}||\text{``GENESIS''}||\sigma)

7:else

h_{b}\leftarrow H(h_{c}||h_{b}(b_{i-1})||\sigma)

\triangleright

Chain hash

9:end if

10:

b_{i}\leftarrow\{m_{i},t_{i},h_{c},h_{b},i,\text{verified}\}

11:Append

b_{i}

to blockchain

\mathcal{B}

12:return

b_{i}

Algorithm 3 Blockchain integrity verification

1:blockchain

\mathcal{B}=\{b_{0},b_{1},\ldots,b_{n}\}

, global salt

\sigma

2:integrity status (valid/tampered)

3:for

i=0

n

h_{c}^{*}\leftarrow H(m_{i}||t_{i}||\sigma)

5: if

h_{c}^{*}\neq h_{c}(b_{i})

then

6: return “TAMPERED: Content hash mismatch at block

i

”

7: end if

8: if

i>0

then

h_{b}^{*}\leftarrow H(h_{c}(b_{i})||h_{b}(b_{i-1})||\sigma)

10: if

h_{b}^{*}\neq h_{b}(b_{i})

then

11: return “TAMPERED: Chain hash mismatch at block

i

”

12: end if

13: end if

14:end for

15:return “VERIFIED: Blockchain integrity intact”

The cryptographic hash function $H(\cdot)$ can be instantiated with various secure hash algorithms including SHA-256, SHA-3 (Keccak), BLAKE2, or BLAKE3, each offering different performance and security trade-offs. SHA-256 remains a robust choice for blockchain applications due to its widespread adoption and proven security properties, while newer algorithms like BLAKE3 offer superior performance for high-throughput scenarios.

The blockchain protocol ensures that any modification to historical interactions invalidates the cryptographic chain, producing detectable integrity violations. When agents load previous conversations, the system verifies the complete hash chain and displays prominent warnings if tampering is detected: “LOG TAMPERED. TRUST HAS BEEN BREACHED. BLOCKCHAIN FAILS.” This mechanism makes post-hoc fabrication of contributions computationally infeasible while preserving the ability to legitimately edit conversations by rebuilding the chain from the point of modification onward.

The implementation maintains separate blockchains for each agent while using a shared salt stored in the system configuration to ensure hash consistency across sessions. Genesis blocks are initialized with fixed timestamps to prevent hash divergence during system restarts. The protocol automatically migrates existing conversation logs to blockchain format, enabling backward compatibility while establishing integrity verification for all new interactions.

This cryptographic foundation serves two complementary functions in discursive networks. First, it provides technical infrastructure for reproducible research by creating verifiable logs of how scientific conclusions evolved through agent interactions. Second, it establishes an ethical framework for attribution by making it computationally expensive to falsify contributions after the fact. The blockchain thus transforms the question of authorship from a matter of trust to one of cryptographic verification, supporting the broader goal of maintaining accountability in collaborative human-AI knowledge production.

A Mathematical Theory of Discursive Networks

Abstract

1 Introduction

1.1 From “Hallucination” to Invalidation

1.2 Invalidation as a Universal Feature of Human and Artificial Cognition

1.3 Risks of Invalidation Spill-over Through Discursive Networks

2 Methods

2.1 Floor on invalidation probability

Lemma 2.1 (Invalidation floor).

Proof.

From Inevitability to Mitigation.

2.2 Information-Theoretic Basis for Verification Advantage

Theorem 2.1 (Variable-Length Entropy Comparison).

Proof.

Remark 2.1 (Why General Theorems Fail).

Implications for Discursive Networks.

2.3 Categorization of Mechanisms to Engage LLM Agents

2.4 Discursive Network Formalization

Definition 2.1.

Definition 2.2.

2.5 Modeling Discursive Networks

2.5.1 Single-Network Two-State Model

Lemma 2.2 (Single-network invalidation propagation).

Proof.

Interpretation of Lemma 2.2.

2.5.2 Single-Network Emergent Invalidation Model

Setup.

Definition 2.3 (Normalized state).

Stochastic primitives.

Fabrication (invalidation).

Internal flips.

Update equations.

Lemma 2.3 (Single-network invalidation with fabrication).

Proof.

Interpretation.

2.5.3 Cross-Network Invalidation-Detection Model

Definition 2.4 (Normalized state).

Stochastic primitives.

Falsehood generation (fabrication).

Cross-network detection.

Internal flips.

Normalized update equations.

Lemma 2.4 (Dual-network invalidation propagation).

Proof.

Interpretation of Lemma 2.4.

2.5.4 Single- vs. Cross-Network Models with Invalidation

Stationary false-statement shares.

Lemma 2.5 (Cross-network detection lowers falsehood prevalence).

Proof.

Interpretation.

2.5.5 How Many Agents Guarantee a Target Falsehood Level?

Lemma 2.6 (Effective correction hazard).

Proof.

Lemma 2.7 (Agent requirement for a tolerance ε𝜀\varepsilonitalic_ε).

Proof.

Proof.

Interpretation.

2.6 FOO Algorithm with Integrity Verification

2.6.1 Core FOO Protocol

2.6.2 Integrity Extension

3 Theoretical Validation and Parameter Analysis

3.1 Parameter specification from literature ranges

3.2 Parameter choices for simulation

3.3 Single-network baseline

Interpretation.

3.4 Dual-network architecture

Interpretation.

3.5 How many independent agents for ≤\leq≤5 % error?

4 Discussion

4.1 Ethical Concerns

4.2 Conclusions and Outlook

We have entered a new cultural mode. It affects scientific production.

Acknowledgements

Appendix A Blockchain Implementation Details

Implementation note.

Blockchain motivation.

Definition A.1 (Conversation Blockchain).

Lemma 2.7 (Agent requirement for a tolerance $\varepsilon$ ).

3.5 How many independent agents for $\leq$ 5 % error?