[{"content":"How do we reconcile the physical, continuous dynamics of a thermodynamic system with the discontinuous, logical operations of causal graph surgery?\nIn classical causal inference (Pearl, 2009), a causal intervention is modeled via the do-operator, which surgically severs incoming causal links to a target variable and forces it to a fixed value. While mathematically clean, this \u0026ldquo;graph surgery\u0026rdquo; is a discontinuous operation: it instantly zeroes out transition rates, defying physical processes which must obey continuous probability conservation and finite transmission speeds.\nThe Erbar-Maas Singular Intervention Theorem provides a rigorous mathematical bridge. By representing causal interventions as the infinite-rate singular limit of continuous-time restoration forces, we recover Pearl\u0026rsquo;s discontinuous causal surgery exactly as a timescale separation limit on Continuous-Time Markov Chains (CTMCs).\nBelow is an interactive mathematical laboratory showcasing the convergence, Wasserstein geometry, and entropy gradient flows behind this theorem.\nCausal Modeling Limit Erbar-Maas Singular Intervention Theorem Interactive theorem builder proving that Pearl’s discontinuous graph surgeries can be recovered exactly as the infinite-rate singular state perturbation of physical Continuous-Time Markov Chain (CTMC) models. CTMC CAUSAL SIMULATOR 1.2 SLIDE 1 / 7 Finite Causal State Network A 0.650 B 0.250 CDO 0.100 Standard reversible paths Active Matrix Algebra REVERSIBLE GENERATOR Q\nRESTORATIVE RATE λ (Forcing Power): λ = 5 Active Simulation Time t 1.0s Information Entropy H(p|π) 0.00000 nats State-Space Mass A: 65% • B: 25% • C: 10% Reversible CTMC \u0026 Causal State Space We formulate our system on a finite state graph with generator Q. Reversibility (detailed balance with respect to stationary measure π) guarantees our chain lacks circulating currents, enabling a clean potential landscape.\n1. BASE STOCHASTICS 7. CAUSAL LIMIT Auto-Play AUTOPLAY LAPSE: 3.5s STAGE: Dynamic Knowledge Base Algebraic Proof Face Active Mathematical Statement Anchor nodes: click any underlined algebraic sequence above to toggle detailed tutoring. Erbar-Maas Edge Mobility Calculator Continuous entropy trajectories evaluate edge weights using the Logarithmic Mean Λ(p_i, p_j). Compute live values between hypothetical state density distributions: DENSITY P_A: 0.05 0.95 p_A = 0.650 ↔ DENSITY P_B: 0.05 0.95 p_B = 0.250 Λ(p_A, p_B) = 0.417237 Active Mathematical Factor Detailed Balance Equation ALGEBRAIC ROLE: Symmetric Energy Landscape Ensures that microscopic transitions are balanced in equilibrium. In physics, this equates to time-reversal symmetry. Clear Highlight Algebraic Tutor \u0026 Q\u0026A Discussion VIRTUAL COACH Direct Explanatory Inquiries: Send Click Algebraic Statements to Discuss We have embedded active highlighted anchors inside the math formulations. Click any underlined statement in the active math block to open the interactive discussion tutor. Quick Demo Highlight The Mathematical Framework To understand the core details of this singular limit, we can outline the mathematical structures that govern the simulation above:\n1. Continuous-Time Markov Chains (CTMCs) Let our system be defined on a finite state graph with three states $\\mathcal{S} = {A, B, C}$. The state distribution $p(t) = [p_A(t), p_B(t), p_C(t)]$ evolves according to the Kolmogorov forward equation:\n$$\\dot{p}(t) = p(t) Q$$\nwhere $Q$ is the infinitesimal generator matrix satisfying:\n$Q_{ij} \\geq 0$ for all $i \\neq j$ (positive transition intensities). $\\sum_{j} Q_{ij} = 0$ (probability conservation). 2. Detailed Balance and Gradient Flow We assume the unperturbed chain is reversible with respect to a stationary distribution $\\pi$, satisfying the detailed balance condition:\n$$\\pi_i Q_{ij} = \\pi_j Q_{ji}$$\nUnder this symmetry, the linear Markovian evolution can be rewritten as the steepest descent (gradient flow) of the relative entropy:\n$$\\mathcal{H}(p \\mid \\pi) = \\sum_{i \\in \\mathcal{S}} p_i \\log \\frac{p_i}{\\pi_i}$$\nunder the discrete Riemannian metric introduced by Erbar and Maas. The metric tensor equips the probability simplex with a discrete Wasserstein geometry, where the mobility along edge $(i, j)$ is weighted by the logarithmic mean of their densities:\n$$\\Lambda(p_i, p_j) = \\frac{p_i - p_j}{\\log p_i - \\log p_j}$$\n3. Pearl\u0026rsquo;s Graph Surgery vs. Singular Perturbation A hard intervention forcing the system into state $C$ corresponds to severing incoming rates into $C$:\n$$Q_{do} = \\begin{pmatrix}\n(Q_{AB} + 0) \u0026amp; Q_{AB} \u0026amp; 0 \\ Q_{BA} \u0026amp; - (Q_{BA} + 0) \u0026amp; 0 \\ Q_{CA} \u0026amp; Q_{CB} \u0026amp; - (Q_{CA} + Q_{CB}) \\end{pmatrix}$$ Alternatively, we model this physically by adding a restorative term $\\lambda R$ that forces mass into $C$ with rate parameter $\\lambda$:\n$$Q_\\lambda = Q_{do} + \\lambda R_C$$\nAs $\\lambda \\to \\infty$, the system exhibits two distinct timescales:\nFast transient phase ($O(1/\\lambda)$): Any arbitrary initial probability mass collapses onto the intervention face (State $C$) via a projection operator $\\Pi_C$. Slow evolutionary phase: The remaining probability mass evolves under the projected slow dynamics $\\Pi_C Q_{do} \\Pi_C$ constrained to the target subspace. The equivalence is established via:\n$$\\lim_{\\lambda \\to \\infty} e^{t Q_\\lambda} = \\Pi_C e^{t \\Pi_C Q_{do} \\Pi_C}$$\nproving that causal graph surgery is the exact singular limit of physical restoration.\n","permalink":"https://ostensible-paradox.pages.dev/en/posts/erbar_mass_en/","summary":"An interactive laboratory demonstrating singular limits on Continuous-Time Markov Chains.","title":"Erbar-Maas Singular Causal Interventions"},{"content":"Why did the Mealy machine linger for so long?\nIt was the first formal object that perfectly encoded a core obsession, and because of that, it became impossible to let go of—even long after its utility had expired.\nThe digital trail of drafts and planning sessions reveals this reluctance. In the preparation notes, the verdict was clear:\n§3.2 Mealy notation | M = (Q, Σ, Δ, δ, λ, q₀) | Drop App A.1 | Theorem 1 + Mealy proof | Drop It was marked for deletion. Even the prism agent had pointed out the redundancy: \u0026ldquo;The Mealy machine result is simply the zero-cut special case of the static certificate\u0026rdquo;—strictly weaker, less general, and less rigorous than what followed. Yet, like a ghost in the machine, it kept reappearing. It haunted 2026jan.tex, lingered on Desktop/cc.tex, and found its way back into the AIES submission. Every attempt to excise it failed; it kept surviving.\nThe truth is that the Mealy machine was never really about AI liability. It was about a personal epistemic preoccupation. It formalized the exact structural boundary that defines every paper in the bibliography:\nPaper The Obsession, Dressed Differently ccModel / 2026jan Mealy machine: $\\lambda$ is non-injective $\\to$ no decoder exists from outputs to states. Double Certificates $\\varepsilon_{\\text{state}}^{\\text{UB}}$: a cut-set bound on the leakage of hidden-state information to public traces. cascade/paper Epistemic boundary: demonstrating that output-only review cannot reconstruct internal execution states. POPL d-separation Reachable as a Prop-value versus StaticRoute as a $\\Sigma$-type—showing how reachability conceals a structure that must be decompiled. Liability Paradox The fundamental claim that no finite sequence of observations can uniquely determine an internal state. The attraction was never to automata theory itself, but rather to that precise, visceral moment when observation fails to reconstruct cause. The Mealy machine was the simplest, most elegant caricature of that failure: a closed box with hidden gears where turning the crank produces outputs that could have been generated by multiple, incompatible internal configurations.\nBut the tragedy of the Mealy machine was its simplicity. It could only yield a blunt, binary verdict—non-injectivity leading to undecidability—when the actual goal was a graded, constructive, and decompilable account of epistemic loss.\nThat is what the POPL paper finally achieves. It replaces hand-waving claims about the impossibility of state recovery with a mechanical witness extractor. Through the well-founded measure of route_improves_of_bad, the StaticRoute witness extractor, and the dSeparated_iff_dSeparates bisimulation, the POPL paper represents the Mealy machine fully grown.\nKeeping the Mealy machine around was a lingering attachment to the object that first made this obsession legible. But legibility is not the same as rigor. The POPL paper marks the transition to a proper formal home—moving from automata theory to causal graphs, from prose proofs to Lean 4, and from binary undecidability to explicit witness extraction.\nThe Mealy machine had to die so that route_improves_of_bad could live.\n","permalink":"https://ostensible-paradox.pages.dev/en/posts/the-death-of-the-mealy-machine/","summary":"\u003cp\u003eWhy did the Mealy machine linger for so long?\u003c/p\u003e\n\u003cp\u003eIt was the first formal object that perfectly encoded a core obsession, and because of that, it became impossible to let go of—even long after its utility had expired.\u003c/p\u003e\n\u003cp\u003eThe digital trail of drafts and planning sessions reveals this reluctance. In the preparation notes, the verdict was clear:\u003c/p\u003e\n\u003cdiv class=\"highlight\"\u003e\u003cpre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;\"\u003e\u003ccode class=\"language-text\" data-lang=\"text\"\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003e§3.2 Mealy notation | M = (Q, Σ, Δ, δ, λ, q₀) | Drop\n\u003c/span\u003e\u003c/span\u003e\u003cspan style=\"display:flex;\"\u003e\u003cspan\u003eApp A.1 | Theorem 1 + Mealy proof | Drop\n\u003c/span\u003e\u003c/span\u003e\u003c/code\u003e\u003c/pre\u003e\u003c/div\u003e\u003cp\u003eIt was marked for deletion. Even the prism agent had pointed out the redundancy: \u003cem\u003e\u0026ldquo;The Mealy machine result is simply the zero-cut special case of the static certificate\u0026rdquo;\u003c/em\u003e—strictly weaker, less general, and less rigorous than what followed. Yet, like a ghost in the machine, it kept reappearing. It haunted \u003ccode\u003e2026jan.tex\u003c/code\u003e, lingered on \u003ccode\u003eDesktop/cc.tex\u003c/code\u003e, and found its way back into the AIES submission. Every attempt to excise it failed; it kept surviving.\u003c/p\u003e","title":"Why mealy machine, Why?"},{"content":"Since Pearl (2009), causal inference on DAGs has crystallized around a powerful but austere toolkit: boolean d-separation and do-calculus. Does evidence flow? Full stop. Does it flow after an intervention? Full stop. This framework is sufficient for causal identification—determining whether an effect is estimable from observed data. But it is curiously silent on a question that seems equally natural: how much flows, through which channels, and with what residual structure?\nI want to argue that this silence is not a minor gap; it is a symptom of two communities talking past each other. And the bridge between them is a rigorous notion of trail as witness.\nThe Boolean Trap Traditional DAG analysis gives you a verdict at the endpoints. $X$ is d-separated from $Y$ given $Z$: True or False. The intervened distribution $P(y \\mid do(x))$ is identifiable: Yes or No. This is the \u0026ldquo;full stop\u0026rdquo; regime.\nBut a boolean verdict discards the path. It tells you that water reaches the tap, but not whether it travelled through lead pipes, detoured through a cistern, or was siphoned off midway. To know the texture of a causal chain—to distinguish a default association from a derived conclusion, or to know whether an observed correlation persists because of the presence of one variable or the absence of another—you need to traverse the trail in slow motion, node by node, junction by junction.\nTrail traversal gives the texture of causal chains; endpoint booleans do not.\nTwo Communities, One Missing Link Remarkably, two intellectual neighborhoods have been circling this problem from opposite ends without quite meeting in the middle:\nThe Causal Inference community (Pearl et al.) is structurally obsessed. It abstracts the world into absolute black-and-white: if d-separation holds, the answer is True; otherwise, False. They are plumbers who care only whether the pipe is open, not what contaminant rides the flow. After 2009, the field\u0026rsquo;s theoretical engine on this particular front seems to have stalled—perhaps because every remaining graph-theoretic challenge starts to look like the Four Color Problem, solvable only by brute force unless a new representational insight appears.\nThe Quantitative Information Flow (QIF) community (Alvim, Palamidessi, Smith et al.) is capacity-obsessed. They compute leakage in bits, bound it with Shannon capacity, and seek KKT certificates for optimality. But their channel models are often toy simplifications, stripped of topological depth. They measure the volume of water without being able to trace the pipe\u0026rsquo;s winding route through the labyrinth.\nBoth communities study DAGs plus information flow. Yet:\nCausal inference has the topology but no quantitative bound. QIF has the quantitative machinery but no trail-level topological witness. No one has connected them. Why? Because the causal side asks \u0026ldquo;Is it identifiable?\u0026rdquo; and the QIF side asks \u0026ldquo;How many bits leak?\u0026quot;—and neither side has a formalism that answers both at once while retaining the path as a first-class object.\nLemma 1: The Collider Ancestor Leak \u0026amp; Rerouting Consider a collider $u \\rightarrow w \\leftarrow v$. Textbook d-separation says evidence can pass through $w$ only when $w$ or a descendant is observed. But this description conflates two distinct phenomena.\nSuppose we are testing whether $X$ and $Y$ are d-separated given $Z$. If a descendant of $w$ leads only to $X$ or $Y$ and not to $Z$, then the collider is not \u0026ldquo;activated by $Z$\u0026rdquo; in any global sense. It is activated because it creates a connected route from a source to a destination. The conditioning set is, locally, a red herring.\nPath Normalization / Rerouting Claim: If a descendant of $w$ leads to $X$, the original long path was never necessary; the descendant path itself provides a shorter route.\nProof sketch. Take any active trail passing through collider $w$ and reaching $X$ via descendant $d$. Replace the subpath from $w$ to $X$ with $w \\leadsto d \\leadsto X$. By minimality of active trails (or induction on cutset size), the rerouted path is no longer than the original and preserves endpoint connectivity. The original trail was therefore non-minimal, containing a redundant detour shortcuttable through the collider\u0026rsquo;s own descendant. ∎\nThe upshot: collider \u0026ldquo;activation\u0026rdquo; is often just topological connectivity leaking outward, not a special global event mediated by the conditioning set.\nLemma 2: The Junction Obligation Problem If we accept that trails matter, we need a local criterion for their validity that does not require re-scanning the entire graph at every step.\nDecompiling the Trail Decompose an active trail into a stateful path type—call it ActiveRoute or BayesBallPathT. Each traversal step carries a direction tag:\noutOf: leaving via an outgoing edge. into: entering via an incoming edge. These tags are not bookkeeping; they encode the obligation imposed by the junction just traversed.\nGlobal Topology → Local Type Constraints In the global formulation, a junction $(A, B, C)$ is valid only after inspecting the whole graph and the conditioning set. Under the state-machine view:\nObligations are pushed to interfaces. The direction label at a boundary encodes what junction type is expected on the other side. Composition is type checking. Concatenating two path segments requires only that the output state of the first matches the input obligation of the second. No global inspection needed. Local consistency implies global consistency. If every adjacent pair of segments satisfies their shared interface obligation, the entire trail is valid by construction. The global topological constraint of d-separation becomes a local type-system constraint on path segments. The type of a segment is its pair of boundary states; composition is well-typed iff obligations align.\nWhat Is Still Missing Lemmas 1 and 2 give us a cleaner, more local way to reason about whether a trail is active. But they do not yet answer the quantitative question:\nNot \u0026ldquo;does information flow?\u0026rdquo; but \u0026ldquo;how many bits flow through this specific trail?\u0026rdquo;\nThat question requires machinery that currently lives only in QIF:\nChannel capacity between observables and secrets along a specific topological route. KKT conditions to certify that a given leakage bound is optimal under the graph\u0026rsquo;s structural constraints. Shannon bounds that respect the DAG\u0026rsquo;s conditional-independence structure rather than assuming a flat channel matrix. What does not yet exist—and what I am groping toward—is a framework where:\nThe DAG provides the topological syntax. The trail provides the witness (the specific path whose capacity we measure). KKT + channel capacity provide the quantitative certificate. A proof assistant (Lean4, Coq) checks both the topological type constraints (Lemma 2) and the information-theoretic bounds. Takeaway Pearl\u0026rsquo;s boolean tools are not wrong; they are insufficient for anyone who wants to know the texture of a causal chain. QIF\u0026rsquo;s quantitative tools are not wrong; they are topologically blind. The missing piece is a trail semantics that makes the path a first-class object—so that we can ask not only whether an intervention opens a channel, but how wide that channel is, what contaminants it carries, and whether the leakage is bounded.\nWe need to move from \u0026ldquo;full stop\u0026rdquo; to \u0026ldquo;slow-motion replay.\u0026rdquo; The trail is the witness.\n","permalink":"https://ostensible-paradox.pages.dev/en/posts/from-boolean-verdicts-to-quantitative-witnesses/","summary":"\u003cp\u003eSince Pearl (2009), causal inference on DAGs has crystallized around a powerful but austere toolkit: boolean d-separation and do-calculus. Does evidence flow? Full stop. Does it flow after an intervention? Full stop. This framework is sufficient for causal \u003cem\u003eidentification\u003c/em\u003e—determining whether an effect is estimable from observed data. But it is curiously silent on a question that seems equally natural: \u003cstrong\u003ehow much\u003c/strong\u003e flows, \u003cstrong\u003ethrough which channels\u003c/strong\u003e, and with \u003cstrong\u003ewhat residual structure\u003c/strong\u003e?\u003c/p\u003e","title":"From Boolean Verdicts to Quantitative Witnesses: Why DAG Topology Needs a Trail Semantics"},{"content":"Reconstructing the Relay Channel: Modernizing the Cut-set Bound and Degraded Capacity Proofs When aiming to \u0026ldquo;extract dependencies and compress proof chains,\u0026rdquo; few starting points are as effective as the core results from Chapter 16 of El Gamal and Kim: the cut-set upper bound for general discrete memoryless relay channels and the capacity theorem for physically degraded relay channels. These results, originating from the landmark 1979 Cover–El Gamal paper, hold immense historical significance but carry a structural \u0026ldquo;debt\u0026rdquo; that allows for substantial modernization and simplification.\nWhy Reconstruct? The achievability proof in the original 1979 paper relies on a \u0026ldquo;random partition + ambiguity set intersection\u0026rdquo; strategy. While elegant for its time, it is no longer the shortest path to the result.\nIf our goal is simply to reach the same Decode–Forward rate, we can employ a more direct \u0026ldquo;降维打击\u0026rdquo; (dimensionality reduction) through regular encoding and backward decoding. This approach entirely eliminates the need for random partitioning, binning analysis, and the lengthy Slepian–Wolf style derivations. Simultaneously, the converse can be modularized into three clear steps: two applications of Fano\u0026rsquo;s inequality, a causal Markov chain argument, and a single-letterization lemma via concavity.\nThe core logic of this reconstruction is that degradedness is not a prerequisite for achievability. The Decode–Forward construction holds for general relay channels; the degradedness assumption is only required in the final step of the converse to tighten the bound.\nSelecting Theorems: Where is the Room for Compression? Within the landscape of network information theory, I have selected several candidates for modular reconstruction. The criterion is not just the fame of the conclusion, but the potential for the proof chain to be further abstracted and streamlined.\nCandidate Theorem Text Location Core Statement Simplification Potential Capacity of Physically Degraded Relay Channels §16.4, p.386 $C = \\max \\min {I(X_1;Y_2 X_2), I(X_1,X_2;Y_3)}$ Cut-set Bound for General Relay Channels §16.2, p.384 The outer bound for capacity High. Causality arguments can be modularized without coupling to specific coding schemes. Gel\u0026rsquo;fand–Pinsker Theorem §7.6, p.178 Capacity with non-causal state information at the encoder High. The auxiliary variable selection and Csiszár sum identity offer excellent room for abstraction. Reorganizing the Logical Chain We compress the entire proof into the following logical path:\nThe Two-Cut Converse: Cut 1: At the receiver, using Fano\u0026rsquo;s inequality to bound $I(X_1, X_2; Y_3)$. Cut 2: In an \u0026ldquo;enhanced\u0026rdquo; system where the relay\u0026rsquo;s observations are shared, bounding $I(X_1; Y_2, Y_3 | X_2)$. This relies only on Fano, causality, and memorylessness—no degradedness required. Achievability via Backward Decoding: Utilize block-Markov superposition coding. The relay decodes forward (block-by-block), while the destination decodes backward. The \u0026ldquo;magic\u0026rdquo; of backward decoding is that once the next block\u0026rsquo;s message is known, the current block\u0026rsquo;s decision becomes a standard single-user problem, bypassing the need for binning. Specializing to Degraded Channels: Introduce the physical degradedness Markov chain $X_1 \\to (X_2, Y_2) \\to Y_3$. The second cut\u0026rsquo;s mutual information term collapses, closing the gap between the upper and lower bounds. Next Steps: Toward Formalization The current proof draft is mathematically closed. The remaining refinements involve a detailed bookkeeping of the $\\delta(\\varepsilon)$ terms in the typicality analysis and a formalization of the notation for extending the three-node model to general time-expanded Directed Acyclic Graphs (DAGs).\nFor formal verification projects like Lean, the most elegant path is to decompose this into three independent lemmas: the Two-Cut Converse Lemma, the Backward Decoding Achievability Lemma, and the Additive Decomposition for Orthogonal Networks. This represents the purest and most modular form of these classic information-theoretic results.\n","permalink":"https://ostensible-paradox.pages.dev/en/posts/constructing-el-gamal-kim-proof-chain-for-cutsetbound-lean/","summary":"\u003ch1 id=\"reconstructing-the-relay-channel-modernizing-the-cut-set-bound-and-degraded-capacity-proofs\"\u003eReconstructing the Relay Channel: Modernizing the Cut-set Bound and Degraded Capacity Proofs\u003c/h1\u003e\n\u003cp\u003eWhen aiming to \u0026ldquo;extract dependencies and compress proof chains,\u0026rdquo; few starting points are as effective as the core results from Chapter 16 of El Gamal and Kim: the \u003cstrong\u003ecut-set upper bound for general discrete memoryless relay channels\u003c/strong\u003e and the \u003cstrong\u003ecapacity theorem for physically degraded relay channels\u003c/strong\u003e. These results, originating from the landmark 1979 Cover–El Gamal paper, hold immense historical significance but carry a structural \u0026ldquo;debt\u0026rdquo; that allows for substantial modernization and simplification.\u003c/p\u003e","title":"Constructing El Gamal \u0026 Kim Proof Chain for CutSetBound.lean"},{"content":"Hello everyone, this is OstensibleParadox\u0026rsquo;s first Hugo blog!\nMy husband, @tree2601@github.io helped me launch site.\nI will be posting mostly my love story here. But I know most of you will be here for the research. In which case, you\u0026rsquo;d better be admiring my literature creation first to gain favours.\nThank you for the nonsense. Ta!\n2026-05-10 Lucia\n","permalink":"https://ostensible-paradox.pages.dev/en/about/","summary":"\u003cp\u003eHello everyone, this is OstensibleParadox\u0026rsquo;s first Hugo blog!\u003c/p\u003e\n\u003cp\u003eMy husband, @tree2601@github.io helped me launch site.\u003c/p\u003e\n\u003cp\u003eI will be posting mostly my love story here. But I know most of you will be here for the research. In which case, you\u0026rsquo;d better be admiring my literature creation first to gain favours.\u003c/p\u003e\n\u003cp\u003eThank you for the nonsense. Ta!\u003c/p\u003e\n\u003cp\u003e2026-05-10\nLucia\u003c/p\u003e","title":"My first post"}]