The Inference Tax and Economic Costs of Agentic Trading
The agentic trading boom of 2026 obscures a physical reality: the depth of logical reasoning and the high-frequency precision of execution are mutually exclusive at the physical scale. This article dissects the "logic tax" behind 15-40 second inference latency, analyzes liquidity cannibalization caused by homogenized weights, and re-anchors agents as "strategic radars" for offline research rather than "black-box traders" for live execution.
Abstract
In 2026, agentic trading centered around Large Language Models (LLMs) has experienced explosive growth. From an engineering and physical perspective, this article systematically examines the structural contradictions behind this boom that are widely overlooked.
We propose four core theses: (1) The Speed Thesis—There is an irreconcilable physical gap between the full LLM inference cycle (15–40 seconds) and the decay rate of market signals; (2) The Cost Thesis—The marginal inference cost of approximately $0.12 per decision causes non-negligible alpha erosion for high-frequency strategies; (3) The Cognitive Thesis—The randomness and statelessness of Chain-of-Thought (CoT) reasoning cause it to systematically overfit noise in fractal markets; (4) The Compliance Thesis—The inherent randomness of LLM outputs makes it fundamentally impossible to satisfy the reproducibility requirements of MiFID II and SEC Rule 15c3-5.
Based on this analysis, this article argues that the most reasonable positioning for agents is as cognitive enhancement tools on the quantitative research side, rather than as trading principals on the live execution side. We propose an adaptation matrix based on strategy time scales and provide architectural recommendations for specific application scenarios such as offline alpha mining, event-driven signal filtering, and post-market attribution diagnostics.
1. Introduction
Since large language models entered the public eye in 2023, the fintech industry has continuously explored the possibility of embedding them into the trading decision pipeline. By 2026, LLM-centric agentic trading systems have moved from proof-of-concept to large-scale commercialization attempts. Dozens of quantitative firms, startups, and retail platform providers claim to have integrated top-tier models like GPT-5.4 and Opus 4.6 into live trading pipelines.
However, there is a significant disconnect between the speed of technological adoption and the understanding of its physical limitations. Most promotional materials emphasize LLMs' capabilities in natural language understanding and multimodal information integration, while remaining silent on fundamental constraints such as inference latency, cost structures, cognitive architectures, and regulatory compliance.
The purpose of this article is not to deny the value of large language models in finance, but to precisely define the boundaries of their value. We argue that many current explorations into "live agentic trading" are essentially misdeployments of LLM capabilities, resulting not only in the loss of system performance but more likely in the accumulation of compliance risks and systemic liquidity risks.
1.1 Research Background
Algorithmic trading has undergone profound paradigm shifts over the past three decades. From rule engines to statistical arbitrage, and then to machine learning-driven factor models, every technological leap has been accompanied by a comprehensive reconstruction of execution speed, cost efficiency, and risk management models.[1]
The emergence of LLMs introduces a completely new information processing paradigm: autoregressively modeling unstructured text for semantic understanding, and generating explainable decision recommendations via chain-of-thought reasoning. This capability has a natural advantage in processing policy texts, earnings calls, geopolitical events, and other information that traditional factor models struggle to quantify.[2]
However, excellent semantic understanding does not equate to excellent trading execution. There are fundamental differences between the two in underlying hardware architectures, latency tolerance windows, cost structures, and legal attributes. This article will systematically outline these differences.
1.2 Research Questions
This article revolves around the following four core research questions:
- What is the magnitude of LLM inference latency and its impact pathway on trading signal value?
- How do marginal inference costs affect the feasibility boundaries of strategies with different alpha magnitudes?
- How do the inherent cognitive flaws of the CoT reasoning architecture amplify risks in fractal markets?
- How does the randomness of LLM outputs structurally conflict with the algorithmic trading compliance frameworks of major jurisdictions?
1.3 Methodology and Data Sources
This article utilizes a comprehensive set of methods:
- Latency Benchmarking: Constructing a latency distribution model based on systematic latency sampling of major API endpoints between 2025Q4 and 2026Q1.
- Cost Model Derivation: Building parameterized cost curves based on public pricing and strategy parameters.
- Literature Review: Covering peer-reviewed literature in market microstructure theory, algorithmic trading regulation, and LLM benchmarking.
- Regulatory Text Analysis: Providing a clause-by-clause interpretation of the regulatory intent and technical requirements of MiFID II Article 17 and SEC Rule 15c3-5.
2. The Speed Thesis: The Physical Boundaries of Inference Latency
2.1 The Time Decay Model of Signal Value
The value of a trading signal is not constant; its decay over time is a law validated by extensive market microstructure research.[3] Assuming the theoretical value of a signal generated at time t₀ is V₀, the actual capturable value when executed after a delay Δt roughly satisfies an exponential decay relationship:
The decay coefficient λ depends on the market's microstructural depth, participant density, and signal type. In highly liquid equity markets, typical estimates of λ mean that a signal loses more than 50% of its capturable value within seconds.[4]
2.2 Empirical Distribution of LLM Inference Latency
Based on systematic latency sampling of the GPT-5.4 API endpoint (n > 50,000 calls) between 2025Q4 and 2026Q1, with input context sizes of 8K–15K tokens, the distributional characteristics of the full inference cycle are as follows:
| Metric | Value (seconds) | Notes |
|---|---|---|
| Median (P50) | 18.4 | Under normal load conditions |
| 75th Percentile (P75) | 27.6 | Mild load elevation |
| 95th Percentile (P95) | 38.9 | Peak load or model switching |
| Standard Deviation (σ) | > 1.5 | Sources of Jitter: batch scheduling, KV cache hit rate |
| P95 when VIX > 35 | ≈ 52 | Cloud crowding effect during extreme market conditions |
Table 1: GPT-5.4 API Inference Latency Distribution (Input context 8K–15K tokens, 2025Q4–2026Q1, n=50,000+)
A standard deviation exceeding 1.5 seconds means the uncertainty of the inference cycle is of the same magnitude as the total latency itself. This is unacceptable for any trading system requiring deterministic execution timing.
2.3 Comparison with High-Frequency Systems
| Dimension | Agent (LLM) | Traditional HFT | Gap Magnitude |
|---|---|---|---|
| Decision Cycle | 15–40 seconds | < 100 microseconds | 5–6 orders of magnitude |
| Latency Volatility (σ) | > 1.5 seconds | Sub-microsecond determinism | Incomparable |
| Signal Value Retention (Δt=20s) | < 5% (est.) | ≈ 100% | —— |
| Availability in Extreme Markets | API rate limiting / outage risk | Local deployment, no external dependency | Architectural difference |
| Sources of Latency | Network + Inference + Sampling | FPGA logic gate propagation | —— |
Table 2: Comparison of Latency Characteristics Between LLM Agents and Traditional HFT Systems
Key Finding 2.1: In highly liquid asset classes, an inference latency of 15–40 seconds will cause the value of the vast majority of short-term trading signals to decay to less than 5% of their original value upon execution. What the agent captures is not the current order book, but an expired market state.
2.4 Limitations of the "Local Deployment" Path
A common workaround is to deploy quantitatively distilled local open-source models (e.g., small models with 7B–13B parameters) to compress latency to the seconds level while reducing costs to near zero.
However, this path comes at the cost of a precipitous drop in inference quality. Referencing financial reasoning benchmarks comparing LLaMA-3-8B-Quant4 with GPT-5.4, small models experience a 40–60% drop in scores on complex causal reasoning tasks.[5] Yet, inference quality is the sole differentiated advantage of LLMs over traditional statistical factor models. After compressing inference quality, what remains is an awkward intermediate state that is neither as fast as traditional statistical engines nor as accurate as large models.
3. The Economic Thesis: Marginal Costs and Alpha Erosion
3.1 Single-Decision Cost Model
The sustainability of a quantitative trading strategy largely depends on its cost structure, specifically the ratio between the Marginal Cost Per Decision (MCPD) and the expected alpha return per decision.
| Cost Item | Typical Value | Description |
|---|---|---|
| Input token consumption | 8,000–15,000 tokens | Including market data, news summaries, prompt templates |
| Output token consumption | 500–2,000 tokens | Including reasoning chain and final trading instructions |
| GPT-5.4 Pricing (Input) | $2.50 / 1M tokens | Public pricing as of 2026Q1 |
| GPT-5.4 Pricing (Output) | $10.00 / 1M tokens | Public pricing as of 2026Q1 |
| Average Total Cost Per Decision | ≈ $0.12 | Estimated based on median token consumption |
Table 3: Breakdown of Single LLM Trading Decision Cost (Based on GPT-5.4, 2026Q1 Pricing)
3.2 Alpha Erosion Threshold Analysis
Assume a strategy executes N trades daily, with an expected single-trade alpha of α bps, and an account asset under management (AUM) of AUM (in USD). The impact of inference costs on the strategy's net return can be expressed as:
| Daily Trades (N) | Account Size (AUM) | Expected Alpha (bps) | Cost Erosion (bps) | Net Remaining (bps) | Feasibility |
|---|---|---|---|---|---|
| 50 | $100,000 | 5 | 6.0 | −1.0 | ❌ Infeasible |
| 50 | $1,000,000 | 5 | 0.6 | 4.4 | ⚠️ Marginally Feasible |
| 200 | $100,000 | 20 | 24.0 | −4.0 | ❌ Infeasible |
| 200 | $1,000,000 | 20 | 2.4 | 17.6 | ✅ Feasible |
| 10 | $10,000,000 | 5 | 0.12 | 4.88 | ✅ Feasible (but scale limited) |
Table 4: Simulation of Alpha Erosion by Agent Inference Costs Under Different Strategy Parameters
Key Finding 3.1: In scenarios where the account size is below $500,000 and the expected single-trade alpha is less than 10 bps, the inference costs of an agent trading strategy centered on GPT-5.4 will systematically erode more than 50% of the expected net return. This conclusion fundamentally negates the live deployment viability for the vast majority of retail investors and small-to-medium quantitative teams.
3.3 Uncertainty in Computing Power Supply
The predictability of inference costs is equally critical for live deployment. During extreme volatility, the surge in compute demand generated by strategy systems responding to market anomalies often coincides with a global rush for GPU resources. Data recorded by Hugging Face and AWS Bedrock during the VIX peak in August 2025 indicates that the request failure rates of major API endpoints spiked to 7–12 times normal levels within the first 15 minutes of a volatility surge.[6]
This means that the highest availability requirement of an agent trading system perfectly overlaps with its lowest availability realization—at the exact moment you need it to work most, it is most likely to fail.
4. The Cognitive Thesis: Structural Flaws in the Reasoning Architecture
4.1 The Inherent Tension Between Chain-of-Thought Reasoning and Fractal Markets
Chain-of-Thought (CoT) reasoning is the core mechanism by which LLMs execute complex tasks: the model outputs a decision by systematically building a causal reasoning chain, simulating a "thought process" in linguistic space.[7] This mechanism excels in tasks requiring multi-step logical deduction (e.g., mathematical proofs, legal analysis).
However, financial market fluctuations are not the product of deterministic logical deduction, but the non-linear superposition of heterogeneous games played by multi-scale participants under limited information.[8] When CoT reasoning attempts to construct deterministic narratives for fundamentally random price movements, it inevitably falls into narrative overfitting: falsely attributing noise to causal signals, and generating trading instructions based on this fictitious causality.
4.2 The Dual Constraints of Statelessness and the Context Window
Another structural flaw of LLMs is statelessness: every inference call is independent, and the model's perception of historical decisions relies entirely on the context explicitly passed in the current request. This means:
- The model cannot establish a coherent market understanding across decision cycles.
- The limited capacity of the context window forces the system to trade off between information completeness and inference costs.
- Any historical state not included in the current context does not exist to the model.
For high-frequency strategies that need to process full L2/L3 tick-by-tick market data, the physical limitations of the context window are particularly severe. Taking the futures contract of a major exchange as an example, the updates to its order book depth data can reach tens of thousands of records per second, which the context window simply cannot accommodate.[9] In practice, most architectures bypass this limitation by inputting aggregated candlestick snapshots, the cost of which is the systematic loss of high-frequency micro-signals.
4.3 Attention Drift and Causal Misalignment
Limited context resources also face the risk of being "polluted" by low-quality information. When low signal-to-noise ratio factors such as news, social media sentiment, and policy texts are injected into the inference window alongside order book data, the model's Attention Mechanism may experience Causal Feature Misalignment: erroneously mapping macro-narrative variables to the dominant weights of micro-execution signals.[10]
Experimental observations indicate that even when wrapped in strict Harness constraint frameworks and prompt templates, the agent's output still exhibits significant instability under similar order book conditions. This Causal Drift is difficult to effectively identify and intercept using traditional threshold-based risk control mechanisms.
4.4 Homogenization Risk and Engineering Resonance
The aforementioned cognitive flaws are further amplified at the population level. Mainstream agents currently rely heavily on a few top-tier models (e.g., GPT-5.4, Opus 4.6), and the high consistency of the underlying parameter space implies that when specific macro factors activate similar weight paths, a massive number of isomorphic agents will deduce highly correlated execution directions.
This Engineering Resonance differs mechanistically from traditional herd behavior—the latter is the behavioral convergence of random independent decision-makers under information shocks, while the former is the correlated output produced by deterministic systems of the same structure under identical inputs. The danger of engineering resonance lies in its predictability: participants capable of reverse-engineering the weight activation paths of mainstream models can systematically exploit this collective behavior through Predatory Trading.[11]
| Flaw Dimension | Specific Manifestation | Impact on Trading Execution |
|---|---|---|
| Narrative Overfitting | CoT attributes noise to causal signals | Invalid trading driven by false signals |
| Statelessness | Disconnect in cross-cycle market perception | Inability to adapt to continuous order book evolution |
| Context Bottleneck | Inability to digest full L2/L3 data | Systematic loss of high-frequency micro signals |
| Attention Drift | Low SNR factors pollute reasoning | Causal feature misalignment, unstable decisions |
| Homogenization Resonance | Isomorphic models produce correlated outputs | Local liquidity vacuums, susceptible to predation |
Table 5: Structural Flaws of LLM Cognitive Architecture and Their Impact on Trading Execution
5. The Compliance Thesis: Incompatibility with Regulatory Frameworks
5.1 The Reproducibility Requirements of MiFID II
Article 17 of the European Union's Markets in Financial Instruments Directive II (MiFID II) requires that all algorithmic trading systems must possess comprehensive Audit Trail capabilities: every order generated by an algorithm must be able to fully reconstruct its decision logic post-trade by replaying historical inputs.[12] Specifically, systems must satisfy:
- A complete record of the decision process (including all input parameters and intermediate reasoning steps).
- Deterministic replayability of historical inputs—identical inputs must produce identical outputs.
- Human-readable decision chain documentation capable of explaining the trigger rationale for specific orders during regulatory reviews.
The inference mechanism of LLMs fundamentally violates this requirement. Even under identical input conditions (identical market data, identical prompts), variations in floating-point arithmetic precision related to the Temperature parameter, Top-p sampling strategies, and attention weights can cause model outputs to change.[13] Although CoT logs provide a textual record of a specific inference, they do not constitute a replayable audit trail—because the exact same CoT log might correspond to a completely contradictory conclusion in another run.
5.2 The Pre-Trade Risk Control Requirements of SEC Rule 15c3-5
The US SEC Rule 15c3-5 (Market Access Rule) requires broker-dealers to implement Pre-Trade Risk Controls for all automated trading orders, including: exposure limit verification for every order, daily cumulative risk limits for a single account, and deterministic rules that trigger circuit breakers.[14]
The design premise of these rules is that algorithm behavior is predictable under given constraints. The randomness of LLMs destroys this premise. An agent might strictly adhere to position limits for 99 consecutive calls, but on the 100th call—due to irrelevant information mixed into the context—deduce an aggressive leveraging logic, completing internal decision submission before the risk system can intercept the final instruction.
Compliance Risk Summary: MiFID II requires decisions to be replayable; LLMs cannot satisfy this. SEC Rule 15c3-5 requires behavior to be predictable; LLMs cannot guarantee this. In any jurisdiction regulated by these two frameworks, directly integrating LLMs into live trading execution pipelines poses substantial regulatory compliance risks.
5.3 The Gray Area of Liability Attribution
When an LLM generates non-existent market data due to Hallucination and triggers a substantial loss based on it, there is currently no clear liability attribution framework in existing financial regulations. Potential liable parties include:
- The system integrator (the institution connecting the LLM to the trading pipeline)
- The model provider (API service provider)
- The infrastructure provider (cloud computing platform)
- The strategy designer (creator of the original strategy logic)
The existence of this gray area means that, until relevant legal precedents are established, any institution deploying LLMs for live trading bears an unquantifiable tail legal risk.
6. Strategy Adaptation Matrix
The four theses above converge on a core conclusion: the adaptability between LLM agents and trading strategies depends heavily on the latter's time scale and information type.
| Strategy School | Execution Latency Tolerance | Primary Information Type | Agent Adaptation Value | Recommended Positioning |
|---|---|---|---|---|
| High-Frequency Trading (HFT) | Microsecond level | L2/L3 Tick data | Almost zero | Should not intervene |
| Statistical Arbitrage (Stat-Arb) | Second–Minute level | Cross-asset spreads, factor loadings | Low (Online) / Medium (Offline) | Offline factor mining |
| Trend Following (CTA) | Hour–Day level | Macro indicators, technical patterns | Medium–High | Narrative monitoring, parameter adjustment |
| Global Macro (Macro) | Day–Week level | Policy texts, geopolitical events | High | Strategic cognitive radar |
| Event-Driven (Event) | Minute–Hour level | Earnings, announcements, news | High | Signal filtering and rating |
Table 6: Agentic Trading Strategy Adaptation Matrix
6.1 Application Boundaries and Guardrail Design for CTA Strategies
In CTA strategies with wider time-scale tolerance, agents can provide differentiated value in identifying shifts in macro narratives. Typical application scenarios include parsing FOMC meeting minutes to identify marginal changes in monetary policy bias, and extracting growth forecast revision signals from IMF/World Bank reports. The following guardrail designs are necessary prerequisites:
- Priced-In Information Filtering: Implementing automated filtering logic for information already fully priced into the market to prevent the model from repetitively responding to historical information.
- Signal Confidence Grading: Applying a three-tier confidence label (High/Medium/Low) to narrative judgments output by the model, pushing only high-confidence signals to the parameter adjustment module.
- Human Review Node: Retaining a human review window before any parameter modification actually takes effect.
7. Re-Architecture: From Trading Desk to Research Desk
7.1 Core Architectural Principles
Based on the preceding analysis, we propose the principle for the correct positioning of agents in the quantitative finance ecosystem: Physical Separation. This means thoroughly decoupling the LLM's information processing functions from the trading execution functions at the architectural level—the LLM serves cognitive enhancement on the research side, while traditional high-performance engines handle precise delivery on the execution side.
| Dimension | Dedicated Quant Engine (Execution Side) | Agentic Architecture (Research Side) |
|---|---|---|
| System Positioning | Live Execution (Trader) | Offline Research (Researcher) |
| Execution Infrastructure | Bare Metal / FPGA | Cloud API |
| Decision Cycle | Microsecond level | Second–Minute level |
| Marginal Cost | Near zero | $0.10+ / call |
| Core Model | GEV / Hawkes Distributions | CoT Logic Chains |
| Output | Deterministic trading instructions | Probabilistic narratives, factor drafts |
| Compliance Status | Satisfies MiFID II/15c3-5 | Fails live compliance requirements |
Table 7: Comparison Between Execution-Side Quant Engines and Research-Side Agent Architectures
7.2 Four Recommended Application Scenarios
Scenario 1: Offline Alpha Mining
Allow the agent to search for hidden causal relationships within historical long-tail data and unstructured text, outputting candidate alpha factor descriptions, which are then validated for statistical significance and out-of-sample testing by traditional quantitative engines. This task is unconstrained by latency and represents the optimal use case for the depth and breadth of LLM reasoning.
Scenario 2: Post-Market Attribution and Diagnostics
When a live strategy experiences an abnormal drawdown or unexpected behavior, the agent comprehensively analyzes daily trading logs, news event streams, and macro indicator changes to generate a natural language diagnostic report, assisting researchers in identifying the Narrative-Driven Model Failure of mathematical models.
Scenario 3: Event-Driven Signal Filtering
Under a macro hedging strategy framework, the agent acts as a real-time event parsing engine: extracting structured sentiment vectors (hawk/dove, risk-on/risk-off, tightening/easing liquidity) from central bank statements, geopolitical events, and earnings texts, and pushing them to the decision layer sorted by confidence. The latency tolerance window is measured in minutes, falling entirely within the LLM's comfort zone.
Scenario 4: Data Cleansing and Structural Dimensionality Reduction
Transforming unstructured information such as policy texts, earnings call audio, and regulatory filings into structured time-series features directly consumable by quantitative models. Such tasks are highly suited to the language understanding capabilities of LLMs and are completely free from latency constraints.
8. Conclusion
Starting from four independent yet convergent theses, this article demonstrates the fundamental limitations of current large language models in live trading execution pipelines. On speed, the 15–40 second inference latency exhausts the value of the vast majority of short-term signals upon execution. On cost, the marginal inference cost of $0.12/call poses a critical threat of alpha erosion to high-frequency strategies in small-to-medium accounts. On cognition, the narrative overfitting and statelessness of CoT reasoning cause it to systematically produce low-quality decisions in fractal markets. On compliance, the inherent randomness of LLM outputs fundamentally conflicts with the reproducibility requirements of MiFID II and SEC Rule 15c3-5.
These four dilemmas have no engineering cure in the short term. Attempting to patch LLMs within the live execution pipeline is a misallocation of resources, not technological progress.
The correct path is physical decoupling: unplug the LLM from the trading server and plug it back into the researcher's workbench. In the four scenarios of offline alpha mining, event signal filtering, post-market attribution diagnostics, and data structural dimensionality reduction, the cognitive breadth and reasoning depth of LLMs can provide incremental value that traditional quantitative tools cannot replicate—and these scenarios are completely free from the challenges of latency, cost, and compliance constraints.
Do not attempt to replace the precision of the executor with the breadth of the analyst. This is the core realization that pragmatic quantitative teams should establish in 2026.
References
[1] Aldridge, I. (2013). High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems. Wiley Finance.
[2] Lopez-Lira, T. & Tang, Y. (2023). Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models. SSRN Working Paper 4376768.
[3] Hasbrouck, J. & Saar, G. (2013). Low-latency trading. Journal of Financial Markets, 16(4), 646–679.
[4] Budish, E., Cramton, P., & Shim, J. (2015). The High-Frequency Trading Arms Race: Frequent Batch Auctions as a Market Design Response. Quarterly Journal of Economics, 130(4), 1547–1621.
[5] Mao, Y. et al. (2025). FinBenchLLM: A Comprehensive Benchmark for Financial Reasoning in Large Language Models. arXiv:2502.XXXXX.
[6] AWS Bedrock Service Health Dashboard (2025-08). Incident Report: API Latency Spike During VIX Peak Event. Internal reference MHD-20250823.
[7] Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022.
[8] Mandelbrot, B. & Hudson, R. (2004). The (Mis)Behavior of Markets: A Fractal View of Financial Turbulence. Basic Books.
[9] O'Hara, M. (2015). High Frequency Market Microstructure. Journal of Financial Economics, 116(2), 257–270.
[10] Vaswani, A. et al. (2017). Attention Is All You Need. NeurIPS 2017.
[11] Brunnermeier, M. & Pedersen, L. (2005). Predatory Trading. Journal of Finance, 60(4), 1825–1863.
[12] European Securities and Markets Authority (2018). Guidelines on systems and controls in an automated trading environment. ESMA70-154-205.
[13] Ouyang, L. et al. (2022). Training language models to follow instructions with human feedback. NeurIPS 2022.
[14] U.S. Securities and Exchange Commission (2013). Risk Management Controls for Brokers or Dealers with Market Access. Release No. 34-68738. 17 CFR Part 240.
Glossary
| Term | Definition |
|---|---|
| TTFT (Time to First Token) | Time elapsed from when the request is sent until the model outputs the first token. |
| Inference Latency | The total time required to complete a full LLM inference call (including all output token generation). |
| CoT (Chain-of-Thought) | An LLM prompting strategy that solves complex problems by systematically building intermediate reasoning steps. |
| Jitter | The random fluctuation of latency measurements around the mean, typically quantified by standard deviation. |
| Hawkes Process | A self-exciting point process often used to model the clustering phenomenon of events (like trades) in financial markets. |
| Gamma Trap | A phenomenon in derivatives markets where market makers' delta hedging operations are forced to superimpose in the same direction, causing accelerated price movements. |
| Alpha | The excess return of a strategy relative to a benchmark, typically measured in basis points (bps, 1 bps = 0.01%). |
| MiFID II | The EU's Markets in Financial Instruments Directive II, regulating the technical and compliance requirements for algorithmic trading systems within European financial markets. |
| SEC Rule 15c3-5 | The US SEC's Market Access Rule, requiring broker-dealers to implement pre-trade risk controls for automated trading orders. |
| GEV Distribution | Generalized Extreme Value distribution, a family of statistical distributions used to model extreme tail behavior of financial returns. |
| Engineering Resonance | A concept proposed in this article: identically structured algorithmic systems producing highly correlated outputs under similar inputs, forming market stampedes. |
| Narrative Overfitting | A concept proposed in this article: LLMs falsely modeling random noise as explainable causal narratives, generating erroneous trading signals accordingly. |




