Abductive Reasoning
A form of reasoning that starts with an observation and seeks the most plausible explanation, even if it cannot be guaranteed to be true. Abduction is about forming hypotheses—“what could explain this?” Sherlock Holmes famously described his method as deduction, but in reality it combined induction (spotting patterns), abduction (hypothesizing the cause), and deduction (testing the implications). Abductive reasoning is central to problem solving under uncertainty: it favors plausible inference when complete proof is not available. See Peircian Triad.
AGI (Artificial General Intelligence)
AGI refers to a machine’s ability to understand, learn, and apply knowledge across a wide range of tasks at human-level competence. Unlike narrow AI, which excels at one domain, AGI can flexibly transfer insights and skills from one context to another.
Aggregation design (OLAP)
The methodical selection of which attribute combinations to pre-aggregate and store so most queries hit fast summaries while staying within build time and storage budgets. It weighs your workload (query log), hierarchies, and partitions to pick a small set of high-value aggregations (e.g., Month × State × Category) and skip redundant ones you can roll up to (so you don’t also pre-aggregate Quarter if Month exists).
AI Agent
An autonomous software entity that can perceive its environment, plan actions, and execute tasks—often by orchestrating multiple AI components. In a RAG pipeline, an AI agent might issue queries to a vector store or graph database, feed the results into an LLM, then post-process and deliver answers or trigger downstream workflows.
ASI (Artificial Super Intelligence)
ASI denotes a hypothetical intelligence that far surpasses the brightest human minds in every field—creativity, scientific reasoning, social skills, and more. It represents the point at which AI not only matches but greatly exceeds human capability across the board.
Balanced Scorecard (BSC)
A management framework for turning strategy into action by choosing a small set of objectives, measures (KPIs), targets, and initiatives across four lenses—typically Financial, Customer, Internal Processes, and Learning & Growth. The BSC gives leaders a recurring cadence (monthly/quarterly reviews) to see what’s working, fix what isn’t, and align teams and budgets to the plan. Think of it as the organization’s instrument panel and governance loop: what we aim to achieve, how we’ll measure it, who owns it, and when we’ll adjust.
In contrast, A strategy map is the storyboard—a visual of objectives linked in cause-and-effect (“improve capacity → better service → more referrals → growth”). The Balanced Scorecard is how you run that story: it assigns KPIs, targets, owners, and initiatives to those objectives and keeps the review cycle honest. In short, the map explains why outcomes should happen; the BSC ensures we measure, manage, and learn our way to them.
Chain of Strong Correlations (CSC)
A CSC is the data storyboard built inside the Enterprise Knowledge Graph—specifically derived from the Tuple Correlation Web (TCW) and Bayesian conditional probabilities, with context from the KG’s ontology/taxonomy, the Insight Space Graph (ISG), and the Data Catalog. Each edge carries strength, lag, and window, sketching plots like X uptick → Y follows → Z eases. CSCs don’t claim causation. Rather, they prioritize what to test and operationalize (alerts, Markov links, or rules) once the strongest scenes prove stable.
Complex (vs. Complicated and Simple)
Complex systems, have many interacting parts whose relationships change over time. Their behavior emerges from feedback loops and context. A business organization, a living ecosystem, or a production data platform are complex — not just difficult, but dynamic. You can’t solve them once and for all; you can only observe, adapt, and learn as patterns evolve.
in comparison, simple systems have few parts and direct cause-and-effect relationships. Their behavior is predictable — like a light switch or a basic SQL query. If something goes wrong, the cause is easy to trace. Complicated systems have many parts, but those parts interact in fixed, knowable ways. A jet engine or a database optimizer is complicated — hard to understand, but ultimately analyzable with enough expertise and documentation.
Context Window
The context window is the span of text or tokens an AI model can “see” and consider at once when generating a response. It defines how much recent or retrieved information influences each prediction. A small context window limits awareness to short passages or single questions; a large one allows the model to reason across entire documents or multi-step conversations. Expanding the context window increases situational awareness but also demands more computation and careful context engineering to keep inputs relevant and frugal.
Cypher
A declarative graph query language designed for property graph databases, most notably Neo4j. Cypher uses pattern matching to traverse nodes and relationships, allowing users to express complex queries in a readable, SQL-like syntax. Unlike SPARQL, Cypher operates on labeled property graphs (LPGs), which include labels on nodes and key-value pairs on both nodes and relationships. Cypher is intuitive and fast for application developers, but is not part of the Semantic Web standards.
Data Frame
A data frame is a two-dimensional, tabular data structure (rows and columns) where each column can hold a different type (numbers, text, dates). It’s the go-to format in analytics and languages like R or Python (Pandas) for slicing, dicing, and transforming datasets before modeling or visualization.
Deductive Reasoning
Reasoning from the general to the specific. If the premises are true and the logic is valid, the conclusion must be true. Deduction applies established rules or principles to reach certain outcomes (e.g., All humans are mortal; Socrates is a human; therefore, Socrates is mortal). In computing, Prolog exemplifies deductive reasoning—rules and facts yield guaranteed logical conclusions. See Peircian Triad.
DIKW / DIKUW
The DIKW hierarchy is a well-known model of cognition and learning: Data → Information → Knowledge → Wisdom. Data are raw signals, information organizes them into patterns, knowledge encodes models and structures, and wisdom applies judgment in context. An extended version, DIKUW, adds Understanding between knowledge and wisdom. Understanding interprets why patterns and rules hold, providing the bridge from structured knowledge to wise action. This additional layer highlights that intelligence requires not just storing and applying rules, but grasping their underlying meaning.
Drill-Down
In OLAP, drill-down means navigating from a higher-level summary to more detailed data along a hierarchy. For example, from Year → Quarter → Month → Day. Drill-down refines the view by expanding members into their child members.
Drill-Through
Drill-through jumps out of the cube entirely to view the underlying fact records that contributed to an aggregated cell. For example, clicking a sales total in an OLAP report to see the individual transaction rows from the source system. Drill-through bridges summarized OLAP data and raw case-level detail.
Drill-Up
The reverse of drill-down. Drill-up rolls detailed members back into their parent level in the hierarchy. For example, moving from Day → Month → Quarter → Year. Drill-up provides broader context and reduces detail.
Edge Computing
A computing architecture where data processing occurs close to the source of data—such as IoT sensors, mobile devices, or industrial equipment—rather than being sent to a centralized cloud or data center. The “edge” refers to the edge of the network, where latency is lower and response times are faster. This approach reduces bandwidth usage, improves real-time responsiveness, and enables autonomous behavior in disconnected or bandwidth-constrained environments. Edge computing is especially important for scenarios like industrial automation, autonomous vehicles, and real-time analytics where waiting for round-trips to the cloud isn’t practical.
Enterprise Knowledge Graph (EKG)
A connected, machine-readable map of what the business is—its people, products, customers, processes, rules, metrics, and the relationships among them—expressed with web identifiers (IRIs) so everything can be linked, queried, and reused across systems. Unlike a single database or a static glossary, an EKG unifies data + meaning: operational records, definitions, KPIs, policies, lineage, and external knowledge all live as nodes and edges that applications and analysts can traverse.
An EKG goes beyond ontologies and taxonomies. It also docks the organization’s stories—strategy maps, SBAR frames, causal claims, scenarios—as first-class, versioned artifacts, alongside pattern layers like ISG (what analysts notice), TCW (chains of strong correlations), and Time Molecules (event sequences). That makes the graph not just descriptive but executable: you can diff proposed strategy changes, attach evidence, trace impacts to KPIs, and learn from outcomes.
The EKG becomes the enterprise’s shared semantic layer and memory: a place where data, definitions, and decisions connect, so people and machines can ask better questions, make faster decisions, and continuously improve the story the business runs on.
Executive Function
In neurology, executive function refers to a cluster of higher-order cognitive processes orchestrated by the prefrontal cortex that govern goal-directed behavior, self-regulation, and adaptive problem-solving. These “command center” skills enable planning, impulse control, and flexible thinking in complex, changing environments.
Key Components:
- Working memory: Temporarily holding and manipulating information.
- Inhibitory control: Suppressing distractions or impulsive actions.
- Cognitive flexibility: Switching strategies or perspectives as needed.
Primarily involves the prefrontal cortex, with support from the basal ganglia, thalamus, and anterior cingulate; impairments (ex. from TBI or ADHD) disrupt daily functioning.
Inductive Reasoning
Reasoning from specific observations to broader generalizations. Induction looks at repeated patterns and infers rules or probabilities (e.g., the sun has risen every day of my life, therefore it will rise tomorrow). Inductive reasoning is the basis of machine learning, where models are trained on historical data to predict future outcomes. Unlike deduction, induction does not guarantee certainty, but it provides evidence-based likelihoods. See Peircian Triad.
Key Performance Indicator (KPI)
A performance indicator is any metric that tells us how some part of a system is doing: page-load time, call-center wait time, daily ad spend, fuel consumption, error rate, and so on.
A Key Performance Indicator (KPI) is a performance indicator that’s been “promoted“. It is explicitly tied to a strategic goal, is important enough to watch continuously, and will actually trigger decisions if it drifts. In other words, all KPIs are performance indicators, but only a small subset of performance indicators are truly “key.”
I choose to generally refer to performance indicators as KPIs because “pi” is overloaded (the more famous pi). More importantly, I use “KPI” as a blanket term for any performance indicator that makes it onto the strategy map. Metrics, key or not, and even goals and objectives are performance indicators. For example, achieving higher net profit is a goal, but it’s also something we measure. If it’s on the map, it’s “key” by definition, because it participates in the planning and trade-off structure rather than just being a background metric.
Knowledge Graph
A structured network of entities (nodes) and their interrelations (edges), often enriched with attributes and semantic context. Knowledge graphs enable machines to understand and traverse complex domains by encoding facts and their relationships in a graph format, powering search, recommendation, and reasoning applications.
Leaf-Level
In OLAP cubes, the leaf-level refers to the lowest level of detail stored in the cube—the point at which no aggregations have been applied. Each row at the leaf corresponds directly to a fact record at the cube’s base grain (e.g., individual transactions, line items, or events). From the leaf-level upward, higher-level summaries are derived through aggregations along hierarchies.
At leaf-level, the cube can be as large as the underlying fact table, since it retains all raw dimensional keys and measures. Querying directly at this level is equivalent to working with the base fact detail, while aggregated levels above provide faster, smaller, and reusable summaries.
MOLAP (Multidimensional OLAP)
MOLAP stands for Multidimensional Online Analytical Processing. In this approach, data is pre-aggregated and stored in a specialized multidimensional structure (often called a “cube”) rather than a standard relational database. These cubes are optimized for fast query performance, typically enabling split-second responses for complex, multi-level aggregations across multiple dimensions.
Because the aggregations are materialized ahead of time, MOLAP excels at query speed and predictable performance, even for large or complex queries. However, it requires significant processing time up front to build and refresh the cubes, and it introduces storage overhead—you’re effectively creating and maintaining a duplicate, summarized version of your data.
Tools like Microsoft SSAS (Multidimensional mode) and legacy enterprise BI platforms are examples of MOLAP implementations.
Key trade-off: MOLAP delivers blazing-fast queries and rich multidimensional capabilities, but at the cost of data freshness, flexibility, and maintenance complexity.
NoLLM (Not Only LLM)
NoLLM says LLMs aren’t all of AI—they’re one powerful thread in a tapestry woven from earlier “AI summers” that still matter: rules/Expert Systems, Semantic Web/KGs, classical ML, CEP/stream processing, planning/optimization, and more. In The Assemblage of Artificial Intelligence, NoLLM means composing these strands so each does what it’s best at: LLMs translate intent and bridge humans and systems; CEP reacts in real time; rules enforce policy; KGs carry meaning; classical models quantify risk; optimizers choose actions. The point isn’t nostalgia—it’s complementarity: by orchestrating proven parts with LLMs (not replacing them), you get systems that are faster, more governable, and less brittle than LLM-only stacks.
OLAP (OnLine Analytical Processing)
A read-optimized layer in a data warehouse built on dimensional models and data marts, where:
- Dimensions (e.g. Time, Product, Region) slice the data cube
- Measures (e.g. Sales, Profit) live at each cell
- You slice, dice, drill and pivot for fast, ad-hoc analytics
OLAP systems are denormalized, batch-loaded, and tuned for complex queries over large datasets; OLTP systems are normalized, handle row-level transactions, and optimize for high-volume inserts/updates. OLAP is the analysis of transactions by dimensional slicing and dicing, while OLTP is the input and maintenance of transactions and entities of a database.
O(n) (Linear Time Complexity)
Denotes an algorithm whose running time (or space usage) grows proportionally with the size of its input, n. In practical terms, if you double the amount of data, an O(n) process will take roughly twice as long—common examples include single-pass loops and simple scans through a list.
On-the-Fly Aggregation
Ephemeral aggregation that’s computed on the OLAP node from root or source when no persisted agg exists; cached locally only until restart.
OODA
An acronym for Observe, Orient, Decide, Act, first developed by U.S. Air Force Colonel John Boyd to describe the cycle of decision-making in combat. In business and AI contexts, the OODA loop represents a dynamic model of intelligence under pressure: observing signals, orienting by framing them against context, deciding on a course of action, and acting to change the environment. The cycle then repeats, with each action creating new observations. OODA is powerful because it is recursive and adaptive—it captures not just reaction but continuous learning, making it a natural structure for knowledge graphs and reasoning systems that must operate in real time.
OLTP (Online Transaction Processing)
High-volume, fine-grained, write-heavy systems that handle day-to-day transactions (ex. add order, update balance) with strict consistency and low latency. Contrasting with OLAP, OLAP scans/joins large data volumes and returns a relatively small result set (aggregates, summaries), whereas OLTP reads a small, fixed set of rows, updates them, and writes back immediately.
Parameters
In a language model, parameters are the internal numerical values—essentially weights—that determine how the model maps input text to output text. Each parameter represents a learned connection between features of language, adjusted during training through gradient descent. The more parameters a model has, the richer and more nuanced its representation of relationships between words, ideas, and contexts—but also the greater its computational and energy cost.
Pearson Correlation
A statistical measure of the linear relationship between two continuous variables, denoted by r. It ranges from –1 (perfect negative linear association) through 0 (no linear association) to +1 (perfect positive linear association). Computed in O(n) time by comparing paired deviations from each variable’s mean, Pearson’s r tells you how strongly—and in which direction—two series move together.
Peircian Triad
Charles S. Peirce’s three modes of inference that drive inquiry:
- Abduction — hypothesis formation (If A were true, B would be expected; observe B ⇒ maybe A).
- Deduction — derive testable consequences from a hypothesis (If A then B; A ⇒ B).
- Induction — evaluate and update credibility from data (How strongly does evidence support A?).
Practical prompts:
- Abduction: “What are the top 3 plausible explanations that would make the observation unsurprising?”
- Deduction: “For each, what unique prediction could distinguish it from the others?”
- Induction: “Given the data, how much did my belief move? What error bars remain?”
Performance management
An ongoing, organization-wide system for turning strategy into results by setting goals, measuring what matters, learning from outcomes, and adjusting course. It links goals → measures (KPIs) → reviews → decisions → actions across levels (enterprise, team, individual) so work stays aligned and improves over time.
Core elements:
- Direction: clarify strategy and translate it into objectives, success criteria, and ownership.
- Measurement: define a small set of KPIs (leading and lagging), targets, and data sources.
- Cadence: run regular reviews (weekly/monthly/quarterly) to assess progress and remove blockers.
- Decisions & actions: choose interventions, allocate resources, and record why changes are made.
- Learning loop: compare expected vs. observed results, update assumptions, and refine the plan.
Common frameworks & tools:
- Balanced Scorecard (BSC) with strategy maps to visualize cause-and-effect among objectives.
- Objectives and Key Results (OKRs) — Objectives = qualitative goals; Key Results = measurable outcomes that signal success.
- Management dashboards, reviews (QBRs), retrospectives, and issue/risk logs.
Good performance management creates alignment, accountability, and learning: people know what matters, see progress, and can act on evidence—not just opinion. Poor practice reduces it to score-keeping; good practice makes it a feedback system that continuously improves execution.
PMML (Predictive Model Markup Language).
An older, vendor-neutral XML wrapper for classic ML models: you train in Tool A (SPSS/SAS/KNIME, etc.), export PMML, and score in System B without retraining. It carries the schema, feature transforms, and the model itself—great for audit trails and for the “don’t drift between train and serve” problem. PMML shines with regressions, trees, scorecards, and other traditional algorithms, but it never kept pace with Pythonic pipelines and modern deep learning. Today it’s still alive but niche—you’ll see it in banks/insurers with long-lived stacks; new projects usually pick ONNX, MLflow flavors, or just ship a containerized Python scorer. Net: solid for legacy governance, not the default for greenfield.
Premature Convergence
A condition in evolution, optimization, or technology where systems settle too early on a “good enough” solution. In biology, it describes species that adapt narrowly and lose future flexibility. In AI, it refers to the widespread adoption of early-stage methods—such as Large Language Models—that work well enough to dominate, but risk locking out richer or more balanced approaches. Premature convergence is not failure; it is limitation. It warns us that progress may stall when success arrives too soon.
RAG (Retrieval-Augmented Generation)
A hybrid AI approach that combines a language model with an external knowledge store (documents, databases, or a knowledge graph). When given a prompt, the system first retrieves relevant information and then augments the model’s output with those facts—reducing hallucinations and grounding responses in up-to-date, sourceable data.
Resource Description Framework (RDF)
A foundational Semantic Web model for representing information as subject–predicate–object triples. RDF provides a flexible, graph-based syntax (e.g., Turtle, RDF/XML) to encode facts about resources in a machine-readable way.
ROLAP (Relational OLAP)
ROLAP stands for Relational Online Analytical Processing. Unlike MOLAP (Multidimensional OLAP), which stores pre-aggregated cubes in proprietary formats, ROLAP operates directly on relational databases—calculating aggregations on the fly using SQL at query time.
ROLAP leverages standard SQL engines to serve dimensional queries, often translating cube-like structures (dimensions, hierarchies, measures) into joins and GROUP BY operations. This avoids pre-processing and cube build times but can result in slower initial queries, especially without proper indexing or caching.
In practice, ROLAP queries often hit summary tables or intermediate caches—in memory or on disk—especially for frequently accessed aggregations. However, these cached layers are usually transient, meaning they disappear after a server restart unless explicitly persisted. Many BI engines and semantic layers blend ROLAP behavior behind the scenes, using smart query generation and temporary caching to mimic cube performance.
Key trade-off: ROLAP sacrifices some query speed in exchange for greater flexibility, reduced storage overhead, and real-time reflection of changes in source data.
Root aggregation
The grand total of a measure across all dimensions—i.e., the cube’s All/Total level. In the aggregation lattice it’s the root node (∅ attribute set): a single value like “Total Sales (all products, regions, dates).” Engines often compute/store it implicitly (ROLAP: GROUP BY (); MOLAP: a stored cell). It’s the baseline cell every query can roll up from, useful for caching, quick totals, and sanity checks. Think of it as a flattened cube.
SBAR (Situation-Background-Assessment-Recommendation)
A structured communication framework originally developed in healthcare to standardize handoffs and urgent escalations. Users succinctly describe the Situation, provide relevant Background, share their Assessment of the problem, and offer a clear Recommendation—ensuring concise, focused, and actionable dialogue.
Self-Supervised Learning
A technique where the system creates its own learning signals from unlabeled data by hiding part of the input and training itself to predict the missing part. This is how modern language models, vision models, and audio models scale: predict the next word, fill in the masked patch of an image, or reconstruct missing audio. It’s not supervised by humans, but the model is “supervising itself” by turning raw data into prediction tasks. In human terms, it’s like learning by completing patterns—predicting what comes next in a sentence, or guessing what’s behind an occluded object—long before anyone explains the rules.
Semantic Web
An extension of the World Wide Web that adds formal semantics to data, allowing information to be shared and reused across application, enterprise, and community boundaries. It relies on standardized data models and vocabularies so machines can interpret and integrate information from heterogeneous sources.
Slice and Dice (Query Pattern)
Slice and dice is a fundamental query pattern in Business Intelligence (BI) and OLAP (Online Analytical Processing) that refers to filtering (slicing) and regrouping (dicing) multidimensional data to explore it from different perspectives. Technically, it is often implemented through SQL GROUP BY queries and filtering conditions, forming the basis of most OLAP cube interactions.
SPARQL (SPARQL Protocol and RDF Query Language)
A W3C-standard query language for retrieving and manipulating data stored in RDF (Resource Description Framework) format. SPARQL operates over triple stores and is designed to extract patterns from semantic graphs, similar to how SQL works with relational databases. It supports filtering, aggregation, subqueries, and federated queries across multiple RDF sources. While powerful for declarative graph querying, it lacks native support for recursion or rule-based reasoning.
SSAS MD (SQL Server Analysis Services Multidimensional)
The original OLAP engine from Microsoft SQL Server Analysis Services, built on the multidimensional (MD) model. SSAS MD organizes data into cubes, dimensions, measures, and hierarchies, and uses the MDX query language. It supports MOLAP, HOLAP, and ROLAP storage modes, and is optimized for drill-down style exploration—sums, counts, averages, and other aggregations across large datasets. While later replaced in many deployments by SSAS Tabular (using DAX and columnar storage), SSAS MD remains a powerful engine for complex cube designs, advanced calculations, and traditional OLAP-style analytics.
Supervised Learning
A form of machine learning where the algorithm is trained using labeled examples: inputs paired with the correct answers. The model’s job is to learn the mapping from input to label. Classic examples include “cat vs dog” image classifiers, sentiment labels on text, and medical diagnosis models. It resembles how humans learn when someone explicitly tells us “This sound means ‘dog,’ this thing is hot, don’t touch that.” Supervision provides the target.
SWRL (Semantic Web Rule Language)
An extension to OWL (Web Ontology Language) that allows users to define Horn-like rules (if-then statements) over RDF and OWL ontologies. SWRL enables basic inference on top of ontologies—such as classifying individuals or inferring new relationships—but is limited in expressivity (no recursion, no negation-as-failure) and computationally expensive at scale. Unlike Prolog, which is a full logic programming language, SWRL is constrained by OWL’s open-world and monotonic reasoning assumptions.
Transference of Cost
In performance management, this refers to improving one measure by offloading its burden onto another part of the organization. For example, reducing call-center handle time may look like efficiency, but if it leaves more issues unresolved, the cost reappears later in escalations, churn, or diminished customer satisfaction. Transference of cost exposes the hidden trade-offs behind KPI gains and reminds us that true performance improvement comes from systemic balance, not shifting burdens.
Unified Dimensional Model (UDM)
A conceptual layer introduced in SQL Server Analysis Services Multidimensional (SSAS MD) that presents enterprise data as a single, consistent dimensional model. The UDM integrates disparate relational sources into an OLAP cube structure, exposing measures, hierarchies, and KPIs through dimensions and facts. It allows users to query data using MDX as if all the information were contained in one unified cube, regardless of the underlying sources.
The UDM was more than just a modeling layer; it acted as an early semantic layer, abstracting complex schemas into business-friendly terms while providing the performance benefits of MOLAP. Many modern semantic layer approaches, especially in MOLAP systems like Kyvos, can trace their lineage back to the UDM concept—using pre-aggregated cube structures not only for speed but also to enforce consistent business logic across tools.
Unsupervised Learning
A type of learning where the model receives data without labels and must find structure on its own. It clusters, groups, compresses, or discovers patterns purely from the shape of the data. Examples include grouping customers by behavior, identifying anomalies, or discovering latent topics in documents. It’s similar to how an infant observes the world before understanding language: seeing shapes, hearing sounds, noticing similarities without anyone naming them.
Web Ontology Language (OWL)
A richer ontology language built on RDF that adds formal logic constructs—classes, properties, restrictions, and axioms—for defining complex vocabularies and enabling automated reasoning. OWL lets you specify hierarchies, cardinalities, and constraints, making it possible to infer new knowledge from existing graph data.