Key Takeaways from My Two Upcoming DMZ 2025 Sessions

I’m in the middle of prepping for my two sessions at Data Modeling Zone 2025 (March 4-6, 2025). Both sessions are very tightly packed with still so much more to say. So I thought I’d write a blog on key takeaways for attendees to read prior to the sessions.

One of my sessions is essentially a three-hour live version of my book, Enterprise Intelligence. However, the three hours only touches on the main points. A thorough coverage of the book would take at least a full week … and nobody wants that … hahaha. But I cover it thoroughly enough in the session, so the attendees walk away with a strong sense of direction towards applying AI in the enterprise.

In a nutshell, Enterprise Intelligence advocates implementing AI into the enterprise on top of a solid BI foundation. The following blogs were posted after the publication of the book (June 21, 2024) in support of the book:

Enterprise Intelligence (blog series) – This is a series of blogs that describe the book beyond the book description and are appendices that didn’t make it into an already large book.
The Intelligence of a Business – Out of the series above, this is probably the single best one for providing background on my intent for the Enterprise Intelligence book.
Prolog in the LLM Era (blog series) – One of those appendices is on reasoning, which is probably an entire book in itself. This series of blogs covers Prolog as one direction for supporting deductive reasoning in a deterministic manner.

The other session is a one-hour presentation, titled, Knowledge Graph Structures Beyond Taxonomies and Ontologies. It’s mostly tied to my upcoming book, Time Molecules, which should be out around the late April to early May 2025 timeframe. Like Enterprise Intelligence, which describes the creation of an Enterprise Knowledge Graph (EKG), Time Molecules is also about knowledge graph structures.

I’ve selected a set of powerful structures to present in the session as examples of graph structures “beyond taxonomies and ontologies”. But they aren’t randomly selected. They work into a fairly well-known framework applied in critical, high-performance situations. I’ve posted a few blogs as previews of a few of the structures we’ll talk about:

There are many more such structures, but these cover the major issues that have driven me since 2004, maybe since 1998. As I write this (Feb 21, 2025), there’s still a week before the Data Modeling Zone 2025.

Here are the major over-arching takeaways of my two sessions.

Knowledge is Cache

Knowledge is the encoding of what we’ve learned. It could range from something as simple as witnessing an event, to patterns we observed, to entire processes we developed. We encode that knowledge and save it for future application by ourselves and hopefully to share it with others so they don’t need to randomly stumble upon the same knowledge.

We cache our hard-won knowledge by:

Encoding as web of synapses in your brain—automatic encoding.
Teaching it to someone else (where it’s now encoded in their brain).
Encoding it in art—ex. cave art, petryogryphs, icons/memes/logos, …
Encoding in a natural language—book, article, email, X-posts, PostIt notes…
Capturing it as software code.
Modern ML architectures like transformers (LLMs) encode knowledge by mapping relationships in data (structured and unstructured) into models.
Mapping it out in some sort of node-relationship graph (i.e. knowledge graph).

The last item is the main theme of the two sessions. It’s about building knowledge graphs—a versatile (very extensible) and deterministic (meaning, it’s directly engineered/ designed/ authored/ programmed) structure in which we encode knowledge. This is the opposite of LLMs—probabilistic and indirectly trained, not programmed. But they aren’t opponents—rather, they have a powerful symbiotic relationship.

Knowledge graphs have been around for a long time. The problem is that it’s very difficult to serialize the complex knowledge developed in our complex brains into a fairly “flat” structure. They’re even harder to maintain since the world is constantly changing and so should our knowledge of the world. But recent technologies have coincided to drastically eased the authoring and maintenance of them:

Enterprise-class graph databases – Enables the storage and querying of huge knowledge graphs of billions of nodes. This opens the door for storing knowledge beyond taxonomies and ontologies.
Semantic web – The adoption of OWL, RDF, SPARQL etc. enable knowledge graphs authored by millions of different parties to self-link to each other.
LLMs – LLMs are magnitudes better at reading very large knowledge graphs. Anything more than a couple dozen nodes is too much for us to take into our brains.

Knowledge is something that some party spent time, energy, and other resources to figure out. If we didn’t somehow cache that knowledge, we’d need to reinvent it every time. But we and most other creatures have the ability to learn to varying extents. If we couldn’t learn, we’d never move forward. We’d spend our energy and resources reinventing the wheel countless times over.

What I stated is very obvious, very duh. But I think most people who haven’t worked on developing AI technologies and/or knowledge graphs subconsciously assume knowledge is the domain of human brains and databases (even the Internet could be thought of as a database) are just glorified libraries in which information is housed. In other words, when we have a problem to solve, the heavy intellectual lifting is up to us, but we access databases, the Internet, books, and other people for information.

Don’t Hand Over the Keys to AI Just Yet

At this time, AI, specifically LLMs, aren’t ready to take on most roles that deal with any sort of ambiguity. In other words, any role that isn’t a good candidate for automation. At this point, as duh as this sounds, it’s a fairly safe bet that if a human is performing the job today, that’s because it’s unpredictably ambiguous therefore not yet automatable.

LLMs of the time of writing are just smart enough to be dangerous, as the saying goes. For example, since ChatGPT burst onto the scene back in November, 2022, I’ve written two books on AI applied to BI. Almost every day, I ask it for sanity and fact checks. This is tricky because these books aren’t about documenting history or an existing product. It easily loses the subtlety of what I’m after and responds from whatever “bucket” it currently has been trained with. It can’t think far outside of its training data.

Worse than that, it doesn’t seem to understand that its training data could be flawed. I asked ChatGPT and Grok if it understands that its training data could be flawed. It insists that it’s “trained on very many terabytes of high quality data and so, statistically, blah, blah, blah”.

It’s trained on a data set comprising a very incomplete truth—in fact, I assume all information is imperfect. A very small percentage of people have published books and articles. Before the advent of avenues such as Kindle, anything published had to have mass appeal so that it was worth publishing. That’s certainly not conducive to complete truth. And that mentality goes for just about any product making it into production by a big company. That’s not bad, it completely makes sense. Why would a company make a product that probably won’t sell many units, when they are constrained by time and resources?

Granted, what makes it into published books must be very relevant. But each book is just one point of view or solution. It’s what went viral for whatever reasons. For every published resource there are countless other valid views and solutions that probably would have won in another place and time. Nonetheless, we’ve all said or at least thought some version of, “If it’s in print, it must be true.”

I think we’ve already reached AGI. How many people can ace the BAR exam, medical boards, write very good code, and know so much about so many things? As I mentioned, it still lags when I ask it questions that push it beyond its training. But so do we.

At this time, it’s good enough to be of tremendous assistance. For example, I’ve written about a SQL Server performance tuning “knowledge graph” that I developed back in 2004. It took me months to create what I’d even call an alpha version. I suspect if I had the means to create an LLM back then, trained on the substantially smaller amount of data at the time, it would have reduced my development time to a few weeks. Yes, available data was much smaller, but it was high in quality—ex. one of the best technical books ever, published in 2003, The Guru’s Guide to Architecture and Internals, by Ken Henderson.

AI in the Enterprise Should be Applied on Top of a Solid BI Foundation

Over the past few decades, Business Intelligence has honed the art of processing data from raw OLTP form to refined forms conducive to analytics. BI is a cross-functional discipline involving subject matter experts, business stakeholders, analysts, and multiple types of highly specialized data engineers. Over the past few decades, the BI world has encountered and dealt with many new issues as the world evolved since the foundations set by Bill Inmon, Ralph Kimball, and more recently, Dan Linstedt and Zhamak Dehghani.

BI may sound antiquated since it’s been overshadowed since around 2010 by Big Data, data science and machine learning, and now AI. But it’s still been there. It’s still the highly-curated, therefore most trusted data source from which important decisions are made. It’s just the term, “BI”, that’s been overshadowed by the newer buzzwords.

A big complaint about BI is that it takes too long to surface data to analysts. That has two very different meanings. The first is that there is a lag in the time new data makes it into the BI database. This is because it has to run through complex ETL processes that might only run at given intervals such as every hour or day. So, the users don’t have “real time data”. This is a valid problem, but with today’s hardware and advanced tools, ETL could be often run in micro-batches, every few minutes or so, which can provide at least “near enough to real time” information.

The other meaning, the tougher meaning, refers to the time between when a business realizes it needs to onboard new data into the BI system and when that can go online—the time it takes for SMEs and business owners to agree on what they want, then for data engineers to rig up the ETL processes.

Several relatively recent technologies/methodologies have emerged help to address this. The first is ELT (switched the T and L), which punts the tough part (the “transformations” that puts the “t in tough”) from the BI team to the end-consumers. Data Mesh and Data Vault have emerged as well. I’ve written a blog about the one-two punch of embedding a data vault in a data mesh.

Most Enterprise Knowledge is Trapped in People’s Heads

Strategies might be set by executives and tactics picked by managers—but knowledge of how things must work in the evolving real world are in the heads of the army of workers. Unfortunately, the vast majority of those workers will not write comprehensive reports, articles, PowerPoint decks, or books on what they know. Most will write emails, notes, and their performance reviews. However, that text probably won’t provide context and details that would make it meaningful to someone (or an AI) stepping into the middle of a situation. All that is locked in the head of the workers.

As enterprises further dive into data-driven strategies, tactics, and operations, many more people (beyond analysts and managers) will become BI consumers. Each will have problems to solve in their own far reaches of the enterprise. In their quests to resolve the problems in their hands, they too will access highly-trustworthy BI databases, view results in visualizations (line graphs, bar charts, etc.), and notice insights in a massive insight space spawned from a massive data space.

These insights are knowledge they will apply towards the solution to their problem. Some insights will be ignored if it’s not applicable, but it’s still knowledge. It could be valuable to another of thousands of other BI consumers who may never have stumbled upon the discovery within the vast insight space. So it should be cached. That is, into a vast enterprise knowledge graph—a database of discovered knowledge combined with authored knowledge of SMEs and a map of all data sources in the enterprise (data catalog).

With that graph, knowledge graph reasoners and/or LLMs can readily (relatively) perform inductive and deductive reasoning. But in this era mercilessly accelerated through AI, abductive reasoning becomes a necessity for navigating the complexity of today’s scale. It’s the sort of reasoning Sherlock Homles is great at. When analysts, managers, or even frontline employees stumble upon an unexpected insight—a novel anomaly in sales patterns, a sudden operational bottleneck, or an emerging market shift—they must engage in abductive reasoning. They observe something unusual piece together fragments of clues until the solution can be deduced.

The problem is, without a comprehensive map of enterprise-wide knowledge, even the sharpest minds are left with fragments. They’re forced to piece together insights from siloed data, anecdotal evidence, and their personal experience. This slows down discovery, leads to missed opportunities, and keeps much of the organization’s potential intelligence trapped in silos or, worse, locked away in individual minds.

Enterprise Knowledge Graphs can help overcome this limitation by connecting scattered data points, relationships, and context. They enable enterprises to build a web of interconnected insights that make it easier for humans and AI to spot relationships and patterns that aren’t immediately obvious.

But this is where AI and human intelligence must work together. AI is not yet capable of true abductive reasoning—it’s still not that good at hypothesizing too far beyond its training or drawing meaningful inferences from fragmented or obscure connections without cautious human guidance. However, by leveraging AI to assist in surfacing potential patterns and anomalies and combining this with human intuition and reasoning, organizations can unlock a new level of intelligence.

This hybrid approach—grounding AI within a solid BI foundation, enriched with knowledge graphs and fueled by human reasoning—is at the heart of Enterprise Intelligence. It’s not about handing over the keys to AI but about using AI as an accelerator for human insight, creativity, and strategy. This ensures that the knowledge once trapped in scattered data sources, or worse, in people’s heads, becomes a shared asset that drives smarter, faster, and more informed decision-making across the enterprise.

In the AI era, the organizations that thrive will be those that don’t just collect data but connect disparate, fragmented knowledge, enabling both human and machine intelligence to work in harmony. This is the true intent behind Enterprise Intelligence: not to replace human reasoning, but to amplify it—harnessing AI and structured systems of knowledge to drive clarity, strategy, and innovation in the age of merciless complexity.

If you’re attending Data Modeling Zone 2025, come ready to challenge how you think about knowledge, AI, BI and the future of enterprise intelligence.

Key Takeaways from My Two Upcoming DMZ 2025 Sessions

Knowledge is Cache

Don’t Hand Over the Keys to AI Just Yet

AI in the Enterprise Should be Applied on Top of a Solid BI Foundation

Most Enterprise Knowledge is Trapped in People’s Heads

Published by Eugene

Leave a comment Cancel reply

Knowledge is Cache

Don’t Hand Over the Keys to AI Just Yet

AI in the Enterprise Should be Applied on Top of a Solid BI Foundation

Most Enterprise Knowledge is Trapped in People’s Heads

Share this:

Related

Published by Eugene

Leave a comment Cancel reply