Where Do Rules Come From?

This weekend I’m preparing for a Webcast where I will present further information on the notion of expanding upon the Strategy Map of Performance Management and bridging it to Predictive Analytics. I discussed that notion in my earlier blog, Bridging Performance Management and Predictive Analytics. The Strategy Map is the blueprint for an organization’s (or organism’s) execution of actions towards accomplishing a goal. It is a cause and effect graph that documents the theory behind the strategy. They are the logical rules we expect through actions and the chain of ensuing events.
 
These strategy maps can be very tedious to author and even worse to maintain. It’s hard enough to map out a system such as a strategy or a manufacturing plant. It may be hard, but it’s possible. That is, if we assume they are closed systems. The problem is that in reality there are hardly any closed systems. All outcomes on this planet are the result of the interactions between an unimaginably large number of entities resulting in a mess of a web of cause and effect. We could never hope to even come close to mapping all of those relationships.
 
We can’t map everything, but yet we’re able to predict the future to some extent. The ability to predict the future is the basis of our human intelligence. We’ve mapped models of recognition (if it looks like a duck and quacks like a duck, it is a duck) and cause and effect (if I go to bed late, I’ll be tired tomorrow, and I won’t do my work well) in our brains. We use those models to recognize something, then simulate actions and our effects in the safety of our brains before commiting ourselves to irreversible physical action.
 
Each recognition and cause and effect comprises a rule. The bad thing is that in our society, there are hardly any hard and fast rules. They are all probablistic. If we say that when we lower the price of fruit 20% at the end of the day, there is a 70% chance we will sell out the fruit, we really mean that there are other factors we haven’t plugged inot the rule at we’re either unware of or obtaining the necessary data is too tedious or the data is too unreliable.
 
So a Strategy Map is a great thing. It is a linked set of rules. It is of immense value because it reflects a crucial aspect of human intelligence that is "machine readable". But how do we author and maintain this?
 
More than a few years ago, a colleague of mine said that he played with Prolog (an old AI language – and the "base" language of my SCL language) back in the 1980s to create expert systems. Prolog itself was fine. It was able to make inferences, even though it was very slow (remember, this was 1980s hardware). But the real challenge was the impossibly tedious task of authoring the rules.
 
That statement stuck with me. There are "Mind Mapping" tools, most of which are derivations of the fishbone (Ishikawa) diagrams. But they are still not good enough to make modeling and maintaining complex systems anything less than excruciating. To address this, I architected the SCL language to have tight integration into SQL Server Analysis Services data mining and OLAP features. The data mining-derived rules, which are semi-automatically generated (the data mining structures are still created by a developer), could contribute and maintain rules automatically strategy maps.
 
In a nutshell, Data Mining/Predictive Analytics greatly eases the burden of authoring by providing a measured score of likilihood for our recognitions or likilihood of a certain effect resulting from an action we take.
 
Figure 1 illustrates sources of rules and their relative volume of contribution. If we effectively employ Predictive Analytics/Data Mining, we can see that it can shoulder a great deal of the burden of rule generation.
 
 
Figure 1 – Pyramid of the source of rules. 
 
First, let’s get rid of the bottom tier of the pyramid (the brown layer), which also happens to occupy the vast majority of the volume. This layer comprises the fact that most of us don’t really "author our own rules". We mostly adopt behaviors of others, whether that’s through the effectiveness of advertisements, social-engineering of some form, or just the fact that none of us have the time and energy to thoroughly think through every single thing we do. With this layer gone, we’re looking at a much smaller pyramid from the green layer on up.
 
At the top of the pyramid (the yellow layer), we see a very small volume of contribution of rules from a small group of hard-core subject matter experts (SME). These are folks who are well-trained with an AI toolset (such as SCL) and have deep subject matter expterise. These are the sort who write our technical books. Is it really that much harder author rules to a complex system than to write a 1000-page deeply technical book?
 
Next are an army of novice AI people (the mustard layer). Perhaps there is a high-level UI that enables authoring of simpler rules. These folks are greater in number than the SMEs, but generate and maintain a larger volume of new rules; in an open-source fashion. Think of the difference between those in the top layer and the second layer as analogous to a database administrator who knows the internals of the DBMS as well as the infrastructure necessary to provide scalable, reliable service versus a user who creates Excel spreadsheets.
 
The tan layer represents rules that have been painstakingly authored into our software code (captured as IF-THEN statements) and imprisoned there for the most part. What if these rules could be decoupled from the logic much like how XML and XSLT decouples data from presentation (as does XAML) in a separation of logic and procedure? Think of the tens of thousands of Facebook applications many of which could leverage rules from each other.
 
What these top three layers have in common is that they are all deterministic. Meaning, they are all hard and fast rules. Sure they may be robust in that the rule can be as specific as we want it to be. For example, if it looks like a duck and quacks like a duck, but it is made of plastic and metal, and we consider machines that represent something to be that thing, then yes, it is a duck. But these are still hard and fast rules that require a skilled human to modify. And modification of rules is inevitable since we live in a world where things are constantly changing.
 
Now we come to the green layer, the largest layer aside from the huge brown layer we discarded earlier. This layer is comprised of automatically generated and updated rules derived from predictive analytics/data mining models. Unlike the rules in the top three layers, these rules are not hard and fast. They come with a probability of accuracy. This is the trade-off we must make for ammeliorating the burden of manually generating and maintaining rules.
 
My old blog, SCL is a Thin Layer on Top of a Solid BI Foundation, discusses the notion of minimizing the burden of rule encoding by leveraging the work of a well-architected BI system, which is fairly mainstream. This green layer is also the only layer that combines what humans do best (think in order to present a mining structure) and computers do best (process a large amount of data). It’s here that the effective integration of the complimentary human and computer intelligence will take place. It will happen as a natural progression of the relatively mature Business Intelligence infrastructure.

About Eugene

Business Intelligence and Predictive Analytics on the Microsoft BI Stack.
This entry was posted in Cutting-Edge Business Intelligence. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s