This is Part 4 of 5 of the Map Rock Problem Statement. Strategy, complexity, competition, and the limitations of logic make up the soup that lead to humans being as smart as we are the way that we are. We’ve obviously done very well for ourselves. However, I feel there is an over-emphasis on speed, simplicity, and control that will essentially lead us to lose these “powers”. The previous installments can be viewed here:
- Part 1 – Preface to the 5-part series.
- Part 2 – I describe Map Rock’s target audience and the primary business scenario for Version 1. It is not just a tool for quants, wonks, and power-users.
- Part 3 – We delve into a high-level description of the Map Rock software application, where it fits in the current BI framework, and how it differentiates from existing and emerging technologies. This is really the meat of the series.
Map Rock’s Main Approach
We live in a complex world powered by the relentless struggle for survival of all things at all levels (individual, herd/tribe/country, species), each following relatively simple rules, with no top-down control. However, we humans have an ability to manipulate our surroundings to our liking, at least in the short-term (seconds to days), by applying logic which works within restricted space and time. In the moderate term (months to years), we can have our way to a lesser extent through the development and implementation of malleable strategies. Beyond the timeframe of a couple of years, even a workable prediction is useless.
Map Rock’s goal is ambitious, to say the least. As I illustrated in the Part 3 section, “How is Map Rock Different?”, it touches so many things. The biggest challenge was to avoid developing a hodge-podge, “chop suey” application analogous to the “Homer”, the car with every feature imaginable designed by Homer Simpson. My approach was to take many steps back to see the common threads tying all of those things listed in the previous section. Instead of looking for an opportunity to fill an underserved aspect of BI, I wanted to see if there is a way to tie the pieces of the BI world together.
In the end, we want to be smarter, we want to make better decisions. A good place to start is to compare why we humans are smarter than other animals. The world has done very well for a few billion years without BI. Simple rules, followed by trillions or so agents, result in the life around us. It’s certainly not just our huge brains which in themselves are just huge hard drives (but with a more intricate structure). But our intelligence involves more complex desires towards success than simply hunger and reproduction. We’ve become symbolic thinkers, which means we can play what-if games in our heads; virtual experiments to test an outcome before we take physically irreversible actions.
At the lowest levels of Map Rock’s design are the notions of rules and troubleshooting. Rules are really about recognizing things both tangible (like a rock) or intangible (like an impending accident). Troubleshooting is the process of the resolution of problems: identifying symptoms, recognizing a diagnosis and applying a treatment.
Troubleshooting isn’t something restricted to “technical” people such as your mechanic or SQL Server performance tuning expert. It’s just the term used by those technical people for “figuring out what’s wrong”, which we all do every day. We’re barely conscious of many of our troubleshooting efforts which can be as mundane as recalling where we left the toothpaste or as complex as figuring out why my poha hasn’t yet fruited.
Identifying symptoms is relatively easy, simply recognizing sets of attributes or they are answers to relatively simple questions. The biggest challenge with identifying symptoms isn’t the answering of the question itself. It is that maybe we aren’t looking for the right things and/or looking for the wrong things, in other words, asking the wrong questions. For example, while the amateur investors are looking for solid performance numbers, the professionals are looking for bubbles about to burst. And, the right and wrong things are different under different contexts.
After we’ve taken inventory of our situation (identified the symptoms), we can “label” the “situation”, consider its own macro object, which is a diagnosis. Has anyone ever seen this set of symptoms before? Yes. Does it have a name? Hodgkin’s Disease?
If we’re fortunate enough to find that someone else has seen these symptoms, we can leverage their experience by applying a treatment used in those previous cases or at least to pick up a few more clues around that previous case. Declaring a diagnosis is also relatively easy, but it’s important to note a couple of addition things about symptoms, a components of a diagnosis. The symptom itself could itself be the result of a diagnosis and our certainty about each symptom may not be as plain as day (meaning, it could just be a best guess).
Treatment is the most difficult part. If we’re lucky, what we are treating has happened many times before and has been rectified through a tried and true process. But out in the wild, because life is a complex system, nothing ever happens exactly the same way. Two events may look very similar, but they are only similar to some extent, not exact. Therefore:
- This inherently means that a diagnosis, no matter how many times it has worked in the past, is always at the risk of being incorrect. The devil is in the details. If it looks like a duck and quacks like a duck, it may just be a decoy deployed by a hunter.
- We must also consider the cost for being wrong. This consideration is too often just a side-note since what could go wrong hasn’t yet happened, and therefore doesn’t seem as important as what is happening right now. And, we’re very good at twisting facts in our head or conveniently sweeping facts under the rug to justify (at least in our minds) why we shouldn’t worry about these things.
- There may be important data points unknown to us that are required to mitigate risk or at least figure out how to deal with the risk. It’s not that we are negligent, but that it’s fair to say no one in their “right mind” would have thought about it.
If we’re facing a completely novel situation, inventing a treatment is usually more involved than simply applying some IF-THEN logic. Even more, we need to be mindful of what could go wrong and “what am I missing”?
There is an elegant process by which our symbolic thinking works that I attempted to implement in my SCL language, a language I developed based on the Prolog AI language attempting to reflect not just the essence of logic, but the distributed nature of knowledge and effort. I discuss it in general terms in my mini-blog, The Four Levels of SCL Intelligence. Map Rock could be thought of as a more specialized user interface for SCL than a more general version I had been working on that I named “Blend” (as in the blending of rules).
At the core of the processes by which we use Map Rock are three main questions:
How are things related? Relationship is the core of our brains. As we go through life, the things we encounter together at any moment, our experiences, are recorded in our brains as related. A web of relationships of many types (correlation, attributes of, etc) are the protein molecules (I chose to use “protein” to convey complexity and variation) of our applied logic.
How are things similar? This question is the basis for metaphor, which is what opens the door to our thought versatility. Metaphor is our ability to recognize something that is merely similar to another thing. A direct match isn’t necessary. The idea is that if something is similar, we can cautiously infer similar behavior. Without this flexible capability, for anything to happen, there would need to be a direct, unambiguous recognition, which is a very brittle system.
What has changed? Noticing change is what engages many mechanisms. All animals respond to change. Birds sit up high looking for the slightest change, movement, in the scene before them. When attempting to troubleshoot a problem, such as a doctor attempting to resolve an issue, one of the first questions after “How can I help you?” is “What has changed recently?”
My goal with Map Rock is to put the “I” back into BI. This notion reflects my career’s roots that began shortly before the AI and Expert System craze of the 1980s. However, the context in which I think of “I” is not the same as truly replicating human intelligence. Back then I was still naïve enough to think that implementation of such concepts were then feasible. So maybe I’m a little unfair since BI was perhaps really never thought of as the corporate version of the sort of software I imagine the CIA must use to facilitate their primary functions. See Note #1. But I’m also referring to moving beyond simply gathering data for analysis by human analysts. As I mentioned in Part 3, I’m after what I call “pragmatic AI”.
With that said, BI has seemed somewhat lackluster to me since the dot-com bust. The “doing more with less” mantra is more about not losing than about winning. We’re also very fearful of failure (and lately even seem to look down on winning). Any mistakes we make become very public and will haunt us forever as rigid data mining algorithms filter us out due to key words on our record that superficially don’t take into account that we all make mistakes and that the only reliable way to uncover vulnerabilities is through mistakes. It’s one thing to take criminal or even negligible (and that’s a questionable word)action, but not well-intentioned risks towards the goal of winning fairly.
Every single conscious action we decide to take is calculated within the context of a strategy. Strategies are a path meandering through a web of cause and effect that takes a situation from one presumably undesirable state to another desired state. A massive web of webs of cause and effect build in our brains from our experiences. Some “causes” are things we can directly alter like the volume dial and some are out of our control, at least indirectly. Some effects are good and some are bad. So all day long we try to hit the good effects and avoid the bad ones through logic, these paths, these cascading links of cause and effect.
On the other hand, subconsious actions (like driving) are not in the context of a strategy but in a context of sequences of recognize/react pairs determined through sheer statistical weights. We drive until we hit an “exception”, something out of bounds of the predictive analytics models running in our head. That engages our thinking and formulating of strategies.
It’s important to realize as well that “bad” effects are not to be avoided at all cost. Remember, almost every strategy involves costs. Some can be relatively painful. For example, curing cancer usually involves trading in several “merely” major pains in exchange for one severely major pain. This is called “investment” or “sacrifice”. The reason I mention this is because “Scorecards” sometimes fail to illustrate that some “yellow” KPIs (ex: the typical traffic light showing a yellow light) reflect some pain we actually intended to bear as an investment towards a goal. It is not something that should be rectified. This is very similar to how we may be inadvertently subverting nature’s reactions to a sprained ankle by taking measures to bring down the swelling.
Now is a good time to note that immediately after I say, “cause and effect”, someone reminds me that “correlation does not necessarily imply causation”. It’s rare to find two or more completely unrelated phenomenon that correlate. Usually, strong correlations do share a common driver, even though one may not cause the other. For example, higher levels of disease and mental stress may correlate with higher population densities.
In fact, one of the primary functions of Map Rock is to help assess the probability for causation, part of a procedure I call “Stressing the Correlation”. This procedure, which illuminates and eliminates false positives, includes tests for signs of bias, consistency of the correlation, chronological order of the two events, and identifying common factors.
Please keep in mind that excessive false positives (the TSA screening practically everyone) is the major negative side-effect of avoiding false negatives (missing the true terrorist). At least we can deal with what we see (false positives). One of the major goals of Map Rock is to expose things we wouldn’t think about (false negatives). If we had to choose, I’d say I’d rather deal with excessive false positives than miss a false negative when the cost for being wrong is extreme.
I’m often told that there are patterns out there, that numbers don’t lie. Yes, nature has done very well for herself without human strategy. Getting data faster is a crucial element to the execution of a strategy. The point is, those patterns work beautifully well as long as you understand that at the level of detail below those patterns, a percentage of things are mercilessly plowed into the field for recycling.
Complicated and Complex Systems
The key to appreciating the value of Map Rock is to recognize the fundamental difference between a complicated system and a complex system. The September 2011 edition of the Harvard Business Review included several very nice articles on “embracing complexity”. Paraphrasing one of the articles, the main reason why our solutions still fail or perhaps work but eventually fall apart is that we apply a solution intended for a complicated system to a problem that is really complex. I think we generally use these terms interchangeably, usually using either term to refer to what is really “complicated”.
Machines and the specific things to which they apply are complicated. A screw driver, a screw, the hole in which the screw is applied and the parts it’s fastening are a complicated machine. On the other end of the spectrum, even something as sophisticated as a hive of Hadoop servers is a complicated system. What makes a system complicated and not complex is that we can predict an outcome with a complicated system, even if it takes Newton to run the calculations. We’re able to predict things because all the parts of a complicated system have a specific, tightly-coupled relationship to each other.
The Industrial Revolution is about qualities such as precision, speed, endurance, all of which machines are infinitely better at than humans. We build machines ranging from bicycle tire pumps to semi-conductor fab plants that can output products (air bursts and chips, respectively) with greater productivity than is possible with just the hands of people. Today, we still continue an obsession with optimization of these systems by eliminating all variance (most of which is human error), minimizing waste, especially defects and down time.
This distinction between complicated and complex is incredibly profound when making business decisions because:
We cannot settle for “just one number, please” predictions in a complex system. We can make accurate predictions within a consistent context. Complex systems are not a consistent context. For example, we can develop data mining algorithms to predict how to attract a patient into a dental office based on the patterns of that office’s patients. However, that same model probably will fail miserably for a practice in another neighborhood, state, or country. The best we can do is hope that the context changes slowly enough so our models at least work to some extent at least for a while.
Strictly speaking, there are probably no complicated systems in real life. Really, I can’t think of anything on earth that operates in a vacuum. Everything is intertwined in a “Butterfly Effect” way. Even a vacuum as we generally mean is a vacuum in that it is devoid of matter, but not things like gravity and light passing through. Every complicated system I can think of is only an illusion we set up. We draw a box around it and limit our predictions to a limited space and time hoping that nothing within that time will change enough to affect our predictions.
Figure 1 illustrates how we create the illusion of a closed system. We encase the system (the green circle representing a car) within a layer of protective systems (the blue layer) protecting it from the complexity of the world. I purposefully chose a dark gray background in Figure 1 to convey the notion of an opaque complex world, a “dark-gray” box, not quite a “black box”.
Figure 1 – Complicated systems are closed systems. We create “virtual” closed systems by protecting it through various mechanisms from the complexity of the real world.
Of course, the protective systems cannot protect the car from everything conceivable. It will not protect it from a bridge falling out from under it or a menacing driver in another car.
The Complexity is Only Getting Worse
A system’s complexity grows with the addition of moving parts and obstacles which complicate the movement of the moving parts. Following are a few example of Forces adding moving parts, which directly increases complexity:
Globalization. Each country has its own sets of laws, customs, and culture. Working with different sets of rules means we need to be agile and compromise, which complicates our efforts.
Accumulating regulations and Tightening Controls. Constraints only add to complexity. They may not be a moving part, but act as a roadblock to direct options. There are so many regulations, collectively millions of them at all levels of government in the US alone) in play that most of them must be in conflict with others. I wouldn’t be surprised if we all inadvertently broke some law every day. Ironically, regulations are an attempt to simplify things by removing variance and events that can cause a lot of trouble.
Growing population and affluence. Each person is a moving part of our global economy. More affluence means more people make more decisions with wider scope, are more active, touching more things, whether it’s as a consumer or as a worker.
The number of “smart” devices which can even be considered semi-autonomous. These devices that make decisions (even though they may be mundane) is also a moving part. See note #2 for an example of one that happened to me today.
Increasing Pace of the world. Even if no moving parts were added, the increasing pace of things adds as much to growing complexity as the number of moving parts. The faster things are going, the more spectacularly they will crash. Not too many things scale linearly and increased load will add complication as different rules for different scales engage.
More demands on us. With more and more regulations and responsibilities hoisted upon us, we’re forced to prioritize things, which opens many can of worms. In essence, priotization means we are choosing what may not get done or at best will be done half-heartedly with probably less than minimal effort. That can result in resentment from the folks we threw under the bus or other sorts of things that add more problems. It forces us to spend the least amount of energy and resources as possible so we can take on the other tasks. We learn to multi-task but that may lower the quality of our efforts, at least for the tougher tasks (easy tasks may not degrade in quality of effort).
De-hierarchization/de-centralization of corporate life. Last, but definitely not least. This leads to more and more moving parts as decision control is placed into more people (and even machines) who are now better trained and tied in through effective collaboration software. However, decentralization is really a good thing that mitigates, if not removes, bottlenecks, enriches the pool of knowledge from which decisions within the enterprise are made, and drastically improves the agility of a corporation. Decentralization is really the distribution of knowledge across an array of people who can proceed with sophisticated tasks minimally impeded by bottlenecks. See Note #3 for more on this concept and Note #4 on Slime Mold.
When I’m in a room with other engineers brainstorming a solution, we’ll agree that a suggestion is too complicated or complex. We then back away from that suggestion, usually ending up sacrificing feature requests of varying importance to the “nice to have” pile (and never actually getting to it). I have no problem with that at all.
Although I believe many who know me will disagree, I don’t like complicated answers (see Note #5). Complications mean more restrictions, which means brittle. Complications happen when you try to have your cake and eat it too, which is a different issue from getting too fancy with all sorts of bells and whistles. What I mean is when we want to accomplish something, but there are constraints, need to include safeguards to protect those constraints. Constraints handicap our options. We can usually engineer some way to have our cake and eat it too, but eventually we will not be able to patch things up and the whole thing blows up.
When I began developing SCL way back when, my thought was how to embrace complexity, tame it, and conjure up ways to deal with the side-effects. The problem is that to truly embrace complexity, we need to be willing to let go of things and often have no choice as to which of those things go. But it’s one thing to be a non-self-aware species of bird that goes extinct as species that are more fit to current circumstances thrive and a self-aware person fighting for survival among billions of other self-aware beings. In a sense, everyone is a species of one.
I am incredibly far from having the answers. But what I do claim (at least I think I do) is that I have a starting point. It involves decentralizing control to networks of “smarter” information workers acting as a “loosely coupled” system (which works very well for developing complicated software systems). Most importantly, at least for Map Rock Version 1, is to accept and deal with the limitations of logic.
The Limitations of Logic
Whatever we personally know (knowledge in our individual head) are the things we’ve experienced: We only know what we know. Obviously, we cannot know things we haven’t experienced personally or that hasn’t been conveyed to us (learned directly) through some trusted mechanism (ex, someone we trust). Everything we know is a set of relationships. For example, an apple is recognized when we see a fruit with the classic shape, smell, color, etc.
That’s all fine and dandy until we attempt to infer new knowledge from our current knowledge. Meaning we take what we know, apply logic, and indirectly come up with a new piece of knowledge. How many times have we found something we were sure about to be wrong and when we figure out what went wrong we say, “I did not know that!” Meaning, had we known that, we would have come to a different conclusion.
Our mastery of the skill of logic relative to other animals is the secret sauce to our so-called “superiority” over them. However, for inter-human competition, all of whom have this power of logic, one needs superior logical capability as well as superior knowledge from which we can draw inferences. Logic is great as we use it to invent ways to outsmart nature (at least for the moment) who isn’t preying on us (nature herself isn’t out to get us). But as Superman was nothing when facing his enemies with the same powers in Superman 2 (General Zod, et al), we need to realize our logic can be purposefully tripped up by our fellow symbolically-thinking creatures. As we wrap a worm in a hook to catch a fish, our own kind does the same to us out in the world of politics and commerce. I wrote about this in my blog, Undermined Predictive Analytics.
The limitations of our beloved logic stem from the fact that we cannot possibly know everything about a system. There is no such thing as perfect information. The complexity of the world means things are constantly changing, immediately rendering much of what we “know” obsolete. However, our saving grace is that for the most part in our everyday world, a system will be stable enough over a limited volume of space and time for something we’ve learned to apply from one minute or day or year to the next.
When I mention this, people usually quickly quip (say that five times), “Garbage in, garbage out”, which entirely misses the point. Of course bad information leads to bad decisions. But even perfectly “correct”, perfectly accurate data points (perfect information) can lead to naïve decisions in the future. The inferences our logical minds make is limited to the relationships accumulated in our brains over the course of our lives; our experiences.
We usually think of things in terms of a complicated system even if the system is complex because animal brains evolved to be effective within a limited space and time. That limited space and time is all that’s needed for most creatures just out to make it through another day. Decisions still seem to work because in the limited space and time, underlying conditions can remain relatively static, meaning something that worked two days ago has a good probability of working today. Additionally, the basic interests of humans are relatively stable and thus provide some level of inertia against relentless change, which adds to the probability that what worked yesterday still has a chance to work a year from now
Our brains evolved to solve problems with a horizon not much further than until the next time we’re hungry. For anything we do, we can pretty much ignore most things outside the immediate problem as just extraneous noise. Thinking long term is unnatural, so we don’t care about any butterfly effect. Thus we really don’t have a genuine sense of space spanning more what we encounter in our day to day lives nor of time spans much beyond a few human lifespans.
Software Sclerosis – An acute condition of software whereby the ability for it to be adapted to the changing needs of the domain for which it was written is severely hindered by the scarring of the excessive addition of logic over time.
As the name of my company, Soft Coded Logic, implies, the primary focus of mine is how to write software that can withstand inevitable changes through built-in malleability. I’m not talking about just change requests or new features added in a service pack or “dot release” (like Version 1.5). I’m talking about logic, those IF-THEN rules that are the basis of what “code” is about. Changes are inevitable because we live in a complex world and logic has readily apparent limitations in a complex world. How can software adjust to rule changes without ending up a victim of “Software Sclerosis”, a patchwork of rules, a domain so brittle that most rules probably contradict something? On the other hand, flexibility can sometimes be “wishy-washy”, which means software cannot perform as optimally as it can.
Soft-coded logic, had always been my passion. I mentioned earlier that I began my software development career in 1979 and was heavily influenced by the Expert System craze of the 1980s. But software projects became modeled under the same paradigms as those to build a bridge or building. The bridge or building must fulfill a set of requirements, which beyond functional requirements includes regulatory requirements, a budget, and a timeframe. Software has similar requirements except that because rather ethereal, not as tangible and rigid as a bridge, it is the path of least resistance and so is the natural choice for what must yield when things change. It’s easier to modify software to operate on another operating system than it would be to retrofit a bridge to act as an airplane runway as well.
Short of developing a genuine AI system, one that genuinely learns and adjusts its logic (the latter much harder than the former), we can only build in systems to ameliorate the sclerosis. The problem is that the value of these systems or methods is not readily apparent and just as importantly they weigh the system down when it’s running (not streamlined). So such systems/methods are quickly deemed “nice to haves” and are the first things to be cut in a budget or time crunch.
BI systems are rather rigid too:
- OLAP cubes reflect a fixed set of data, which means it can pre-aggregate in a predictable manner, thus fulfilling its prime mission of snappy (usually sub-second) query response.
- Data Marts and Data Warehouses are still based primarily on relational databases which store entities as a fixed set of attributes (tables).
- “Metadata” still primarily refers to things like the database, table, and field names of an entity attribute, as opposed to the “Semantics” of an attribute.
- Definitions of calculations and business rules are still hard-coded. The great exception are data mining models where the selection of factors and their relationships can be automatically updated to reflect new conditions … at least to an extent.
- Users still mostly consume BI data as pre-authored reports, not through analytics tools – based on the feedback I get about practically any analytics tool being a quant’s tool.
- Basic concepts such as slowly-changing dimensions is still more of an afterthought.
Technologies I mentioned in the Part 3 topic, “Why is Map Rock Different?” , such as metadata management and predictive analytics as well as technologies like columnar databases and the Semantic Web will help to reduce the “plaque of quick fixes” in today’s software. But I hope Map Rock can “surface” the notions of malleability higher up the “stack” to information workers, that is, beyond current Self-Service BI. Developing Map Rock, I did my best to incorporate these things into its DNA while at the same time avoiding too much overhead by going “metadata crazy” and more importantly, developing systems that ameliorate the terrible side-effects of being metadata-driven.
- Part 5 – We close the Problem Statement with a discussion on imagination, which is how we overcome the limitations of logic, and how it is incorporated into Map Rock.
- Map Rock Proof of Concept – This blog, following the Problem Statement series, will describe how to assess the need for Map Rock, readiness, a demo, and what a proof-of-concept could look like.
- Obviously, I’ve never worked for the CIA because I seriously doubt I’d be able to even publicly suggest what sort of software they use. I would imagine their needs are so unique and secret that their mission critical applications are home-grown. But then, it’s like not I’ve never been surprised by learning a BI system consists of hundreds of Excel spreadsheets.
- Junk mail filters are one of these semi-autonomous decision makers. Today it made a decision that could have profoundly affected my life. It placed a legitimate response to a position for which I was sincerely interested into my junk mail. It was indeed a very intriguing position. I don’t usually scan my junk mail carefully and so it could very easily have been deleted. My point is that such semi-autonomous software applications or devices do affect things, adding to the complexity of the world.
- Dehierarchization, distribution of decision-making, is very analogous to a crucial design concept in software architecture known as “loosely coupled”. Instead of a monolithic, top-down, controlled software application, functions are delegated to independent components each with the freedom to carry out their function however the programmer wishes and as unobtrusively as possible (plays well with the other components). Each is responsible for its maintenance and improvement. Without this architectural concept, the capabilities of software would be hampered due to the inherent complexity of a centrally controlled system.
- Slime mold is one of the most fascinating things out there. You’ve probably seen it in your yard or on a hike at some time and thought it to be some animal vomit. It is a congregation of single-celled creatures that got together into what could be called a single organism. When food is plentiful, these cells live alone and are unnoticed by our naked eye. When food is scarce, they organize into this mass and can even develop mechanisms to hunt for food.
- I think engineers are often thought to over-think things and are prone to over-engineering – which I think is a good thing. But it’s often because we are aware of many things that can go wrong even if things may seem great on paper. I believe we also innately realize that there is an underlying simplicity to things and that if something is too hard, it’s often not right. When faced with something I need to engineer, I can only start with the technique I know works best. I may find it’s not good enough (or suspect it’s not good enough), which will lead me to research a better way or I may need to invent a better way. In any case, engineering involves the dimension of time, a limited resource. So we engineers weigh “good enough for now” with “so bad that there must be a better way”.