Map Rock Problem Statement – Part 4 of 5

This is Part 4 of 5 of the Map Rock Problem Statement. Strategy, complexity, competition, and the limitations of logic make up the soup that lead to humans being as smart as we are the way that we are. We’ve obviously done very well for ourselves. However, I feel there is an over-emphasis on speed, simplicity, and control that will essentially lead us to lose these “powers”. The previous installments can be viewed here:

  • Part 1 – Preface to the 5-part series.
  • Part 2 –  I describe Map Rock’s target audience and the primary business scenario for Version 1. It is not just a tool for quants, wonks, and power-users.
  • Part 3 – We delve into a high-level description of the Map Rock software application, where it fits in the current BI framework, and how it differentiates from existing and emerging technologies. This is really the meat of the series.

Map Rock’s Main Approach

We live in a complex world powered by the relentless struggle for survival of all things at all levels (individual, herd/tribe/country, species), each following relatively simple rules, with no top-down control. However, we humans have an ability to manipulate our surroundings to our liking, at least in the short-term (seconds to days), by applying logic which works within restricted space and time. In the moderate term (months to years), we can have our way to a lesser extent through the development and implementation of malleable strategies. Beyond the timeframe of a couple of years, even a workable prediction is useless.

Map Rock’s goal is ambitious, to say the least. As I illustrated in the Part 3 section, “How is Map Rock Different?”, it touches so many things. The biggest challenge was to avoid developing a hodge-podge, “chop suey” application analogous to the “Homer”, the car with every feature imaginable designed by Homer Simpson. My approach was to take many steps back to see the common threads tying all of those things listed in the previous section.  Instead of looking for an opportunity to fill an underserved aspect of BI, I wanted to see if there is a way to tie the pieces of the BI world together.

In the end, we want to be smarter, we want to make better decisions. A good place to start is to compare why we humans are smarter than other animals. The world has done very well for a few billion years without BI. Simple rules, followed by trillions or so agents, result in the life around us. It’s certainly not just our huge brains which in themselves are just huge hard drives (but with a more intricate structure). But our intelligence involves more complex desires towards success than simply hunger and reproduction. We’ve become symbolic thinkers, which means we can play what-if games in our heads; virtual experiments to test an outcome before we take physically irreversible actions.

At the lowest levels of Map Rock’s design are the notions of rules and troubleshooting. Rules are really about recognizing things both tangible (like a rock) or intangible (like an impending accident). Troubleshooting is the process of the resolution of problems: identifying symptoms, recognizing a diagnosis and applying a treatment.

Troubleshooting isn’t something restricted to “technical” people such as your mechanic or SQL Server performance tuning expert. It’s just the term used by those technical people for “figuring out what’s wrong”, which we all do every day. We’re barely conscious of many of our troubleshooting efforts which can be as mundane as recalling where we left the toothpaste or as complex as figuring out why my poha hasn’t yet fruited.

Identifying symptoms is relatively easy, simply recognizing sets of attributes or they are answers to relatively simple questions. The biggest challenge with identifying symptoms isn’t the answering of the question itself. It is that maybe we aren’t looking for the right things and/or looking for the wrong things, in other words, asking the wrong questions. For example, while the amateur investors are looking for solid performance numbers, the professionals are looking for bubbles about to burst. And, the right and wrong things are different under different contexts.

After we’ve taken inventory of our situation (identified the symptoms), we can “label” the “situation”, consider its own macro object, which is a diagnosis. Has anyone ever seen this set of symptoms before? Yes. Does it have a name? Hodgkin’s Disease?

If we’re fortunate enough to find that someone else has seen these symptoms, we can leverage their experience by applying a treatment used in those previous cases or at least to pick up a few more clues around that previous case. Declaring a diagnosis is also relatively easy, but it’s important to note a couple of addition things about symptoms, a components of a diagnosis. The symptom itself could itself be the result of a diagnosis and our certainty about each symptom may not be as plain as day (meaning, it could just be a best guess).

Treatment is the most difficult part. If we’re lucky, what we are treating has happened many times before and has been rectified through a tried and true process. But out in the wild, because life is a complex system, nothing ever happens exactly the same way. Two events  may look very similar, but they are only similar to some extent, not exact. Therefore:

  • This inherently means that a diagnosis, no matter how many times it has worked in the past, is always at the risk of being incorrect. The devil is in the details. If it looks like a duck and quacks like a duck, it may just be a decoy deployed by a hunter.
  • We must also consider the cost for being wrong. This consideration is too often just a side-note since what could go wrong hasn’t yet happened, and therefore doesn’t seem as important as what is happening right now. And, we’re very good at twisting facts in our head or conveniently sweeping facts under the rug to justify (at least in our minds) why we shouldn’t worry about these things.
  • There may be important data points unknown to us that are required to mitigate risk or at least figure out how to deal with the risk. It’s not that we are negligent, but that it’s fair to say no one in their “right mind” would have thought about it.

If we’re facing a completely novel situation, inventing a treatment is usually more involved than simply applying some IF-THEN logic. Even more, we need to be mindful of what could go wrong and “what am I missing”?

There is an elegant process by which our symbolic thinking works that I attempted to implement in my SCL language, a language I developed based on the Prolog AI language attempting to reflect not just the essence of logic, but the distributed nature of knowledge and effort. I discuss it in general terms in my mini-blog, The Four Levels of SCL Intelligence. Map Rock could be thought of as a more specialized user interface for SCL than a more general version I had been working on that I named “Blend” (as in the blending of rules).

At the core of the processes by which we use Map Rock are three main questions:

How are things related? Relationship is the core of our brains. As we go through life, the things we encounter together at any moment, our experiences, are recorded in our brains as related. A web of relationships of many types (correlation, attributes of, etc) are the protein molecules (I chose to use “protein” to convey complexity and variation) of our applied logic.

How are things similar? This question is the basis for metaphor, which is what opens the door to our thought versatility. Metaphor is our ability to recognize something that is merely similar to another thing. A direct match isn’t necessary. The idea is that if something is similar, we can cautiously infer similar behavior. Without this flexible capability, for anything to happen, there would need to be a direct, unambiguous recognition, which is a very brittle system.

What has changed? Noticing change is what engages many mechanisms. All animals respond to change. Birds sit up high looking for the slightest change, movement, in the scene before them. When attempting to troubleshoot a problem, such as a doctor attempting to resolve an issue, one of the first questions after “How can I help you?” is “What has changed recently?”

Strategy

My goal with Map Rock is to put the “I” back into BI. This notion reflects my career’s roots that began shortly before the AI and Expert System craze of the 1980s. However, the context in which I think of “I” is not the same as truly replicating human intelligence. Back then I was still naïve enough to think that implementation of such concepts were then feasible. So maybe I’m a little unfair since BI was perhaps really never thought of as the corporate version of the sort of software I imagine the CIA must use to facilitate their primary functions. See Note #1. But I’m also referring to moving beyond simply gathering data for analysis by human analysts. As I mentioned in Part 3, I’m after what I call “pragmatic AI”.

With that said, BI has seemed somewhat lackluster to me since the dot-com bust. The “doing more with less” mantra is more about not losing than about winning.  We’re also very fearful of failure (and lately even seem to look down on winning). Any mistakes we make become very public and will haunt us forever as rigid data mining algorithms filter us out due to key words on our record that superficially don’t take into account that we all make mistakes and that the only reliable way to uncover vulnerabilities is through mistakes. It’s one thing to take criminal or even negligible (and that’s a questionable word)action, but not well-intentioned risks towards the goal of winning fairly.

Every single conscious action we decide to take is calculated within the context of a strategy. Strategies are a path meandering through a web of cause and effect that takes a situation from one presumably undesirable state to another desired state. A massive web of webs of cause and effect build in our brains from our experiences.  Some “causes” are things we can directly alter like the volume dial and some are out of our control, at least indirectly. Some effects are good and some are bad. So all day long we try to hit the good effects and avoid the bad ones through logic, these paths, these cascading links of cause and effect.

On the other hand, subconsious actions (like driving) are not in the context of a strategy but in a context of sequences of recognize/react pairs determined through sheer statistical weights. We drive until we hit an “exception”, something out of bounds of the predictive analytics models running in our head. That engages our thinking and formulating of strategies.

It’s important to realize as well that “bad” effects are not to be avoided at all cost. Remember, almost every strategy involves costs. Some can be relatively painful. For example, curing cancer usually involves trading in several “merely” major pains in exchange for one severely major pain. This is called “investment” or “sacrifice”. The reason I mention this is because “Scorecards” sometimes fail to illustrate that some “yellow” KPIs (ex: the typical traffic light showing a yellow light) reflect some pain we actually intended to bear as an investment towards a goal. It is not something that should be rectified. This is very similar to how we may be inadvertently subverting nature’s reactions to a sprained ankle by taking measures to bring down the swelling.

Now is a good time to note that immediately after I say, “cause and effect”, someone reminds me that “correlation does not necessarily imply causation”. It’s rare to find two or more completely unrelated phenomenon that correlate. Usually, strong correlations do share a common driver, even though one may not cause the other. For example, higher levels of disease and mental stress may correlate with higher population densities.

In fact, one of the primary functions of Map Rock is to help assess the probability for causation, part of a procedure I call “Stressing the Correlation”. This procedure, which illuminates and eliminates false positives, includes tests for signs of bias, consistency of the correlation, chronological order of the two events, and identifying common factors.

Please keep in mind that excessive false positives (the TSA screening practically everyone) is the major negative side-effect of avoiding false negatives (missing the true terrorist). At least we can deal with what we see (false positives). One of the major goals of Map Rock is to expose things we wouldn’t think about (false negatives). If we had to choose, I’d say I’d rather deal with excessive false positives than miss a false negative when the cost for being wrong is extreme.

I’m often told that there are patterns out there, that numbers don’t lie. Yes, nature has done very well for herself without human strategy. Getting data faster is a crucial element to the execution of a strategy. The point is, those patterns work beautifully well as long as you understand that at the level of detail below those patterns, a percentage of things are mercilessly plowed into the field for recycling.

Complicated and Complex Systems

The key to appreciating the value of Map Rock is to recognize the fundamental difference between a complicated system and a complex system. The September 2011 edition of the Harvard Business Review included several very nice articles on “embracing complexity”. Paraphrasing one of the articles, the main reason why our solutions still fail or perhaps work but eventually fall apart is that we apply a solution intended for a complicated system to a problem that is really complex. I think we generally use these terms interchangeably, usually using either term to refer to what is really “complicated”.

Machines and the specific things to which they apply are complicated. A screw driver, a screw, the hole in which the screw is applied and the parts it’s fastening are a complicated machine. On the other end of the spectrum, even something as sophisticated as a hive of Hadoop servers is a complicated system. What makes a system complicated and not complex is that we can predict an outcome with a complicated system, even if it takes Newton to run the calculations. We’re able to predict things because all the parts of a complicated system have a specific, tightly-coupled relationship to each other.

The Industrial Revolution is about qualities such as precision, speed, endurance, all of which machines are infinitely better at than humans. We build machines ranging from bicycle tire pumps to semi-conductor fab plants that can output products (air bursts and chips, respectively) with greater productivity than is possible with just the hands of people. Today, we still continue an obsession with optimization of these systems by eliminating all variance (most of which is human error), minimizing waste, especially defects and down time.

This distinction between complicated and complex is incredibly profound when making business decisions because:

We cannot settle for “just one number, please” predictions in a complex system. We can make accurate predictions within a consistent context. Complex systems are not a consistent context. For example, we can develop data mining algorithms to predict how to attract a patient into a dental office based on the patterns of that office’s patients. However, that same model probably will fail miserably for a practice in another neighborhood, state, or country. The best we can do is hope that the context changes slowly enough so our models at least work to some extent at least for a while.

Strictly speaking, there are probably no complicated systems in real life. Really, I can’t think of anything on earth that operates in a vacuum. Everything is intertwined in a “Butterfly Effect” way. Even a vacuum as we generally mean is a vacuum in that it is devoid of matter, but not things like gravity and light passing through. Every complicated system I can think of is only an illusion we set up. We draw a box around it and limit our predictions to a limited space and time hoping that nothing within that time will change enough to affect our predictions.

Figure 1 illustrates how we create the illusion of a closed system. We encase the system (the green circle representing a car) within a layer of protective systems (the blue layer) protecting it from the complexity of the world. I purposefully chose a dark gray background in Figure 1 to convey the notion of an opaque complex world, a “dark-gray” box, not quite a “black box”.

Closed Systems

Figure 1 – Complicated systems are closed systems. We create “virtual” closed systems by protecting it through various mechanisms from the complexity of the real world.

Of course, the protective systems cannot protect the car from everything conceivable. It will not protect it from a bridge falling out from under it or a menacing driver in another car.

The Complexity is Only Getting Worse

A system’s complexity grows with the addition of moving parts and obstacles which complicate the movement of the moving parts. Following are a few example of Forces adding moving parts, which directly increases complexity:

Globalization. Each country has its own sets of laws, customs, and culture. Working with different sets of rules means we need to be agile and compromise, which complicates our efforts.

Accumulating regulations and Tightening Controls. Constraints only add to complexity. They may not be a moving part, but act as a roadblock to direct options. There are so many regulations, collectively millions of them at all levels of government in the US alone) in play that most of them must be in conflict with others. I wouldn’t be surprised if we all inadvertently broke some law every day. Ironically, regulations are an attempt to simplify things by removing variance and events that can cause a lot of trouble.

Growing population and affluence. Each person is a moving part of our global economy. More affluence means more people make more decisions with wider scope, are more active, touching more things, whether it’s as a consumer or as a worker.

The number of “smart” devices which can even be considered semi-autonomous. These devices that make decisions (even though they may be mundane) is also a moving part. See note #2 for an example of one that happened to me today.

Increasing Pace of the world. Even if no moving parts were added, the increasing pace of things adds as much to growing complexity as the number of moving parts. The faster things are going, the more spectacularly they will crash. Not too many things scale linearly and increased load will add complication as different rules for different scales engage.

More demands on us. With more and more regulations and responsibilities hoisted upon us, we’re forced to prioritize things, which opens many can of worms. In essence, priotization means we are choosing what may not get done or at best will be done half-heartedly with probably less than minimal effort. That can result in resentment from the folks we threw under the bus or other sorts of things that add more problems. It forces us to spend the least amount of energy and resources as possible so we can take on the other tasks. We learn to multi-task but that may lower the quality of our efforts, at least for the tougher tasks (easy tasks may not degrade in quality of effort).

De-hierarchization/de-centralization of corporate life. Last, but definitely not least. This leads to more and more moving parts as decision control is placed into more people (and even machines) who are now better trained and tied in through effective collaboration software. However, decentralization is really a good thing that mitigates, if not removes, bottlenecks, enriches the pool of knowledge from which decisions within the enterprise are made, and drastically improves the agility of a corporation. Decentralization is really the distribution of knowledge across an array of people who can proceed with sophisticated tasks minimally impeded by bottlenecks. See Note #3 for more on this concept and Note #4 on Slime Mold.

Embracing Complexity

When I’m in a room with other engineers brainstorming a solution, we’ll agree that a suggestion is too complicated or complex. We then back away from that suggestion, usually ending up sacrificing feature requests of varying importance to the “nice to have” pile (and never actually getting to it). I have no problem with that at all.

Although I believe many who know me will disagree, I don’t like complicated answers (see Note #5). Complications mean more restrictions, which means brittle. Complications happen when you try to have your cake and eat it too, which is a different issue from getting too fancy with all sorts of bells and whistles. What I mean is when we want to accomplish something, but there are constraints, need to include safeguards to protect those constraints. Constraints handicap our options. We can usually engineer some way to have our cake and eat it too, but eventually we will not be able to patch things up and the whole thing blows up.

When I began developing SCL way back when, my thought was how to embrace complexity, tame it, and conjure up ways to deal with the side-effects. The problem is that to truly embrace complexity, we need to be willing to let go of things and often have no choice as to which of those things go. But it’s one thing to be a non-self-aware species of bird that goes extinct as species that are more fit to current circumstances thrive and a self-aware person fighting for survival among billions of other self-aware beings. In a sense, everyone is a species of one.

I am incredibly far from having the answers. But what I do claim (at least I think I do) is that I have a starting point. It involves decentralizing control to networks of “smarter” information workers acting as a “loosely coupled” system (which works very well for developing complicated software systems). Most importantly, at least for Map Rock Version 1, is to accept and deal with the limitations of logic.

The Limitations of Logic

Whatever we personally know (knowledge in our individual head) are the things we’ve experienced: We only know what we know. Obviously, we cannot know things we haven’t experienced personally or that hasn’t been conveyed to us (learned directly) through some trusted mechanism (ex, someone we trust). Everything we know is a set of relationships. For example, an apple is recognized when we see a fruit with the classic shape, smell, color, etc.

That’s all fine and dandy until we attempt to infer new knowledge from our current knowledge. Meaning we take what we know, apply logic, and indirectly come up with a new piece of knowledge. How many times have we found something we were sure about to be wrong and when we figure out what went wrong we say, “I did not know that!” Meaning, had we known that, we would have come to a different conclusion.

Our mastery of the skill of logic relative to other animals is the secret sauce to our so-called “superiority” over them. However, for inter-human competition, all of whom have this power of logic, one needs superior logical capability as well as superior knowledge from which we can draw inferences. Logic is great as we use it to invent ways to outsmart nature (at least for the moment) who isn’t preying on us (nature herself isn’t out to get us). But as Superman was nothing when facing his enemies with the same powers in Superman 2 (General Zod, et al), we need to realize our logic can be purposefully tripped up by our fellow symbolically-thinking creatures. As we wrap a worm in a hook to catch a fish, our own kind does the same to us out in the world of politics and commerce. I wrote about this in my blog, Undermined Predictive Analytics.

The limitations of our beloved logic stem from the fact that we cannot possibly know everything about a system. There is no such thing as perfect information. The complexity of the world means things are constantly changing, immediately rendering much of what we “know” obsolete. However, our saving grace is that for the most part in our everyday world, a system will be stable enough over a limited volume of space and time for something we’ve learned to apply from one minute or day or year to the next.

When I mention this, people usually quickly quip (say that five times), “Garbage in, garbage out”, which entirely misses the point. Of course bad information leads to bad decisions. But even perfectly “correct”, perfectly accurate data points (perfect information) can lead to naïve decisions in the future. The inferences our logical minds make is limited to the relationships accumulated in our brains over the course of our lives; our experiences.

We usually think of things in terms of a complicated system even if the system is complex because animal brains evolved to be effective within a limited space and time. That limited space and time is all that’s needed for most creatures just out to make it through another day. Decisions still seem to work because in the limited space and time, underlying conditions can remain relatively static, meaning something that worked two days ago has a good probability of working today. Additionally, the basic interests of humans are relatively stable and thus provide some level of inertia against relentless change, which adds to the probability that what worked yesterday still has a chance to work a year from now

Our brains evolved to solve problems with a horizon not much further than until the next time we’re hungry. For anything we do, we can pretty much ignore most things outside the immediate problem as just extraneous noise. Thinking long term is unnatural, so we don’t care about any butterfly effect. Thus we really don’t have a genuine sense of space spanning more what we encounter in our day to day lives nor of time spans much beyond a few human lifespans.

Software Sclerosis

Software Sclerosis – An acute condition of software whereby the ability for it to be adapted to the changing needs of the domain for which it was written is severely hindered by the scarring of the excessive addition of logic over time.

As the name of my company, Soft Coded Logic, implies, the primary focus of mine is how to write software that can withstand inevitable changes through built-in malleability. I’m not talking about just change requests or new features added in a service pack or “dot release” (like Version 1.5). I’m talking about logic, those IF-THEN rules that are the basis of what “code” is about. Changes are inevitable because we live in a complex world and logic has readily apparent limitations in a complex world. How can software adjust to rule changes without ending up a victim of “Software Sclerosis”, a patchwork of rules, a domain so brittle that most rules probably contradict something? On the other hand, flexibility can sometimes be “wishy-washy”, which means software cannot perform as optimally as it can.

Soft-coded logic, had always been my passion. I mentioned earlier that I began my software development career in 1979 and was heavily influenced by the Expert System craze of the 1980s. But software projects became modeled under the same paradigms as those to build a bridge or building. The bridge or building must fulfill a set of requirements, which beyond functional requirements includes regulatory requirements, a budget, and a timeframe. Software has similar requirements except that because rather ethereal, not as tangible and rigid as a bridge, it is the path of least resistance and so is the natural choice for what must yield when things change.  It’s easier to modify software to operate on another operating system than it would be to retrofit a bridge to act as an airplane runway as well.

Short of developing a genuine AI system, one that genuinely learns and adjusts its logic (the latter much harder than the former), we can only build in systems to ameliorate the sclerosis. The problem is that the value of these systems or methods is not readily apparent and just as importantly they weigh the system down when it’s running (not streamlined). So such systems/methods are quickly deemed “nice to haves” and are the first things to be cut in a budget or time crunch.

BI systems are rather rigid too:

  • OLAP cubes reflect a fixed set of data, which means it can pre-aggregate in a predictable manner, thus fulfilling its prime mission of snappy (usually sub-second) query response.
  • Data Marts and Data Warehouses are still based primarily on relational databases which store entities as a fixed set of attributes (tables).
  • “Metadata” still primarily refers to things like the database, table, and field names of an entity attribute, as opposed to the “Semantics” of an attribute.
  • Definitions of calculations and business rules are still hard-coded. The great exception are data mining models where the selection of factors and their relationships can be automatically updated to reflect new conditions … at least to an extent.
  • Users still mostly consume BI data as pre-authored reports, not through analytics tools – based on the feedback I get about practically any analytics tool being a quant’s tool.
  • Basic concepts such as slowly-changing dimensions is still more of an afterthought.

Technologies I mentioned in the Part 3 topic, “Why is Map Rock Different?” , such as metadata management and predictive analytics as well as technologies like columnar databases and the Semantic Web will help to reduce the “plaque of quick fixes” in today’s software. But I hope Map Rock can “surface” the notions of malleability higher up the “stack” to information workers, that is, beyond current Self-Service BI. Developing Map Rock, I did my best to incorporate these things into its DNA while at the same time avoiding too much overhead by going “metadata crazy” and more importantly, developing systems that ameliorate the terrible side-effects of being metadata-driven.

Coming Up:

  • Part 5 – We close the Problem Statement with a discussion on imagination, which is how we overcome the limitations of logic, and how it is incorporated into Map Rock.
  • Map Rock Proof of Concept – This blog, following the Problem Statement series, will describe how to assess the need for Map Rock, readiness, a demo, and what a proof-of-concept could look like.

Notes:

  1. Obviously, I’ve never worked for the CIA because I seriously doubt I’d be able to even publicly suggest what sort of software they use. I would imagine their needs are so unique and secret that their mission critical applications are home-grown. But then, it’s like not I’ve never been surprised by learning a BI system consists of hundreds of Excel spreadsheets.
  2. Junk mail filters are one of these semi-autonomous decision makers. Today it made a decision that could have profoundly affected my life. It placed a legitimate response to a position for which I was sincerely interested into my junk mail. It was indeed a very intriguing position. I don’t usually scan my junk mail carefully and so it could very easily have been deleted. My point is that such semi-autonomous software applications or devices do affect things, adding to the complexity of the world.
  3. Dehierarchization, distribution of decision-making, is very analogous to a crucial design concept in software architecture known as “loosely coupled”. Instead of a monolithic, top-down, controlled software application, functions are delegated to independent components each with the freedom to carry out their function however the programmer wishes and as unobtrusively as possible (plays well with the other components). Each is responsible for its maintenance and improvement. Without this architectural concept, the capabilities of software would be hampered due to the inherent complexity of a centrally controlled system.
  4. Slime mold is one of the most fascinating things out there. You’ve probably seen it in your yard or on a hike at some time and thought it to be some animal vomit. It is a congregation of single-celled creatures that got together into what could be called a single organism. When food is plentiful, these cells live alone and are unnoticed by our naked eye. When food is scarce, they organize into this mass and can even develop mechanisms to hunt for food.
  5. I think engineers are often thought to over-think things and are prone to over-engineering – which I think is a good thing. But it’s often because we are aware of many things that can go wrong even if things may seem great on paper. I believe we also innately realize that there is an underlying simplicity to things and that if something is too hard, it’s often not right. When faced with something I need to engineer, I can only start with the technique I know works best. I may find it’s not good enough (or suspect it’s not good enough), which will lead me to research a better way or I may need to invent a better way. In any case, engineering involves the dimension of time, a limited resource. So we engineers weigh “good enough for now” with “so bad that there must be a better way”.
Posted in Map Rock | Tagged , , , | Leave a comment

Polynomial Regression MDX

About a year and a half ago I posted a blog on the value of correlations and the CORRELATION MDX function titled, Find and Measure Relationships in Your OLAP Cubes. However, the CORRELATION function calculates only linear relationships, which means that the polynomial nature of more of the juicier relationships out there are somewhat poorly measured.

Most real relationships change as the values scale to larger or smaller extremes. For example, working sixteen hours per day will not be twice as productive as only eight hours. It’s only natural as hardly anything is ever allowed to grow indefinitely. There are tempering and exacerbating factors, some form of “diminishing return” or on the other hand a snowball effect. Figure 1 illustrates the polynomial (red line) and linear (green line) relationship between the fuel consumption (mpg) and your speed on the freeway (mph). It shows that fuel efficiency rises until it hits a peak at about 45 mph, then declines after about 60 mph, in significant part due to changes in aerodynamics at those higher speeds.

Miles per Gallon vs Miles Per Hour

Figure 1 – Miles per Gallon vs Miles per Hour. Sections are linear enough.

The polynomial relationship is very tight with a correlation strength (R2) of .9382. However, the linear relationship shows up rather weak with an R2 of .2782. The polynomial and linear relationships are so different that they actually contradict!  From the graph, it’s easy to see that the polynomial figure makes more sense. The need for polynomial measurement can often be avoided if we stick to a limited range of measure. In Figure 1, the orange circles show that the correlation is pretty linear between 5 and 35 mph and again between about 55 and 75. But from 5 to 75, it follows a fairly tight polynomial curve.

So polynomial relationship calculations are superior, but the problem is that they are more calculation-intensive. And for Analysis Services folks, there isn’t a native MDX function for polynomial regression as there is for linear regression (CORRELATE). We need to reach back to high school algebra and write out an old-fashioned polynomial (y=ax2+bx+c) for this using calculated measures. The calculations are all pretty simply, mostly just a bunch of SUMing and squaring x and y in all sorts of manners.

There are certainly tons of other methods, most better, for calculating polynomial regressions. However, this method pushes SSAS and MDX as far as I’d feel comfortable doing as it still performs fairly well. I should also point out that this blog is focused on finding relationships between combinations of measures (or even members) and doesn’t go to the next step of using it for forecasting (plugging in x into the y=ax2+bx+c formula to get y) – which is better served using data mining models.

Incidentally, as a side note, the blog I mentioned earlier, Find and Measure Relationships in Your OLAP Cubes, was what inspired me to develop Map Rock. I realized that once one plays with exploring correlations, some of the things you would want to do aren’t straight-forward in what were then the typical OLAP browsers. An example is how easily prone to supplying misinformation these correlations can be as we saw above. Those issues were the subject of a follow-up blog which I didn’t end up posting as I realized that would make a pretty neat application.

The MDX

The MDX used to describe this technique is in the form of a SELECT statement that can be downloaded and run in SQL Server Management Studio. Here are a few points before we get into a walk-through of the MDX:

  • The technique is equivalent to Excel’s Polynomial to the 2nd Order Trend Line, as Figure 2 illustrates.
  • Because this is just algebra, I will not go deeply into an explanation of the calculations as it is fairly elementary – although I needed to think it through myself after not seeing the actual equations for so long.
  • This MDX sample uses the AdventureWorks sample cube.
  • I’m using SQL Server 2008 R2 and Excel 2010.

Excel Trendline Options.

Figure 2 – Excel Trendline Option, Polynomial to the 2nd order.

Figure 3 illustrates where we will end up with this walk-through. It shows the relationship between [Internet Sales Amount] and [Sales Amount] for the Product Subcategory, Shorts.

Miles per Gallon vs Miles Per Hour

Figure 3 – Measure correlation for “Shorts”.

The R2 value of .6615 demonstrates moderate correlation, but it is deceptive as there is a clear outlier toward the bottom (Jul-04) that is skewing the result somewhat. I left the outlier in because I can’t stress enough how this technique doesn’t take into account the removal of outliers. Figure 4 shows that removing the outlier yields an almost non-existent correlation.

Miles per Gallon vs Miles Per Hour

Figure 4 – Measure correlation for “Shorts without the outlier”. The correlation isn’t good at all without that outlier.

If you’d like to play along, open up an MDX window in SQL Server Management Studio (I’m using SQL Server 2008 R2) and  open this script, which will be described in the following paragraphs.

There are three main sets of calculations in the MDX. Figure 5 shows what I call the parameters. The three parameters mean that we are looking for the relationship between [Internet Sales Amount] and [Sales Amount] based on the months from August 2003 through July 2004. Notice that there is a line commented out for [Internet Tax Amount]. That is to test a different measure for “Y”, [Internet Tax Amount] instead of [Sales Amount]. (If you did try [Internet Tax Amount], you will be a perfect correlation since the tax amount is directly proportional to the sales.)

Figure 5 – “Parameters” of the MDX demonstrating polynomial relationship calculation.

Figure 6 shows some of the intermediate calculations for finding the “a, b, and c” (remember, y=ax2+bx+c) values of the polynomial. Again, it’s just a bunch of squaring and summing, the same old stuff I’m sure people have implemented many times, Excel included. Figure 6 doesn’t show the more important “a, b, and c” calculations because they are rather verbose and I didn’t want to include such a large snapshot.

Figure 6 – Intermediate calculations for determining the a, b, and c values of the polynomial.

Figure 7 shows the actual select part of the MDX along with the star calculation, “Relationship” (R2). This MDX will show the strength of the correlation between the [Internet Sales Amount] and [Sales Amount] for each product subcategory.

Figure 7 – The business end of the MDX.

Notice as well that there is a line commented out for the customer Gender/Education level. You can try this out after this walkthrough focused on product subcategory.

If you are playing along, I should mention there are two queries in the script, this query and a test query. Be sure to highlight the one you intend to run.

Executing the MDX will yield what is shown in Figure 8 (partial results). The Relationship values show a value from 0 through 1, where 0 is absolutely no correlation and 1 is perfect correlation.

Figure 8 – Result of the MDX. The “Relationship” column shows the R2 value.

I’ve highlighted (red circle) “Shorts” as the product subcategory we will test. Notice though that there are many quirky values:

  • Lights and Locks show a value of 1.000, a perfect correlation. However, that’s because all of the values are null.
  • Mountain Frames shows -1.#IND. In this case, the Internet Sales Amount for all months are null, but there are values for Sales Amount.
  • You can’t see it here, but some of the Relationship values will not match Excel. That is because for some of the product subcategories, the values for Jul-04 are null. “Mountain Bikes” are an example.

Figure 9 shows the MDX used in the first step to test the Relationship values against what Excel will calculate (as illustrated in Figure 3). Notice that I slice (WHERE clause) to return values for “Shorts”.

Figure 9 – Test the R2 value for shorts.

Figure 10 shows the month by month values used to derive the polynomial used to calculate the relationship strength.

Figure 10 – The Internet Sales Amount and Sales Amount for Shorts by month.

Follow these steps to duplicate what is shown in Figure 3 (well, almost – I did do some cleaning up):

  1. Copy/Paste the entire contents of the Result pane into an Excel spreadsheet.
  2. In the Excel spreadsheet, select just the Internet Sales Amount and Sales Amount columns.
  3. Click on the Insert tab, click the Scatter icon, and select the plain Scatter plot (the one in the upper-left corner).
  4. Right-click on any of the plotted points and select “Add Trendline”.
  5. Select Polynomial and check the “Display R-Squared value on Chart” and “Display Equation on chart” items. Close.

Limitations

Implementing these calculations into the MDX script is easy for the most part. Just add the calculations, setting the appropriate visibility. What will be clumsy is dynamically selecting the measures for X and Y. There isn’t a straight-forward way to select two measures from most cube browsers. My only thought right now would be to set up two pseudo Measures dimensions where each member is mapped to a real measure (using SCOPE). Then we can select x and y from those dimensions. That’s a blog in itself.

Additionally, in “If You Give a Mouse a Cookie” fashion, after you begin playing with relationships, you’re going to want to:

  • Drill down to the details of the correlation set, as we did in the example.
  • Select the same hierarchy across rows and columns. For example, we may want to cast for the correlation between the [Internet Sales Amount] for each product and the associated ad campaign cost (assuming product-level costs exist).
  • Handle outliers.
  • Have more control over the nuances of the correlation algorithm (from a calculation and performance point of view) than is allowed through MDX.

Those are in fact among the initial thoughts I had a year and a half ago when I first created the Visual Studio 2010 project for Map Rock. Please do take a look at the Map Rock Problem Statement for much more on those thoughts.

From a performance point of view, each cell involves many calculations, so the number of cells calculations are many. The good news is that the calculations aren’t the sort that generates thousands of “Query Subcube” events. Currently, the MDX is pretty snappy (even on cold cache), but modifications to handle the quirks I described in the walk-through would have noticeable effects.

Posted in BI Development, SQL Server Analysis Services | Tagged , , , , , | Leave a comment

Map Rock Problem Statement – Part 2 of 5

This is Part 2 of 5 of the Map Rock Problem Statement. After a discussion on how I decided to develop Map Rock in, Map Rock Problem Statement – Part 1 of 5, Part 2 describes the target audience as well as the primary scenario for this Version 1.

Map Rock’s Target Audience

Map Rock’s target audience are those held accountable for meeting objectives in an ambiguously defined manner. Their objectives are ambiguous in that there isn’t a clear path to success and thus there is an art to meeting their objectives. This traditionally includes the managers; the CEOs through middle managers to team leaders. But it’s crucial to point out that even if you are one of the guys with no direct reports, as a modern, commando-like information worker, you still have a growing level of responsibility for designing how you achieve your objectives.  Map Rock is a software application intended to help such information workers through the strategy invention process and the execution of that strategy.

Map Rock isn’t just a “high-end tool” limited to a niche audience of quants, wonks, and power-users all with highly-targeted needs. For example, I’m aware of some of the marketing problems associated with the abrupt dropping of the PerformancePoint Server Planning application back in 2009. It was a very complex tool for a very small audience of managers. Such a product may not be a good fit for a Microsoft that develops blockbuster software for a wide audience (“whatever for the masses”), but can be a great opportunity for a startup willing and able to crack a few nuts. However, my goal isn’t to develop and sell a blockbuster software that goes “Gold” or “Platinum”. While financial success is naturally one of my goals, more importantly, I have a vision for BI which include – certainly not limited to – notions such as these: 

  • Assist information workers with finding relationships they don’t already know.
  • Blur and smooth out the rough, abrupt hand-off points between human and machine intelligence.
  • Do better than keep on shoving square pegs into round holes because things need to be neatly categorized.
  • When making a decision, what if I’m wrong and what can I do about it?.
  • Help humans to regain the appreciation for our rather unique quality of imagination and its artful application. (See Note #1 for my comment on “art”.)

The answer isn’t another high-end, power-user tool, or on the other extreme a watered-down, lowest-common-denominator, single-task tool.  I built an integration technology that will tie the efforts of managers, IT, wonks, and commando information workers. Towards that goal, I needed to dive deeper than the planning activity that the PerformancePoint Planning application addressed into the abstract lower-level of strategy, which is an activity common to all people trying to make their way in the world.

I’ve pondered whether Map Rock is a “data scientist’s” tool. I think of a data scientist as someone with the wide-ranging skill required to procure and analyze data. My view of a data scientist is someone with an “A” or “B” (in the context of school grades A through F) level of skill with databases, analytics, and business (all broad categories in themselves). This is different from quants and wonks who generally have a narrower and more focused “A+” level of typically math-oriented skill, utilizing a narrow range of highly specialized tools such as SPSS.

I see the data scientist as the high-end but not exclusive audience of Map Rock. So I need to be clear that Map Rock is intended to work best as a collaborative application, not a single-user, stand-alone tool. As I mentioned above, Map Rock integrates the efforts of human managers, IT, wonks, and commando information workers. The data scientist plays a crucial role in a Map Rock ecosystem but not like a DBA who manages the database system but is not herself a consumer of the data. The data scientists using Map Rock are more in spirit of the entrepreneurs of an economy who keep the economy robust through their push for change and are also consumers themselves. And in some ways like the Six Sigma black belt who actively nurtures the system.

Additionally and equally importantly, Map Rock integrates these human intelligences to the machine intelligence captured in our various BI implementations; Data Warehouses/Marts, OLAP Cubes, Metadata repositories, and Predictive Analytics Models. From this point of view, data scientists are also like the ambassadors of the human world to the machine world, as they speak machine almost as well as they speak human.

At a higher level (the organization level), the target audience also considers the plight of the corporate “Davids” (of David and Goliath fame) who have come to terms with the fact that they will not beat those Goliaths at the very top of their market (ex: McDonalds and Burger King, the Big Three Automakers) by playing their game. Any entity at the top is at the top because current conditions favor them. Those conditions may have been imposed on the world by them a while ago through crafty engineering, it already existed and they saw a way to capitalize on it, or most likely a happy series of events lead to dominance (somebody has to roll heads ten times in a row). In any case, it seemingly makes sense for the members of that “royalty” to do all they can to preserve those conditions, at worst controlling the relentless wave of change as best as they can. They play primarily a defensive game. They have at their disposal the sheer force of their weight, to exploit the powers such as economies of scale, brand name recognition, lobby for laws, trip up the promising startups. This isn’t bad, it makes perfect sense. But eventually, controlling change, plugging one hole leads to two more, then others, until finally, the system explodes or implodes.

The scrappy “sub-royalty” corporations below the royalty of #1 and #2 in their market at the same time keep exerting pressures by finding novel ways to topple #1 and #2; playing a wiry offensive game capitalizing on the virtues of being small. They find their weak spots and hit the Goliath there. This audience is the one that would have incentive to take Performance Management to the next step, beyond a “nervous system” to a focus on strategy.

Just as compelling are the thousands of relatively small”conglomerate” corporations consisting of a number of franchises such as hotels, fast food restaurants, and medical practices. These mini-businesses, for which there could be a few to hundreds in the conglomerate, span a range of markets and localities meaning each piece involves unique characteristics rendering what works one place useless in another. For example, it would be very useful to be able to test the risk of applying an initiative that worked in a pilot project across the board to all medical practices.

The most common negative feedback I get on Map Rock is some form of, “People would not want to do that.” That is, they will choose to rely on their gut instincts as opposed to turning to a software application to validate their beliefs, to find new levers of innovation, to explore for unintended consequences, to clarify a mess of data points back into a coherent picture. Curiously, most of the people who tell me this are themselves the player/managers I will talk about more below. Remember, Map Rock in great part about blurring the line between human gut instincts and the cold hard facts of our BI assets, meaning things don’t need to be “this or that”.

Yes, Map Rock goes beyond a tip calculator or the sort of “there is an app for that” software. While it is painful to learn anything new, enough incentive can get us to do all sorts of things. In other words, the pain of learning is less than (actually much less than) another pain the learning is attempting to resolve. The incentive here is to win. Corporate entities don’t exist simply so we have somewhere to do things. Map Rock is about enhancing the abilities of information workers, which is beyond just speeding up a process to get the work done faster to make room for other work (or even play). So I ask:

  • Are you willing to limit your options to only what you know? Are you going to limit yourself to that closed box which you may have mined to death over the years, depending solely on the existence of some gem you have yet to find?
  • Even if you feel confident in your knowledge, say from being in that position for decades, are you really sure that you know everything about your job? There is nothing else to learn? Are you 100% confident you kept up with the changes? Your mastery at your position hasn’t blinded you from things going on outside? Are you confident you’ll always stay in that job, never needing to take a few steps back in the mastery of your subject area?
  • Your actions don’t exist in a vacuum. Do you know how your actions affect other departments? Inadvertently, or even knowingly – like when there is so much going on and a deadline is looming, we realize we can’t save everyone. On the other side of the coin, at the risk of sounding cynical, are you certain all of the other managers will not jump on an opportunity to excel, even at your expense?

One version of the “people aren’t going to do that” I’ve heard a few times is the notion that people only want to do the least amount possible to accomplish their tasks. The reasoning is that our plates are already so full there is just no room for anything else, there isn’t enough time to even take care of what is already on our plates. While this may make logical “business sense” from a common-sense point of view, I find that very sad and a bit insulting to the many who consistently go beyond the call of duty, not just at work, but in many facets of their lives. While it may be true that many people would not want to think more than they need to and do the least amount possible to satisfy their most obligations, when it comes to one’s livelihood, I would think that is incentive enough to buck up and fight.

Perhaps you feel you are in a job where you think you aren’t paid to strategize (one of those “We don’t pay you to think” jobs), you’re paid to simply execute instructions as flawlessly as is humanly possible.  I’m not sure how comfortable I’d feel if I consider the implications of what the consistent improvement of Artificial Intelligence and robots hold in store for the value of such roles. That is, especially after conversations with an old auto worker comparing the assembly lines of the 1960s to the assembly lines of today. If outsourcing jobs to countries with low wages seem bad, think about outsourcing to a robot that doesn’t complain and can do your job flawlessly. See Note #2 on Robots.

Business Problem Scenario

At most feisty corporations, meaning those that are playing to win (as opposed to “not lose”), a periodic (at least annually, but to lesser intensities, quarterly or even monthly) ritual takes place called the Performance Review. Every employee, both managers and worker bees, are evaluated on their performance during the previous period and assigned targets for the next period. Those targets originated from corporate goals and high-level strategies designed by the corporate generals to meet those devised goals, and then trickled down the org chart. As the high-level goals and strategy trickle down to wider and wider breadths of people with increasingly specialized roles, they receive increasingly specific goals and objectives.

That trickling down goes something like this. A CEO deems that we must focus on our ability to retain customers and devises a strategy to “delight customers”. The VPs are given objectives specific to their department. For example, the VP of Manufacturing works to improve quality to Six Sigma standards and the VP of Support works to improve the scores of support surveys. These assigned objectives are just the “what”, not the “how”. It’s upon each VP to devise the how, a strategy, to meet their given objectives (as well as the key performance indicators to measure the progress of their efforts). Once they devise their strategies, they hand down objectives (components of their strategy) to the directors under them. The directors in turn devise strategies and hand objectives down to managers, and on and on until finally we reach those with no direct reports, the majority of the employees.

In this day of the information worker, decentralization of control, the delegation of responsibility, it’s more up to each worker to devise how to accomplish their objectives. It’s far from anarchy. We’re still expected to stick within myriad government and corporate guidelines and are usually expected to prove that our “how” has been tried many times before with great success (meaning, trying new things doesn’t happen very often because no one is willing to be the first). But we still have much wriggle room for creativity than if things were severely top-down, unambiguous, micro-managed directives.

At one corporation I’ve worked with, one with a highly competitive culture, there are about 3,000 managers out of a total of about 10,000 employees. That’s not as bad as it sounds since most of those managers are player/managers, such as a sergeant who shoots a gun as well as directs the actions of a squad. But that’s 3,000 managers struggling quarterly with devising strategies to meet objectives that are usually tougher than the previous quarter. That’s 3,000 different personalities, skillsets, points of view, points of pain, environments, social networks, etc. These assigned objectives aren’t just tougher because it involves simply a higher target for the same value, such as Sales. It could be tougher because it represents strategic shifts from above that require the manager to change as well.

So there are the 3,000 managers banging their heads against the wall figuring out to meet these challenging assigned objectives which are never as cut and dried as one would hope. After much booze and crying, they come up with a few ideas. Each of these ideas are a theory, a strategy. They consist of chains of cause and effect. For example, to achieve higher customer satisfaction ratings, I will improve the knowledge of the support personnel by placing them into a smaller room where they can overhear other conversations, increasing their knowledge of problems that pop up for our products and reducing the steps needed to find someone who may have already encountered that problem. For now, just note the negative side-effects this strategy could have.

Remember, this strategy is just a theory. Many funny things can happen during that trickling down of objectives from superior to subordinate. For example, as the trickling down gets wider and wider:

  • The spirit of the original strategy can get lost in translation. This is no different than how a story evolves as it’s told from person to person.
  • They can begin to conflict with or even undermine each other, hopefully always inadvertently. In a corporation with thousands of employees, the efforts are bound to conflict or have contention with others.
  • The strategies at whatever level could be wrong, whether too risky or too naïve. A strategy is just a theory.
  • The chosen KPIs could be the wrong things to be watching.

But it gets worse. During execution of the strategies, many things can happen as well that thwarts the corporation’s efforts:

  • If the “theories” of the strategies at any level are wrong, once a bad strategy is selected, subsequent strategies by subordinates will address things incorrectly.
  • Things fly out of left field throwing the game plan out of control. “Black Swan” (if it hasn’t yet happened, it couldn’t possibly happen) events are not as rare as we seem to think (see Note #3). As the effects cascade through the enterprise, execution degrades into a free for all, with employees retreating back to their comfort zone of old habits that don’t work anymore.
  • Employees could game the KPIs. Meaning, they are able to meet or even rocket past their goals in a way that is easy for them, but probably has some side-effect to someone else. It’s amazing how clever people can be. So it looks like they are doing well, but in reality are not.
  • Employees may leave or even give up on reaching their goals as a sports team would be discouraged once they know they are mathematically out of the playoffs.

The points I list above (there are many more) make me wonder how corporations in the end do succeed. For one thing, they often don’t and die a quiet death. Those that do survive do because people are intelligent  and somehow figure out how to make things work, even if it was very painful. 

The number of moving parts within the corporation, for which they have at least some control, are so many they are indeed complex systems. Additionally, there are countless factors outside the control of the corporation, for which they have little or no control, which exponentially adds to the complexity. How things do actually work, or at least appear to actually work, is beyond the scope of this blog. I’m attempting to make the case for the need to take the process I describe above, Performance Management, to the next level.

Performance Management is a framework for establishing at least some level of order to that process. The most widely recognized tool is currently the Dashboard, a set of relatively simple charts and graphs, assessable through collaborative software such as SharePoint, where information workers can monitor the state of affairs. The primary object on a Dashboard is the Scorecard, a list of key performance indicators. A Dashboard is tailored to each information worker reflecting their KPIs and relevant chart.

F1 – Example of a Dashboard.

Another of these parts of a typical dashboard is a yet under-developed part called a “Strategy Map”.  A Strategy Map is a graphic of a network of initiatives pointing to what the initiative is intended to affect. For example, happy employees, leads to higher product quality, which leads to satisfied customers, which leads to more purchases, which leads to higher profit. It’s currently implemented as an almost static visual created in a tool such as Visio. Nodes on the Strategy Map may do things like match the color of the status of the KPIs, but other than that, it is just a picture. Like Dashboards, the Strategy Map is tailored to the role of each information worker.

F2 – Example of a Strategy Map. This is for a dental practice.

However, this strategy map is really the brain of the corporation whereas the Dashboard is simply the nerves indicating the state of being. It could be said that Map Rock is about bringing the Strategy Map to life. A brain is a much tougher thing to implement than a nervous system. This current decoupling of the Dashboard from the Strategy Map is dangerous since we will supposedly take actions based on what we see from the Dashboard. Just about every action has side-effects. Without a flexible connection to cause and effect, which as of now exists pretty much solely in our heads, at best, we jump in with the faith that those side-effects will be innocuous or we will cross that bridge when we get to it.

This “Business Problem Scenario” may be set in a corporate environment, but any endeavor is executed in a similar manner. The ultimate goal may be something other than financial profit though. This could be a non-profit devising strategies for implementing its cause or a government attempting to understand the effects of laws that it passes (yes, I do have a sense of humor – and understand I may have just flushed all my credibility down the toilet).

Coming up:

  • Part 3 – We delve into a high-level description of the Map Rock software application, where it fits in the current BI framework, and how it differentiates from existing and emerging technologies. This is really the meat of the series.
  • Part 4 – We explore strategy, complexity, competition, and the limitations of logic.
  • Part 5 – We close the Problem Statement with a discussion on imagination, which is how we overcome the limitations of logic, and how it is incorporated into Map Rock.
  • Map Rock Proof of Concept – This blog, following the Problem Statement series will describe how to assess the need for Map Rock, readiness, a demo, and what a proof-of-concept could look like.

Notes:

  1. The term “art” has been used in a disparaging context. People snidely refer to an endeavor that is not just repeating a successful process as, “It is more art than science …”. To me, art is a combination of a imagination and a high degree of skill to manifest what is imagined. And imagination isn’t something we outgrow out of childhood. It’s really what separates humans from the birds and reptiles. Art is not just fine art – which does require a great amount of skill and imagination. All things that didn’t exist at some time required imaginative and high-skilled people to navigate an unclear path.
  2. Rise of AI and Robots. This is probably the most relevant force to Map Rock. Many jobs seemingly outsourced to other countries or lost through mergers are never coming back. Why would anyone hire an army of people to dig a ditch when a backhoe can do the job faster, better, and more cheaply (by today’s wage standards). A friend of mine who worked in an auto assembly plant in Grand Rapids during the 1960s, but left to do other things, recently visited an auto assembly plant. He was shocked by the seemingly endless line of people replaced by a line of robots and relatively few operators. People have been predicting this for decades and we aren’t yet ruled by robots. But the trend is certainly that every day, AI and robotics will impinge upon the work available to humans today.
  3. I probably don’t need to mention that “Black Swans” are a notion popularized by the book, “the Black Swan”. They are events that no one would have even known to attempt to predict and with a large impact on things. However, in a world as complex as ours, such events happen much more frequently than we’d like to think at smaller, personal scales. I believe this underestimation of them is because we generally only think of the spectacular ones like 9/11 and the effects of Hurricane Katrina. However, a stroke or heart attack on our way to work on what is otherwise a day like any other is just as impactful to us individually and those close to us. I recall my father in the hospital for his bone marrow transplant just before the first Gulf War turning away from the TV saying he had his own battle to deal with.
Posted in Map Rock | Tagged , , , , , , | Leave a comment

Map Rock Problem Statement – Part 1 of 5

Preface

Map Rock addresses the need to manage competitive fitness in an increasingly complex world through superior development and management of versatile strategies. There, that is the 25-words-or-less, Twitter-able, sound bite “Problem Statement” for Map Rock. I somewhat facetiously refer to this series of blogs as the “Map Rock Problem Statement” when it really is a “Problem Essay”. So out of professional courtesy to everyone, before I begin here is the Elevator Pitch:

The Problem Map Rock is trying to solve: Take Performance Management to the next level by “bringing the Strategy Map to life”. Performance Management initiatives would benefit from Business Intelligence systems that focus on presenting relationships rather than primarily returning data, sums and calculations, via what are still just reports. Business Intelligence packages many data points into aggregated “information” (data to information), but eventually there will be so many pieces of “information” that it again becomes data. Additionally, data from which we can calculate relationships exist is a number of formats and in a great number of isolated sources. What is needed is a system that can integrate these sources, sort the information into a hierarchy, and maintain the validity of the information.


Current Solutions and Where Map Rock Fills Some Holes: The Business Intelligence areas such as Data Warehouses, Performance Management, and Predictive Analytics as it stands today has added tremendous value to the decision-making capability of enterprises, but hasn’t lived up to its full potential. The vision of a Centralized Data Warehouse is elusive due to factors such as the complexity of integrating semantics across dozens if not thousands of data sources. Performance Management fails as it only tells us what is wrong with KPIs that we aren’t even sure is what we should be measuring. Additionally, KPIs are disconnected allowing workers ample room for gaming the system, which actually makes things worse. Predictive Analytics falls short in that the models make predictions based on historic patterns that are severely prone to skewing by one-off events. Simply removing the one-off event as an outlier could fail to detect what is really the birth of new trends.

Map Rock’s Added Value: New initiatives such as Self-Service BI, Master Data Management, Metadata Management, Semantic Webs, and of course, Big Data are significant steps in the right direction. But at the same time, these initiatives can further complicate matters if they are not united. For example, Big Data in itself for the most part mostly adds to more data points – sometimes simply more isn’t the answer. What is required is a way to integrate the heterogeneous array of technologies attempting to help us make better decisions from a higher level. Additionally, we need to smooth out communication between the wide communication chasm between where BI leaves off (ex: the OLAP cubes and Predictive Analytics models) and the human brain.

Oh good, you’re still here.

If Map Rock sounds intriguing enough from the elevator pitch, I will be posting a marketing-oriented blog on February 8, 2013 on how to inquire on a demo. Please look out for it. Otherwise, I offer this essay on all that is behind Map Rock. It will take many more than 25 words or a one minute speech to lay out the primary concepts underlying Map Rock which I will do by discussing:

  • Embedding the concepts into the well-known Performance Management framework.
  • Building on top of the efforts of what is traditional BI and Predictive Analytics.
  • The strategy of building a “pidgin” to bridge human and machine intelligence versus a genuinely AI system.
  • The fundamental place of competitiveness, strategy, and imagination in a complex world.
  •  Understanding the difference between complicated and complex systems for insight into why the result of current BI projects are still often only marginally helpful, or at worst, we still make a lot of bad decisions.

This series of blogs is the “Why”, the reasoning behind Map Rock, not the “How”. This blog isn’t intended to be the “marketing” blog.  I look at this article more as Map Rock’s “Federalist Papers”, from which the more consumer-friendly and poignant United States Constitution is derived. Actually, for Map Rock there is a journal of about 500 pages (in Word) dating back to 2003 which I’ve condensed down to a these approximately 20 pages. This theme of “why” is actually in itself very “Map Rock” as “why” is really a set of relationships, and relationships is what Map Rock is all about. We can be taught how to do something, but if we don’t know why, we will be lost when (not if) conditions for that “how” change.

The slogan driving the development of Map Rock is: The better we understand relationships, the more effective we can be at manipulating our surroundings. Humans have an enhanced ability to learn; that is to assimilate and process relationships throughout our life. When we understand why something happens, how are things related to each other, we can then engineer a solution to achieve a desired state in a system even if the starting points are different each time. A “solution” is a set of manipulations to pieces of a system. Over the last couple hundred thousand years, we’ve done very well in taking us from a relatively weak, “jack of all trades” animal to the apex of the apex.

About ten years ago I read the great book, The Ingenuity Gap, by Thomas Homer-Dixon. In a nutshell, the thesis is that eventually the increasing complexity of the world will overtake humankind’s ability to engineer our way towards our dreams and out of the messes we individually and collectively get ourselves into. The world is becoming more complex by magnitudes, but the innate intellectual capacity of humans is rather constant, or at best improved incrementally through superior education techniques.  I thought then that the popularity of this book would open the door for my thoughts around what would eventually become SCL from a “solution looking for a problem” to a “solution to a recognized problem”. Ten years later, we’re somewhere in between, but I optimistically think leaning toward the latter side.

That means I still have a significant “solution looking for a problem” issue to overcome – which by the way isn’t necessarily a bad thing. The big obstacle I feel stems from society having grown too comfortable with the seductive simplicity of the sound-bite, non-competitive, tips and tricks, best practices, bullet point, PowerPoint, quick fix, instant gratification, elevator pitch, Tweet quips, risk averse, single-function, multi-tasking, lowest-common-denominator culture that we’ve made for ourselves.

Don’t get me wrong. Believe me, I partake in and greatly appreciate all the ease and convenience the sound-bite culture provides. In fact, innovation in large part is about making the mundane of life as quick, effective, and painless as possible. But I feel the art of “American Ingenuity” (which can exist anywhere in the world where there are the conditions for innovation) and the appreciation for it is slipping through our fingers and I don’t believe it’s something that is easily re-learned or re-taken.

Innovation is about delayed gratification. It involves thinking deeply and widely, allowing for and learning from mistakes, and being allowed to be a little bit playful and crazy. It’s what differentiates humans from other creatures that do live solely by simple rules. When it comes to the chores of life, of which there are more imposed on us every day at home and work, I’d like them to be as simple and painless as possible. But when it comes to creating new things and competing, at the risk of sounding sadistic, we need to embrace the opposite. “Embracing” in this case means instead of rejecting complexity, we face it and tame it. Towards that goal, I think of my development of Map Rock over the past few years as having fought an epic battle with a grizzly bear that I’ve now tamed. Maybe we’re not yet BFFs, but at least we can have a working relationship, which is a start.

The complexity of life is growing at an accelerated rate at this time for many reasons which I’ll list later. Complexity means there is an unpredictable aspect to the outcomes of all movement involved in a complex system. In the course of all this movement, things are naturally destroyed and new things are created. But we humans have attachments to things and have a natural tendency to seek stability valiantly resisting the relentless change.

No, the world hasn’t come screeching to a halt due to the growing complexity of human activity. Life on Earth is still much too powerful to come to an end from our yet comparatively puny efforts. Life endured all sorts of much bigger catastrophes over a few billion year span. Humans are innovative and resilient creatures. The question is, how can we mitigate the risks and capitalize on the constantly changing conditions? Maybe we think we are handling it just fine. But maybe there is a boiling frog problem. Maybe we haven’t reached a scalability tipping point where drastic change can come very abruptly. Any more clichés? Hahaha.

At the end of the day, my intent for Map Rock is to help answer these three powerful questions:

  • How could this have happened?
  • What could possibly happen?
  • How can I make this happen?

Coming up:

  • Part 2 –  I describe Map Rock’s target audience and the primary business scenario for Version 1. It is not just a tool for quants, wonks, and power-users.
  • Part 3 – We delve into a high-level description of the Map Rock software application, where it fits in the current BI framework, and how it differentiates from existing and emerging technologies. This is really the meat of the series.
  • Part 4 – We explore strategy, complexity, competition, and the limitations of logic.
  • Part 5 – We close the Problem Statement with a discussion on imagination, which is how we overcome the limitations of logic, and how it is incorporated into Map Rock.
  • Map Rock Proof of Concept – This blog, following the Problem Statement series will describe how to assess the need for Map Rock, readiness, a demo, and what a proof-of-concept could look like.

Related Blogs

It may be beneficial to peruse material I’ve posted over the years that are collectively the soul of Map Rock. In a sense, almost all of my posts have something to do with Map Rock, but these posts strike me as the most relevant at this point. Map Rock is the manifestation of all these concepts. However, I will write this “Problem Statement” with the assumption that none of the posts were read.

Please keep in mind that these blogs were written over a few years (2005 through 2012) and may be a bit, or more than a bit outdated, at times as things have moved on over the years and my thoughts on the subjects have evolved as well.

Find and Measure Relationships in Your OLAP Cubes The first two blogs listed here set the direction for my efforts leading to Map Rock. This one really represents the foundation of Map Rock, the ability to “cast a wide net” for correlations or even lack of correlations. The main idea is to look for relationship measures to begin with, as opposed to looking for aggregate measures as is normal browsing an OLAP cube.

Bridging Predictive Analytics and Performance Management Performance Management usually centers around the Scorecard, a report on the Key Performance Indicators. It is just a report, the nerves reporting pain. But imagine if the pain in your nerves didn’t report to the brain with an awareness of pain from other parts of your body, an awareness of what is going on, a catalog of things it can do to alleviate the pain, etc.

Undermined Predictive Analytics This blog was meant to be a reminder that it’s a jungle out there. There is a big difference between data mining people as they just go about their daily business and when there is actually an intelligence involved or when people know they are being watched. In business, a big problem with performance management is that workers are clever in gaming the system.

Cutting Edge BI is About Imperfect Information There is no “one number” answer and practically all answers must to preceded by a series of “it depends” questions.

Why Does a Lt. General Outrank a Major General? This blog attempts to illustrate the role of strategy and tactics at different levels of jobs. But as companies trend towards decentralization of responsibility, the delegation of coming up with the “how” to people at all levels, what emerges is that modern information worker, the commando. That commando, who is often a player/manager, must be strategically, tactically, and operationally proficient.

Data to Information to Data to Information One of the main notions of Map Rock is that it’s the relationship between data that provides the really juicy, meaty insights. More data, as in Big Data, isn’t in itself the answer. A focus on Big Data still sidesteps tackling the challenges of embracing complexity.

Why Isn’t Predictive Analytics a Big Thing? At the time of the writing of this blog, Predictive Analytics was still frustratingly rather fringe. Since 2009, it is perhaps still not a household word but almost a “officehold” word. I positioned Predictive Analytics then similarly to how I’m positioning Map Rock; as a bridge between the chasm left by most BI implementations and the human brain.

Predictive Analytics is Science for the Masses The first feedback I usually get on Map Rock is that it is a quant’s tool. It is a tool intended to make non-quants a little bit “quantier”. It amazes me how people casually tell me they have “non-thinker” roles, as if thinking is reserved for scientists and quants. Who has never strategized about something? Who as a kid hasn’t schemed about something like getting a Red Ryder BB gun for Christmas? I’ve encountered so many people who say they know nothing about data mining but yet provide fantastically profound arguments for why Barry Bonds may or may not be better than Babe Ruth.

Where Do Rules Come From? A major factor of the evolution of SCL was triggered by a comment made by a friend of mine way back when I first began developing it. He said that the really hard part was encoding the rules. He is absolutely correct. That lead to the path on finding sources of rules that already exist or are as naturally produced as possible (ex: clickstream analysis gleaning insight from something people already do) and integrating these rules.

Exponentially Growing Complexity. There are many powerful trends adding to the complexity of our lives. It’s important to recognize them.

Things Quickly Become Complex. A short true story of how complexity slapped me in the face.

Posted in Map Rock | Tagged , , , | 2 Comments

Map Rock is Almost There!

Happy New Year!

I’m very happy to say that the development of Map Rock V1.0 is just about there. In fact, over the past month, my effort has shifted from primarily development of the actual product to primarily authoring rollout material.

Map Rock is the encapsulation of how I’ve always envisioned BI. In a nutshell, there is a big chasm between where BI ends and our human brains begin. Bridging that chasm is quite a challenge that I’ve tackled over the past few years. In some ways, it may have been easier to fight then tame a grizzly bear.

Beginning in a week or so, I’ll start releasing a series of blogs stepping through Map Rock’s purpose, value, its differentiation from emerging technologies, and most importantly, how enterprises can engage that power. For now, here is a short blog on one of the slides in the Map Rock presentation that seems to resonate: Business Problem Silos

Please also see how Map Rock Got Its Name.

Posted in Map Rock | Tagged , , , , , , | Leave a comment

Securing a Dimension with Members Having the Same Name

Here’s an SSAS security issue that doesn’t come up very much at all. In fact, when this recently came up, I had forgotten about a solution I provided way back in 2005 (which was the only other time I’ve encountered this). The problem is that when securing an attribute member (a role’s Dimension Data tab) using the member’s name and not key (using the ”Allowed Member Set” text box of the “Advanced” tab), and there is more than one member with that same name, only the first one will be secured.

For example, if I have two customers named “John Smith” (but with different key IDs) and I place the MDX in the “Allowed Member Set” text box, [Customers].[Customer].[John Smith], only the first John Smith will appear. This is consistent with the behavior of what happens in a select statement when referring to members by name. If there is more than one, only the first will show. Please note that this is different from what will happen if your Axis SET references the members’ key IDs (which means using functions such as members, children, etc – something like [Customers].[Customer].Members). In that case, “John Smith” will show up twice.

Before continuing to the solution, I should explain why the names instead of the IDs were referenced (which also explains why this is such a rare occurrence). The underlying data sources of the dimensions did not maintain static dimension keys. What happens is that the underlying data source (data warehouse or data mart) was completely repopulated when it refreshed. This means that any MDX referring to those members by key (ex: [Customers].[Customer].&[1]) are no longer valid; or worse, that key now refers to another member for which security should not be applied. Therefore, keys wherever they are referenced (security, calculations, KPIs,) must be manually changed.

Additionally, this also means is that SSAS dimensions must be fully processed, as opposed to incrementally processing with ProcessUpdate, since it now cannot map those dimension keys to the internal keys it creates when the dimensions are processed.

It’s these terrible side-effects of not maintaining consistent dimension keys combined with the relative infrequency of two members having the same name (especially companies, products, etc) that make this situation I describe so rare.

The solution is to refer to the member as the result of a FILTER statement. For our John Smith example, instead of simply composing the MDX, [Customers].[Customer].[John Smith], we would compose something like:

FILTER([Customers].[Customer].Members,INSTR([Customers].[Customer].CurrentMember.Name,”John Smith”)<>0)

This will return a set consisting of the two John Smiths. If we wanted to secure John Smith and Eugene Asahara, even though there is unlikely to be more than one Eugene Asahara and we could just state [Customers].[Customer].[Eugene Asahara], just to be safe we could write:

FILTER([Customers].[Customer].Members,INSTR([Customers].[Customer].CurrentMember.Name,”John Smith”)<>0)+FILTER([Customers].[Customer].Members,INSTR([Customers].[Customer].CurrentMember.Name,”Eugene Asahara”)<>0)

I’d like to note as well that another reason I suspect I do not see some situations very often is not just because they don’t happen often for one reason or another, but because customers may give up on the technology when they run into a wall. That’s a terrible shame since there is hardly ever a perfect product and often a solution is just one little insight away.

Posted in BI Development, SQL Server Analysis Services | Leave a comment