How can you relate ML algorithms to your business?

When you start talking about Machine Learning algorithms, most people’s eyes glaze over – it’s higher order math, something cool and distant that they don’t want to be bothered with.

So how can you engage in a meaningful conversation about these algorithms and demonstrate how and why they add value to make the case for implementing them? 

We have had success in showing measurable and quantifiable results from predictions coming from these algorithms that are better than the current state process. Now, everything is not as easy to measure or demonstrate. Hence classifying your ML models into categories and having a measurement framework around each category helps. 

The following elements are critical in establishing such a measurement system: 

  • A continuously executable platform that can evaluate rule sets and persist results.
  • Rule set authoring, configuration, promotion and execution to be configurable and on demand.
  • Measurable results that are persisted and can be audited.
  • Continuous Improvement: A framework to establish a Champion-Challenger framework, where a production population is using the “champion” set of rules, while a pilot set is using the “challenger”. When we see the challenger doing statistically better than the champion, we have the ability to flip the two. 

I have found the following classification useful for the work that my team is doing. Now this isn’t a comprehensive framework of all ML models available, but just something that we have found useful: 

  • Clustering: The technique of dividing a set of input data into possibly overlapping subsets where the elements of each subset are considered related by some similarity measure. The measures we have used here is commonality by department, division, cost center etc. Typical implementations have used DBSCAN (density based spatial clustering of applications), K-spanning tree, Kernel k-means and shared nearest neighbor clustering algorithms. Typical use cases we have put to clustering are role classification, entitlement clustering etc. 
  • ARM (Associative Rule Mining): Is a technique to uncover how items are associated to one another. Typically calculate three measures – support (P(A) = occurrences(A)/total ), confidence (confidence(B/A)=P(B,A)/P(A)) and lift (lift(B/A) = support(B,A) / support(B)). We have used it to predict what rights should be offered up to a new employee who joins a group or department. 
  • Recommender Systems: We have seen success with collaborative filtering (with both item based and user based). UBCF assumes that a user will have a similar rating for an item as its neighbor, if they are similar; while IBCF focusses on what items from all options are more similar to what a user enjoys thus allowing us to directly recalculate the similarity between co-rated items and skip the k-neighborhood search. A key measure we have used in this space to evaluate algorithms is to measure how many times did a user act on a suggestion that was surfaced by a recommender system. We have used this to recommend other items that a user could request with their original request.  
  • Market Basket Analysis: Technique to uncover association between items by looking at combinations of items that occur together frequently in transactions, to predict the occurrence of an event happening given the occurrence of another.  
  • Neural Networks: Models which are trained to mimic the behavior of a system. The weights in the model are tuned using the training set till we start getting realistic predictions for new data that the model has not experienced. Some measures here are to keep a record of actual events vs. predicted to measure variance and use continuous feedback to improve predictions. We have plans to use this to learn from system events on the network to infer processes to run in response. 

Will keep you posted on the results. Happy to hear about alternate frameworks and paradigms that folks have used to gain acceptance. 

Evolution Of AI And The Trust Frameworks We Need to Support It

We have seen evolution of AI systems from the simple to the more complex: Going from simple correlations and causations, to model creation, training and advanced prediction to finally unsupervised learning & autonomous systems.

Trust Models Required At Each Stage Of Evolution

Human society has been seen to be comfortable with assigning accountability to one of its own i.e. a human actor who creates, authors or mentors these models and can assume accountability & responsibility. If you trace the evolution of our justice systems from the time of Hammurabi (sixth king of the First Babylonian Dynasty, reigning from 1792 BC to 1750 BC), to the modern ones in nation states today, we seem to accept good behavior within a well defined system of laws and rules, and digression from these attract punishment which is meant to drive compliance. 

Unknown author Mbzt, P1050763 Louvre code Hammurabi face rwk, CC BY 3.0

But will this always be true? Do we need new trust models and enforcement mechanisms?

Some Questions To Answer Before Advent Of Completely Autonomous Systems:

How do you program ethics?

Morals are the objective transcendent ideals we base our ethics upon. Jonathon Haidt in his exploration of the conservative and liberal morality describes 5 key traits – harm, fairness, authority, in-group and purity. Per his TED talk, liberals value the first two and score low on the other three, while conservatives value the latter 3 more than liberals.

Ethics are the subjective rules by which we govern our behavior and relate to each other in an acceptable manner. So which of these moral principles and in what measure should our ethical rules be based upon? And who chooses?

These ethics rules determine the system’s behavior in any situation and thus form the basis for the trust system we will operate upon with the autonomous system. (See the definitions of trust in my earlier post here)

Would the creator of a model be held responsible for all its future actions?

Think about an infant that is born. He/she usually has a base set of moral frameworks hard wired into the brain and it is life experiences that shape how that model further develops, what behaviors are acceptable in society, which ones are not, what’s considered good vs. evil etc. The only thing that a creator can be held responsible for is the base template that he inputs into creating the autonomous AI system. Anything that is learnt post birth would be a part of the nurture argument that would be very difficult to assign accountability for. 

Can you set up a reward and punishment system for AI models?

If we consider an AI system to be similar, how do you provide a moral compass to it? Would you expose it to religion (and which one?) to teach it the basics of right or wrong or set up reward and punishment systems to train it to distinguish desirable vs. undesirable behavior. And again who determines what is desired and what is not – is it us humans or do we leave this up to the autonomous AI system.

Who decides on when and how we go to Autonomous AI?

When would we as a society be ready to take the leap? There are a number of thought leaders who have warned us about this including Stephen Hawking and Elon Musk. Are we ready to heed those warnings and muzzle our explorations into truly autonomous systems or is this an arms race that even if we bore restraint, someone somewhere may not act with the same constraints that we did…and finally was the purpose for us as a species was to develop something more intelligent than us that is able to outpace, out compete and eventually sunset our civilization?

I guess only time will tell, but meanwhile it is important to at least model ethical rules as we know it (similar to Isaac Assimov’s three Laws of Robotics – A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey orders given it by human beings except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law) into systems and autonomous programs that we create but realize that our biases, desires and ambitions will always be a part of our creation…

A Possible Way Forward…

We need a framework to establish certain base criteria for evolution of AI – something that will be the basis of all decision making capabilities. This core ROM which cannot be modified should form the basis of trust between humans and autonomous AI systems.

This basic contract is enforced as a price of entry to the human world and becomes a fundamental tenet for trust between us humans and autonomous systems allowed to operate in our realm.  

As long as humans trust the basis of decision making upon these core principles (like Assimov’s three laws of robotics described above) we will operate from a position of mutual trust where we should be able to achieve a mutually beneficial equilibrium that maximizes benefits all around.

Given the CRISPR announcement today about two babies being born with their genes edited using CRISPR Cas9, its all the more urgent for us to establish this common framework before the genie is out of the bottle… 

How To Determine Which Machine Learning Technique Is Right For You?

Machine Learning is a vast field with various techniques available to a practitioner. This blog is about how to navigate this space and apply the right methods for your problem.

What is Machine Learning?

Tom Mitchel provides a very apt definition: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

E = the experience of playing many games.

T = the task of playing an individual game.

P = the probability that the program will win the next game.

For example a machine playing Go was able to beat the world’s best Go player. Earlier machines were dependent on humans to provide the example learning set. But in this instant, the machine was able to play against itself, and learn the basic Go techniques.

Broad classification of Machine Learning techniques are:

Supervised Learning: A set of problems where there is a relationship between input and output; Given a data set where we already know the correct output,  we can train a machine to derive this relationship and use this model to predict outcomes for previously unknown data points. These are broadly classified under “regression” and “classification” problems.

  • Regression: When we try to predict results within a continuous output meaning we try to map input variables to some continuous function.  For e.g. given the picture of a person, predicting the age of the person.
    1. Gradient Descent – or steepest descent is an optimization technique to follow the largest derivative to get to a local or global minima. This technique is often used in machine learning applications to calculate the coefficients in regression curve fitting over a training data set. Using these curve fitting coefficients, the program can then make  predictions on a continuous valued output for any new datasets presented to it.
    2. Normal Equation –  (\[\theta=(X^TX)^{-1}X^Ty\]) Refers to a set of simultaneous equations involving experimental unknowns and derived from a large number of observation equations using least squares adjustments.
    3. Neural Networks: Refers to a system of connected nodes that mimic our brains (biological neural networks). Such systems learn the model coefficients by observing real life data and once tuned can be used in output predictions for unseen data or observations outside the training set.  
  • Classification: When we try to predict results in a discrete output i.e. map input variables into discrete categories.  For e.g. given a patient with tumor, predicting whether its benign or malignant. Types of classification algorithms: 
    1. Large Margin Classification
    2. Kernels
    3. Support Vector Machines

 

Unsupervised Learning: When we derive the structure by clustering the data based on relationships among the variables in the data. With unsupervised learning there is no feedback based on the prediction results.

 

  • Clustering: Its the process of dividing a set of input data into possibly overlapping, subsets, where elements of each subset are considered related by some similarity measure. Take a collection of data, and find a way to automatically group this data that are similar or related by different variables. For e.g. the clustering of news on the google news home page.

Some classic graph clustering algorithms are the following:

  1. Kernel K-means : Select k data points from i/p as centroids, assign data points to nearest centroid; recompute centroid for each cluster till centroids do not change.
  2. K-spanning tree: Obtain the minimum spacing tree (MST) of an input graph; removing k-1 edges from the MST results in k clusters.
  3. Shared nearest neighbor: Obtain the shared nearest neighbor (SNN) graph the input graph; removing edges from the SNN with weight less than τ results in groups of non overlapping vertices. 
  4. Betweenness centrality based: quantifies the degree to which a vertex (or edge) occurs on the shortest path between all other pairs of nodes.  
  5. Highly connected components: the minimum set of edges whose removal disconnects a graph to produce a highly connected subgraph (HCS). 
  6.  Maximal clique enumeration : A subgraph C of graph G with edges between all pairs of nodes; Maximal clique is a clique not part of the larger clique; 

 

  • Non-Clustering: Allows you to find structure in a chaotic environment.
    1. Reinforced Learning: where software agents automatically determine ideal behavior to maximize performance.
    2. Recommender Systems: Is an information filtering system that seeks to predict the preference for an item from a user’s perspective by watching and learning the user’s behavior.
    3. Natural Language Processing: Is a field that deals with machine interaction with human languages. Specifically manages the following 3 challenges: speech recognition, understanding and response generation.

 

And finally, remember the 7 essential steps in accomplishing your machine learning project are the following:

  • Gathering the data
  • Preparing the data
  • Choosing a Model
  • Training your Model
  • Evaluating your Model parameters
  • Hyperparameter training
  • And finally prediction