AI Product Management as a Markov Chain

Product Management as a discipline sits at the intersection of business need, user experience and technology. A consistent Product Management strategy is necessary for delivering a viable and sustainable product. When the Product Management strategy deviates with every Product Manager hire then the focus and investment can become confused, and you end up a bit like Manchester United. With Machine Learning the requirement for a multi-disciplinary team becomes greater and necessitate ML/Ops, Data Science and hybrid development skills. Again, like Manchester United the hiring of ageing ‘superstars’ is never a good strategy. To be a successful Product Manager with Machine Learning requires flexibility and faith in an iterative process. The following is part of my learnings on developing ML products as a CTO with specific focus on Markov Chains, Decision Trees, and Genomics.

A Markov Chain is a mathematical system that experiences transitions from one state to another according to certain probabilistic rules. Markov Chains are regularly used in Algorithms for Reinforcement Learning and specifically within Markov Decision Making Processes, Neural Networks and Supervised Learning. They also make a very good analogy / structure for the different responsibilities of a ‘Product Manager’ working on a traditional non-AI based delivery model versus a Machine Learning Data Centric Product. This is because like a UML State Diagram each node represents a position, and each flow has a weighted percentage for travelling to another state and the probability of transitioning to another state is dependent solely on the current state and time elapsed. A cup competition is a good example of a Markov Chain, a team may have an even probability of moving to the next round (state) or exiting the competition, and each round is not affected by the previous rounds.

A Simple Software Delivery Model represented as a Markov Chain

The primary role of any Product owner is to manage the entire product lifecycle including internal engineer facing and external customer / market facing. Note that the terminology is a bit confusing in ML which is normally viewed as a service, but we will keep using the term ‘Product Owner’ rather than ‘Service Owner’ which always sounds a bit more fixing stuff.

A traditional Product Owner lifecycle as a Markov Diagram would look something like the diagram with states such as Design, Development and Release. Software delivery processes like Lean, Agile and Kanban processes have states and transitions. In your personal delivery model you may choose to add more states, for example Acceptance Test, and between each of these states you would define transitions.

In the example of a non-ML software development model, designs would be released to development as stories 70% of the time but complex design items would stay in design 30% of the time and stories not understood by the developers would return to design 40% of the time. The other transitions are explained by the tensors and weighted by percentages. In the example I have not included a direct flow from Design to Release without going through Development. I have included a link between Release and Design as some items might not meet the customer requirements on Release so could go straight back to Design rather than going back to Development.

With a Markov Chain the probabilities provide a weighting which allows a simple calculation of the efficiency of the model. This make is easy to quantify what percentage of design goes into release with the fewest possible steps. The same can apply to a ML software delivery model with additional states such as Machine Learning Operations and Data Science.

Some learnings in AI Product Management and Markov Chains
A Machine Learning Software Development Lifecycle represented as a Markov Chain

Machine Learning operations including the management of training data and federated learning is a discipline with its own best practices and career progressions. Machine Learning Operations requires specific skills including Dev Ops environment management for training and production separation alongside other skills such as Information Governance processes for training data. Solutions may have ML Ops as their final state before Release, especially in early product releases. ML Ops may also take responsibility for security and Security by Design is important when working with any form of personal data. Privacy-enhancing technologies and Zero-Trust Frameworks are useful here when synchronous algorithms can be victim to model inversion attacks. When working with Genetic Data (GWAS) and or Polygenic Risk Scores it is important to build a secure platform which encrypts personal data and uses PETs to avoid deducing further personal data from training data and the algorithm.

The Data Science capability can sit within any part of the framework from architecture design, product prototyping, and data wrangling, through to software development. Personally, I have always tried to be both a Data Scientist and a Software Developer throughout my career and I always try to cross-convert development skills with data skills in my teams. Too frequently organisations will silo their engineers. As a CTO I always recommend breaking down artificial barriers between Development and Data. It might seem obvious, but a successful ML Product Manager needs to be good at envisioning the problem and how the ‘Problem’ can be solved by Machine Learning. The Product Manager must be able to translate the business problem that can be solved by Machine Learning. This requires trust with the Data Engineering and Research Science capabilities.

The Design phase of any ML implementation requires a strong and flexible architecture. The same concepts of componentisation, architectural separation, and APIs apply equally to a ML service. Persistence is important for more complex machine learning solutions and wrangling data into the appropriate structure in advance will always be advantageous in a closed loop systems. When dealing with PetaBytes of genomic data for example an appropriate Columnar data structure with metadata stored in a graph or hashmap structure can improve the speed of machine learning.

Lastly, the philosophies of Behavioural-Driven Development (BDD) testing and “Given, When, Then” testing still apply with Machine Learning services. Though it becomes even more incumbent upon the ML Product Manager to work on the problem mapping, together with the architects, so that they are figuring out the present problem and how many of them can be solved using machine learning. You can’t solve all your problems and issues using machine learning. Therefore, a machine learning product manager must be able to distinguish those problems. I personally recommend starting with robust and broad acceptance criteria when training a supervised learning model, such as a decision tree, and then finessing the test cases and the model together with the data scientists. With Genomic datasets and Polygenic Risk Scores the test is a correlation for genetic mappings between SNPs, existing test and peer review frameworks then come into play.