The Agile Architecture Guide (part 1: Product Development)

Introducing A New Organisation to Agile Product Development

As the new CTO, Chief Product Owner, Chief Architect, Lead Engineer or just natural leader it may be your responsibility to introduce an Agile way of working to a new Organisation. We are going to walk through the common steps to successfully introduce Agile methodology and we are going to do it using the most suitable place to introduce Agile which is the launch of a new Product.

I have worked in multiple companies, like BT, Bupa & Vodafone, who have tried multiple times to introduce Agile or one of its familial methodologies. These have varies in success and reach but the thing that most concentrates an organisation is the launch of a new product. This can range from SIM only propositions at Vodafone, 5G network releases at BT and dynamically priced PAYG treatments at Bupa. In each of these cases the product was a massively complex project that could not be delivered without breaking it down into its component parts. Also no one single individual could envisage all the necessary change required and what the end state will actually look like. These products lend themselves to Agile delivery.

I am quite loyal to Atlassian Products so I will be referencing Confluence and Jira liberally in this guide.

Starting with the Product Brief

There are multiple different product brief frameworks and templates available on the internet. Some it’s just easier to search by image to find what you’re looking for. However I have included some templates I have used before as examples.

The core concept of the product brief is that it literally must be brief. It is the elevator pitch of product opportunities. It’s not a manifesto or unchangeable constitutional document that defines the product going forward but the gist of what the product may be. It can be rejected very early on so don’t invest too much effort in it. Though I always recommend keeping a personal record of all your ideas.

The following sections describe the use of PowerPoint and Atlassian Confluence for documenting Product Briefs. Each have their own benefit.

A Simple PowerPoint Product Brief Template

I generally avoid using static documentation tools as they represent a pre-internet way of thinking. Product Briefs however fit quite well in a single slide PowerPoint template. The latter Confluence example is more complex, but a single page can capture the concept of the product.

The Brief template should always first capture the Product or Service Vision. This should be a concise description of the innovation (often the technology) and the benefit which it will bring.

Further sections can include:

  • detail of the Target Group describing the target market segment (B2C, B2B, B2B2C etc) and who would be the users (often as personas)
  • detail on the Needs that this product realises by answering the question ‘what benefit is solved by this product’
  • detail on the Product or Service and a description of how the product aligns with the Business Goals
  • a high-level competitor analysis is useful whilst remembering that the product can be an improvement of an existing product
  • any cost estimates at a high level such as potential Revenue and expected Cost Factors can be very useful at an early stage but a Product Brief is never expected to be fully costed as that activity can come out in a later articulation stage.
  • as I work in R&D some explanation of the Science behind the product helps explain the novelty and costs of the product
Example Product Brief Template

Atlassian Confluence Living Document Approach to a Product Brief

I personally use Atlassian Confluence for all my product design work. I maintain a folder structure in Confluence following The Open Group Architecture Forum’s (TOGAF) Business, Data, Application & Technology format to describe a complete enterprise architecture.

Product Briefs go the under the Business Architecture folder where I provide standard “Templates to be Completed” for all new Product Briefs. Because it is Confluence users have to copy the template and then create a new page under Product Briefs folder in the name of the product. Always remind your users

File structure providing Templates for a Product Brief

The organisations I work with are data centric and a lot of new products have an insights or machine learning capability. For this reason, I include sections in the template to capture the Data and algorithmic parts of the Product. It is beneficial though to keep the product description agnostic to the technology.

Example Product Brief Template Table

Product or Service

An important distinction in a SaaS environment is determining if the product is a single charge product or a recurring charge service. This does not have to be defined in the Product Brief stage but it is useful to get an idea of the nature of the product. A useful lesson I learned whilst working with R&D science start-ups is that the distinction between a product and a service is not clear cut. A science product, like a lab testing function can be expose a set of products with each having a shipped testing kit, imagine lateral flow testing kits. These products are crucial for controlling the rate of infection in a community from Covid-19. The data captured from the mass recording of lateral flow tests provides a set of insights which NHS-Digital ( used to analyse the R-value transmission rate in the UK. Genetic sequencing provided by NHS labs were able to provide more accurate R-value rates for different Covid variants and used these insights to inform the UK Government of the need for lockdown periods. The insights from these -omics analyses provided crucial insight services and show a data service can be built on top of a science product.

The What, Why & Who of a Product Vision

The vision of a product does not have to be some lofty ambitious epic of a transformational product. But it needs a definition of a What, Why & Who as early as possible. This is really important as otherwise you can rush into a wasted investment.

Personas are a good way of defining the interests of your users and a simple bit of celebrity alliteration (Stormzy the Scientist, Elton the Ecologist) can be a useful way of using characters in your stories. It’s useful to add some further colour to your personas by defining some non work items that they like and dislike. So for Stormzy the Scientist we added that they did not like having to scan barcodes on every sample and liked single click purchase solutions. For Elton we added that he did not like excessive packaging and preferred to order in bulk.

The Why of the Product is critical for understanding the benefit of the product. A recent example of poor understanding of the benefit of a product relates to an international hospital service provider in the UK. This hospital group made the decision to order one million Covid testing kits and four qPCR machines to provide a large testing capability for all doctors, nurses and visiting patients. This procurement activity was made without understanding the digital process for testing. When the solution was launched emails went out to internal staff who all arrived at the testing point at the same time causing a large queue and a potential mass spreading event. They had to go back quickly to the design process to arrange an end to end a registration, booking, sampling and results process to ensure that incoming patients could be properly tested. This design process took a month out of hospital operations during the early stage of the pandemic.

The What of the Product is an articulation of the deliverables and operations of the product. Examining and testing this early will help identify gaps in the product. It is to be expected to have gaps in the product at this early stage and investigation. Modelling the end to end process in a series of workshops will help fill in these gaps. Simple swim-lane process diagrams in Confluence can help articulate the end to end processes that are necessary for linking together stories in Jira at a later stage.

Gating the Product Brief Phase

The product design process is a continual activity and new concepts may arise from all layers of the organisation at any time. The product design process should not be the remit of a select few members of your organisation. Imagination should not be restricted to a strategy department, neither should anything else for that matter.

A gating process is required for Product Briefs where they are reviewed and handled when they are submitted. The gate then approves whether to progress the brief to the next stage or they are rejected early on. The whole process needs to be fast and transparent so that submitters get clear response as soon as possible. The submitter should be invited to the submissions process as otherwise the whole process can seem secret and bureaucratic.

In an agile methodology the aim is to determine success in as few iterations as possible in order to come to the appropriate conclusion. The aim of the product brief gating phase is to select those product briefs with the best hope of success that can be progressed to the subsequent articulation phase. The overhead of the articulation phase is that available resources are provided to support defining the next level of detail.

Articulating the Benefit

In Confluence I provide an articulation template for the next set of detail required. This provides the source of the first set of stories by highlighting and clicking text in Confluence to create Stories under the Product name Epic.

Many organisations skip a formal articulation phase and go straight to Story capture. There is nothing wrong with jumping this stage. My personal preference from working in scientific organisations is that an articulation stage is required to explain the science to the business and the business to the science. This also helps make Confluence more of a document store rather than maintaining assets in Office products. This increasingly becomes useful when Atlassian is your service desk and Confluence becomes your knowledge base for help issues.

The articulation template is more of an architectural high-level design in that it requests details around the Technology, Science (if you work for a lab science business like me), People, Operations, Data & Machine Learning requirements (grouped under Insights) and Finance. Diagrams including wireframes and flows are also useful at this stage so any links to diagramming tools like Miro or Lucid Chart.

A reiteration of the concept, like that in the Product Brief, may seem repetitive at this stage and if the concept has not changed then a link to the Brief can just be provided. Some product concepts may have evolved, and this therefore is a good opportunity to capture that change. Also any further details will help with the articulation of stories which can come from this document.

Articulation Template for Capturing More Detail

Describing Data & Machine Learning Requirements

Agile Machine Learning is a bit of a contradiction as design, training, testing and launch fit a more traditional waterfall approach. Product Management as a discipline sits at the intersection of business need, user experience and technology. A consistent Product Management strategy is necessary for delivering a viable and sustainable product. When the Product Management strategy deviates with every Product Manager hire then the focus and investment can become confused, and you end up a bit like Manchester United. With Machine Learning the requirement for a multi-disciplinary team becomes greater and necessitate ML/Ops, Data Science and hybrid development skills. Again, like Manchester United the hiring of ageing ‘superstars’ is never a good strategy. To be a successful Product Manager with Machine Learning requires flexibility and faith in an iterative process.

I have written on the modelling of Machine Learning operations as a Markov Chain here, as the software delivery model for Machine Learning has a greater number of state transition points than an agile digital delivery.

The epic and stories can frame the Machine Learning problem. The epic must explain the user-centric problem that the Machine Learning problem is trying to achieve. I have worked with 5G radio mast planning designs whilst at BT / EE in the UK. The first 5G sites were costing nearly £500k and had to provide considerable quality of service in dense urban environments. Mobile network planning uses reinforced learning techniques for training and predicting the best deployment model of multiple mobile masts. This is a very human and compute resource intensive process so any optimisation offers considerable benefit.

Machine Learning algorithm selection, in our case this included Artificial Bee Colony algorithms, were the output of a story testing and selecting the most appropriate algorithms during PoC stages. The selection of algorithms were based on comparisons with in-field tests and previous 4G model comparisons. All of this test data was then fed-back into the learning environment.

The nuance from an Agile point of view is that the time taken to attaining an optimal machine learning model cannot be easily predicted and that certain key stories such as model selection and training will run across multiple sprints. Sub-tasks are a good way of documenting the activities for a Machine Learning epic.

One last point to note in any Machine Learning delivery is that research scientists are generally unfamiliar with project management or Agile. In a research institute the time to discovery is does not have a regular cadence. But in a commercial organisation a regular manageable approach is required which can bit chunks out of the greater whole. For this reason, AWS and Azure offer improved visualisation tools for their machine learning capabilities as these lift the point of science away from the necessary infrastructure. If you can break your ML Epics into those that are infrastructure and data based away from those that are training and proof based, then you will be able to achieve success quicker.

Describing Operational Requirements

Products and services require operational support to deal with imperfections and to keep the customers happy. Don’t launch a product without an operational service wrap but also make sure you start capturing operational requirements at the Product Brief stage, because if you can’t support it then you can’t sell it. Capturing these requirements at an early stage is quite complex if you do not have an existing service wrap. If that is the case then simply document how customer issues will be captured, triaged and supported.

The Operations section of the template asks how the product will be supported. In a lab operations environment, the operational support model includes the processes implemented in the ERP and LIMS (Lab Information Mgmt System). These systems should have their own Standard Operating Procedures. So this section should not be a new domain


Introducing an Agile architecture to an organisation is actually very exciting. One of the most satisfying feelings I’ve ever had at work has been working with floating brain in vat scientists, is when they realise the benefits of Agile for working on a complex problem. There’s an enjoyment no matter your background in drawing UX designs and articulating simple stories. A good buddying system can work well, as an example I have paired an ecology university lecturer with a UX designer to define a geospatial planning application and they paper prototyped a highly intuitive solution. Scientists are very competitive for discovery so adding a quantified competitive element like number of story points designed drives the initial cadence and avoids inertia.

AI Product Management as a Markov Chain

Product Management as a discipline sits at the intersection of business need, user experience and technology. A consistent Product Management strategy is necessary for delivering a viable and sustainable product. When the Product Management strategy deviates with every Product Manager hire then the focus and investment can become confused, and you end up a bit like Manchester United. With Machine Learning the requirement for a multi-disciplinary team becomes greater and necessitate ML/Ops, Data Science and hybrid development skills. Again, like Manchester United the hiring of ageing ‘superstars’ is never a good strategy. To be a successful Product Manager with Machine Learning requires flexibility and faith in an iterative process. The following is part of my learnings on developing ML products as a CTO with specific focus on Markov Chains, Decision Trees, and Genomics.

A Markov Chain is a mathematical system that experiences transitions from one state to another according to certain probabilistic rules. Markov Chains are regularly used in Algorithms for Reinforcement Learning and specifically within Markov Decision Making Processes, Neural Networks and Supervised Learning. They also make a very good analogy / structure for the different responsibilities of a ‘Product Manager’ working on a traditional non-AI based delivery model versus a Machine Learning Data Centric Product. This is because like a UML State Diagram each node represents a position, and each flow has a weighted percentage for travelling to another state and the probability of transitioning to another state is dependent solely on the current state and time elapsed. A cup competition is a good example of a Markov Chain, a team may have an even probability of moving to the next round (state) or exiting the competition, and each round is not affected by the previous rounds.

A Simple Software Delivery Model represented as a Markov Chain

The primary role of any Product owner is to manage the entire product lifecycle including internal engineer facing and external customer / market facing. Note that the terminology is a bit confusing in ML which is normally viewed as a service, but we will keep using the term ‘Product Owner’ rather than ‘Service Owner’ which always sounds a bit more fixing stuff.

A traditional Product Owner lifecycle as a Markov Diagram would look something like the diagram with states such as Design, Development and Release. Software delivery processes like Lean, Agile and Kanban processes have states and transitions. In your personal delivery model you may choose to add more states, for example Acceptance Test, and between each of these states you would define transitions.

In the example of a non-ML software development model, designs would be released to development as stories 70% of the time but complex design items would stay in design 30% of the time and stories not understood by the developers would return to design 40% of the time. The other transitions are explained by the tensors and weighted by percentages. In the example I have not included a direct flow from Design to Release without going through Development. I have included a link between Release and Design as some items might not meet the customer requirements on Release so could go straight back to Design rather than going back to Development.

With a Markov Chain the probabilities provide a weighting which allows a simple calculation of the efficiency of the model. This make is easy to quantify what percentage of design goes into release with the fewest possible steps. The same can apply to a ML software delivery model with additional states such as Machine Learning Operations and Data Science.

Some learnings in AI Product Management and Markov Chains
A Machine Learning Software Development Lifecycle represented as a Markov Chain

Machine Learning operations including the management of training data and federated learning is a discipline with its own best practices and career progressions. Machine Learning Operations requires specific skills including Dev Ops environment management for training and production separation alongside other skills such as Information Governance processes for training data. Solutions may have ML Ops as their final state before Release, especially in early product releases. ML Ops may also take responsibility for security and Security by Design is important when working with any form of personal data. Privacy-enhancing technologies and Zero-Trust Frameworks are useful here when synchronous algorithms can be victim to model inversion attacks. When working with Genetic Data (GWAS) and or Polygenic Risk Scores it is important to build a secure platform which encrypts personal data and uses PETs to avoid deducing further personal data from training data and the algorithm.

The Data Science capability can sit within any part of the framework from architecture design, product prototyping, and data wrangling, through to software development. Personally, I have always tried to be both a Data Scientist and a Software Developer throughout my career and I always try to cross-convert development skills with data skills in my teams. Too frequently organisations will silo their engineers. As a CTO I always recommend breaking down artificial barriers between Development and Data. It might seem obvious, but a successful ML Product Manager needs to be good at envisioning the problem and how the ‘Problem’ can be solved by Machine Learning. The Product Manager must be able to translate the business problem that can be solved by Machine Learning. This requires trust with the Data Engineering and Research Science capabilities.

The Design phase of any ML implementation requires a strong and flexible architecture. The same concepts of componentisation, architectural separation, and APIs apply equally to a ML service. Persistence is important for more complex machine learning solutions and wrangling data into the appropriate structure in advance will always be advantageous in a closed loop systems. When dealing with PetaBytes of genomic data for example an appropriate Columnar data structure with metadata stored in a graph or hashmap structure can improve the speed of machine learning.

Lastly, the philosophies of Behavioural-Driven Development (BDD) testing and “Given, When, Then” testing still apply with Machine Learning services. Though it becomes even more incumbent upon the ML Product Manager to work on the problem mapping, together with the architects, so that they are figuring out the present problem and how many of them can be solved using machine learning. You can’t solve all your problems and issues using machine learning. Therefore, a machine learning product manager must be able to distinguish those problems. I personally recommend starting with robust and broad acceptance criteria when training a supervised learning model, such as a decision tree, and then finessing the test cases and the model together with the data scientists. With Genomic datasets and Polygenic Risk Scores the test is a correlation for genetic mappings between SNPs, existing test and peer review frameworks then come into play.