Edge SDN as a Service

Tags

, ,

Not all micro-services can be stateless lambda functions. Some services must maintain state. A good example is the management of autonomous vehicle platooning functions across multiple radio network sites.

A challenge for this distributed statefulness is if the stateful micro-services are running in a specific container then how does the SDN controller manage networking to a specific container? This requires attaching the SDN networking at the container rather than the host level. Something that is possible with Amazon EC2 Container Service

If Tier-1 telcos are serious about providing Network as a Service or Edge Compute as a Service then they must provide the join between data centre and network operator. To do this they can either be the edge landlord to Amazon, Google and Facebook. Or if they are truly ambitious they need to provide a SDN Edge

Charles Gibbons is talking about Future of NFV / SDN at Digital Transformation World this week in Nice:

Cloud Migration of a Legacy IT Estate

Tags

, ,

There are many things to consider when migrating a legacy IT estate to the cloud. The first though must be what are the motivations and expected benefits. Many organisations have many decades of developed software running on private infrastructure and migration to the cloud is something they think they should do.

Migrating an estate to the cloud incurs a significant cost hurdle as new functions are required just to support migration activities. Often the benefit is minimal as only limited efficiencies can be found from closing (or worse partial closing) of legacy applications and data centres.

What is needed is a target systems architecture aligned to business benefits and vertical product supporting IT Stacks.

The systems architecture should reflect management of intermediary states between internal hosting and public cloud. The management of intermediary estates can easily increase an organizations run cost; for example if Corporation A decides to migrate all of its channels’ IT to a public cloud it will need to build an integration from public to private infrastructure, lease connection between new and old sites, provide a security wrap and identity mgmt function across internal and external clouds and finally support the operations for managing these new systems.

The benefits to support all of these new cloud enablement functions will be high. This does not mean it should never be done but the business must address how benefits like improved time to market will be substantively realised.

A TOGAF business architecture should be included before migrating as migration for the sake of hosting will only ever be a platform change. The balance has to be on how much change your organisation can stomach in a single move. Always consider that the SaaS services you are considering will probably be more configurable than your legacy estate. So don’t fall into the myth of business architecture as business change does not always have to be front loaded.

A Reference Architecture for Cloud Operational Support Systems

Tags

, , , ,

Most telecoms operators have multiple stove piped networks each with a specific historic associated OSS. All CSPs want the agility of Web Scale firms and view OSS and Cloud provision as complementary technologies. The challenge for CSP is to move from legacy vertical pillars to a horizontal platform model. Trying to achieve this with a simple OSS refresh will be a mere shim. For CPSs to be revolutionary they must consider the viability of a Cloud OSS as a way of externalising the orchestration & management of their network resources.

Currently it’s quite easy to find major components of a SaaS BSS (for example Salesforce). However it is much hard it is much harder to find an equivalent within the OSS domain. The primary reason for lack of SaaS in this domain is the nicheness of OSS (discussed previously here IoT Don’t Need No OSS). This nicheness is changing as AWS, GCP and Azure offer essentially offer IoT OSS. There’s currently no ONAP SaaS; but I wouldn’t be surprised if ONAP matured into a SaaS offering at some point. The other major areas of concern are security which can be mitigated through policy & control. Lastly there are concerns around throughput / latency of Resource Performance Management which is a specific topic covered later.

There’s also increasing CSP interest in Open Source OSS (OSS2 maybe?) with Open Source Mano, ONAP and the new TM Forum ODA architecture (for which I’m partly responsible). These OSS’ provide functions that are componentised in their design.

I’ve personally be looking at putting together a best of breed architecture based OSM, ONAP and some Netflix OSS on a cloud-hosted environment to support multiple operational networks. In doing this work I’m trying to understand the following questions:

  1. What is a suitable logical architecture for a Cloud OSS?
  2. And if it can’t all be externally hosted then what would be a suitable logical hybrid architecture?

In order to answer these questions let’s decompose the functions of OSS and compare which parts are most suitable for being cloud hosted. Let’s break it down (using eTOM’s service and resource domains) into nine logical packages for further investigation.

Functions for Cloud OSS

I’ve categorised by Cloud Nativeness (how easy is it to port these functions to the Cloud and how many SaaS offerings are available) against Network Interactivity (be it throughput of data, proximity to element managers). It is fairly self-evident that certain functions are cloud native (service management) whilst others (order to activation) require both close deployment to the network and have specific security constraints.

By grouping the logical functions we end up with three groups: Cloud Native Solutions (those that already run well in the cloud), Not Cloud Native Solutions (those that can’t be externalised to the Cloud) and a middle group of Either / Or that could be either internally managed or externalised.

OSSCloudGraph

The Either / Or group is the newest area covering Machine Learning, Autonomics and MANO for NFV / SDN. These could be either natively deployed (for example a local deployment of FlinkML on top of a performance management solution) or a cloud hosted solution (e.g. Google Cloud Platform’s TensorFlow deployment

Cloud Native Solutions: 

Service and Incident Management systems include perennial favourites Service Now, BMC Remedy & Cherwell.  These tools as cloud hosted solution require feeds from alarm management systems. Whilst the architecture orients itself to data streaming and machine learning the incident management system handles less tickets and works more on auto-remediation. This model necessitates the closed loop remediation function to sit within the network. I would expect a streaming flow in and out of the network boundary and this will obviously be the biggest of the pipes (and the most risky). Network & Service Operations provides a specialism for service & incident management and includes the resource alarm management, solutions like EMC Smarts & IBM Netcool increasingly offer cloud based operation consoles for alarm management tools.

Field Management systems together with Resource Plan and Build are easily managed from a public cloud. These systems have limited access to the operational network and normally have to manage internal and 3rd party resource to complete field operations. Systems like ESRI and Trimble fit in this space. These systems predominantly need access to resource and service inventories, and resource tools (such as HR systems, maps and skills bases).

Strategy systems are an interesting case of specialist planning, delivery and product lifecycle tools with eTOM. They cover service development & retirement, capability delivery and strategic planning. These functions are all equally loosely coupled to the network so require inventory detail, resource detail and a big data store of network performance. But they can be hosted externally and are not mission critical systems. So for our OSS these should be Cloud Native.

Not Cloud Native Solutions:

Order 2 Activation are the Activation systems for management of the network which are either subscription based or resource activation. Distinction here between provisioning controller and the intent based network choreographer (passing intent and policy to the network)

Performance Management Real time operational systems predominantly taking data streams from the network require local deployment as network functions predominantly require low latency if incidents are to be immediately managed.

The Interesting Either / Or Group:

MANO for NFV / SDN can either be a localised solution or can be cloud hosted in the case of a master orchestrator implementing intent based models.  This model makes sense when the orchestration involves third party service orchestration. This is partially covered by the TM Forum ODA.  The challenges would be organising the split of VNF Management with NFV Orchestration. The security controls will need to avoid the attack vector to the client VNF Manager running inside a CSPs network. 

It is likely that CSP’s will investigate this model going forward as they look to benefit from the opportunity of providing Mobile Edge Compute as an integrated PaaS.

Machine Learning & Autonomic Remediation is partially dependent upon the NFVO cloud architecture as remediation needs services to be exposed in order to implement remediation. If the NFVO is already cloud hosted then remediation is a natural continuation of its capabilities. The Machine Learning capability is a driver for the remediation engine constantly looking for situational improvements for specific conditions. Machine Learning can be deployed locally on a CSPs own infrastructure or use the scaling capabilities of tools like TensorFlow on GCP. The decision CSPs make here will be about scaling the intelligence to provide usable conditions that can be implemented within the remediation engine. A CSP with good skills in this area will have a technology advantage.

Next Steps:

I will be updating this stream as I believe there is a genuine future for a Cloud Native OSS. So please keep following this blog and ping me @apicrazy if you’re on the same journey.

 

 

 

12 Reasons Why Cloud OSS hasn’t happened so far

Tags

, , ,

I am regularly asked why there are so few Cloud OSS, or OSS as a Service, options when AWS / GCP and Azure all have IoT plays. I have also wondered why no systems integrator has deployed ONAP on AWS (or other). The following are the main reasons why I think such an option has not yet become popular for CSPs & vendors.

12 Reasons Why Cloud OSS hasn’t happened so far:

  1. Network operators are risk averse
    • That’s a very good thing as CSPs protect your data in flight and at rest. Security is critical for CSPs. However, this does not mean that a Cloud OSS cannot be used just that the appropriate security measures need to be in place
  2. Network operators have customers that are even more risk averse
    • That’s a very good thing too and CSPs have to take account of their customers requirements. However, a private cloud or a public cloud can be secured in the same way as a private data centre. The OSS must make sure that it is not persisting customer data or exposing network functions.
  3. Cloud OSS creates another attack vector and dude we’ve got enough of those
    • We sure do. But internally hosted OSS is itself a risk / attack vector. The benefit of Cloud OSS is that it should allow a simplification / reduction of the number of OSS stacks within the CSP
  4. OSS must be internal because of Data regulation and on-shoring / safe-harbouring of data
    • OSS systems should not be persisting customer data (EVER even Static IP addresses!). So, data regulation requirements will only have limited application. OSS data must be secured at rest and in transit. The low latency requirements of OSS will require near hosting.
  5. Few network operators have sufficient levels of virtualised network functions
    • This is changing rapidly and 5G technologies will be predominantly virtualised
  6. The cost of the OSS is always a low proportion of the costs of the network
    • This is true but does not stop the need to gain greater platform efficiencies.
  7. Moving to the cloud will not wipe away the legacy
    • Of course, it won’t but it will help focus of the future and pass management of VNFs to a single master. PNF management will always be a challenge.
  8. The OPEX model is not always beneficial
    • This is true but OSS stovepipes are not cheap. Best of breed SaaS will help spread the cost and not create a lock in to a single technology version.
  9. It’s the OSS, those guys don’t move quickly
    • A classic refrain but not a reason not to move to a Cloud OSS
  10. The streaming data pipe will be too fat and the latency will be too slow to fix items quickly
    • This is a genuine concern and will required a data pipeline architecture with streaming inside the network and OSS components residing outside. Intent based programming with specific levels of management at the different layers will be key to answering the low latency requirement. Especially when control is part of a network slice management function.
  11. The BSS will never be in the Cloud
    • Salesforce, GCP, AWS, Pega, Oracle Cloud, Azure are all changing that model. Especially in the IoT space.
  12. The OSS will never be in the Cloud
    • Watch this space….

 

Bringing IT (OSS) all together

Tags

, , , , , ,

I try and fit components together logically so that they can make the most of what the technology offers. I work predominantly in the OSS world on new access technologies like 5G and implementations like the Internet of Things. I want to achieve not just the deployment of these capabilities but to also to let them operate seamlessly.  The following is my view of the opportunity of closed-loop remediation.

For closed-loop remediation there are two main tenets: 1. you can stream all network event data in a machine learning engine and apply an algorithm like K-Nearest Neighbour  2. you can expose remediation APIs on your programmable network.

All of this requires a lot of technology convergence but: What’s actually needed to make everything convergent?

ClosedLoop

Let’s start with Streaming. Traditionally we used SNMP for event data, traps & alarms and when that didn’t work we deployed physical network probes. Now it’s Kafka stream once implementations where a streams of logs of virtualised infrastructure and virtualised functions are parsed in a data streaming architecture into different big data persistence.

The Machine Learning engine, I’m keenest of FlinkML at the moment, works on the big data persistence providing the largest possible corpus of event data. The ML K-NN can analyse network behaviour and examine patterns that are harder for human operation teams to spot. It can also predict timed usage behaviours and scale the network accordingly.

I am increasingly looking at Openstack and Open Source Mano as a NFVO platform orchestrating available virtualised network functions. The NFVO can expose a customer facing service or underlying RFSs. But to truly operate the ML should have access to the RFS layer. This is the hardest part and is dependent upon the underlying design pattern implementation of the Virtual Network Functions. This though is a topic for another blog post.

 

 

 

5G, Iaas and Mobile Edge Computing

Tags

, , , , , ,

Mobile Edge Computing (MEC) is a key piece of the 5G architecture (or 5G type claims on a 4G RAN). MEC can already make a huge difference in video latency and quality for video streaming multiple feeds within a sporting environment. For example Intel, Nokia and China Mobile video streams of the Grand Prix at Shanghai International Circuit.

A 5G mobile operator will be introducing virtualised network functions as well as mobile edge computing infrastructure. This creates both opportunities and challenges. The opportunities are the major MEC use cases included context-aware services, localised content and computation, low latency services, in-building use cases and venue revenue uplift.

The challenges include providing the Mobile Edge Compute Platform in a virtualised 5G world. Mobile operators are not normally IaaS / PaaS providers so this may become a challenge.

The ETSI 2018 group report Deployment of Mobile Edge Computing in an NFV environment describes an architecture based on a virtualised Mobile Edge Platform and a Mobile Edge Platform Manager (MEPM-V). The Mobile Edge Platform runs on NFVI managed by a VIM. This in turn hosts the MEC applications.

MECETSI

The ETSI architecture seems perfectly logical and reuses the NFVO and NFVI components familiar to all virtualisations. In this architecture the NFVO and MEPM-V act as what ETSI calls the Mobile Edge Application Orchestrator” (MEAO) for managing MEC applications.  The MEAO uses NFVO for resource orchestration and for the element manager orchestration.

The difficulty still lies in implementing the appropriate technologies to suit the MEC use cases. Openstack (or others) may provide the NFVI and Open Source Mano (or others) may provide the NFVO; however what doesn’t exist is the service exposure, image management and software promotion necessary for a company to on-board MEC.

If MEC does take off what is the likelihood that AWS, GCP and Azure will extend their footprint into the telecom operators edge?

 

 

 

 

Mobile Operators Guide to European Payment Services Directive

Tags

, , , ,

European Payment Services Directive 2

The European Payment Services Directive (PSD2) will be transposed into member state law by 2018 and will have a transformative effect on nation state and cross border electronic payments. The Directive aims to increase the convenience of security of electronic payments. This is achieved by promoting payment innovation, for example by Open APIs, and by deregulation of financial service roles. PSD2 will allow new payment service providers to enter the market. Technology firms and mobile operators may be the greatest beneficiaries.

The Directive will transform the way users access their bank accounts during digital commerce. For example, the user may choose a mobile network operator’s payment mechanism as part of a contactless payment.

Opportunity for Mobile Operators

PSD2 mandates the use of robust authentication standards. Any technology provider with authentication and authorisation capabilities can take advantage to PSD2.

The advantage for Mobile Operators is their ability to support network authentication and service location functions. These functions are all particular to mobile networks, making operators a valuable partner in the development of new identity and authentication solutions.

  • 100.1 million contactless credit & debit cards in issue in the UK (Q1 2016)
  • £39.2 billion – UK domestic spending on debit (October 2016)
  • £2,903.2 million – UK contactless card spend (November 2016)
  • £249.9 million – payment card gross fraud in the six months to (June 2016)
  • 12 million Apple Pay monthly users globally (Q1 2016)
  • 71% – proportion of UK adults with a smartphone (Q1 2016)

Electronic Identification and Trust Services

Electronic Identification and Trust Services (eIDAS) regulation is a tenet of the EU’s Digital Single Market. Mobile Operators have already launched pilots for eIDAS compliant cross-border authentication solutions for the use of public sector services.

PSD2 together with eIDAS give a unique opportunity to operators to support identity for both the private and public sectors. This identity management capability will be critical to all Open APIs in any new PSD2 mobile banking platform.

Some likely use cases

The freedom to “delegate” bank account access is the first major shift that users will see. Under PSD2 an account holder will be able to allow a licensed Payment Initiation Service Providers (PISP) or Account Information Service (AISP) access to their bank account for the purposes of initiating a payment or evaluating the user’s ability to pay.

Online commerce is likely to become simpler through such rules as it will allow all banked consumers to buy online using just their bank account, removing the reliance on debit or credit card ownership. This represents a leap forward for consumer and merchant alike, since direct bank transfers can typically clear in two hours or less with some services offering instant settlement.

Discounts for mobile cash?

For merchants wanting to ease cash flow this is a benefit and service for which they may be willing to offer incentives. Direct bank transfers and instant settlement provide simplicity for the user and immediacy for the merchant that may be equivalent to when merchants offered “discounts for cash”?

The power to delegate bank account access is set to trigger major changes in the way digital commerce is conducted. The appearance of new innovative payment services that rely on the powers conveyed by PSD2 is highly likely; as is the anticipated reaction from traditional card schemes whose profitability may well be curtailed by PSD2’s cap on interchange fees and merchant surcharging. Either way the consumer will benefit.

Payment security

With increased openness comes issues that relate to “security”. To address these PSD2 is demanding the use of strong authentication. The European Banking Association (EBA) has been tasked with defining a standard that achieves this and first drafts are out for review now. From the application designer’s perspective traditional authentication systems that employ one time passwords (OTP) or static personal identification numbers (PINs) may be deemed unfit for use within future digital commerce applications as the banks and other service provider’s latch on to the EBA’s regulatory technical standard.

The EBA is asking for two factor authentication where the user has to be in possession of two things, for instance, a password and an access token to prove their identity. Mobile phone based services like GSMA Mobile Connect will become more prevalent in the future digital market. Advances in smart phone will also increase the use of biometrics as an authentication factor..

Direct Carrier Billing

All the impacts of PSD2 will not come just from easier access to bank accounts or added security. PSD2 has tightened the rules on Direct Carrier Billing (DCB). Consumer accustomed to buying digital content via their mobile phone and charging it to their phone bill will see their options curtailed.

Under PSD2 single DCB transactions will be capped to a maximum of €50 per transaction with a maximum monthly limit of €300. PSD2 continues to allow Electronic Money Institutions (EMIs) to extend the reach of DCB from digital content to the purchase of physical goods.

Mobile Operator Opportunities and Partnerships

Mobile Operators have a PSD2 advantage through service location functions and authentication. SIM & eSIM based authentication can be extended to provide security for customers and merchants by implementing Electronic Identification and Trust Services. With 5G, new network slices may be able to provide a Quantum Encryption Network Slice that would guarantee merchant to bank transactional security.

The greatest opportunity may be through partnerships. The GSMA Mobile Connect and mobile payments projects are likely avenues for greater partnerships. The advent of contactless payment cards in the late 2000s saw early attempts by UK mobile operators to act in partnership as a bank. The advantage of PSD2 is that it removes the requirement for mobile operators to become banks as they can instead focus on interactions with payment processing companies.

Finally any potential European Commission regulation on Anti-Trust on mobile device payment solutions could further open the market for mobile operators (or mobile industry bodies) to provide payment solutions. Such a change in regulation may allow the handset vendor to offer their services as part of the initial contract sale.

Identity for the Internet of People

The Internet of Things, as distinct from the internet of people, requires communication between devices which enable tracking, monitoring and metering etc…   This intercommunication is dependent upon semantically structured and shared data for enabling functions such as identification, authentication, authorisation, bootstrapping and provisioning.  Standardising both the semantically structured data and the enabling functions across M2M applications and devices would reduce the cost and extend the life of M2M devices.  Standardisation for the Internet of Things is the aim of a common service layer for M2M.

The oneM2M group aims to develop technical specifications that address the need for a common M2M Service Layer that can be readily embedded within various hardware and software, and relied upon to connect the myriad of devices in the field with M2M application servers worldwide.   The common M2M Service Layer should be agnostic to underlying network technology (yet leveraging the unique features of these underlying networks), and it will use network layer services (such as security (encryption and authentication), QoS, policy, provisioning, etc.) through an adaptation layer/APIs.

In order for an embedded common M2M service layer to operate it must support AAA (authN, authZ & accounting) for smart devices that is agreeable between multiple device manufacturers and network operators.  The Telecommunications Industry Association (http://www.tiaonline.org) are defining a functional standard for Authentication, Authorization and Accounting for Smart Device (AAA-SD TIA) The functions proposed by the common M2M service layer that include Policy & Resource Management

  • Authentication and Registration (Identity Management)
  • Establish communications session (Add/Delete/Modify)
  • QoS/SLA for communication session
  • Billing, Charging, and Rating rules
  • Group Management
  • Security Management (Data confidentiality, integrity, abuse prevention, privacy)

TIA TR-50 Functional architecture for M2M Smart Device Communication System Architecture  describes AAA-SD as ” provide authentication, authorization and accounting services to other entities in the network to establish and enforce security policies. The services may include generation of keys, generation and validation of certificates, validation of signatures, etc”

JSON Web Token (JWT) and JSON Object Signing and Encryption (JOSE)

This blog is part of a series comparing the implementation of identity management patterns in SAML and OpenID Connect:

JSON Web Token (JWT)

OpenID Connect uses the JSON Web Token (JWT)

The OpenID Connect protocol [OpenID.Core] is a simple, REST/JSON- based identity federation protocol layered on OAuth 2.0. It uses the JWT and JOSE formats both to represent security tokens and to provide security for other protocol messages (performing signing and optionally encryption). OpenID Connect negotiates the algorithms to be used and distributes information about the keys to be used using protocol elements that are not part of the JWT and JOSE header parameters.

  • iss REQUIRED. Issuer Identifier for the Issuer of the response
  • sub REQUIRED. Subject Identifier
  • aud REQUIRED. Audience(s) that this ID Token is intended for. It MUST contain the OAuth 2.0 client_id of the Relying Party as an audience value
  • exp REQUIRED. Expiration time on or after which the ID Token MUST NOT be accepted for processing
  • iat REQUIRED. Time at which the JWT was issued
  • auth_time Time when the End-User authentication occurred
  • nonce String value used to associate a Client session with an ID Token, and to mitigate replay attacks
  • acr OPTIONAL. Authentication Context Class Reference. String specifying an Authentication Context Class Reference value that identifies the Authentication Context Class that the authentication performed satisfied
  • omr OPTIONAL. Authentication Methods References. JSON array of strings that are identifiers for authentication methods used in the authentication. For instance, values might indicate that both password and OTP authentication methods were used
  • azp OPTIONAL. Authorized party – the party to which the ID Token was issued. If present, it MUST contain the OAuth 2.0 Client ID of this party

JSON Object Signing and Encryption (JOSE)

JSON Object Signing and Encryption (JOSE) specifications

In the OpenID Connect context, it is possible for the recipient of a
   JWT to accept it without integrity protection in the JWT itself.  In
   such cases, the recipient chooses to rely on transport security rather than object security.
For example, if the payload is
   delivered over a TLS-protected channel, the recipient may regard the
   protections provided by TLS as sufficient, so JOSE protection would
   not be required.

   However, even in this case, it is desirable to associate some
   metadata with the JWT payload (claim set), such as the content type,
   or other application-specific metadata.  In a signed or encrypted
   object, these metadata values could be carried in a header with other
   metadata required for signing or encryption.  It would thus simplify
   the design of OpenID Connect if there could be a JOSE object format
   that does not apply cryptographic protections to its payload, but
   allows a header to be attached to the payload in the same way as a
   signed or encrypted object.