Internet-Draft | Securing hybrid network criteria and req | November 2024 |
OIWA | Expires 8 May 2025 | [Page] |
This document analyzes requirements for ensuring and monitoring the security status of the network used under complex network environment such as hybrid cloud or mixed cloud settings.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 May 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Recently, virtualized resources such as cloud computing infrastructure rapidly replace traditional types of network/computing environment such as local servers or on-premise computer clusters. In such kind of infrastructure, information of physical resources such as servers, local network, network routers, etc. are hidden from users in trade with flexibility, service redundancy and costs as well. Cryptographic communications such as TLS, IPsec etc. are typically used to protect communication into/out of such systems from eavesdropping and tampering.¶
However, there are many use cases where service still depends on the security nature of underlying physical resources, instead of just encrypting the communication:¶
Traffic analysis on encrypted communication may reveal partial information of the payload;¶
Juridical requirement (such as personal data protection) demands some specific property (such as governing laws, geological positions, operators) to be checked;¶
Denial-of-service and several other attacks may not be prevented by encryption only.¶
For such high-security applications, we need some technical infrastructure for continuously checking the properties and statuses of underlying network and intermediate nodes. In non-virtualized, self-managed setting, tHere are several existing technologies (e.g. NETCONF, path validation, etc.) for acquiring such statuses. However, these are not enough for virtualized, multi-stake-holder setting of modern cloud infrastructure.¶
This document gives a first-stage problem analysis for ensuring and monitoring the security status of the network used under complex network environment such as hybrid cloud or mixed cloud settings. It also proposes a brief, straw-man view on the enabling architecture for possible monitoring systems.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Concepts of multi-cloud and hybrid clouds are defined in ISO/IEC 5140:2024; in short, multi-cloud is a system where a single service is implemented using two-or-more independently-operated cloud services. Hybrid cloud system composes two or more computation environments having different nature of operation, security level or other aspects, at least one of which is typically a public cloud service. Often, subsystems on privately-operated cloud, on-premise, or edge networks are connected with public cloud infrastructure by network to construct a single hybrid cloud system.¶
Hybrid cloud systems are, in general, constructed when the security or other provisions of public cloud systems are not sufficient for a part of information or a subsystem component (if not, a simple public or multi cloud environment is sufficient). At the same time, there are often a requirement where some benefits (scalability, costs, resilience, maintainability etc.) of public cloud systems are beneficial (if not, simple on-premise deployment is enough). This mixed, seemingly conflicting requirements makes it difficult to ensure the monitoring of security for the hybrid cloud systems.¶
Multi-cloud and hybrid cloud systems require system-internal communications flowing beyond the boundary of single cloud systems. In a simplest case, it can be implemented using authenticated TLS or HTTPS communications via public Internet infrastructure. For high-security systems, it is often implemented using dedicated channels of communications, such as VPNs, private peering, or even a dedicated optical fiber channels. To maintain the security of whole systems, monitoring integrity of such dedicated channels is mandatory.¶
Furthermore, with IP-based software systems, there are lot more dependency to ensure such secure communications. In other words, there are a lot more surfaces for attacks. For example, if a DNS recored is either tampered or misconfigured, a communication intended to go through a secure channel might be routed to public channels. If there is a misconfiguration for routing, the traffic might go public. Enumerating and collecting status of such dependency are undermined currently.¶
There are a lot of technology already available and useful for such purposes.¶
NASR activity (Network Attestation For Secure Routing) provides capability for recording and monitoring the paths of network packets forwarding.¶
SAVNET (Source Address Validation in Intra-domain and Inter-domain Networks) provides a way to ensure validity of incoming traffic and possibly blocking any rogue packets.¶
SRv6 provides a control of intended routes for individual IPv6 packets between networks.¶
RPKI provides a control and trust anchors for the security or inter-domain routing.¶
However, to ensure the security of the whole hybrid cloud infrastructure, we still have to address the following aspects, which seems to be lacking solutions currently.¶
Hybrid cloud systems depends on a lot of resources which are not under control of the application system operators. Needless to say, public clouds (both IaaS and SaaS) are operated by external service providers. They have their own policy for their operations, and they have their own decisions for maintaining or replacing any of the providing hardware/software resources, provided that their service-level agreements (SLAs) is met.¶
This makes it non-satisfactory to expose information of all network intermediate nodes to the final application operators. First, detailed information on design and implementation of the cloud infrastructure is a confidential information and important properties of the cloud providers. Moreover, some extent of independence between application operators (users or cloud infrastructure) and cloud service providers are critical for maintaining cost effectiveness, maintainability, security etc of the cloud services.¶
In a small-scale, hand-crafted network, determining whether the current running state of the network is intended or not is a relatively simple question. However, in the complex multi-cloud systems, it is quite hard or even impossible problem to determine that, even if we had been possible to know all the detail of the running state of the whole global network. To determine that, we also have to know about the design principle and hidden assumption about the operation of each single network.¶
Many cloud resources, not only computation nodes but also network routers, switches, VPN endpoints, etc., are virtualized and provided via infrastructure-as-code (IaC) systems. Unlike physical routers and switches, determination of virtual intermediate nodes in the traffic path does not mean its physical locations, physical properties and security natures. (imagine how we can analyze results of traceroute or ICMP ping via virtual private network.)¶
If there are any virtual nodes, physical properties of its underlying infrastructure may have to be traced and checked to ensure security and integrity. This requires cooperation of virtual resource provider or cloud providers and integration with their infrastructure management systems.¶
Today, many network systems are managed via complex systems. This means any invasion to the IT-side assets of those management systems will cause severe risks to the network layers. These assets include (and are not limited to) software asset management, software vulnerability, ID managements, etc.¶
To correctly evaluate risks of the whole network operations, we must also care about the risks of these management systems as well.¶
To overcome these problems, we propose to design a distributed architecture for assuring the network operation integrity for the mixed and hybrid cloud applications. Such a system should:¶
Have a modeling of the network infrastructure in two dimensions: one axis in parallel to the network paths and forwarding directions, and the other axis for the layers of protocols.¶
Have enough knowledge on the complex dependency of software and protocols; not only the network packet-forwarding technologies but also surrounding areas such as addressing and DNS must be covered.¶
Have explicit handling of tunneling and virtualization aspects, both on protocol level (e.g. VPNs, IPIP, IPSec) and on infrastructure level (IaC, Network-as-a-Service, Wavelength Division Multiplexing, etc.)¶
Consolidate operation information at each operator's level and consider their pre-determined operation principles for evaluating integrity.¶
Address management-oriented risks of the infrastructure managements, including non-network aspects.¶
Possible implementation of such a system might be distributed systems of network security coordinations between operators and users of cloud and network infrastructure. Instead of the "disclose all" approach, such a design might keep both flexibility and efficiency of the multi-cloud applications.¶
In particular, such a system will:¶
Have ability to state network security requirements from an infrastructure user to infrastructure providers. In a hybrid cloud or layered systems, it will include communications between operators of infrastructure/cloud systems.¶
Have ability to return assertions for the current provisional status against given requirements.¶
Provide some choices on the transparency levels about the internals of cloud-service infrastructure.¶
Have some traceability provisions for trouble shooting, if there are opacities in network status assertion replies.¶
Have enough considerations on various tunneling and virtualization technologies.¶
Have a bidirectional interface to system-level security management systems, such as Continuous Diagnostics and Mitigations (CDM) dashboards.¶
A service called "Path Characteristics Service" (PCS) provides a endpoint for requesting/answering assertions for the "characteristics" of the network paths. Typically it is deployed on each network operators or connectivity providers, and answering the real-time assertion of the network status for their contracted clients.¶
In the complex commercial networking, network operators provide connectivity not only for the Internet but also for other providers' dedicated services, such as public and private clouds. Also, their provided connectivity may utilize tunneling technology upon networks provided by other operators. For such multi-stakeholder settings, PCS will gather information from the PCSs of the other providers and returns the summary information to the clients.¶
(TBA: figures: see IETF 120 NARS BoF presentation)¶
The rest of this section will provide a unconsolidated list of requirements for the functionality of PCS.¶
The PCS service will be access-controlled and confidentiality-protected.¶
The service will be authenticated e.g. by OAuth or similar strong authentication mechanisms.¶
The identity of the authentication will be associated with a single connectivity or network access channel. Some examples of the single connectivity are:¶
TBD: if there are multiple connectivity channel on single business contracts, multiple IDs might be associated with a single authentication.¶
A query will be a set of assertion requests.¶
Each assertion request will contain a destination and a list of desired connectivity properties on the communication path to the specified destination.¶
A destination is a subset of the Internet, specified as an AS, a subset of IP addresses, or a DNS name.¶
Providers of PCS are not required to support all types of queries, and may pose their own limitations onto it. However, they has to make negative responses for any queries they do not support.¶
List of desired connectivity properties declares what kind network nodes (both network nodes and edges) the communication packets will be allowed to flow over.¶
Possible property requests for a network node will include at least:¶
Network edges may be categorized into:¶
Possible property requests for a physical network edge will include at least:¶
operator¶
geo-location¶
the protocol type of the physical network¶
the security status of the operator¶
required assurance level (see below)¶
Possible property requests for a network tunnel will include at least:¶
operator¶
geo-location¶
(nested) path property request for the underlying network¶
the identification of the tunnel¶
the protocol type¶
the strength of the integrity/confidentiality protection¶
the security status of the tunnel¶
the name and version of the software realizing the tunnel¶
the security status of the operator¶
required assurance level (see below)¶
Possible property requests for a software-defined network will include at least:¶
An assertion, which is a response to the query, will contain either an evidence or an guarantee of the required network properties. There will be several types of assurance levels or types of the assertions to be returned. Every response will be signed by the PCS with the identification of the PCS software.¶
For traced assertions, the query will typically contain a requirement for specific node suppliers and types. The answer will contain a recorded trace of the path, signed with each traversed network nodes with their identifications. The information will ensure that the property is satisfied only at the present time.¶
This type of assertion will require dedicated support for packet traces in every network nodes.¶
For transparent assertions, the response will contain a list of traversed nodes and edges with their properties (as requested in the query). If the query contains requirements for networks operated by third parties (i.e. involving a cascaded queries to other PCSs), the assertion will contain sub-assertions received from the third parties. The information will ensure that the property is satisfied only at the present time.¶
For traceable opaque assertions, the response will contain an opaque ID for the response. That ID has to be corresponding to the trace informations which can be used by operators to identify the records for trouble-shooting in the future time. The information will ensure that the property is satisfied only at the present time.¶
For opaque assertions, the response will contain just a positive or negative answer to the query. The information will ensure that the property is satisfied only at the present time.¶
For traceable opaque future assertions, the response will contain an opaque ID for the response. That ID has to be corresponding to the trace informations which can be used by operators to identify the records for trouble-shooting in the future time. The information will ensure that the network is controlled in the way that the require property is kept satisfied, even when dynamic routing has been changed.¶
For opaque assertions, the response will contain just a positive or negative answer to the query. The information will ensure that the network is controlled in the way that the require property is kept satisfied, even when dynamic routing has been changed.¶
How to assert security level of operators¶
Standards or de-facto standards for status sharing with security dashboards¶
Details on speficificaton for real-world properties such as operators, suppliers, models and geo-locations¶
How to integrate and monitor application-level dynamic routing (e.g. DNS)¶
Possible more-detailed specification for network topology requirements¶
More detailed integration with other NASR activities¶
Possible integration with RPKI and other global-level managements¶
The following is an informal example of the possible query to the PCS operated by operator A.¶
Internet | +------+ +-----------------+ +-----------------+ | User |----| Operator A (FR) | | Operator C (DE) | +------+ | 192.0.2.64/28 | | 198.51.100.4/30 | +-----------------+ +-----------------+ | | `===== IPIP VPN tunnel =====' via some network operator in EU Note: Areas FR, DE, and EU are just chosen as examples for nested areas.¶
The path to 192.0.2.64/28 (on operator A) will be composed of:¶
The path to 198.51.100.4/30 (on operator C) will be composed of:¶
Security SHALL be deeply considered and discussed during the ongoing standardization process.¶
This document has no IANA actions.¶
This work is supported by NEDO grants P23013 from the Net Energy and Industrial Technology Development Organization.¶