Global Lambda Integrated Facility

Subject Re: [GLIF controlplane] RE: Network Control Architecture
From Jerry Sobieski <jerrys@xxxxxxxxxxxxxx>
Date Fri, 20 Apr 2007 14:47:46 -0400

Good comments both Steve and Bert...let me chime in: (this is a bit long, but I think it is relevant)

I too think the reservation phase in each domain must be atomic - there are effective ways to do this. The overall process though becomes two phase: HOLD a resource for some finite holding time and provide an ACK to the requestor. At some later time the RM will receive a CONFIRM from the requester, or a RELEASE. If the hold time expires, the resource is released unilaterally. On a macro basis, the reservation of the entire end-to-end lightpath must also be held in the HOLD state while the rest of the application resources are reserved as there may be a dependency between availability of non-network resources and the reserved lightpath. As Steve suggests, this atomic two phase mechanism is used in many other similar reservations systems.

The issue I am concerned about is the roles of the RB and RM. I think the RBs will be numerous - possibly one for every user. I believe we must assume that all networks will default to a stringent "self secure" stance and will only allow access to its RM from known and trusted peers. It doesn't scale for every network to "know" about every other RB in the world (RBs are agents of the user - not of the network) Therefore, for scalability and security reasons, these resource reservation requests must be made between directly peering networks, and each network is responsible for recursively reserving the resources forward toward the destination. This is still a two stage commit as described above but it solves two problems: a) it scales much better as each network only needs to expect queries from its direct peers (and customers) and b) it allows each network to negotiate aggregation policies with its peers for services (enabling economies of scale and global reach). This is not unlike how we place a phone call to anywhere in the world - we don't go asking each network if we can use it, we ask our service provider to do so, they ask theirs, and so on, and so on,...

The above scenario assumes the RB poses the service request to the RM serving the source end of a path. There is a [common?] case where the RB is not at the endpoint(s) and does not know of any RMs at the endpoint (or in the middle for that matter). This brings us to another assumption I think we must make: a RB only knows its *local* network RM. An appropriately designed algorithm should/could forward the request to the source address RM using the same forwarding process as the reservation (but crossgrain toward toward the source), and then the request can be serviced forward normally as described above. (This is the "third party" provisioning scenario.) An alternative model asumes a "minion" agent at the path endpoints that is owned by the end user and knows of its local RM- the minion agent acts as proxy for the RB and makes the reservation request to the minion's RM. (got that?:-) I think we *can* assume that the RB knows of these minions since they reside at the end points (source or destination) at a well known port.

It is important to note that this process relies on each network RM (not the RB) knowing constrained reachability of all endpoints - not unlike current interdomain routing protocols. This allows the RM to postulate which "nexthop" network will provide the best path and try that first. If the RM knows more than just reachability - i.e. if it knows topology, then the RM can select a more specific candidate path and, via authorized recursive querires, can reserve the resource. Only the RM responsible for a network knows the state and availability details associated with the internal network resources, and therefore only the local RM can authoritatively and atomically reserve the resources in that network.

The beauty of this process is that from the RB perspective, the RB need only ask one RM for the entire end-to-end network path. The RM will either return a ticket indicating a path was successfully reserved that meets the requested service characteristics, or a NACK indicating that the resource was not available for some reason. The user must change the requested services parameters somehow before trying again - i.e. change the source or destination addr, the start time, the capacity, etc.)

As Gigi states, once all application resources are reserved in the HOLD state, then all must be CONFIRM'ed which will lock in the reservation. At some delta-t later (which could be 0) there is a separate process that causes the reconfiguration of the network elements to make the reserved resources available for actual use (i.e. the provisioning or signaling process). This process must be correlated to a previous reservation and so the provisioning request (separate from the reservation request) must contain some indicator that is trusted by the network and indicates which reservation is being placed into service (see Leon's work on AAA)

Note that none of the above is predicated on any particular routing or signaling protocol... That being said (:-), DRAGON has implemented much of this functionality using GMPLS protocols. -The DRAGON Network Aware Resource Broker (NARB) is analogous to the network RM and performs the path computation recursively reserving the resources along the way.. It returns a path reservation in the form of an Explicit Route Object (ERO) to the source requestor. This loose hop ERO specifies a path consisting of ingress and egress points at each network boundary. -RSVP then uses this ERO to provision the multi-domain end-to-end path. -The DRAGON Application Specific Topology "Master" is an agent analogous to the RB mentioned above. AST Master queries all the various resource managers (compute nodes, storage, instruments, network, etc) to reserve groups of dependent resources. There is a significant protocol exchange defined for ASTs to construct a workable physical resource grid for the application. What DRAGON has not yet implemented: We have implemented scheduling and policy constraints in the traffic engineering database, but we have not yet implemented the path computation to use those constraints (this will be coming soon). We have atomic reservations, but have not implemented the two phase commit - though we have long recognized it as critical to the bookahead capability and a robust integrated resource scheduling process.

Thanks for sticking with me on this ...:-)