Global Lambda Integrated Facility

Subject Re: [GLIF controlplane] RE: Network Control Architecture
From Joe Mambretti <j-mambretti@xxxxxxxxxxxxxxxx>
Date Sun, 06 May 2007 9:32:57 -0500

Hello:

I agree with your suggestion that it is important to start with small steps. However, with any
steps, there must be some assumptions behind the design. One reason that these designs have been
challenging is that different communities have varying ideas about resource costs, the higher the
cost the greater the consideration for advanced scheduling (e.g., airline travel vs the local metro 
- note that UvA has create a token based ticketing system). Some communities where resources are
ubiquitous do not want to have major considerations about scheduling at all. Also, there are
difference design approaches, such as chained authorization vs simultaneous pushing or pulling
credentials across domains. (There are many other issues as well.) These types of issues have
slowed progress toward an actual prototype implementation. I suggest that during your proposed
call, the participants agree to design and implement "a prototype" (vs perhaps the ultimate
prototype) by agreeing on some of these basic concepts, as the IETF says "rough consensus and
running code."

Thanks.


==============Original message text===============
On Sun, 06 May 2007 8:18:59 am CDT Gigi Karmous-Edwards wrote:

All,

I forgot to mention one more thing: As was discussed in the meeting in 
February, both strategies can co-exist. We drew this up on the 
whiteboard the first day and then decided not to have it initially as 
part of the architecture. If those who were present remember when we 
drew two separate network domain clouds, (Domain Network Resource 
Manager ) NRM-A and DNRM-B. Then we discussed, that if they had an 
agreement between each other such as "inter-domain Dragon" testbed, then 
we can have another DNRM-AB (one cloud that encapsulates the two smaller 
ones) for advertising and therefore configuring. In this case if a user 
request comes in that requires a lightpath across domains A and B, the 
RB on behalf of the user can make a single request to DNRM-AB.   Let me 
know what the community's thoughts are ....

Kind regards,
Gigi

--------------------------------------------

Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
http://www.mcnc.orgMCNC 
RTP, NC, USA
+1 919-248 -4121
gigi@xxxxxxxx
--------------------------------------------



Gigi Karmous-Edwards wrote:
> Hi Jerry and All,
>
> Ok Jerry,  I stuck with you on  your insightful email ( I started your 
> email a couple of weeks ago and just finished it this morning :-) ). 
> If I can summarize your assertions : When an interdomain lightpath is 
> requested, the resource broker (RB)  (which is a servant of a user 
> rather than a domain) talks only to the first Domain's NRM (network 
> resource manager) and then that NRM talks to the second NRM, and so on 
> till the destination. This requires each domain to have established 
> some sort of agreement with all adjacent domains. In your second 
> scenario it seems the user requested a source RM that is not in the 
> RB's "domain" and that the RB will have to forward it to the right RM, 
> then a repeat of the above process.
>
> I think what you described is the ultimate goal of the community, 
> however, due to complexities of the current infrastructures (NRENs, 
> Research testbeds, Global government networks, etc) that require 
> interoperation, it seems that we first need to take small "baby 
> steps".  Existing infrastructures include a variety of  technologies, 
> different management (TL1, SNMP, CLI, etc.) and control plane  (very 
> few deployments of GMPLS) tools for configuration and fault 
> management, also current procedures for information exchange between 
> network domains range from protocols to phone calls/emails. These 
> complexities and other "policy" related challenges force us to break 
> the problem up into smaller functional blocks. I think the framework 
> presented will give us a path forward based on "baby steps" to finally 
> reach the scenario you describe.
>
> I see the problem as having three key challenges:
> 1) Information dissemination (where is what resource?  what are its 
> characteristics?  what are its policies for use?)
> 2) Capability to request reservations on resources globally once 
> discovered ( standard interfaces to query resource managers, with "NO" 
> restrictions on how each resource manager accommodates each request, 
> reuse of existing implementations)
> 3) Scalability (  division of labor among  functional components and 
> responsibilities per domain)
>
>
> The assumption in the framework sent out has been that an RB takes 
> requests from a particular domain's user/application but behaves as a 
> servant of the domain not a single user. In this case there will be 
> several RBs worldwide, but not one for each user, rather one or two 
> per domain. It is assumed that the knowledge of the different 
> resources  globally will be published per domain in a very distributed 
> fashion (each RB will publish the resources and their characteristics 
> hopefully using the schema from the OGF Network Markup Language 
> working group. A query from one RB to the "distributed GLIF resources" 
> will use a type of crawl mechanism to match the requested resources 
> with the "published" resource information that each domain RB 
> publishes on behalf of its RMs.  The assumption is, the information 
> published by the RBs is not static and will be updated by each RB when 
> necessary.  This email is already getting too long, I suggest that we 
> have a conference call and use a WEB based slide sharing application 
> to go through some scenarios. Any interest?
>
>
> To summarize, the strategy in your email will be the goal of the 
> community but it will take a while. I think, as a community we can 
> start to develop standard interfaces for the various RMs such as the 
> Generic Network Interface (GNI), this will help us towards 
> interoperability in today's environment.
>
> Please let me know if we should have a GLIF control plane conference 
> call in the next few weeks?
>
> Kind regards,
> Gigi
>
> --------------------------------------------
>
> Gigi Karmous-Edwards
> Principal Scientist
> Advanced Technology Group
> http://www.mcnc.org> MCNC RTP, NC, USA
> +1 919-248 -4121
> gigi@xxxxxxxx
> --------------------------------------------
>
>
>
> Jerry Sobieski wrote:
>> Good comments both Steve and Bert...let me chime in:     (this is a 
>> bit long, but I think it is relevant)
>>
>> I too think the reservation phase in each domain must be atomic - 
>> there are effective ways to do this.   The overall process though 
>> becomes two phase:  HOLD a resource for some finite holding time and 
>> provide an ACK to the requestor.  At some later time the RM will 
>> receive a CONFIRM from the requester, or a RELEASE.  If the hold time 
>> expires, the resource is released unilaterally.    On a macro basis, 
>> the reservation of the entire end-to-end lightpath must also be held 
>> in the HOLD state while the rest of the application resources are 
>> reserved as there may  be a dependency between availability of 
>> non-network resources and the reserved lightpath.
>> As Steve suggests, this atomic two phase mechanism is used in many 
>> other similar reservations systems.
>>
>> The issue I am concerned about is the roles of the RB and RM. I think 
>> the RBs will be numerous - possibly one for every user.   I believe 
>> we must assume that all networks will default to a stringent "self 
>> secure" stance and will only allow access to its RM from known and 
>> trusted peers.   It doesn't scale for every network to "know" about 
>> every other RB in the world (RBs are agents of the user - not of the 
>> network)  Therefore, for scalability and security reasons, these 
>> resource reservation requests must be made between directly peering 
>> networks, and each network is responsible for recursively reserving 
>> the resources forward toward the destination.   This is still a two 
>> stage commit as described above but it solves two problems:  a) it 
>> scales much better as each network only needs to expect queries from 
>> its direct peers (and customers) and b) it allows each network to 
>> negotiate aggregation policies with its peers for services (enabling 
>> economies of scale and global reach).   This is not unlike how we 
>> place a phone call to anywhere in the world - we don't go asking each 
>> network if we can use it, we ask our service provider to do so, they 
>> ask theirs, and so on, and so on,...
>>
>> The above scenario assumes the RB poses the service request to the RM 
>> serving the source end of a path.   There is a [common?] case where 
>> the RB is not at the endpoint(s) and does not know of any RMs at the 
>> endpoint (or in the middle for that matter).   This brings us to 
>> another assumption I think we must make: a RB only knows its *local* 
>> network RM.   An appropriately designed algorithm should/could 
>> forward the request to the source address RM using the same 
>> forwarding process as the reservation (but crossgrain toward toward 
>> the source), and then the request can be serviced forward normally as 
>> described above. (This is the "third party" provisioning scenario.)  
>> An alternative model asumes a "minion" agent at the path endpoints 
>> that is owned by the end user and knows of its local RM- the minion 
>> agent acts as proxy for the RB and makes the reservation request to 
>> the minion's RM.  (got that?:-)    I think we *can* assume that the 
>> RB knows of these minions since they reside at the end points (source 
>> or destination) at a well known port.
>>
>> It is important to note that this process relies on each network RM 
>> (not the RB) knowing constrained reachability of all endpoints - not 
>> unlike current interdomain routing protocols.   This allows the RM to 
>> postulate which "nexthop" network will provide the best path and try 
>> that first. If the RM knows more than just reachability - i.e. if it 
>> knows topology, then the RM can  select a more specific candidate 
>> path and, via authorized recursive querires, can reserve the 
>> resource.   Only the RM responsible for a network knows the state and 
>> availability details associated with the internal network resources, 
>> and therefore only the local RM can authoritatively and atomically 
>> reserve the resources in that network.
>>
>> The beauty of this process is that from the RB perspective, the RB 
>> need only ask one RM for the entire end-to-end network path.  The RM 
>> will either return a ticket indicating a path was successfully 
>> reserved that meets the requested service characteristics, or a NACK 
>> indicating that the resource was not available for some reason.  The 
>> user must change the requested services parameters somehow before 
>> trying again - i.e. change the source or destination addr, the start 
>> time, the capacity, etc.)
>>
>> As Gigi states, once all application resources are reserved in the 
>> HOLD state, then all must be CONFIRM'ed which will lock in the 
>> reservation.
>> At some delta-t later (which could be 0) there is a separate process 
>> that causes the reconfiguration of the network elements to make the 
>> reserved resources available for actual use (i.e. the provisioning or 
>> signaling process).   This process must be correlated to a previous 
>> reservation and so the provisioning request (separate from the 
>> reservation request) must contain some indicator that is trusted by 
>> the network and indicates which reservation is being placed into 
>> service (see Leon's work on AAA)
>>
>> Note that none of the above is predicated on any particular routing 
>> or signaling protocol...  That being said (:-), DRAGON has 
>> implemented much of this functionality using GMPLS protocols.     
>> -The DRAGON Network Aware Resource Broker (NARB) is analogous to the 
>> network RM and performs the path computation recursively reserving 
>> the resources along the way..  It returns a path reservation in the 
>> form of an Explicit Route Object (ERO) to the source requestor.  This 
>> loose hop ERO specifies a path consisting of ingress and egress 
>> points at each network boundary.      -RSVP then uses this ERO to 
>> provision the multi-domain end-to-end path.     -The DRAGON 
>> Application Specific Topology "Master" is an agent analogous to the 
>> RB mentioned above.  AST Master queries all the various resource 
>> managers (compute nodes, storage, instruments, network, etc) to 
>> reserve groups of dependent resources.  There is a significant 
>> protocol exchange defined for ASTs to construct a workable physical 
>> resource grid for the application. What DRAGON has not yet 
>> implemented:   We have implemented scheduling and policy constraints 
>> in the traffic engineering database, but we have not yet implemented 
>> the path computation to use those constraints (this will be coming 
>> soon).
>> We have atomic reservations, but have not implemented the two phase 
>> commit - though we have long recognized it as critical to the 
>> bookahead capability and a robust integrated resource scheduling 
>> process.
>>
>> Thanks for sticking with me on this ...:-)
>> Jerry
>>
===========End of original message text===========