Global Lambda Integrated Facility

Subject Re: [GLIF controlplane] RE: Network Control Architecture
From Cees de Laat <delaat@xxxxxxxxxxxxxx>
Date Sun, 6 May 2007 17:03:57 +0200


What I see here is a discission between two fundamentally different systems: a tree like allocation and authorization method versus a chain like model. In Phosphorus we identify those and also need to work out how to bridge between neighboring domains that implement different models.
We need big steps and rough code there.

Best regards,

At 09:32 -0500 06-05-2007, Joe Mambretti wrote:

I agree with your suggestion that it is important to start with small steps. However, with any steps, there must be some assumptions behind the design. One reason that these designs have been challenging is that different communities have varying ideas about resource costs, the higher the cost the greater the consideration for advanced scheduling (e.g., airline travel vs the local metro - note that UvA has create a token based ticketing system). Some communities where resources are ubiquitous do not want to have major considerations about scheduling at all. Also, there are difference design approaches, such as chained authorization vs simultaneous pushing or pulling credentials across domains. (There are many other issues as well.) These types of issues have slowed progress toward an actual prototype implementation. I suggest that during your proposed call, the participants agree to design and implement "a prototype" (vs perhaps the ultimate prototype) by agreeing on some of these basic concepts, as the IETF says "rough consensus and
running code."


==============Original message text===============
On Sun, 06 May 2007 8:18:59 am CDT Gigi Karmous-Edwards wrote:


I forgot to mention one more thing: As was discussed in the meeting in
February, both strategies can co-exist. We drew this up on the
whiteboard the first day and then decided not to have it initially as
part of the architecture. If those who were present remember when we
drew two separate network domain clouds, (Domain Network Resource
Manager ) NRM-A and DNRM-B. Then we discussed, that if they had an
agreement between each other such as "inter-domain Dragon" testbed, then
we can have another DNRM-AB (one cloud that encapsulates the two smaller
ones) for advertising and therefore configuring. In this case if a user
request comes in that requires a lightpath across domains A and B, the
RB on behalf of the user can make a single request to DNRM-AB.   Let me
know what the community's thoughts are ....

Kind regards,


Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
+1 919-248 -4121

Gigi Karmous-Edwards wrote:
 Hi Jerry and All,

 Ok Jerry,  I stuck with you on  your insightful email ( I started your
 email a couple of weeks ago and just finished it this morning :-) ).
 If I can summarize your assertions : When an interdomain lightpath is
 requested, the resource broker (RB)  (which is a servant of a user
 rather than a domain) talks only to the first Domain's NRM (network
 resource manager) and then that NRM talks to the second NRM, and so on
 till the destination. This requires each domain to have established
 some sort of agreement with all adjacent domains. In your second
 scenario it seems the user requested a source RM that is not in the
 RB's "domain" and that the RB will have to forward it to the right RM,
 then a repeat of the above process.

 I think what you described is the ultimate goal of the community,
 however, due to complexities of the current infrastructures (NRENs,
 Research testbeds, Global government networks, etc) that require
 interoperation, it seems that we first need to take small "baby
 > steps".  Existing infrastructures include a variety of  technologies,
 > different management (TL1, SNMP, CLI, etc.) and control plane  (very
 few deployments of GMPLS) tools for configuration and fault
 management, also current procedures for information exchange between
 network domains range from protocols to phone calls/emails. These
 complexities and other "policy" related challenges force us to break
 the problem up into smaller functional blocks. I think the framework
 presented will give us a path forward based on "baby steps" to finally
 reach the scenario you describe.

 I see the problem as having three key challenges:
 1) Information dissemination (where is what resource?  what are its
 characteristics?  what are its policies for use?)
 2) Capability to request reservations on resources globally once
 discovered ( standard interfaces to query resource managers, with "NO"
 restrictions on how each resource manager accommodates each request,
 reuse of existing implementations)
 3) Scalability (  division of labor among  functional components and
 responsibilities per domain)

 The assumption in the framework sent out has been that an RB takes
 requests from a particular domain's user/application but behaves as a
 servant of the domain not a single user. In this case there will be
 several RBs worldwide, but not one for each user, rather one or two
 per domain. It is assumed that the knowledge of the different
 resources  globally will be published per domain in a very distributed
 fashion (each RB will publish the resources and their characteristics
 hopefully using the schema from the OGF Network Markup Language
 working group. A query from one RB to the "distributed GLIF resources"
 will use a type of crawl mechanism to match the requested resources
 with the "published" resource information that each domain RB
 publishes on behalf of its RMs.  The assumption is, the information
 published by the RBs is not static and will be updated by each RB when
 necessary.  This email is already getting too long, I suggest that we
 have a conference call and use a WEB based slide sharing application
 to go through some scenarios. Any interest?

 To summarize, the strategy in your email will be the goal of the
 community but it will take a while. I think, as a community we can
 start to develop standard interfaces for the various RMs such as the
 Generic Network Interface (GNI), this will help us towards
 interoperability in today's environment.

 Please let me know if we should have a GLIF control plane conference
 call in the next few weeks?

 Kind regards,


 Gigi Karmous-Edwards
 Principal Scientist
 Advanced Technology Group> MCNC RTP, NC, USA
 +1 919-248 -4121

 Jerry Sobieski wrote:
 Good comments both Steve and Bert...let me chime in:     (this is a
 bit long, but I think it is relevant)

 I too think the reservation phase in each domain must be atomic -
 there are effective ways to do this.   The overall process though
 becomes two phase:  HOLD a resource for some finite holding time and
 provide an ACK to the requestor.  At some later time the RM will
 receive a CONFIRM from the requester, or a RELEASE.  If the hold time
 expires, the resource is released unilaterally.    On a macro basis,
 the reservation of the entire end-to-end lightpath must also be held
 in the HOLD state while the rest of the application resources are
 reserved as there may  be a dependency between availability of
 non-network resources and the reserved lightpath.
 As Steve suggests, this atomic two phase mechanism is used in many
 other similar reservations systems.

 The issue I am concerned about is the roles of the RB and RM. I think
 the RBs will be numerous - possibly one for every user.   I believe
 we must assume that all networks will default to a stringent "self
 secure" stance and will only allow access to its RM from known and
 >> trusted peers.   It doesn't scale for every network to "know" about
 every other RB in the world (RBs are agents of the user - not of the
 network)  Therefore, for scalability and security reasons, these
 resource reservation requests must be made between directly peering
 networks, and each network is responsible for recursively reserving
 the resources forward toward the destination.   This is still a two
 stage commit as described above but it solves two problems:  a) it
 scales much better as each network only needs to expect queries from
 its direct peers (and customers) and b) it allows each network to
 negotiate aggregation policies with its peers for services (enabling
 economies of scale and global reach).   This is not unlike how we
 place a phone call to anywhere in the world - we don't go asking each
 network if we can use it, we ask our service provider to do so, they
 ask theirs, and so on, and so on,...

 The above scenario assumes the RB poses the service request to the RM
 serving the source end of a path.   There is a [common?] case where
 the RB is not at the endpoint(s) and does not know of any RMs at the
 endpoint (or in the middle for that matter).   This brings us to
 another assumption I think we must make: a RB only knows its *local*
 network RM.   An appropriately designed algorithm should/could
 forward the request to the source address RM using the same
 forwarding process as the reservation (but crossgrain toward toward
 the source), and then the request can be serviced forward normally as
described above. (This is the "third party" provisioning scenario.) An alternative model asumes a "minion" agent at the path endpoints
 that is owned by the end user and knows of its local RM- the minion
 agent acts as proxy for the RB and makes the reservation request to
 the minion's RM.  (got that?:-)    I think we *can* assume that the
 RB knows of these minions since they reside at the end points (source
 or destination) at a well known port.

 It is important to note that this process relies on each network RM
 (not the RB) knowing constrained reachability of all endpoints - not
 unlike current interdomain routing protocols.   This allows the RM to
 postulate which "nexthop" network will provide the best path and try
 that first. If the RM knows more than just reachability - i.e. if it
 knows topology, then the RM can  select a more specific candidate
 path and, via authorized recursive querires, can reserve the
 resource.   Only the RM responsible for a network knows the state and
 availability details associated with the internal network resources,
 and therefore only the local RM can authoritatively and atomically
 reserve the resources in that network.

 The beauty of this process is that from the RB perspective, the RB
 need only ask one RM for the entire end-to-end network path.  The RM
 will either return a ticket indicating a path was successfully
 reserved that meets the requested service characteristics, or a NACK
 indicating that the resource was not available for some reason.  The
 user must change the requested services parameters somehow before
 trying again - i.e. change the source or destination addr, the start
 time, the capacity, etc.)

 As Gigi states, once all application resources are reserved in the
 HOLD state, then all must be CONFIRM'ed which will lock in the
 At some delta-t later (which could be 0) there is a separate process
 that causes the reconfiguration of the network elements to make the
 reserved resources available for actual use (i.e. the provisioning or
 signaling process).   This process must be correlated to a previous
 reservation and so the provisioning request (separate from the
 reservation request) must contain some indicator that is trusted by
 the network and indicates which reservation is being placed into
 service (see Leon's work on AAA)

 Note that none of the above is predicated on any particular routing
 >> or signaling protocol...  That being said (:-), DRAGON has
implemented much of this functionality using GMPLS protocols. -The DRAGON Network Aware Resource Broker (NARB) is analogous to the
 network RM and performs the path computation recursively reserving
 the resources along the way..  It returns a path reservation in the
 form of an Explicit Route Object (ERO) to the source requestor.  This
 loose hop ERO specifies a path consisting of ingress and egress
 points at each network boundary.      -RSVP then uses this ERO to
 provision the multi-domain end-to-end path.     -The DRAGON
 Application Specific Topology "Master" is an agent analogous to the
 RB mentioned above.  AST Master queries all the various resource
 managers (compute nodes, storage, instruments, network, etc) to
 reserve groups of dependent resources.  There is a significant
 protocol exchange defined for ASTs to construct a workable physical
 resource grid for the application. What DRAGON has not yet
 implemented:   We have implemented scheduling and policy constraints
 in the traffic engineering database, but we have not yet implemented
 the path computation to use those constraints (this will be coming
 We have atomic reservations, but have not implemented the two phase
 commit - though we have long recognized it as critical to the
 bookahead capability and a robust integrated resource scheduling

 Thanks for sticking with me on this ...:-)

===========End of original message text===========