GLIF: Re: [GLIF controlplane] RE: Network Control Architecture

Subject	Re: [GLIF controlplane] RE: Network Control Architecture
From	Harvey Newman <Harvey.Newman@xxxxxxx>
Date	Sat, 28 Apr 2007 08:46:56 -0700

Gigi Karmous-Edwards wrote:

Harvey and All,
I like your list of points on service-oriented architecture. I thinkthe framework presented provides a strategy to accomplish each of yourstated points. The first two is really the policy of the resourceswhich need to be defined by the resource manager and honored by theresource broker. The rest of your points relate mainly to themonitoring system's ability to check for SLA violations, topologydiscovery and performance, and the feedback loop between the monitoredinformation and the resource allocation.

Yes this is part of it.
Many of the characteristics are required to ensure the
degree of scalability needed.
The ability to measure end-to-end performance enables
one to devise applications that use network as needed;
including simply filling it up with TCP traffic, with a moderate
number of flows.

Regarding ML vs. PerfSONAR, I think they will both co-exist and wemust assure interoperations between them for global interoperabilityto work.

Yes this is true.
However the relative relationship will depend on what can/cannot
 be done in each. I don't think PerfSONAR will probe and
 characterize end-systems, for example. Nor evaluate
 performance on behalf of VOs in aggregate and make
long-term adjustments.

In my opinion based on the use of MonALISA in the EnlightenedComputing project, it works very well and we were able to request theaddition of some features for checking lightpath connectivity on ourtestbed (they were implemented by the ML team). I do think ML hasextremely well developed architecture and implementation and isfeature rich. However, we found that moving forward, the number of newfeatures we require will need to implemented by ourselves rather thanhave others (ML team) do it for us, due to time constraints.

We have areas we do not intend to develop, such as GMPLS and interdomainpath building using

GMPLS.

Otherwise we would have to see what needs to be done and who couldeffectivelydo it faster. Our first impression is, that unless there are a trulylarge set of developerswith a very high level of expertise and familiarity with suchextensible, loosely coupledintelligent real time systems, it is not possible to write efficient andfault-free services

of this type in a short time.

If there are a small number of capable developers, then they can come onboard bysigning over IP-rights for this to Caltech. And then work on APIs toallow (almost)

anything that is needed further to be done by the community. As below, such
developments would need to be tested and qualified before being included in
any release.

We therefore concluded that an open source monitoring solution will benecessary for us. Within the GLIF community, the solution formonitoring will be one that is constantly evolving to meet the needsof the emerging applications, and emerging network architectures andtechnologies. An open source solution will allow for many teams todevelop rich feature sets which then can be shared by others for theirspecific needs. It will be really helpful if ML can be made to be opensource for the GLIF community.

We find we need to restructure MonALISA to shield only what is
necessary in the core. ML underlies other systems, notably EVO that
is being commercialized for use outside the R&E community.
We would need resources to establish and
maintain ML in a(n almost entirely) open source model. Also, a
system as capable as this is not trivial to develop. We know from
experience that doing so in an effective way will require supervision

and training, qualification, organized operations, and filtering andtestingpotential new developments before including them in the next release (ornot)

which is again a matter of sufficient resources, to maintain a somewhat
larger and sufficiently expert team.

Many of the services the community would like to deploy don't reallyconcern thecore or main services, but work at the edges, or on particularservice-components

(e.g. GMPLS as one of the technologies in Layer 1-2-3 path building).

Edge services are effectively done in the APMon client and/or the LISAagent,

and components can be interfaced to the MonALISA services. .

Iosif may wish to comment further.

Best regards
Harvey

Kind regards,
Gigi

--------------------------------------------

Gigi Karmous-Edwards
Principal Scientist
Advanced Technology Group
http://www.mcnc.org
MCNC RTP, NC, USA
+1 919-248 -4121
gigi@xxxxxxxx
--------------------------------------------



Harvey Newman wrote:
This is a limited view that will run into the same problems as arewell-known
from RSVP. One will never get to reserve a multi-domain path this way.

Operational steps in a services-oriented architecture:
(reservations are stateful, time-dependent, and responsive to
capability to use the allocated resource):

(1) AAA, with priority schemes and policy expressed by each VO.
(2) Inter-VO allocations according to quotas; coupled to tracking of
       what has been used during a specified time period
(3) Service to verify end-system capability and load as being
       consistent with the request
(4) Agents to build the path and verify its state (up, down,
       which segment(s) are down or impaired) also agents to
       verify end-system capability (hardware, system and kernel
       config., network interface and settings); verification
       of end-to-end  capability with an active probe (viz.
       FDT); build or tear down circuits in parallel in a
      time < the TCP timeout.
(5) Tracking of capability (if relevant, as in large scale data
       transfer)
(6) Adjustment of channel capability if allowed, according to
       performance end-to-end. For example with LCAS
     [allocation of a non-adjustable channel takes longer,
       and becomes an economic question.]
(7) Adjustments driven by (a) entry of higher priority
      allocation-requests; these could affect many or even
      all channels or (b) re-routing of certain flows if better
      paths become available (c) optimization of workflow according
      to deadline scheduling for certain flows

Except for the higher-level "strategic" parts above (policy and
quotas; which need to come from the VOs), many of the technical pieces
above exist, and will be hard to match.

Harvey



Steve Thorpe wrote:
Hello Bert, everyone,
The point Bert made "...if the pre-reservation of resources is notan atomic action..." is very important.
My belief is the pre-reservation of resources, or Phase 1 of a2-phase commit protocol, *must* be atomic. That is, there must be aguarantee that at most one requestor will ever be granted apre-reservation of a given resource. Then, the requestor shouldcome back with a subsequent "Yes, commit the pre-reservation", or"No, I release the pre-reservation". In the case where therequestor does not come back within a certain amount of time, thenthe pre-reservation could expire and some other requestor could thenbegin the 2-phase commit process on the given resource.
There may be situations where a resource broker can not get thedesired resource reservation(s) booked. But, I don't seedeadlocking here - where both resources can *never* be booked.Unless of course, a resource broker books them once and is allowedon to them forever.
The atomicity of the pre-reservation (phase 1) stage of the 2-phasecommit process is a very critical part for this to work.
Steve
PS I have also added Jon MacLaren to this thread, as I'm not surehe's on the GLIF email list(s).
Bert Andree wrote:
Hi Gigi,

What exactly dou you mean with one RB per request.
Suppose there are two independant RB's,RB-A and RB-B and tworesources, RS-1 and RS-2.Suppose that there is a request to RB-A to book both resources anda request to RB-B to do the same. Now, if the pre-reservation ofresources is not an atomic action, two different strategies mayintroduce specific problems.
Stategy 1: an availibility request does not reserve the resource:
RB-A asks for RS-1 (available)
RB-B asks for RS-2 (available)
RB-A asks for RS-2 (available)
RB-B asks for RS-1 (available)

RB-A confirms RS-1 (success)
RB-B confirms RS-2 (success)
RB-A confirms RS-2 (fail)
RB-B confirms RS-1 (fail)
The obvious solution would be to free all resources and try again.In complex systems there is a fair chance that both resources cannever be booked (deadlock).
Stategy 2: an availibility request reserves the resource:
RB-A asks for RS-1 (available)
RB-B asks for RS-2 (available)
RB-A asks for RS-2 (not available)
RB-B asks for RS-1 (not available)
RB-A and RB-B free all resources and try again. In complex systemsthere is a fair chance that both resources can never be booked(deadlock).
The only way to prevent is, is to have some queing of requests andeven then "individual starvation", e.g. RB-A can never book anyresources is possible in complex systems.
Best regards,
Bert

Gigi Karmous-Edwards wrote:
Hi Admela,
I agree, there are two phases, 1) check availability from xRM, and2) If all xRMs give an ack. then go th second phase of commit, 2')if one or more xRM gives a nack, then do not proceed to the phasetwo commit. In the architecture sent out, the responsibility ofcoordinating and administering the two phases is in ONE RB perrequest. Each xRM will rely on the RB to tell them whether toproceed to a commit or not. If they get a commit from an RB, itthen becomes the xRM's responsibility to make the reservation andallocation in the actual resources. I think if for example RB-Atalks to an xRM in domain "B", then it may be the responsibilityof the xRM-B to tell its own RB-B of its interaction with RB-A.Is this in line with your thoughts?
Gigi

References:
- Network Control Architecture
  - From: Gigi Karmous-Edwards
- RE: Network Control Architecture
  - From: Inder Monga
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Gigi Karmous-Edwards
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Gigi Karmous-Edwards
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Bert Andree
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Steve Thorpe
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Harvey Newman
- Re: [GLIF controlplane] RE: Network Control Architecture
  - From: Gigi Karmous-Edwards

Prev by Date: Re: [GLIF controlplane] RE: Network Control Architecture
Next by Date: Re: [GLIF controlplane] RE: Network Control Architecture
Previous by thread: Re: [GLIF controlplane] RE: Network Control Architecture
Next by thread: Re: [GLIF controlplane] RE: Network Control Architecture
Index(es):
- Date
- Thread