Regions: A New Capability for Networks
Karen R. Sollins
Introduction
Naive pictures of the Internet frequently portray a small collection
of hosts or LAN's connected by a "cloud" of connectivity. The truth is
more complex. The IP-level structure of the Internet is composed from
a large number of constituent networks, each of which differs in some
or all of transmission technologies, routing protocols, administrative
models, security policies, QoS capabilities, pricing mechanisms, and similar
attributes. On top of this, a whole new structure of application-layer
overlays and content distribution networks, equally diverse in the sorts
of ways mentioned above, is rapidly evolving. Virtually any horizontal
slice through the current Internet structure reveals a loosely coupled
federation of separately defined, operated, and managed entities, interconnected
to varying degrees, and often differing drastically in internal requirements
and implementation. Intuitively, it is natural to think of each of these
entities as existing in a region of the network, with each region having
coherent internal technology and policies, and each region managing its
interactions with other regions of the net according to some defined set
of rules and policies.
The thesis of this project is that this basic, general notion of a region
is a powerful tool for managing the increasingly complex demands on the
Internet and its successors, and thus should made into an explicit, first-class
component of the network architecture. We suggest that the many uses of
the region concept share a well-defined set of core ideas and operations,
and that it is therefore useful to separate the concept of "region" from
any specific use of the concept and provide it as an independent, reusable
abstraction. We further suggest that the region abstraction can be implemented
as a concrete software entity, and that doing so will provide protocol
designers and implementers at all levels of the stack with logically structured,
scalable, high-performance access to an important class of functionality.
The Approach
The region is a generalization of a group and a bound in some space.
The two defining concepts of a region are a set of contained members,
which share some common characteristics, and a boundary, which allows
us to capture the notion of actions taken when entering and leaving the
region. Each characteristic defines a set of values defined over an attribute.
As a result, we can view a region as a subspace of the multidimensional
space defined by the set of attribute types used for this region. This
allows for a formal definition of the boundary of a region. In addition
to our formalism, the implementation of the region class permits us to
provide self-adaptation to changes in size or usage patterns, allowing
the same abstraction to apply under widely differing demands on scalability,
performance, and efficiency. We find these ideas useful in many places
in networking to address problems such scaling, distribution, efficiency,
heterogeneity, cost recovery, and many others. We have proposed in this
research to define and develop the region abstraction and explore its
utility as a general mechanism in networking. To do this each student
has studied or is studying an aspect of the region concept.
- Interlayer information control: In this research, Ji Li studied
regions as the basis for communication among regions of differing sorts.
In this case, we considered two types of regions, Autonomous Systems
or ASs, used as the basis for routing at the IP layer in the Internet,
and overlay networks that do their own routing at the application layer.
The question we addressed in this research was whether we could use
information from the lower level regions, ASs, to inform and improve
routing in the higher level regions, or overlay networks. The performance
problem that generally has been an issue for structured or DHT-based
peer-to-peer systems is latency. Such a P2P system initially places
its content even distributed around the network. If we could use the
DHT system to look up information that had been placed more efficiently,
latency could be reduced significantly. The common approach that has
been taken to achieve this is for each P2P node to learn the latency
to each of its peers, initially by use of "ping", and in some cases
now using beacons conveniently placed in the network. By approximating
latency using AS-path (the path or length of the path in terms of ASs)
we find that we have significantly reduced two factors, network traffic
for learning about latency and amount of storage required in the nodes.
The tradeoff is that although we improved performance significantly,
it was not quite as good as with the mechanisms that reside only in
the application layer. We are still convinced by the engineering tradeoff,
that there is significant value in utilizing lower layer information
effectively, and that regions allow us to provide a valuable abstraction
for this.
- Garbage Collection: Since there will be many regions and an
entity may participate in more than one, and additionally the implementation
of a region may be distributed, Clyde Law took on the question of whether
and if so where garbage collection would be necessary. The problem for
which garbage collection seemed most important is within a single region,
specifally, if the implementation of the region is distributed. For
the semantics of some of the region functions to be correct, it will
be important that the distributed components that comprise the infrastructure
of a region be as consistent with each other as possible. In part, that
means that it will be important for each such component to know when
a member of a region has departed from that region, either simply because
it is no longer part of the region or because the element itself no
longer exists. In this thesis, Law explored appropriate algorithms and
built a proof-of-concept implementation.
- Security at the boundaries: One question that we considered
to be particularly important was whether the concept of the boundary
could provide a security perimeter as well as something more abstract.
One of the problems that network and security managers face is that
distributed firewalls are extremely fragile and difficult to manage.
Joshua Baratz took on the question of whether region boundaries might
not provide a more manageable abstraction than distributed firewalls.
A key component of this problem space is how to guarantee that traffic
destined to travel between source and destination can only reach the
destination by passing through an appropriate component of the boundary.
To this end, he built a proof-of-concept implementation and found that
the region abstraction because it can be viewed at the application layer
is significantly easier and therefore more manageable. With the appropriate
certificate requirements and management one can guarantee that traffic
will handled correctly.
- Adaptability of Region Membership in the Face of Anarchy: Rob
Beverly took on the question of adaptation of membership in regions.
The issue is how members might cluster themselves effectively into regions
in order to achieve their tasks with most utility to themselves. The
approach taken concentrates on the fact that in an economically viable
environment it is hard to believe that altruistic superpeers, as proposed
for Gnutella will survive. Instead, we can ask whether an unstructured
system such as Gnutella can self-organize into useful clusters based
exclusively on anarchic evaluations of utility to each peer. The answer
demonstrated in this project was that one could do that, as measured
by widely accepted criteria. As a demonstration and validation of his
work, Beverly collected Gnutella data, considered the utility of a peering
relationship between two nodes based on their mutual interests as identified
by which files they already contained (through analysis of files names,
where available) and their queries.
- Region Boundaries: As suggested above, we have explored particular
uses of region boundaries, but Neha Singh set out to study them more
generally. The questions one needs to ask here are 1) how does one figure
out which region to visit, 2) how does it find the regions, and 3) what
happens at region boundaries that is distinctive and important. For
purposes of this study Singh worked in a mobile agent system in which
agents develop an itinerary based on their objectives, in order to determine
which regions may provide the services they need to reach their objectives.
We chose an agent system as a generalization of other more limited forms
of mobile entities that might be moving through the region system, such
as packets, other data, or pure code such as applets. In this case,
each member of a region advertises to the region its functions constraints.
In turn the region summarizes this information as well as its own constraints,
and advertises them to a region directory. Thus, the information made
widely available is only approximate for each region. The agent then
is given its objectives, to be compared with the contents of the region
directory to determine candidate regions to visit and an ordering on
them. By this means an itinerary is defined for the agent in terms of
URNs or globally unique identifiers for the regions, each of which can
be translated into an IP address when needed. At any time during the
travels of the agent this itinerary can be modified or updated. With
the itinerary formed the agent moves to the entry point of the first
region on its itinerary. There it determines whether an exact match
can be made within the region, as well as any determination about authentication,
authorization and payment policies. The agent may make one or a number
of stops within the region and may recompute its itinerary at any time.
It will then move on to another region, possibly stopping again at the
region boundary for final reconciliation wiht the region on its way
out. This study has explored how to support scaling (through nested
advertising and hence nested decision-making about suitability), as
well as the functions of policy and pricing models at and within the
boundaries of regions. The thesis will be completed in May 2005.
- Supporting Efficient Functions in Ad Hoc Networks: Apostolos
Fertis is exploring the use of regions to improve the performance of
functions in ad hoc wireless networks. One of the key features of an
ad hoc wireless network is that it can reorganize itself dynamically.
Typically these networks are organized into clusters in order to reduce
either network transmissions or power consumption, which are closely
related. In this research an additional criterion is used, clustering
in order to improve performance of functions over subsets or regions
of the ad hoc network. The first step is to determine an efficient partitioning
of the function, in order that more isolated subsets of nodes can be
accessed. Thus, for example, an averaging function can be partitioned
into a series of paired functions for the count and sum of a subset.
These can be merged, by summing the sums and summing the counts, up
through the nested regions until the root of the nesting is reached,
at which point the sum of the sums can be divided by the sum of the
counts. A function such as this can be partitioned to any desired depth,
down to pairs of nodes. In order to create the structure of subregions,
Fertis has devised an algorithm which we believe has significantly improved
performance characteristics to those in existence and can be performaed
in a distributed fashion. In addition, he intends to apply several addition
performance improvements. This thesis will be completed by June, 2005.
Dissemination of Results
All these projects have resulted in or will result in theses submitted
to the Department of Electrical Engineering and Computer Science at MIT.
In addition, there have been a number of published papers on these and
related topics. See http://krs.lcs.mit.edu/regions for the theses and
papers. For a more complete introduction to regions, see
[1]
References:
[1] Karen R. Sollins. Designing for Scale and Differentiation.
In Proc. ACM SIGCOMM 2003 Workshop on Future Directions in Network,
Karlsruhe, Germany, August 2003.
|