Final Project Presentation Schedule
Thursday, April 29, 2004
Note: Presentations will be strictly limited to 6.5 minutes each and should begin on time. There will be two projectors setup, so the next group should be done setting up well before their slot begins. We'll cut you off if you run over.
1:05 - 1:11 Prototyping a DHT Oriented Architecture
Maxwell Krohn, Jeremy Stribling and Michael Walfish
We attempt to decouple identity from location in Internet hosts.
In our proposal, hosts receive flat identifiers in a large
and sparse namespace, and an Internet-wide distributed hash table
(DHT) acts as a resolver by mapping these flat identifiers
to IP addresses in analogy with today's DNS. Unlike
DNS names, the source and destination identifiers appear in
packets, in a shim layer after the IP header. Our proposal
would change all host software but leave the core
routers untouched. The advantages of our proposal are: (1) the
benefits of decoupling location from identity and (2) enhanced
middlebox functions.
1:12 - 1:18 Automated Testing of Distributed Systems
Nathan Boy, Jared Casper, Carlos Pacheco, Amy Williams
We present a technique to test servers that interact with clients using
the Sun RPC protocol. The technique requires the user to provide two things:
a list of RPC calls for the server being tested, and a set of invariants
that are required to hold over the RPC communications trace. The tester
works by generating random sequences of RPC calls and checking that the
invariants hold over the tests' traces. If an invariant is violated,
the violating sequence of RPC calls is reported to the user. To illustrate
the validity of our system, we use it to test a block server and a lock
server.
1:19 - 1:25 Global Optimization by Building Better Peer Neighbourhoods in BitTorrent
Asfandyar Qureshi
BitTorrent [1] is popular file-sharing tool, accouting for a surprisingly large proportion of Internet traffic. Files are divided into fragments and transferred out of order among nodes
trying to download them. Individual nodes penalize and reward other nodes, adjacent in the BitTorrent overlay, depending on how willing they are to share data. This incentives scheme and the subsequent enforcement of sharing is credited with making BitTorrent outperform other contemporary file-sharing systems.
Nonetheless, BitTorrent builds its overlays by
selecting random peers for newly joining nodes,
a fact that can seriously handicap both
individual performance and waste global network resources.
This paper investigates schemes which
attempt to build a more intelligent overlay network,
particularly using network delays to select
overlay peers that are close by in the underlying
network. We evaluate our techniques both from
the network's perspective (resources used) and
from the individual's perspective (average time
to complete a download). In both cases, our
results show that better peer selection can lead to
improved performance with no other changes to
the BitTorrent protocol.
1:26 - 1:32 Portable Reputations with EgoSphere
Keith Bonawitz, Chaitra Chandrasekhar, Rui Viana
Many online services require some form of trust between users trust that a seller will deliver for goods as advertised, trust that an author's thoughts are worth the time spent on reading them. To accommodate an internet community where users are constantly interacting with strangers, online services often construct proprietary reputation management systems for their community, with the side effect of locking users into that service if they wish to maintain their reputation. In contrast, this paper outlines EgoSphere, a system for portable Internet reputations, so that reputations built on one service can be used elsewhere. EgoSphere hinges on the use correlational statistics to automatically project reputations from one service to those services which have related reputations. To achieve these goals, EgoSphere must gather reputation information from webservices. EgoSphere avoids the unreasonable expection that all webservices to publish their reputation databases, while also avoiding violating robots.txt restrictions and incurring additional website load, by gathering its reputation data with using a distributed passive robot system which simulates the function of a standard webcrawling robot by using webproxies on users' computers to analyze the responses to standard webservice requests. This paper outlines the design and proof-of-concept implementation of EgoSphere, targeted specifically at providing portable reputations for bulletin-board style internet services.
1:33 - 1:39 DIBS: Distributed Backup for Local Area Networks
Eugene Hsu, Jeff Mellen and Pallavi Naresh
DIBS (Distributed Intranet Backup System) is a distributed file backup system for computers in local area networks that lack dedicated or secondary data storage. DIBS exploits the unused disk space among a set of connected machines and the high speed of local area networks to provide a transparent peer-to-peer backup solution. Each node in the DIBS system contains a module that manages its list of critical files and the files it stores for other nodes, a server that responds to requests from other nodes, and a client interface to perform backup operations. Each node is responsible for its own files, and, given adequate space on the network, that it is replicated on at least one additional machine. The system is designed to provide complete recovery for a crashed node, recovery from transient machine failures, protection for mistaken file overwrites and deletion, and flexibility in the face of changing network-wide storage conditions.
1:40 - 1:46 Peer to Peer Multicast for Live Streaming Video
Maya Dobuzhskaya, Rose Liu, Jim Roewe and Nidhi Sharma
The high bandwidth requirements necessary for serving live streaming video and audio content restrict normal Internet users on cable or dialup modems from distributing such content to multiple viewers. With the growing popularity of hosting live video content, this high bandwidth requirement is becoming a widespread problem. Unlike with pre-recorded video, people viewing live video all receive the same content at approximately the same time. Our system exploits this temporal locality in viewer streams to redistribute the bandwidth load on the server among the clients. We use a peer-to-peer multicast scheme of content distribution to dramatically reduce the server's bandwidth requirement, enabling low bandwidth home users to serve high quality live video to 50-100 clients.
1:47 - 1:53 Robust BitTorrent
Nirav Dave, Albert Huang, Yuan Shen, Edmund Wong
BitTorrent was designed to distribute a single set of data to a large
set of nodes quickly. To accomplish this, a client downloads the file
from both nodes that have the entire file (the seed) and other
downloaders (peers) who have parts of the file. A centralized tracker is
used to find other peers. As a result, this tracker is vulnerable to
DDoS attacks, since this tracker is statically specified.
Our goal is to make the BitTorrent system more robust against
trackerdirected attacks. We intend to accomplish this by adding the
ability to dynamically designate new trackers should one fail, and by
replicating the tracker k times in order to guarantee a graceful
rollover.
Much like Gnutella, nodes connect to the network by means of searching a
list of previously observed nodes on the network. Once on the general
network, nodes can initiate a search, which indirectly queries a
distributed hash table (DHT) to obtain a tracker for the desired file.
Through this, a client can connect to the BitTorrent system to download
the desired file.
1:54 - 2:00 A Library for Parallel Genetic Algorithms
Ariel Rideout, Ryan Williams and David Ziegler
Genetic Algorithms are favored for optimization problems because they lend themselves to both continuous and discrete combinatorial problems and are less susceptible to getting 'stuck' at local optima in comparison to gradient search methods. Unfortunately, GAs are typically computationally expensive. A distributed toolkit is offered which will allow GAs to be processed across multiple machines. The toolkit ensures an equivalent result as when the problem is approached locally, but significantly diminishes the required computation time. An analysis of the system's projected performance is described.
2:01 - 2:07 WebTorrent: a BitTorrent Extension for High Availability Servers
Gary Sivek, Steven Sivek, Jonathan Wolfe, Michael Zhivich
Achieving content high-availability is one of the most important goals
of a webserver system.In order to achieve high-availability in the traditional server-client setting, the server must have the bandwidth and the hardware needed to handle any peak load that might occur.
However, this is a very costly and rarely practical solution, especially for servers that are subjected to the Slashdotting effect. We propose a WebTorrent system that is based on BitTorrent and willleverage the resources of the clients to help the server make the content more available. Such a system will alleviate the load on the server and reduce the client download times.
2:08 - 2:14 P2P Objects
Wenyan Dong, Sonam Dorji, Sachin Katti, Dimitris Vyzovitis
In a peer to peer system autonomous systems pool their resources (e.g.
files, storage, compute cycles) in order to inexpensively handle tasks
that would normally require large costly servers. Distributed applications
are then built on top of this which take advantage of the resource scaling
(music file sharing, co-operative computation of complex tasks etc). In
most such distributed systems communication is essentially one to many.
Retrieving a particular object involves finding the set of nodes who hold
that object, querying a set of nodes holding particular types of resources
means communicating with all those nodes and so on. Thus the problems boil
down to efficient object location and grouping of objects based on objects
types. Thus a object based multicast primitive is a fundamental building
block for almost every distributed application.
DHTs have recently emerged as an elegant and scalable building block for
P2P systems. DHTs provide a get and put interface for keys, which provides
applications with an efficient routing substrate. But they are designed
for exact lookups i.e. given a key a DHT can efficiently find the node
responsible for that key. Combining this with a mechanism to efficiently
group nodes based on the type of resources they hold and enabling nodes to
communicate with these groups using multicast, a powerful and flexible API
for application developers can be provided.
In this paper we present an efficient way to build multicast trees based
on object types and query them. The multicast group formation algorithm is
scalable and has the inherent strengths a DHT has: low control overhead,
logarithmic bound on routing etc. We intend to provide an
implementation and use it to build a querying mechanism to locate files
efficiently based on complex predicates and not just keywords.
2:15 - 2:21 Advanced Backup Service
Joseph Cooley, Alen Peacock, Chris Taylor
Many computer users maintain machines with no backup scheme for protecting
their data in the event of a loss or failure. At the same time, many
user's computers are likely to contain spare disk space and unused
networking resources. We present the Apportioned Backup System (ABS),
which provides a reliable collaborative backup resource by leveraging
independent, distributed resources. With ABS, procuring and maintaining
specialized backup hardware is unnecessary. ABS makes efficient use of
network and storage resources through use of coding techniques, convergent
encryption, and difference versioning. The system also painlessly
accommodates dynamic expansion of system compute, storage, and network
resources, and is tolerant of catastrophic node failures.
2:22 - 2:28 An Instant Messager Application based on OpenHash
Tamara Yu, Ryan Manuel, Joanna Liang, Timothy Chan
We propose an OpenHash-based architecture that avoids heavyweight web service tools and that is suitable for building a lightweight instant messaging (IM) application. OpenHash provides the backbone to a distributed storage medium using a simple get() / put() interface. Our design emphasizes scalability and reliability in order to achieve an application that is competitive with existing commercial IM applications.