Monday, April 14, 2008

Google Summer of Code

Google Summer of Code 2008 is on! Over the past three years, the program has brought together over 1500 students and 2000 mentors from 90 countries worldwide, all for the love of code. We look forward to welcoming more new contributors and projects this year.


** Globus: Google Summer of Code 2008 Ideas

New execution and data transfer providers

Globus project: Swift

Description: The Java CoG kit provides an abstraction for process execution and file transfer (for example, local execution, local filesystem copy, over ssh/scp, GridFTP, GRAM2, GRAM4, direct submission to the PBS scheduling system). Execution and transfer providers can then be used by higher level applications such as Swift in order to move data to execution sites and to perform application execution without needing to be particularly aware of how that execution and transfer is happening. An interesting project might be to implement a provider for some existing execution or transfer mechanisms so that they could be used as part of CoG.

Requires: Decent Java programming skills. A favourite execution or data transfer mechanism


Mentor: Ben Clifford


Integration of GridFTP with Freeloader storage system

Globus project: GridFTP

Description: GridFTP is a high-performance, secure, reliable data transfer protocol optimized for high-bandwidth wide-area networks. It is based on the Internet FTP protocol, and it defines extensions for high performance operation and security. Striped data transfer (aka cluster-to-cluster data transfer) is a key feature that utilizes multiple CPUs and NICs to achieve higher performance. In striped mode, however, GridFTP assumes the support of a high-performance parallel file system, a relatively expensive resource. Freeloader is a storage system that aggregates the idle storage space from workstations connected to a local area network to build a high-performance data store. FreeLoader breaks files into chunks and distributes these chunks across the storage nodes. This accelerates read/write operations as they can benefit from the parallel access to multiple disks. This project aims to integrate GridFTP and FreeLoader to reduce the cost and increase the performance of GridFTP deployments.

Links related to this project


** Project Ideas from the Ohio Supercomputer Center



  1. Improved scalability in pbsdcp scatter implementation
    Mentor: Troy Baer
    Programming Language(s): Perl, C with MPI
    License: GPL

    pbsdcp is a distributed copy command for PBS and TORQUE batch environments that is part of OSC's pbstools package. It is used to copy files between shared directories (e.g. NFS home directories) and local storage on a set of compute nodes (e.g. /tmp). It has two modes of operation: scatter, in which files in a shared directory are copied into local file systems on each of the compute nodes; and gather, in which files in local file systems on each of compute nodes are collected into a shared directory

    The scatter mode in pbsdcp is currently implemented in a rather naive fashion: for each node, it forks an rcp on the source files with a destination directory on that node's local storage. This means that the amount of data which must be transferred from the shared storage scales linearly with the number of nodes. We would like to replace that implementation with something more scalable, such as a tree-based or store-and-forward distribution scheme. Moreover, we would like this to use MPI for communication between nodes if possible, so that the high-performance Infiniband and Myrinet networks in our (and similar) clusters will be used for as much of the data transfer as possible.

    1. Improved scalability in all
      Mentor: Rick Mohr
      Programming Langauge(s): C
      License: GPL

      all is a distributed shell command built on top of rsh used by OSC and other sites. It allows commands to be run on either all or an arbitrary subset of the nodes in a cluster, either sequentially or in parallel.

      The parallel mode of all currently has a scalability problem on clusters larger than about 200 nodes. Because all uses rsh and rsh wants to use privileged ports (i.e. port numbers below 1024), parallel executions of all run out of the necessary ports for node counts above 200 or so. One solution to this problem would be "chunking" or "batching"; that is, starting up at most a fixed number (say 128) of rsh connections and then only starting more once the first few rshes have completed. (Similar logic can be seen in OSC's parallel command processor.)

      Alternately, a project to add some of all's features, such as its relatively simple syntax and PBS/TORQUE integration, to another distributed shell command such as pdsh would also be considered.



** dev:sahana_gsoc08_ideas



No comments: