Grid FTP Tests

From BingGridWiki

Jump to: navigation, search

OnurDemir and MichaelHead have been working on this project since July 2005.

Contents

GridFTP setup

  1. log in as globus
  2. download http://firefighter.cs.binghamton.edu/~burner/gt4/gt4.0.1-smaller.tgz
  3. extract in /home/globus
  4. Open a terminal and run these commands:
    1. sudo apt-get install build-essential cvs
    2. cvs -d:ext:globus@10.0.0.51:/home/globus/cvsroot co gt4.0.1-small
    3. cvs -d:ext:head@nsrg.cs.binghamton.edu:/usr/local/cvsroot co -r AUTHENTICATION_SERVER gt4.0.1-all-source-installer
    4. unset GLOBUS_LOCATION
    5. rm -rf globus
    6. cd gt4.0.1-all-source-installer
    7. ./configure --prefix=$HOME/globus --disable-prewsgram --disable-rls --disable-wsjava --disable-wsmds --disable-wsdel --disable-wsrft --disable-wsgram --disable-rndvz --disable-wscas --disable-wsc --disable-tests --disable-wstests --disable-webmds
    8. make clean
    9. make globus_libtool # This makes libltdl for gcc64dbg
    10. make all
    11. make install
  5. add export GLOBUS_LOCATION=$HOME/globus to .bashrc
  6. Follow the instructions on http://www-unix.globus.org/toolkit/docs/4.0/security/simpleca/admin-index.html#s-simpleca-admin-installing
    1. make sure the first 'ou' in the simple CA's subject name is the proper hostname for the CA server (I think?)
    2. use 'grid-mapfile-add-entry -f ~/.gridmap -dn <unknown DN> -ln <username>'. This is the file that globus-gridftp-server will look at when run in nonroot mode
    3. run setup-gsi (as instructed by $GLOBUS_LOCATION/setup/globus/setup-simple-ca) with the -nonroot option.

Set up a CA

For reference, here are the steps to set up the CA (cook is the CA):

  1. $GLOBUS_LOCATION/setup/globus/setup-simple-ca
  2. $GLOBUS_LOCATION/setup/globus_simple_ca_c7881362_setup/setup-gsi -nonroot -default Note: just use the commandline as suggested by the output from the previous command
  3. $GLOBUS_LOCATION/bin/grid-cert-request -host 'IP address' Note: fix up the IP address (proper hostname is best)
  4. $GLOBUS_LOCATION/bin/grid-ca-sign -in $GLOBUS_LOCATION/etc/hostcert_request.pem -out $GLOBUS_LOCATION/etc/hostcert.pem


Deploying the CA

To share the certificate authority: this should be done on the AS cook

  1. scp $HOME/.globus/simpleCA/globus_simple_ca_c7881362_setup-0.18.tar.gz othermachine: Fixup: the Hash and othermachine
  2. ssh othermachine \$GLOBUS_LOCATION/sbin/gpt-build globus_simple_ca_c7881362_setup-0.18.tar.gz
  3. ssh othermachine \$GLOBUS_LOCATION/sbin/gpt-postinstall
  4. ssh othermachine \$GLOBUS_LOCATION/setup/globus_simple_ca_c7881362_setup/setup-gsi -nonroot -default


Making a User Cert

To make a user cert:

  1. $GLOBUS_LOCATION/bin/grid-cert-request -nopassphrase
  2. $GLOBUS_LOCATION/bin/grid-ca-sign -in $HOME/.globus/usercert_request.pem -out $HOME/.globus/usercert.pem
  • and if it's on a remote machine:
  1. scp .globus/usercert_request.pem cook:
  2. <code>ssh cook \$GLOBUS_LOCATION/bin/grid-ca-sign -in usercert_request.pem -out usercert.pem
  3. scp cook:usercert.pem .globus/usercert.pem

Requesting and signing a cert remotely

To request a cert on another machine: this should be done on the remote machine not cook

  1. <code>$GLOBUS_LOCATION/bin/grid-cert-request -host 'IP address'
  2. scp /home/globus/globus/etc/hostcert_request.pem cook: Fixup: cook should be the name of the simpleca machine
  3. ssh cook \$GLOBUS_LOCATION/bin/grid-ca-sign -in hostcert_request.pem -out hostcert.pem
  4. scp cook:hostcert.pem $GLOBUS_LOCATION/etc/


Lab Setup

Hacking GridFTP

  • In globus_gridftp_server_control_commands.c::globus_l_gsc_auth_cb(), if the response was a success, notify the active NIC with a UDP packet containing the IP, username, and remote port (if possible).
  • Struct for server->activenic communication
ip address : (result of htonslon) 32bit
timestamp : (long) 64bit
nBytes : 32bit
nBytesRemaining :32bit
username : null term. char*
filename : null term. char* (or hashvalue of some kind)
port : port (short) - use 0 if unknown

Journal Paper

Enhancing GridFTP Performance Using Intelligent Gateways

IJHPCN

The paper is due January 31, 2006. We are running a number of experiments. They all involve calling globus-url-copy from a number of clients. The control connection routes through an ActiveNic router, which can drop and massage the different connections as decided by a program running on the router.

We had a discussion and made some notes about The Graphs

The doc version of the paper is in the cvs now. I still need to do more on experiments section.

Journal Paper as Word Document

The references should be completed. Bios should be added.

HPDC Workshop Paper

HPDC Workshop on Next-Generation Distributed Data Management; Due February 28, 2006

For this workshop, we should improve the test scripts so that instead of running N times, the tests run for T seconds. This will make the output a bit more comparable.

We should also add concurrency to the client scripts, so multiple clients on a machine can be downloading at the same time.

The download test script should do a better job of timing the download process.

We should attempt to do authentication on the host or activenic. Then we can do load balancing and lots of other cool stuff.

Grid Workshop paper

Our plan here was to separate the data servers from the authentication node and provide a backchannel to the activenic to provide a real working solution for grid ftp providers.

Cluster Workshop paper

Repeat the Grid Workshop plan.

Compare server realized throughput, client wait time until data starts == response time?, reliability?

  1. No active NIC
  2. Active NIC with one server (previous experiment)
  3. Active NIC with remote AS
  4. Active NIC with local AS (AS is on ANIC host)

Test clients' effective bandwidth?, server realized throughput against several policies. Check number of requests completed per minute, response time,

  1. let small files through first
  2. smallest percentage remaining
  3. smallest bytes remaining

Outbound Traffic Shaping with an ActiveNIC-based Egress Switch

Future stuff

If we have an outgoing Active NIC

  • Multiple services using same outbound NIC
    • Each app sends client -> IP+Port mapping to active NIC
    • Active NIC queue packets before forwarding. When packets must be dropped and there is a client using a high percentage of the queue slots, prefer to drop his packets.
    • Works for multiple grid FTP servers when a certain client opens many channels and starves other clients
    • Can we discover bottlenecks in the network cloud by looking at the number of retransmits? If so, and client1 has 10 connections and client2 has one connection and client1 is retransmitting at 10x the rate of client2 (but the outbound interface isn't saturated) prefer client2's packets a little.
    • Also works when outbound interface is saturated with gridftp data and ssh packets want to get through, so look also at the packets per connection for the high usage client
      • Even consider the size of the packet. Small packets get priority (ssh vs. scp).
  • Software only Server optimizations
    • Implement fast authentication using pre-shared public keys when possible
    • If a connection has multiple timeouts, cache its file for a while. It should have a faster resumption
    • Improve scheduling based on file usage/disk layout. Attempt to reduce the need to seek around the disk at the application level.

General GridFTP Implementation Notes

α) The Globus Striped GridFTP Framework and Server

β) File and Object Replication in Data Grids

γ) Data Management and Transfer in High-Performance Computational Grid Environments

δ) More bits about our GridFTP setup

Personal tools