From b9d2448475da159c914b79eab07b5c057905d1bf Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Thu, 27 Jun 2013 23:31:38 -0700 Subject: [PATCH 1/5] Initial commit --- README.md | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..c942fa4 --- /dev/null +++ b/README.md @@ -0,0 +1,4 @@ +pathos +====== + +a framework for parallel graph management and execution in heterogeneous computing From 4d8a0a02730587fa126a63800f0cc58b6aef509d Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Thu, 11 Jul 2013 10:28:49 -0700 Subject: [PATCH 2/5] merged changes to README.md from svn --- README.md | 132 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 131 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c942fa4..37ae9a4 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,134 @@ pathos ====== - a framework for parallel graph management and execution in heterogeneous computing + +About Pathos +------------ +Pathos is a framework for heterogenous computing. It primarily provides +the communication mechanisms for configuring and launching parallel +computations across heterogenous resources. Pathos provides stagers and +launchers for parallel and distributed computing, where each launcher +contains the syntactic logic to configure and launch jobs in an execution +environment. Some examples of included launchers are: a queue-less +MPI-based launcher, a ssh-based launcher, and a multiprocessing launcher. +Pathos also provides a map-reduce algorithm for each of the available +launchers, thus greatly lowering the barrier for users to extend their +code to parallel and distributed resources. Pathos provides the ability +to interact with batch schedulers and queuing systems, thus allowing large +computations to be easily launched on high-performance computing resources. +One of the most powerful features of pathos is "tunnel", which enables a +user to automatically wrap any distributed service calls within a ssh-tunnel. + +Pathos is divided into four subpackages:: + * dill: a utility for serialization of python objects + * pox: utilities for filesystem exploration and automated builds + * pyina: a MPI-based parallel mapper and launcher + * pathos: distributed parallel map-reduce and ssh communication + + +Pathos Subpackage +----------------- +The pathos subpackage provides a few basic tools to make distributed +computing more accessable to the end user. The goal of pathos is to +allow the user to extend their own code to distributed computing with +minimal refactoring. + +Pathos provides methods for configuring, launching, monitoring, and +controlling a service on a remote host. One of the most basic features +of pathos is the ability to configure and launch a RPC-based service +on a remote host. Pathos seeds the remote host with a small `portpicker` +script, which allows the remote host to inform the localhost of a port +that is available for communication. + +Beyond the ability to establish a RPC service, and then post requests, +is the ability to launch code in parallel. Unlike parallel computing +performed at the node level (typically with MPI), pathos enables the +user to launch jobs in parallel across heterogeneous distributed resources. +Pathos provides a distributed map-reduce algorithm, where a mix of +local processors and distributed RPC services can be selected. Pathos +also provides a very basic automated load balancing service, as well as +the ability for the user to directly select the resources. + +The high-level "pp_map" interface, yields a map-reduce implementation that +hides the RPC internals from the user. With pp_map, the user can launch +their code in parallel, and as a distributed service, using standard python +and without writing a line of server or parallel batch code. + +RPC servers and communication in general is known to be insecure. However, +instead of attempting to make the RPC communication itself secure, pathos +provides the ability to automatically wrap any distributes service or +communication in a ssh-tunnel. Ssh is a universally trusted method. +Using ssh-tunnels, pathos has launched several distributed calculations +on national lab clusters, and to date has performed test calculations +that utilize node-to-node communication between two national lab clusters +and a user's laptop. Pathos allows the user to configure and launch +at a very atomistic level, through raw access to ssh and scp. + +Pathos is in the early development stages, and any user feedback is +highly appreciated. Contact Mike McKerns [mmckerns at caltech dot edu] +with comments, suggestions, and any bugs you may find. A list of known +issues is maintained at http://trac.mystic.cacr.caltech.edu/project/pathos/query. + + +Major Features +-------------- +Pathos provides a configurable distributed parallel-map reduce interface +to launching RPC service calls, with:: + * a map-reduce interface that extends the python 'map' standard + * the ability to submit service requests to a selection of servers + * the ability to tunnel server communications with ssh + * automated load-balancing between multiprocessing and RPC servers + +The pathos core is built on low-level communication to remote hosts using +ssh. The interface to ssh, scp, and ssh-tunneled connections can:: + * configure and launch remote processes with ssh + * configure and copy file objects with scp + * establish an tear-down a ssh-tunnel + +To get up and running quickly, pathos also provides infrastructure to:: + * easily establish a ssh-tunneled connection to a RPC server + + +Current Release +--------------- +The latest released version of pathos is available from:: + http://trac.mystic.cacr.caltech.edu/project/pathos + +Pathos is distributed under a modified BSD license. + +Development Release +------------------- +You can get the latest development release with all the shiny new features at:: + http://dev.danse.us/packages. + +or even better, fork us on our github mirror of the svn trunk:: + https://github.com/uqfoundation + +Citation +-------- +If you use pathos to do research that leads to publication, we ask that you +acknowledge use of pathos by citing the following in your publication:: + + M.M. McKerns, L. Strand, T. Sullivan, A. Fang, M.A.G. Aivazis, + "Building a framework for predictive science", Proceedings of + the 10th Python in Science Conference, 2011; + http://arxiv.org/pdf/1202.1056 + + Michael McKerns and Michael Aivazis, + "pathos: a framework for heterogeneous computing", 2010- ; + http://trac.mystic.cacr.caltech.edu/project/pathos + +More Information +---------------- +Probably the best way to get started is to look at a few of the +examples provided within pathos. See `pathos.examples` for a +set of scripts that demonstrate the configuration and launching of +communications with ssh and scp. The source code is also generally well documented, +so further questions may be resolved by inspecting the code itself, or through +browsing the reference manual. For those who like to leap before +they look, you can jump right to the installation instructions. If the aforementioned documents +do not adequately address your needs, please send us feedback. + +Pathos is an active research tool. There are a growing number of publications and presentations that +discuss real-world examples and new features of pathos in greater detail than presented in the user's guide. +If you would like to share how you use pathos in your work, please send us a link. From 8dec5059a7e6edcd35823cf127bc0eefe121adf8 Mon Sep 17 00:00:00 2001 From: Mike McKerns Date: Thu, 11 Jul 2013 11:24:55 -0700 Subject: [PATCH 3/5] fixed formatting in README.md --- README.md | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 37ae9a4..9b5aace 100644 --- a/README.md +++ b/README.md @@ -20,10 +20,11 @@ One of the most powerful features of pathos is "tunnel", which enables a user to automatically wrap any distributed service calls within a ssh-tunnel. Pathos is divided into four subpackages:: - * dill: a utility for serialization of python objects - * pox: utilities for filesystem exploration and automated builds - * pyina: a MPI-based parallel mapper and launcher - * pathos: distributed parallel map-reduce and ssh communication + +* dill: a utility for serialization of python objects +* pox: utilities for filesystem exploration and automated builds +* pyina: a MPI-based parallel mapper and launcher +* pathos: distributed parallel map-reduce and ssh communication Pathos Subpackage @@ -74,19 +75,22 @@ Major Features -------------- Pathos provides a configurable distributed parallel-map reduce interface to launching RPC service calls, with:: - * a map-reduce interface that extends the python 'map' standard - * the ability to submit service requests to a selection of servers - * the ability to tunnel server communications with ssh - * automated load-balancing between multiprocessing and RPC servers + +* a map-reduce interface that extends the python 'map' standard +* the ability to submit service requests to a selection of servers +* the ability to tunnel server communications with ssh +* automated load-balancing between multiprocessing and RPC servers The pathos core is built on low-level communication to remote hosts using ssh. The interface to ssh, scp, and ssh-tunneled connections can:: - * configure and launch remote processes with ssh - * configure and copy file objects with scp - * establish an tear-down a ssh-tunnel + +* configure and launch remote processes with ssh +* configure and copy file objects with scp +* establish an tear-down a ssh-tunnel To get up and running quickly, pathos also provides infrastructure to:: - * easily establish a ssh-tunneled connection to a RPC server + +* easily establish a ssh-tunneled connection to a RPC server Current Release From 535c7f4d27a98f8d8e9b3c90c6e35ef809f34e05 Mon Sep 17 00:00:00 2001 From: nrhine1 Date: Thu, 24 Mar 2016 17:53:16 -0400 Subject: [PATCH 4/5] created multiprocessing with class & cache example --- examples/mp_class_example.py | 51 ++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 examples/mp_class_example.py diff --git a/examples/mp_class_example.py b/examples/mp_class_example.py new file mode 100644 index 0000000..fc604a3 --- /dev/null +++ b/examples/mp_class_example.py @@ -0,0 +1,51 @@ +from pathos.multiprocessing import ProcessingPool, ThreadingPool +import pathos.multiprocessing +import logging +log = logging.getLogger(__name__) + +class PMPExample(object): + def __init__(self): + self.cache = {} + + def compute(self, x): + self.cache[x] = x ** 3 + return self.cache[x] + + def threadcompute(self, xs): + pool = ThreadingPool(4) + results = pool.map(self.compute, xs) + return results + + def processcompute(self, xs): + pool = ProcessingPool(4) + results = pool.map(self.compute, xs) + return results + +def parcompute_example(): + dc = PMPExample() + dc2 = PMPExample() + dc3 = PMPExample() + dc4 = PMPExample() + + n_datapoints = 100 + inp_data = range(n_datapoints) + r1 = dc.threadcompute(inp_data) + assert(len(dc.cache) == n_datapoints) + + r2 = dc2.processcompute(inp_data) + assert(len(dc2.cache) == 0) + assert(r1 == r2) + + r3 = ProcessingPool(4).map(dc3.compute, inp_data) + r4 = ThreadingPool(4).map(dc4.compute, inp_data) + assert(r4 == r3 == r2) + assert(len(dc3.cache) == 0) + assert(len(dc4.cache) == n_datapoints) + + log.info("Size of threadpooled class caches: {}, {}".format(len(dc.cache), len(dc4.cache))) + log.info("Size of processpooled class caches: {}, {}".format(len(dc2.cache), len(dc3.cache))) + +if __name__ == '__main__': + logging.basicConfig() + log.setLevel(logging.INFO) + parcompute_example() From 357df730efa87010f133c6e8e21fedc64080660b Mon Sep 17 00:00:00 2001 From: mmckerns Date: Tue, 1 Aug 2017 09:27:35 -0400 Subject: [PATCH 5/5] fix typo --- README.md | 2 +- setup.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 8f3bf2b..3f0b77f 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,7 @@ The `pathos` framework is composed of several interoperating packages:: About Pathos ------------ The `pathos` package provides a few basic tools to make parallel and -distributed computing more accessable to the end user. The goal of `pathos` +distributed computing more accessible to the end user. The goal of `pathos` is to enable the user to extend their own code to parallel and distributed computing with minimal refactoring. diff --git a/setup.py b/setup.py index d13ed8c..121f9af 100644 --- a/setup.py +++ b/setup.py @@ -92,7 +92,7 @@ ============ The `pathos` package provides a few basic tools to make parallel and -distributed computing more accessable to the end user. The goal of `pathos` +distributed computing more accessible to the end user. The goal of `pathos` is to enable the user to extend their own code to parallel and distributed computing with minimal refactoring.