Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring the Oncoscape Core #164

Open
pshannon-bioc opened this issue Dec 14, 2015 · 1 comment
Open

Refactoring the Oncoscape Core #164

pshannon-bioc opened this issue Dec 14, 2015 · 1 comment

Comments

@pshannon-bioc
Copy link
Collaborator

Refactoring the oncoscape core (14 dec 2015)

These notes propose some simple changes to Oncoscape in order to create a more flexible server. The principal change is the use of the Factory Pattern so that data and analyses can be made available to the server in many ways, ranging from directly in-process (as we do now) to remote, shared and secured in a service-oriented architecture (broadly understood) as is the clear need. The creation and provision of more sophisticated data and analysis services is not specified here. Instead, simple refactoring of the Oncoscape server is described which will support open-ended forms of distributed and secure computation in the future.

current constructor:

onco <- OncoDev14(port=port, scriptDir=scriptDir, userID=userID, datasetNames=current.datasets)

new form

app <- Oncoscape(port, analysisPackages, datasets, browserFile, userCredentials)

 analysisPackages: a list of R package names, each of which is derived from
                   the SttrAnalysisPackage base class
 datasets: a list of R package names, each of which is derived from the
           the SttrDataPackage base class

 browserFile: name of a file combining HTML, CSS and Javascript
 userCredentials: an instance of the UserCredentials class (or a subclass)

Three abstract base classes are needed:

   SttrDataPackage: need add open-ended support for indirect data (local database, remote
      database, cloud, etc.)
   SttrAnalaysisPackage: provides template and some shared methods for, e.g., PCA, PLSR, and
      future additions
   UserCredentials:  open-ended design, from simple userID and no password, to LDAP, AD, and etc.

Both SttrDataPackage and SttrAnalysisPackage follow the loose definition of SOA, service oriented
architecture (https://en.wikipedia.org/wiki/Service-oriented_architecture):

   "a component that is encapsulated behind an interface"

The PCA analysis package behaves like this (these calls are made by the Oncoscape server)

   packge.name <- "PCA"    # or "PCA.SOA.AmazonS3" or ...
   library(package.name)   # load the code, which may be self-contained, or a facade to a
                           # adaptive distributed system deployed in the cloud, or ...
                           # crucial: the server has no idea how the PCA calculations are actually
                           # performed, nor where the data actually is

   eval(parse(text=sprintf("pkg <- %s(server)", package.name)))
   register(pkg)           # the pkg tells the server the websocket messages it wants to receive

   the server provides data and message passing services to the pkg.

Two app constructor examples to demonstrate the spectrum of uses:

1) reproduce current style of use:

    app <- Oncoscape(7001, c("PCA", "PLSR"), c("DEMOdz", "TCGAbrain"),
                      "index.html", "[email protected]");

2) demonstrate distributed shared data, analysis, high security.  "

    app <- Oncoscape(7001, c("PCA.SOA.AmazonS3", "PLSR.immediate"),
                     c("DEMOdz.immediate", "TCGAbrain.Amazon"), 
                     "index.html",
                     "HutchPHI")

    note that the actual values of the the user's credentials is deferred to 
    an as-yet unspecified but arbitrarily complex, arbitrarily secure class.

Data and analysis packages, and credentials, all depend upon the Factory design
pattern, in which a character strings are passed to the appropriate factory,
which returns a (possibly intricate, possibly simple) object of the appropriate
derived class.  Each of these concrete objects (an SttrDataPackage, an SttrAnalysisPackage,
a UserCredential instance) supports the methods of their base class, so each
can be used in Oncoscape interchangeably.  

For example, imagine the use of a private BRCA data set stored with many layers of security
on the Amazon cloud.

  app <- Oncoscape(7001, 
                   c("TCGAbrca.SOA.AmazonS3", "BRCA4013.SOA.AmazonS3.PHIlevel.10"),
                   c("PCA.immediate", "HOBO.hutchCluster"), 
                   "index.html",
                   "HutchCredentials.PHI.level.10")

 As the app starts up:
    1) the specified credentials object is created, and the user must establish
           a) she has a secure connection
           b) she is authorized 
    2) TCGAbrca.SOA.AmazonS3 is created, needs no credentials (or maybe just enough
       for billing purposes)
    3) BRCA4013.SOA.AmazonS3.PHIlevel.10 is created.  the high security credentials
       from step 1 must be supplied
    4) a PCA package is loaded and initialized; it runs in-process with Oncoscape
    5) a hobo similarity calculator, peruaps already running on the hutch cluster,
       is contacted.  maybe credentials are needed, if only to track which lab
       is using the cluster.
@canaantt
Copy link
Contributor

canaantt commented Jan 5, 2016

@grettygoose @canaantt
need to learn from Paul's design and work through all the remaining datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants