-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accessing "staffwww.dcs.sheffield.ac.uk/people/J.Hensman" data #8
Comments
@jameshensman do you still have the files? |
I tried to move most of these types of things across as I found them. It's a good example of why we developed pods! If we can recover the datasets let's try and get them integrated. On Sun, Mar 6, 2016 at 10:27 AM, Max Zwiessele [email protected]
|
Is there any news on the drosophila data? |
No, I established that using pods is better than using GPy.utils to access the dataset files. This is with GPy-devel. All in all, I managed to put a complete folder "datasets" from various sources and packages in SheffieldML. Hence, I managed to form the drosophila.knirps file required by Hierarchical.ipynb and eliminate direct access to Lab3 in that notebook. |
That's great. yes pods is the right place to do this. Did you do a pull request for an updated version of the notebook? On Tue, May 3, 2016 at 3:09 PM, finmod [email protected] wrote:
|
Here's the drosophila data if someone wants to add it. |
Thank you James for this data file. With the kalinka09_mel.csv and kalinka09_mel_pdata.csv files extracted into the compbio folder, Hierarchical.ipynb is now running fine. Note that kalinka09_mel is a lighter version than the one I downloaded from the original source using pods. To recap the fix:
#import urllib #urllib.urlretrieve('http://staffwww.dcs.sheffield.ac.uk/people/J.Hensman/data/kalinka09_mel.csv', 'kalinka_data.csv') #urllib.urlretrieve('http://staffwww.dcs.sheffield.ac.uk/people/J.Hensman/data/kalinka09_mel_pdata.csv', 'kalinka_pdata.csv') expression = np.loadtxt('kalinka09_mel.csv', delimiter=',', usecols=range(1, 57)) gene_names = np.loadtxt('kalinka09_mel.csv', delimiter=',', usecols=[0], dtype=np.str) replicates, times = np.loadtxt('kalinka09_mel_pdata.csv', delimiter=',').T #normalize data row-wise expression -= expression.mean(1)[:,np.newaxis] expression /= expression.std(1)[:,np.newaxis] Running the complete (8 out of 8) compbio folder requires a similar availability of a data file for Y=np.load("/users/suraalrashid/expression.npy") in TFA_with_Coregion-1.ipynb. I could not locate the suraalrashid data anywhere. From: James Hensman [mailto:[email protected]] Here's the drosophila data if someone wants to add it. — |
Hello James, As a logical step after running hierarchical.ipynb, in deepGPy (configuration: Linux (Ubuntu) on VM VirtualBox, python 2.7 and Anaconda 2.5), two questions arise about plotting:
It would be nice if you could make available the code for these two plots because they convey a telling message for otherwise complex processes. Thank you. From: James Hensman [mailto:[email protected]] Here's the drosophila data if someone wants to add it. — |
There is a common problem on accessing compbio and other datasets: drosophilia, spellman yeasts, Lab3.zip and others. This is in addition to migrating matplotlib and pods to Python 3. Should'nt these datasets be integrated nicely in pods to provide an homogeneous set of testing notebook (gprs, gpss) and "datasets" folder?
The error is:
C:\Users\Denis\Anaconda3\lib\urllib\request.py in http_error_default(self, req, fp, code, msg, hdrs)
587 class HTTPDefaultErrorHandler(BaseHandler):
588 def http_error_default(self, req, fp, code, msg, hdrs):
--> 589 raise HTTPError(req.full_url, code, msg, hdrs, fp)
590
591 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 403: Forbidden
The text was updated successfully, but these errors were encountered: