-
paralelni vs neparalelni
-
ucb-gp - porovnat acq
-
prumerovani a prepinani mezi gp a prumerem v 2d
-
random search x2 a vic nez 50
-
bez paralelni evaluace
-
ucb-gp
-
buckety u zaokrouhlovani
- chceme krajni body
- zbytek bucketu prepocitat rovnomerne
- zaokrouhlovat na stred bucketu
-
multi-output + jak se pocita objective + zobrazovat outputy v tabulkach
-
--criteritum='accuracy/size'
- RESULT(_.)?=(.)
-
silnejsi prior na lengthscale kdyz jsou buckety (nechci lengthscale mensi nez bucket)
-
parallel coords plot
-
random search x2
-
random search
-
vzdycky pridat jeden experiment navic ze vsech initial random search co se pustily
-
poustet dalsi na prvnich 50 (vsech) random searchich z predchozich experimentu
- alternativne videt vsechny samply ze vsech ostatnich
-
priory na hyperparam
-
kapacita modelovana explicitne?
-
napsat na discrete bopt o implementaci
issues:
-
ZOBRAZOVAT DISTRIBUCI + bootstrap
- bud je to osklive, nebo je to gaussian
- https://plot.ly/python/distplot/
-
zaokrouhlovaci kernely
-
bopt-multi submitne array job kterej dostane 10 5 50 v env a adresare v args
- bopt-multi-arrayjob -v BOPT_ARGS="10 5 50" foo{1..10}
-
bopt run -C experiments/1 --continue --n_parallel 5 --n_iter 50
-
bopt-multi experiments/{1..100} 10 5 50
-
banditi
-
paralelni exploit na konci
-
1 nebo 5 nebo 10 paralelne
-
dopustit 100 u fictree a koukat kde se to fixne
-
ERROR root:hyperparam_values.py:30 Invalid hyperparam value 0.09999999999999998 for Hyperparameter(name='y', range=Float(0.1, 6.0))
INFRASTRUKTURA !!
-
jednoducha funkce s noise, koukat jak vypada overfitting nez to fitne spravnej noise
-
cs_fictree + cs_pdt ale s treebankama (.orig)
-
reinfroce znova se 100
-
seedy (bopti) k random searchum
-
batch -> task -> (multi)experiment -> sample -> (multi)job
-
task a batch do experimentu
-
filesystem :((
- reinforce-18/broken-job
- skoro u vsech zkopirovanych /lnet/depot scriptu porad ta sama issue
- :(
-
moznost poustet jenom random search
-
zobrazovat cas v rozumnem meritku (hodiny atd)
-
do args pridat parametry gamma prioru
-
do args pridat parametry EI (xi)
-
tail -F bopt.o5400313
Traceback (most recent call last): File "/home/arnold/.venvs/bopt/bin/bopt", line 11, in load_entry_point('bopt', 'console_scripts', 'bopt')() File "/lnet/spec/work/people/arnold/bopt/bopt/cli/cli.py", line 205, in main args.func(args) File "/lnet/spec/work/people/arnold/bopt/bopt/cli/run.py", line 22, in run experiment = bopt.Experiment.deserialize() File "/lnet/spec/work/people/arnold/bopt/bopt/experiment.py", line 348, in deserialize with open("meta.yml", "r") as f: FileExistsError: [Errno 17] File exists: 'meta.yml' ======= EPILOG: Mon Apr 8 00:54:02 CEST 2019 == Limits: == Usage: cpu=00:00:00, mem=0.00000 GB s, io=0.00000 GB, vmem=N/A, maxvmem=N/A == Duration: 00:00:27 (27 s) == Server name: paris4
-
mf=8G,amf=8G,h_vmem=12G vs mem_free=8G,act_mem_free=8G,h_data=12G (/grid vs /gpu)
-
jak predavat --threads?
-
zobrazovat output failnuteho jobu
-
zkontrolovat implementaci EI
-
moznost zadat manualni slice point (predvyplnovat podle selekce)
-
moznost prekliknout jestli slice v max nebo v next nebo tak neco
-
seedy!
-
funguje manual-run?
-
fitnout mean fci
-
moznost pouzit bud gamma prior nebo constraint
-
ARD=True u kernelu
- moznost to zapnout jako option
-
pocet optimize restarts jako param
-
skipnout prvnich num_random + 1 samplu u kernel param plotu
-
moznost poustet jenom random search
-
timeline
-
opravit zaokrouhlovani aby slo jen spravnymi smery :)
-
kernel jako param u init
-
nehardcodit jmeno kernelu do webu
-
qstat na vsechny joby najednou pri pocitani running
-
smazat mu_pred 0/1 u random search
-
web
-
NxN grid mean + acq
- 1d
- 2d
-
interpolace mezi 2 body hyperparam + plot slicu v lib. 0d, 1d, 2d ose
-
burty!
-
okraje v plotech
-
graf acq fn
-
grafy v dalsim bode
-
-
expected improvement per second
-
zaokrouhlovani z nejakeho duvodu probiha spatne u intu
-
collect nejak dlouho trva
-
moznost nadefinovat qsub param (a nebo obecne spoustejici param co se pridaji u runneru)
-
logscale int je spatne, protoze dava jenom 2^n a zadne mezi?
-
logovat jak dlouho uz job bezi v bopt exp
-
log timings
-
parsovat bash -c time pokud tam je, pokud ne tak beru finished_at jako ted
-
videt jak se meni hyperparam v case (s kazdym novym bodem + konvergence hyperparam)
- porovnat s optmizer_restarts()
-
konvergencni graf
-
backupit yml predtim nez ho prepisuju
-
qstat jenom na joby co nemaji result
-
gpy numpy error
-
vyresit, proc propose location vraci NAN
-
zaokrouhlovaci kernel
- diskretni hyperparam - onehot nebo fixni lengthscale na konkretni hyperparam
- sigmoid vs round?
- diskretni hyperparam - onehot nebo fixni lengthscale na konkretni hyperparam
-
logscale
-
bopt delete job_id
-
bopt resubmit job_id
-
seedy
-
job by mel byt optional
-
manual sample nema job
-
presunout run_params pod sample
-
oznacit status WAITING_FOR_SIMILAR
-
brat WAITING_FOR_SIMILAR v potaz behem bopt
-
zprocessit WAITING_FOR_SIMILAR v collect_results
-
manual_run - vyberu hyperparam rucne
-
manual_sample - reknu kolik to vyslo "rucne"
-
-
zkontrolovat, ze se mu_pred + sigma_pred pouziva ve spravnych mistech
- u bezicich jobu bych mel brat mu_pred misto result (jeste ho nemam)
-
collect na vsech spravnych mistech
- nikde nepouzivat meta_dir (neni potreba, delame cd)
-
replace print with logging
-
garbage collecteni resultu
-
job status (sample status?)
-
regex parsujici result
-
collectit resulty do yamlu
- v ramci toho ziskat i finished_at
-
nepocitam s failnutyma jobama
- predpokladam, ze priste uspeje - tohle nedelam, nikdy nedelam retry, nepocitam s docasnyma chybama
- kontrolovat zaokrouhlene jestli ho nepoustim znova
- vytvorim manual sample s mean_pred - sigma_pred (nejak aby byl videt) a uz nepoustim dal
-
bopt debug - ipdb with bopt imports
-
davat mean kdyz delam znovu stejny bod
-
ignorovat underflow dokud nedelaj problem :)
-
zdetekovat duplicitni ID a spadnout
-
smazat benchmarks/ az bude cas udelat to poradne
-
manual run nefunguje
-
JOB_ID env variable - neni potreba, SGE to nastavi, u local to neresim
-
int range high
-
testovat vse na MNISTu a ne RL MC
-
u manual-run kontrolovat, ze hodnota je uvnitr range
- suggest vraci hodnoty mimo horni range
-
logging !!!!
-
vsechny body zobrazit v 2d viz
- pca
-
kernel type: trivialni, ale neni to :)
-
acq fn v yml
-
parallel evaluace: trivialni, ale neni to :)
- u nedobehlych jobu predpokladam ze vysledek je jejich mean
- moznost pustit run s -j 10
- kontrolovat, ze 2x nevyhodnocuju ve stejnym bode
-
intove hyperparam: GPy?
-
diskretni/categorical parametry
-
SGE Runner: neni broken, jenom neni updated na novejsi API
-
plot convergence
- kernely! porad nevime ktery se nam libi
- acq fn ... to same
-
paralelni evaluace: jde to lip nez pouzivat mean?
-
duplicitni PIDy
-
multijoby
-
intermediate results !!!
- GPy
- cmdline: manual-run, run, run-single, plot, suggest
- serializace modelu pro kazdy step
- plotovani vsech kroku
- konzistentni -C vsude
-
kam generovat slice v plotu u EI a u 1d/2d?
- asi option pro
bopt plot
? - nebo generovat hodne slicu? viz jupyter
- asi option pro
-
GPy priory (jupyter notebook)
- chceme prior nebo jenom bound?
- SheffieldML/GPy#735
-
GPyOpt? https://github.com/SheffieldML/GPyOpt
- asi spis jenom vykrast
GPSS.CC http://deepbayes.ru
-
plot current max
-
fix vmin/vmax
-
plot STD?
-
yaml sort keys? (custom key order)
-
co kdyz mam duplicitni-pid?
- job-PID-1
- pouzivat relativni cesty
-
zamykani - flock, lockfile
-
bopt cmdline
- vytvoreni experimentu
- pousteni jobu
- single shot run
- forever
- lock
- sync
- run
- sync
- unlock
- sleep
- suggest
- naformatovany command s newlinama bopt manual-run --x=1 --y=3
- bopt manual-run
- bopt web
- sync + render
- bopt job -c DIR ID
- bopt exp -c DIR
- bopt plot -c DIR
-
nemuzu se ptat na vysledky samplu, aniz bych vedel adresar outputu jobu
- jak a kdy mam prelejt outputy jobu do samplu? mam to vubec delat?
-
ukladani meta_dir? ted ho vsude musim predavat, ale kdyz ho budu serializovat, tak pak nic nejde presunout
-
kam patri result parser? viz kradeni stdout
-
jak u multijobu poustet/schedulovat vic behu?
-
format cmdline argu u init? vs template.yml bopt init results/mc ./.venv/bin/python ./experiments/rl/monte_carlo.py
bopt init -p parser neur bopt init -p file_parser[fname] neur
mkfifo p neur 2>p | parser parser < p
bopt init --param "gamma:float:0:1" --param "epsilon:float:0:1" --dir results/mc ./.venv/bin/python ./experiments/rl/monte_carlo.py
-
logging? ploty kdyz failne assert ... soft assert?
-
do rezu dat jenom max bod (muzem si ho vybrat)
- cislo samplu
-
merit MI mezi dimenzema?
-
double fork pajp.py
-
kde vezmu finish date?
-
flake8 + black
-
experiment
- hyperparam
- runner
- samples (noise?)
-
param + vysledek
-
per-sample noise?
-
model
-
kernel
-
random search
-
gp
-
-
job
- vypocet
- result parser
-
- last model
-
acq funce donstane optimizer
- max f(posterior(R|data))
-
grafy pro param kernelu
-
levenberg marquardt
-
test x^2
-
marginal & conditional plots
-
share z-axis in plots
-
doublefork children so they don't need to be awaited & can survive crash of parent
-
noise optimization
-
time slider
-
discrete hyperparameters
-
priors
-
UCB acquisition function
-
parallel optimization without cmdline target
-
expected improvement per second (hyperparam affects training time)
-
look into approximate GP inference ... at which point would we need it?
-
predicting training curves
- ability to stop a job when it looks like it won't work out
-
zobrazovat spravne max
-
hodnoty hyperparam videt v bodech ve slicech
-
zobrazit zafixovane param u kazdeho slicu
-
zobrazit param nejlepsiho bodu
-
zobrazovat body co se aktualne vyhodnocuji
- hodnotu z GP (a jak to dopadne?)
-
slider na celou vizualizaci
-
timestampovat zacatky a konce jobu
-
multijoby
- zvlast megajob a single
-
seedy pro joby
- jmeno parametru pres ktery se predava seed