Skip to content

Commit

Permalink
restrucured in the readmes
Browse files Browse the repository at this point in the history
  • Loading branch information
dekvall committed May 8, 2019
1 parent d5e1652 commit cb7315b
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 31 deletions.
36 changes: 5 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,23 @@

Make sure you have `virtualenv` and `python 3.5+` installed


```bash
bash install.sh
```
This will activate the virtualenv and install the proper packages.

You will probably also need [Git LFS](https://git-lfs.github.com/) to track various datasets

To then launch the jupyter instance use
```bash
jupyter notebook
```
And you should be directed to `localhost:8888`. In the future we should set up the gcloud instance with the same thing. But it seems some bureaucracy got in the way for now.
And you should be directed to `localhost:8888`.

### Project structure
* [Models](models) - The resulting generated models
* [Utilities](utils) - Various tools such as visualization etc.
* [Scripts](scripts) - The scripts used for the project
* [Datasets](datasets) - The datasets used for the project


## Resources

### Voice to text
Expand All @@ -46,23 +42,9 @@ And you should be directed to `localhost:8888`. In the future we should set up t

* Adversarial loss [short](https://www.quora.com/What-is-adversarial-loss-in-machine-learning) [paper](https://arxiv.org/pdf/1901.08753.pdf)

### Datasets

* Conceptual Captions [link](https://ai.google.com/research/ConceptualCaptions/download)

* Flickr 30k [link](https://www.kaggle.com/hsankesara/flickr-image-dataset/version/1)

* TIMIT Speech corpus [link](https://catalog.ldc.upenn.edu/LDC93S1)

I think that we should use the flickr dataset as the 30k images should really be enough in the limited time we have.
* Training GANs, [Tips and Tricks](https://github.com/soumith/ganhacks)

Once you have downloaded the flickr dataset extract it and run the resize script
that's located in the flickr30k_images folder, from the folder in question
```bash
bash resize_images.sh
```

## Further notes
## Further information

### Report
The report Overleaf is available [here](https://www.overleaf.com/4488118745cjmprgwyfxcw)
Expand All @@ -72,17 +54,8 @@ The report Overleaf is available [here](https://www.overleaf.com/4488118745cjmpr
Might have to use a bag of words model or some other form of context presentation to simplify what the sentence says, look into this further.

### Training GANS
* [Tips and Tricks](https://github.com/soumith/ganhacks)

## GCP
### Running StackGAN
Run StackGAN on GCP from the code folder with
```bash
python2 main.py --cfg cfg/coco_eval.yml --gpu 0
```
Contrary to popular belief setting `--gpu 0` here actually refers to the id of the gpu. In most other cases `gpu 0` refers to cpu mode. Weird.

The generated images will be stored in the `models/coco/netG_epoch_90` directory.
## Google Cloud Platform

### Jupyter notebooks
To use jupyter notebooks, run this on the remote
Expand All @@ -95,6 +68,7 @@ Then tunnel your connection through
david@fridge:~$ ssh -N -L localhost:8888:localhost:8888 david@<EXTERNAL_IP_OF_VM>
```
Then simply open a browser on `localhost:8888` and provide it with the token that should be visible in the commandline window on the vm to connect.

### Show results
The images are viewable in python notebooks and can also, be downloaded from there.

Expand Down
15 changes: 15 additions & 0 deletions datasets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Datasets

* Conceptual Captions [link](https://ai.google.com/research/ConceptualCaptions/download)

* Flickr 30k [link](https://www.kaggle.com/hsankesara/flickr-image-dataset/version/1)

* TIMIT Speech corpus [link](https://catalog.ldc.upenn.edu/LDC93S1)

I think that we should use the flickr dataset as the 30k images should really be enough in the limited time we have.

Once you have downloaded the flickr dataset extract it and run the resize script
that's located in the flickr30k_images folder, from the folder in question
```bash
bash resize_images.sh
```
8 changes: 8 additions & 0 deletions inspiration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
## Running StackGAN
Run StackGAN on GCP from the code folder with
```bash
python2 main.py --cfg cfg/coco_eval.yml --gpu 0
```
Contrary to popular belief setting `--gpu 0` here actually refers to the id of the gpu. In most other cases `gpu 0` refers to cpu mode. Weird.

The generated images will be stored in the `models/coco/netG_epoch_90` directory.

0 comments on commit cb7315b

Please sign in to comment.