Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOSS is not declinig #24

Open
YerongLi opened this issue Aug 29, 2024 · 1 comment
Open

LOSS is not declinig #24

YerongLi opened this issue Aug 29, 2024 · 1 comment

Comments

@YerongLi
Copy link

I found with original training workflow, the loss is not decling, I am not sure this is because I am using a subset of the training set.

# File modified by authors of InstructDiffusion from original (https://github.com/CompVis/stable-diffusion).
# See more details in LICENSE.

model:
  base_learning_rate: 1.0e-04
  weight_decay: 0.01
  target: ldm.models.diffusion.ddpm_edit.LatentDiffusion
  params:
    fp16: True
    deepspeed: 'deepspeed_1'
    ckpt_path: stable_diffusion/models/ldm/stable-diffusion-v1/v1-5-pruned-emaonly-adaption-task.ckpt
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: edited
    cond_stage_key: edit
    image_size: 32
    channels: 4
    cond_stage_trainable: false   # Note: different from the one we trained before
    conditioning_key: hybrid
    monitor: val/loss_simple_ema
    scale_factor: 0.18215

    scheduler_config: # 10000 warmup steps
      target: ldm.lr_scheduler.LambdaLinearScheduler
      params:
        warm_up_steps: [ 0 ]
        cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
        f_start: [ 1.e-6 ]
        f_max: [ 1. ]
        f_min: [ 1. ]

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 32 # unused
        in_channels: 8
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: True
        legacy: False
        force_type_convert: True

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: ldm.modules.encoders.modules.FrozenCLIPEmbedder

data:
  target: main.DataModuleFromConfig
  params:
    batch_size: 2
    num_workers: 4
    train: 
      - ds1:
        target: dataset.editing.edit_zip_dataset.GIERDataset
        params:
          path: data/GIER_editing_data/
          split: train
          min_resize_res: 256
          max_resize_res: 256
          crop_res: 256
          flip_prob: 0.0
          zip_start_index: 0
          zip_end_index: 100
          sample_weight: 2.0

    validation:
      target: dataset.pose.pose.COCODataset
      params:
        root: data/coco/
        image_set: val2017
        is_train: False
        max_prompt_num: 5
        min_prompt_num: 1
        radius: 10
trainer:
  initial_scale: 13
  max_epochs: 200
  save_freq: 20
  accumulate_grad_batches: 1
  clip_grad: 0.0
  optimizer: adamw

Screenshot_29-8-2024_82224_wandb ai

@YerongLi
Copy link
Author

I don't get it why you are not logging the loss in the

def train_one_epoch(config, model, model_ema, data_loader, val_data_loader, optimizer, epoch, lr_scheduler, scaler):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant