Skip to content

Commit

Permalink
Fixing invalid timestamps identified by maintenance tool (#1887)
Browse files Browse the repository at this point in the history
As part of the work for bootstrapping our maintenance sprint, a series
of articles were identified with invalid timestamps. This PR fixes this
so we can get reliable information about last updated dates.

In addition to the timestamp changes, there are automatic changes made
by my editor to remove empty spaces from the files.
  • Loading branch information
erikaheidi authored Nov 7, 2024
1 parent 58b79d8 commit c31995b
Show file tree
Hide file tree
Showing 20 changed files with 186 additions and 181 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ title: "Differences Between Development and Production Images"
linktitle: " Dev vs. Production Images"
type: "article"
description: "Learn about the differences between development and production Chainguard Images"
date: 2024-11-01:07:52+02:00
lastmod: 2024-11-01:07:52+02:00
date: 2024-11-01T07:52+02:00
lastmod: 2024-11-01T07:52+02:00
draft: false
tags: ["Chainguard Images", "Product", ]
images: []
Expand All @@ -17,7 +17,7 @@ toc: true

Chainguard Images follow a distroless philosophy, meaning that only software absolutely necessary for a specific workload is included in an image. For this reason, most Chainguard Images come in two variants:

- **Production**: These images provide a runtime for production workloads. Designed to be as minimal as possible, production images do not contain package managers such as apk, shells such as b/a/sh, or development utilities such as Git or text editors. Production Chainguard Images are tagged `:latest`.
- **Production**: These images provide a runtime for production workloads. Designed to be as minimal as possible, production images do not contain package managers such as apk, shells such as b/a/sh, or development utilities such as Git or text editors. Production Chainguard Images are tagged `:latest`.
- **Development**: These images are designed for development tasks such as building, testing, or debugging. They can be used to build software artifacts that are then copied into production images as part of a multi-stage build, or to test workflows interactively in an environment similar to a production image. Development images contain familiar utilities such as package managers and shells. While our production images have advantages related to security, development images are also secure and production-ready. Development images are tagged `:latest-dev`.

While we encourage you to use production images in your live deployments, development images are useful for many parts of the dev lifecycle. This article explains some of the key differences between these variants and outlines ways these variants come together in creating a secure deployment.
Expand All @@ -26,9 +26,9 @@ While we encourage you to use production images in your live deployments, develo

Our production images have the following advantages:

- Production images contain fewer packages. While Chainguard moves quickly to patch CVEs in all images, production images still experience fewer CVEs overall. Reducing the number of packages also reduces the potential number of unknown vulnerabilities that might apply to an image.
- Not all executables are created equal. Shells such as bash, package managers such as apk, and communication-ready utilities such as Git and curl are general-purpose tools that are broadly exploitable.
- A smaller image can use fewer resources and reduce deployment time. In some cases, especially with already-large images, a smaller version can make a deployment more stable or robust.
- Production images contain fewer packages. While Chainguard moves quickly to patch CVEs in all images, production images still experience fewer CVEs overall. Reducing the number of packages also reduces the potential number of unknown vulnerabilities that might apply to an image.
- Not all executables are created equal. Shells such as bash, package managers such as apk, and communication-ready utilities such as Git and curl are general-purpose tools that are broadly exploitable.
- A smaller image can use fewer resources and reduce deployment time. In some cases, especially with already-large images, a smaller version can make a deployment more stable or robust.
- Removing unnecessary components increases the observability and transparency of the image. Reducing the number of components can facilitate risk assessment or post-incident reporting.

While our production images can be considered to have advantages for security, the development variants of Chainguard Images are also low-to-no CVE, include useful attestations such as SLSA provenance and SBOMs, and follow other security best practices. You should feel comfortable using these secure images in production if they better fit your use case.
Expand All @@ -38,15 +38,20 @@ While our production images can be considered to have advantages for security, t
Though we encourage the use of production images in your final deployment, development images have many use cases. These include:

- **Building**: In many Dockerfile builds, you will need to generate software artifacts such as static binaries or virtual environments as part of the build process. Development images are ideal for this use case, and after generation artifacts can be copied to a production image for use. See [How to Port a Sample Application to Chainguard Images](/chainguard/migration/porting-apps-to-chainguard/) for a detailed example.
<<<<<<< Updated upstream
- **Debugging**: Our development images contain a number of useful utilities, but are otherwise designed to be as close as possible to the production variant. This makes them useful for debugging, since you can test out build steps or the build environment using interactive shells and package managers. See [Debugging Distroless Images](/chainguard/chainguard-images/debugging-distroless-images/) for more on this use case.
- **Training**: In the case of AI images, you can use a development variant to train a model, then run the model in inference using a production image.
=======
- **Debuging**: Our development images contain a number of useful utilities, but are otherwise designed to be as close as possible to the production variant. This makes them useful for debugging, since you can test out build steps or the build environment using interactive shells and package managers. See [Debugging Distroless Images](/chainguard/chainguard-images/debugging-distroless-images/) for more on this use case.
- **Training**: In the case of AI images, you can use a development variant to train a model, then run the model in inference using a production image.
>>>>>>> Stashed changes
- **Deploying**: Though we encourage you to use our production images in your live deployment where possible, our development images are low-to-no CVE and are suitable for production.

## Special Considerations

It’s likely already clear that switching to our production images requires a few changes in development and deployment. Here are a few additional considerations:

* Since we don’t include general-purpose shells in our production images, the entrypoint to these images will vary by each image’s use case. Check the documentation for each image, and note that Dockerfile commands such as `CMD` will be directed to the image-specific entrypoint. Because we aim to keep our development images as close as possible to our production images, these changes to entrypoint also affect development images.
* Since we don’t include general-purpose shells in our production images, the entrypoint to these images will vary by each image’s use case. Check the documentation for each image, and note that Dockerfile commands such as `CMD` will be directed to the image-specific entrypoint. Because we aim to keep our development images as close as possible to our production images, these changes to entrypoint also affect development images.
* Chainguard Images use a less privileged user by default. When using our development images, you will need to explicitly access the image with the root user β€” such as by using the `--user root` option β€” to perform tasks such as installing packages with apk.

## Conclusion
Expand All @@ -55,7 +60,7 @@ Taking the step into distroless by using our production Chainguard Images can be

## Resources

* [Blog: Minimal container images: Towards a more secure future](https://www.chainguard.dev/unchained/minimal-container-images-towards-a-more-secure-future)
* [Chainguard Academy: Overview of Chainguard Images](/chainguard/chainguard-images/overview#why-distroless)
* [Blog: Minimal container images: Towards a more secure future](https://www.chainguard.dev/unchained/minimal-container-images-towards-a-more-secure-future)
* [Chainguard Academy: Overview of Chainguard Images](/chainguard/chainguard-images/overview#why-distroless)
* [Chainguard Academy: Debugging Distroless Images](/chainguard/chainguard-images/debugging-distroless-images/)

Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
title: "Getting Started with the NeMo Chainguard Image"
type: "article"
linktitle: "NeMo"
aliases:
aliases:
- /chainguard/chainguard-images/getting-started/nemo
description: "Get started with the NeMo Chainguard Image for generative deep learning"
date: 2024-05-16:08:00+02:00
lastmod: 2024-05-16:08:00+02:00
date: 2024-05-16T08:00:00+02:00
lastmod: 2024-05-16T08:00:00+02:00
tags: ["Chainguard Images", "Products"]
draft: false
images: []
Expand All @@ -29,7 +29,7 @@ This guide is primarily designed for use in an environment with access to one or

## Prerequisites

If Docker Engine (or Docker Desktop) is not already installed, follow the [instructions for installing Docker Engine on your host machine](https://docs.docker.com/engine/install/).
If Docker Engine (or Docker Desktop) is not already installed, follow the [instructions for installing Docker Engine on your host machine](https://docs.docker.com/engine/install/).

To take advantage of connected GPUs, you'll need to install CUDA Toolkit on your host machine.

Expand Down Expand Up @@ -81,7 +81,7 @@ Once you've determined that your environment has access to CUDA and connected GP

## NeMo Overview

NeMo is a generative AI toolkit and framework with a focus on conversational AI tasks such as NLP, ASR, and TTS, as well as large language models (LLM) and multimodal (MM) models. NeMo uses a system of [neural modules](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/core/neural_modules.html), an abstraction over a variety of common elements in model training and inference such as encoders, decoders, loss functions, layers, or models. NeMo also provides [collections of modules](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/collections.html) targeting specific areas of concern in conversational and generative AI, such as LLMs, speech AI / NLP, and TTS.
NeMo is a generative AI toolkit and framework with a focus on conversational AI tasks such as NLP, ASR, and TTS, as well as large language models (LLM) and multimodal (MM) models. NeMo uses a system of [neural modules](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/core/neural_modules.html), an abstraction over a variety of common elements in model training and inference such as encoders, decoders, loss functions, layers, or models. NeMo also provides [collections of modules](https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/collections.html) targeting specific areas of concern in conversational and generative AI, such as LLMs, speech AI / NLP, and TTS.

NeMo is built on [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/), a high-level interface to PyTorch with a focus on scalability, and uses the [Hydra](https://hydra.cc/) library for configuration management.

Expand Down Expand Up @@ -128,11 +128,11 @@ docker run -it --rm \
```
Note that we ran the above script as root. This allows us to share the script and output `.wav` file between the host and container. Remember not to run your image as root in a production environment.

If your host machine does not have attached GPUs and you'd like to run the above on your CPU, omit the ` --gpus all \` line. The script tests for availability of the CUDA platform and sets the accelerator to CPU if CUDA is not detected, so the script will also function on CPU.
If your host machine does not have attached GPUs and you'd like to run the above on your CPU, omit the ` --gpus all \` line. The script tests for availability of the CUDA platform and sets the accelerator to CPU if CUDA is not detected, so the script will also function on CPU.

Since we're using pretrained models to perform text to speech, this example will only take a few minutes using a CPU only. However, other tasks such as model training and finetuning may take significantly longer without connected GPUs.
Since we're using pretrained models to perform text to speech, this example will only take a few minutes using a CPU only. However, other tasks such as model training and finetuning may take significantly longer without connected GPUs.

Note that NeMo collections are large, and initial imports can take up to a minute depending on your environment. The script may appear to hang during that time.
Note that NeMo collections are large, and initial imports can take up to a minute depending on your environment. The script may appear to hang during that time.

After imports are complete, you should see a large amount of output as NeMo pulls models and works through the steps in the script (tokenizing, generating a spectrogram, generating audio, and writing audio to disk). On completion, the script outputs a `test.wav` file. Because we mounted a volume, this file should now be present in the working directory of your host machine.

Expand All @@ -152,7 +152,7 @@ The `test.wav` file should contain audio similar to this output:

This section will consider next steps for applying the NeMo Chainguard Image to other tasks in conversational AI.

In the [tts.py](https://github.com/chainguard-dev/nemo-examples/blob/main/tts.py) script run above, we used two models provided by NeMo, both contained within the TTS collection.
In the [tts.py](https://github.com/chainguard-dev/nemo-examples/blob/main/tts.py) script run above, we used two models provided by NeMo, both contained within the TTS collection.

- [Tacotron2 speech synthesis model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/tts_en_tacotron2)
- [HiFi-GAN speech synthesis model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/speechsynthesis_hifigan)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
title: "Getting Started with the PyTorch Chainguard Image"
type: "article"
linktitle: "PyTorch"
aliases:
aliases:
- /chainguard/chainguard-images/getting-started/getting-started-pytorch-cuda12
- /chainguard/chainguard-images/getting-started/getting-started-pytorch
description: "Tutorial on the PyTorch Chainguard Image"
date: 2024-04-25:08:00+02:00
lastmod: 2024-04-25:08:00+00:00
date: 2024-04-25T08:00:00+02:00
lastmod: 2024-04-25T08:00:00+00:00
tags: ["Chainguard Images", "Products"]
draft: false
images: []
Expand All @@ -34,7 +34,7 @@ This guide is designed for use in an environment with access to one or more NVID

Our first step is to check whether our PyTorch-CUDA environment has access to connected GPUs.

If you don't already have Docker Engine installed, follow the [instructions for installing Docker Engine on your host machine](https://docs.docker.com/engine/install/).
If you don't already have Docker Engine installed, follow the [instructions for installing Docker Engine on your host machine](https://docs.docker.com/engine/install/).

Run the below command to pull the image, run it with GPU access, and start a Python interpreter inside the running container.

Expand Down Expand Up @@ -146,19 +146,19 @@ In the below steps, the prompt of your host machine will be denoted as `(host) $
5. Run the model-training script:
```bash
(container) $ python image_classification.py
(container) $ python image_classification.py
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100.0%
πŸ™ Epoch 0/24
🐳 train Loss: 0.9276 Acc: 0.5583
🐧 val Loss: 0.2275 Acc: 0.9500
[...]
πŸ™ Epoch 24/24
🐳 train Loss: 0.1940 Acc: 0.9167
🐧 val Loss: 0.0248 Acc: 1.0000
Training complete in 1m 39s
Best val Acc: 1.000000
```
Expand Down Expand Up @@ -212,7 +212,7 @@ Feel free to try the above inference on other images of octopuses, whales, and p
In this section, we'll review the script provided in the above steps, highlighting some common options and approaches and a few ways the script might be adapted to other use cases. Deep learning is a complex and emerging field, so this section can only provide a high-level overview and a few recommendations for moving forward.

To fine-tune a model for image classification as we did here, you can replace the provided training and validation data with your own. The script examines the number of folders in the training set to determine the targeted number of classes. The folder names are used as class labels. We used 40 training and 20 validation images for each class, but a ratio of 5:1 training to validation may also produce good results.
To fine-tune a model for image classification as we did here, you can replace the provided training and validation data with your own. The script examines the number of folders in the training set to determine the targeted number of classes. The folder names are used as class labels. We used 40 training and 20 validation images for each class, but a ratio of 5:1 training to validation may also produce good results.

By fine-tuning a pretrained model, we took advantage of transfer learning, meaning that the pretrained model (resnet18) was already trained on inputs with relevance to our classification task. Because we used transfer learning, the relatively small amount of input data was still sufficient for good accuracy in our fine-tuned model. If you're working with a large amount of input data, you might consider using a larger pretrained model, such as resnet34. In addition, if training using significantly more data or training using limited computation relative to the task, you may consider the more efficient convolutional neural network as fixed feature extractor approach, which trains only one attached layer rather than updates the original model.
Expand Down
Loading

0 comments on commit c31995b

Please sign in to comment.