Release 2.0.0 - Video Challenges V1
Update Steps
Validator Update Steps
- Ensure you are running on a machine that meets the requirements specified in
min_compute.yaml
- Note the 80 GB VRAM GPU and recommended 600 GB storage
- If on autoupdate, no other action needed
- If not on autoupdate, run:
git pull
./setup_env.sh
pm2 delete bitmind_validator
pm2 delete run_neuron # if using our run script
pm2 start run_neuron.py -- --validator
If you wish to turn off autoupdate or self healing restarts, you can instead start your validator with either
pm2 start run_neuron.py -- --validator --no-auto-update --no-self-heal
or
./start_validator.sh
NOTE Our startup script run_neuron.py
(which calls start_validator.py
, which is responsible for spawning the pm2 processes) now starts the processes shown below.
Do not manually run neruons/validator.py
┌────┬───────────────────────────┬─────────────┬─────────┬─────────┬──────────┬────────┬──────┬───────────┬──────────┬──────────┬──────────┬──────────┐
│ id │ name │ namespace │ version │ mode │ pid │ uptime │ ↺ │ status │ cpu │ mem │ user │ watching │
├────┼───────────────────────────┼─────────────┼─────────┼─────────┼──────────┼────────┼──────┼───────────┼──────────┼──────────┼──────────┼──────────┤
│ 2 │ bitmind_cache_updater │ default │ N/A │ fork │ 3354129 │ 14s │ 0 │ online │ 0% │ 1.7gb │ user │ disabled │
│ 3 │ bitmind_data_generator │ default │ N/A │ fork │ 3354163 │ 13s │ 0 │ online │ 0% │ 1.8gb │ user │ disabled │
│ 1 │ bitmind_validator │ default │ N/A │ fork │ 3354098 │ 15s │ 0 │ online │ 0% │ 1.8gb │ user │ disabled │
│ 0 │ run_neuron │ default │ N/A │ fork │ 3353994 │ 52s │ 0 │ online │ 0% │ 10.5mb │ user │ disabled │
└────┴───────────────────────────┴─────────────┴─────────┴─────────┴──────────┴────────┴──────┴───────────┴──────────┴──────────┴──────────┴──────────┘
run_neuron
manages self-heal restarts and auto-updatebitmind_validator
is your validator processbitmind_data_generator
is populating~/.cache/sn34/synthetic
with outputs from text-to-video and text-to-image models. This is the only sn34 validator process that uses GPUbitmind_cache_updater
manages the cache of real images and videos at~/.cache/sn34/real
Miner Update Steps
- For deployed miner instances, no immediate action is required
- For orientation around video detection model training and deployment, please see the Miner Updates section below
- Note that reward distribution will initially be split 90% to image challenges, 10% to video challenges to allow miners the opportunity to iterate without being significantly impacted.
Overview
v2.0.0 features our initial version of video challenges with three text-to-video models, and two large real video datasets amounting to nearly 10TB of video data. It also contains significant refactors of core subnet components, with an emphasis on our approach to sampling and generating data for validator challenges. More on this below.
It also includes code to train and deploy a deepfake video detection model called TALL (whitepaper) as a SN34 miner.
Reward distribution will initially be split 90% to image challenges, 10% to video challenges. This is meant to allow miners the freedom to explore and experiment with different video detection models without a significant on incentive distribution, while still allowing high video performance to provide upward mobility.
To test miner generalizability, video challenges include (chronologically ordered) random set of video frames sampled at a variable frame rate. Frames are extracted with a png codec, with randomly applied jpeg compression as part of our pre-existing augmentation pipeline
This release corresponds to our initial version of video challenges. Upcoming future releases will include:
- Improved output quality from existing text-to-video models
- Additional SoTA text-to-video models, like open-sora
- Additional real image dataset sources
- Improved VideoSynapse efficiency
Models
https://huggingface.co/genmo/mochi-1-preview
https://huggingface.co/THUDM/CogVideoX-5b
https://huggingface.co/ByteDance/AnimateDiff-Lightning
Datasets
Initial selection for providing real videos. More to come, these are subject to change.
https://huggingface.co/datasets/nkp37/OpenVid-1M
https://huggingface.co/datasets/shangxd/imagenet-vidvrd
Replacement of open-image-v7 URL dataset with 256x256 JPEG subset (1.9M images).
https://huggingface.co/datasets/bitmind/open-image-v7-256
Miner Updates
TALLDetector + VideoDataset
Model Training
- Miners can use
base_miner/datasets/create_video_dataset.py
(example usage increate_videos_dataset_example.sh
) to transform a directory of mp4 files into a train-ready video frames dataset. This involves extracting individual frames from the mp4s and creating a local Huggingface dataset to reference the extracted frames during training.
Miner Deployment
miner.env
now has separate env vars for configuring both an image detection model and a video detection model.- These models reside within the same miner process listening on the single miner axon port, and respectively respond to requests of type
ImageSynapse
andVideoSynapse
.
Validator Optimizations and Refactors
SyntheticDataGenerator
- Formerly known as SyntheticImageGenerator, this class handles generation of videos in addition to its original functionality of prompt and image generation
- New functionality to run independently of the validator, continually generating and caching new synthetic video and images in
~/.cache/sn34/synthetic
.start_validator.sh
now starts an additional pm2 process calledbitmind_data_generator
- This is an improvement over the previous "as-needed" approach to synthetic data generation, providing greater flexibility and higher generation throughput
ImageAnnotationGenerator
- Simplified, stripped of utility code used internally by the BitMind team for full synthetic dataset generation. The utility code will be updated and moved to our bitmind-utils repository
ImageCache and VideoCache
- Two new classes are introduced to reduce storage requirements for validators (given that the video files from OpenVid-1m alone takes up several TB)
- These classes are used to keep a fresh cache of compressed data sources (parquet for images, zips for videos), and a corresponding cache of extracted images and videos.
start_validator.sh
now starts an additional pm2 process calledbitmind_cache_updater
that manages a real video cache and a real image cache. Each has its own asynchronous tasks to download new zip/parquet files on a regular interval, and extract random images and videos on a shorter interval.
Validator.forward
- Challenge generation has been updated according to the asynchronous nature of data sampling/generation described in the previous two sections.
- Rather than generating synthetic media on an as-needed basis and downloading entire image datasets, the BitMind Validator's
forward
function now samples random data from a local cache of real and synthetic images and videos. For videos, a random number of frames are sampled from a random mp4 file. This logic is handled by theVideoCache
class
Additional Changes
bitmind.constants.py
had become quite a monolith, with the majority of its contents pertaining to validator operations. These variables have been moved to a more aptly namedbitmind/validator/config.py
, and model metadata dictionaries have been given a new structure that is more amenable to the nature of how the codebase interacts with them.- Upgraded library versions, accommodating changes in behavior of huggingface file downloads.
- Added functionality to data augmentation pipeline to apply image transforms uniformly across all frames of a video
- Consolidated
requirements.txt
andsetup_env.sh
to avoid out of sync dependency versions - For image challenges, we swapped out the open-images-v7 URL dataset for a JPEG subset of open-images-v7 to comply with our new caching system. This will also improve sampling reliability, as fetching URL images sometimes fails.