docs add initialize docs and deploy

jarbus · Jul 8, 2024 · 42f73b1 · 42f73b1
1 parent 5edab05
commit 42f73b1
Show file tree

Hide file tree

Showing 40 changed files with 690 additions and 186 deletions.
diff --git a/.github/workflows/Documentation.yml b/.github/workflows/Documentation.yml
@@ -0,0 +1,29 @@
+name: Documentation
+
+on:
+  push:
+    branches:
+      - master # update to match your development branch (master, main, dev, trunk, ...)
+    tags: '*'
+  pull_request:
+
+jobs:
+  build:
+    permissions:
+      contents: write
+      pull-requests: read
+      statuses: write
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: julia-actions/setup-julia@v2
+        with:
+          version: '1.10'
+      - uses: julia-actions/cache@v1
+      - name: Install dependencies
+        run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
+      - name: Build and deploy
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # If authenticating with GitHub Actions token
+          DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }} # If authenticating with SSH deploy key
+        run: julia --project=docs/ docs/make.jl
diff --git a/README.md b/README.md
@@ -5,10 +5,8 @@
 Jevo is a high-performance, distributed, and modular (co-)evolutionary algorithm framework written in Julia. It is designed to be flexible and easy to use, with a focus on deep neuroevolutionary applications using [Flux.jl](https://fluxml.ai/Flux.jl/stable/). Jevo is designed to be easy to use, with a simple API that allows users to quickly define custom evolutionary algorithms and run them on distributed systems.
 
 
-```markdown
-| ⚠️ **Warning:** Jevo is currently alpha software and is under active development. |
+| **Warning:** Jevo is currently alpha software and is under active development. |
 | ----------------------------------------- |
-```
 
 # Install
 

diff --git a/docs/make.jl b/docs/make.jl
@@ -10,16 +10,17 @@ makedocs(
         "Overview" => "overview.md",
         "Operators" => "operators.md",
         "Phylogeny" => "phylogeny.md",
-        "Examples" => "examples.md",
+        "Miscellaneous" => "miscellaneous.md",
         "API" => "api.md",
     ],
+    warnonly=true,
 
 
 )
 
 # Documenter can also automatically deploy documentation to gh-pages.
 # See "Hosting Documentation" and deploydocs() in the Documenter manual
 # for more information.
-#=deploydocs(
-    repo = "<repository url>"
-)=#
+deploydocs(
+    repo = "github.com/jarbus/Jevo.jl.git",
+)
diff --git a/docs/src/api.md b/docs/src/api.md
@@ -1,26 +1,3 @@
-
-```@docs
-Individual
-Population
-CompositePopulation
-```
-
-```@docs
-Counter
-PopulationRetriever
-InitializePhylogeny
-CreateMissingWorkers
-InitializeAllPopulations
-map!
-```
-
-```@docs
-Jevo.GenerationIncrementer
-Jevo.skip
-Jevo.PopulationCreatorRetriever
-Jevo._WeightCache
-Jevo.get_opponent_ids_2player
-Base.show
-Jevo.MaxMRs
-Jevo.copyarchitecture
+```@autodocs
+Modules = [Jevo]
 ```
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -1,10 +1,11 @@
 # Jevo.jl
 
-Documentation for Jevo.jl, a julia package for distributed, high-performance (co-)evolutionary computation. While this framework is flexible enough to support most evolutionary/co-evolutionary algorithms and organisms, it is designed specifically for distributed deep neuroevolution. Only one selection method has been implemented so far. 
+Documentation for Jevo.jl, a julia package for distributed, high-performance (co-)evolutionary computation, inspired by [CoEvo](https://github.com/twillkens/coevo). While Jevo is flexible enough to support most evolutionary/co-evolutionary algorithms and organisms, it is designed specifically for distributed deep neuroevolution. Only one selection method has been implemented so far. 
 
-We highly recommend reviewing [Overview](@ref) before starting for a high-level introduction to the package's core concepts, functionality, and design.
+We highly recommend reviewing [Overview](@ref) for a high-level introduction to the package's core concepts, functionality, and design.
 
 ## Table of Contents
 
 ```@contents
+Pages = ["overview.md", "operators.md", "phylogeny.md", "miscellaneous.md"]
 ```
diff --git a/docs/src/miscellaneous.md b/docs/src/miscellaneous.md
@@ -0,0 +1,22 @@
+# Miscellaneous Bits
+
+## Checkpointing
+
+The [Checkpointer](@ref) operator serializes the state to disk. As the state contains every part of an evolutionary simulation (except per-worker caches), this is sufficient to resume from a checkpoint.
+
+For checkpointing, you can use the following pattern for state creation:
+
+```julia
+checkpointname = "check.jls"
+state = isfile(checkpointname) ? restore_from_checkpoint(checkpointname) :
+          State("example", rng, creators::Vector{<:AbstractCreator}, 
+            [Checkpointer(checkpointname, interval=25),
+            # other operators...
+            ]
+```
+
+## SLURM
+
+Jevo.jl supports distributed computing on [SLURM](https://slurm.schedmd.com/overview.html) clusters. Jevo currently only supports GPU workers on a single node, but will support distributed computing across nodes in the future.
+
+[CreateMissingWorkers
diff --git a/docs/src/operators.md b/docs/src/operators.md
@@ -1,6 +1,72 @@
+# Operators
 
-```@docs
-Operator
-Jevo.@define_op
-create_op
-```
+Each step of an evolutionary algorithm is represented by an [Operator](@ref). The [@define_op](@ref) macro defines new operator *structs* with fields specified in the [operator documentation](@ref Operator). The [create_op](@ref) function generates new operator *objects* with default values for unspecified operator fields. This section provides an overview of the operators implemented so far in Jevo.jl.
+
+## Retrievers
+
+Retrievers are a struct or function that, retrieve data from the state. 
+
+* [Jevo.PopulationCreatorRetriever](@ref), used in [InitializeAllPopulations](@ref)
+* [PopulationRetriever](@ref), used in [AllVsAllMatchMaker](@ref), and many others
+* [get_individuals](@ref), used in [ClearInteractionsAndRecords](@ref)
+
+## Updaters
+
+* [Jevo.ComputeInteractions!](@ref), used in [Performer](@ref)
+* [PopulationAdder](@ref), used in [InitializeAllPopulations](@ref)
+* [PopulationUpdater](@ref), currently unused because I forgot to use it 
+* [Jevo.add_matches!](@ref), used in [AllVsAllMatchMaker](@ref) and [SoloMatchMaker](@ref)
+* [Jevo.RecordAdder](@ref), used in [ScalarFitnessEvaluator](@ref)
+
+## Matchmaker
+
+* [AllVsAllMatchMaker](@ref)
+* [SoloMatchMaker](@ref), individuals play a match alone, used for evolutionary computing
+
+## Evaluators
+
+* [ScalarFitnessEvaluator](@ref)
+* [RandomEvaluator](@ref)
+
+## Selectors
+
+* [TruncationSelector](@ref)
+
+## Reproducers
+
+* [CloneUniformReproducer](@ref)
+
+## Performer
+
+* [Performer](@ref)
+
+
+## Mutators
+
+* [Mutator](@ref), uses [Jevo.mutate](@ref) as its `.operator`.
+
+## Assertors
+
+Assertors are operators that you can add at any point in the pipeline to check that certain aspects of the state are as expected.
+
+* [PopSizeAssertor](@ref)
+
+## Reporters
+
+* [Reporter](@ref), can log data if [`Jevo.measure`](@ref) for a specified [Jevo.AbstractMetric](@ref) as its `.operator`.
+
+## Checkpointer
+
+* [Checkpointer](@ref)
+
+## Initializers
+
+* [InitializeAllPopulations](@ref), uses [Jevo.create](@ref) as its `.operator`.
+
+## Miscellaneous
+
+* [CreateMissingWorkers](@ref). SLURM compatible, but only for a single node.
+
+## Phylogenies
+
+* See [Phylogeny](@ref)
diff --git a/docs/src/overview.md b/docs/src/overview.md
@@ -1,35 +1,78 @@
 # Overview
 
-Jevo is designed to be simple yet flexible, with the core package under 2k LOC. The framework is broken up into four core concepts:
+Jevo is designed to be simple yet flexible (co-)evolutionary computing framework, with the core package under 2500 LOC. The framework is broken up into four core concepts:
 
 1. A [State](@ref) is a structure that holds all information about the evolutionary process.
-2. [Operator](@ref)s are functions/structs that update the state. Jevo breaks down evolutionary algorithms into a series of sequential operators which are applied to the simulator state.
+2. [Operators](@ref Jevo.Operator) are functions/structs that update the state. Jevo breaks down evolutionary algorithms into a series of sequential operators which are applied to the simulator state.
 3. A [Creator](@ref) is a structure that generates new objects with specified parameters, and are called by operators. Creators spawn populations, individuals, genotypes, environments, other creators, and more.
 4. All other standard evolutionary objects ([`Individual`](@ref), [`Population`](@ref), genotypes, phenotypes, environments, etc.) which are contained in the [`State`](@ref).
 
-## State
+## Numbers Game Example
 
-```@docs
-State
-State(rng::AbstractRNG, creators::Vector{<:AbstractCreator}, operators::Vector{<:AbstractOperator})
-run!
-```
+The simplest way to get started with Jevo is by example. We'll use [The Numbers Game](http://www.demo.cs.brandeis.edu/papers/gecco2001_cdms.pdf) as our example domain, which consists of two co-evolving populations of vectors. For an example which focuses on neuroevolution, see the TODO example.
 
+```julia
+using Jevo
+using Logging
+using StableRNGs
 
-```@docs
-Creator
-PassThrough
-```
+# Jevo provides a custom logger to easily store statistical measurements to 
+# an HDF5 file, print to console, and log text to a file.
+global_logger(JevoLogger())
+rng = StableRNG(1)
 
-## Design Philosophy
+k = 2          # How many individuals to keep each generation
+n_dims = 2     # Number of dimensions in the vector
+n_inds = 10    # Number of individuals in each population
+n_species = 2  # Number of species
+n_gens = 10    # Number of generations
+
+# We create a list of counters which are incremented 
+# to track individuals, genes, generations, and matches.
+# This is passed to the state constructor.
+counters = default_counters()
+
+# Instead of creating genotypes and phenotypes directly, we create a
+# genotype creator, which generates genotypes, and a phenotype creator,
+# which "develops" a genotype into a phenotype.
+ng_genotype_creator = Creator(VectorGenotype, (n=n_dims,rng=rng))
+ng_developer = Creator(VectorPhenotype)
 
+# Likewise, instead of creating populations directly, we create a population
+# creator, which generates populations. Here, we create a composite population,
+# which is a population of sub-populations. Each sub-population is a species.
+comp_pop_creator = Creator(CompositePopulation, ("species", [("p$i", n_inds, ng_genotype_creator, ng_developer) for i in 1:n_species], counters))
 
-## Guidelines:
+# An environment creator can be used to generate instances of an environment,
+# particulary useful for randomizing the environment.
+env_creator = Creator(CompareOnOne)
+
+# We create a state called "ng_phylogeny" with the RNG object we passed to our creators.
+# We initialize the state with a list of creators (order does not matter), and
+# a list of operators (order does matter). Operators will look for creators by
+# type when needed to generate new objects when appropriate.
+state = State("ng_phylogeny", rng,[comp_pop_creator, env_creator],
+    [InitializeAllPopulations(),
+     InitializePhylogeny(),
+    AllVsAllMatchMaker(),
+    Performer(),
+    ScalarFitnessEvaluator(),
+    TruncationSelector(k),
+    CloneUniformReproducer(n_inds),
+    Mutator(),
+    PopSizeAssertor(n_inds),
+    ClearInteractionsAndRecords(),
+    Reporter(GenotypeSum, console=true)], counters=counters)
+
+# We run the state for n_gens generations.
+run!(state, n_gens)
+```
+
+## Design Philosophy
 
 - Minimize architectural complexity, maximize code reuse.
-- Trade-off efficiency for simplicity whenever possible, except for performance-critical code. 
-    - Don't be afraid to recompute things to avoid bookkeeping or use existing inefficient solution if it doesn't noticably impact performance.
-- All Counters use the highest level appropriate type (AbstractIndividual, AbstractGene, etc).
-- randn is faster for float32 than float16, so we use float32 for weights, despite the added memory cost.
-- Any multi-threaded operation that uses `state.rng` should generate a new RNG object for each iteration/thread in the main process, to ensure that the same random numbers are not used in different threads.
-- When using Distributed, all evaluations are done on workers and evolutionary operations are done on the main process.
+- Trade-off efficiency for simplicity whenever possible, **except for performance-critical code**. Don't be afraid to recompute things to avoid bookkeeping or use existing inefficient solution if it doesn't noticably impact performance.
+- All [`Counters`](@ref Counter) use the highest level appropriate type (`AbstractIndividual`, `AbstractGene`, etc).
+- `randn` is faster for `Float32` than `Float16`, so we use `Float32` for weights, despite the added memory cost.
+- Any multi-threaded operation that uses `state.rng` should generate a new RNG object for each iteration or thread in the main process, to ensure that the same random numbers are not used in different threads.
+- When using Distributed, all evaluations are done on workers and other evolutionary operations are done on the main process. When not using Distributed, all operations are done on the main process.
diff --git a/docs/src/phylogeny.md b/docs/src/phylogeny.md
@@ -1,8 +1,34 @@
 # Phylogeny
 
-## Design of phylo
+We use [PhylogeneticTrees.jl](https://github.com/jarbus/PhylogeneticTrees.jl) for phylogeny tracking. This package includes efficient algorithms for calculating distances between taxa and pruning the tree of taxa with no living descendants to free up memory.
 
-- we store phylo tree and delta cache with population, because those are NOT LRU
-- we store genotype cache and weight cache outside of population because those ARE LRU
+## Why do we care about Phylogeny?
 
-## Design of gene pool
+In short, phylogeny is the evolutionary history of a group of organisms. This has myriad uses:
+
+* *Analysis*: We can easily observe how population dynamics change throughout evolution and visualize extinctions and sub-species
+* *Optimization*: We can estimate performance of individuals based on their phylogenetic distance from other individuals to achieve substantial reductions in compute required. See [evolutionary](https://arxiv.org/abs/2306.03970) and [co-evolutionary](https://arxiv.org/abs/2404.06588) applications of this idea.
+
+
+## Phylogenetic Operators
+
+Phylogeny is managed using four operators:
+
+* [InitializePhylogeny](@ref): Adds current members of the population as roots of a phylogenetic tree. Runs on the first generation
+* [UpdatePhylogeny](@ref): Updates the phylogeny for the current population, runs on all generations. Should run after children are produced.
+* [LogPhylogeny](@ref): A bit misnamed for now, this operator writes phylogeny data to disk in the ALIFE Data Standard format. Should run before pruning individuals
+* [PurgePhylogeny](@ref): Removes individuals from the phylogeny that have no living descendants. Should run after children are produced and optionally, all individuals have been written to disk. Essential for reducing memory usage.
+
+
+## Deltas and DeltaCaches
+
+Stores the difference between an individual and its parent for all edges in the tree.
+
+* [InitializeDeltaCache](@ref): Initializes the delta cache for the current population. Runs on the first generation.
+* [UpdateDeltaCache](@ref): Updates the delta cache for the current population. Should run after children are produced.
+
+## Gene Pool
+
+The [GenePool](@ref) is a subset of genes in the population, typically the most recent genes. Used for runtime techniques that leverage information about the population, like adaptive mutation rates.
+
+* [UpdateGenePool](@ref): Creates/updates the gene pool for the current population.
diff --git a/example/ng.jl b/example/ng.jl
@@ -31,7 +31,7 @@ state = State("ng_phylogeny", rng,[comp_pop_creator, env_creator],
     CloneUniformReproducer(n_inds),
     Mutator(),
     UpdatePhylogeny(),
-    TrackPhylogeny(),
+    LogPhylogeny(),
     PurgePhylogeny(),
     PopSizeAssertor(n_inds),
     ClearInteractionsAndRecords(),