Skip to content

Commit

Permalink
Merge branch 'release-v0.102'
Browse files Browse the repository at this point in the history
============================== Release Notes: v0.102 ==============================
Support for new training algorithms:
 - LTFB is now a first-class training algorithm.
 - LTFB now allows multiple metrics. The local algorithm is favored by
   each trainer and a partner model must win every metric to be declared
   the tournament winner.
 - The batched iterative optimizer (sgd_training_algorithm) was
   refactored for consistency.
 - Improved documentation of training algorithm infrastructure.

Support for new network structures:
 - ATOM WAE model - character-based Wasserstein Autoencoder
 - Community GAN model for graph data sets

Support for new layers:
 - "DFTAbs" layer that computes the absolute value of the channel-wise
   DFT of the input data
 - Adding support for 3D Matrix Multiplication
 - Added scatter and gather neural network layers
 - CPU-based GRU layers using oneDNN
 - Added batch-wise reduce-sum
 - ArcFace loss

Python front-end:
 - Added 3D U-Net Model
 - Added Cosmoflow Model
 - Ported CANDLE Pilot1 models
 - Support nvprof
 - Added channelwise fully connected layer
 - Added support for non square kernels, padding, stride, and
   dilation for the convolution module
 - Support for OpenMPI launcher

Performance optimizations:
 - Use cuDNN 8 RNN API and CUDA Graphs in GRU layer
 - Cache CUDA Graphs for each active mini-batch size
 - Tuned performance of slice, concatenate, and tessellate layers on
   ARM processors
 - Parallelize computation of Gaussian random numbers
 - Optimizing tessellate, concatenate, and slice layers on CPU

Experiments & Applications:
 - Added experiment scripts for ATOM cWAE Gordon Bell simulations
 - LBANN-ATOM model inference and analysis

Internal features:
 - Wrapper classes for CUDA Graphs API
 - Elementary examples of using complex numbers
 - cuDNN handles are now wrapped in RAII management classes
 - Improved HWLOC compatility for v1.11 and v2.x
 - Added an enum type of visitor hooks that will eventually be used to
   allow callbacks or other visitors to operate at user defined hook
   points
 - Changed checkpoint logic to checkpoint at the start of epochs
   and changed the naming scheme to use the callback phase (visitor
   hook) in the name rather than the current execution context.
 - Added in-memory binary model exchange for LTFB.
 - Added support for ROCm and MIOpen
 - Added support for oneDNN
 - Updated the bamboo test environment to use local executable rather
   than hard coded executables
 - Overhauled and refactored serialization throughout code to use
   Cereal serialization library
 - Significant cleanup and refactoring of code base to improve compile
   times.  Moving to ensure  that code adheres to standard split of
   header between declaration and implementation functions (for
   templated code).  Specifically focused on serialization functions
   and comm class.  Reduced dependencies through over reaching header
   inclusions.
 - The relationship of execution_contexts and training_algorithms was
   clarified. There is still work to do here.
 - Added DistConv tests both convolution and pooling layers
 - Support padding in distributed embedding layer
 - Added dump model graph callback
 - Added perturb learning rate callback
 - Added batched inference algorithm
 - Switched ATOM tests to use CPU embedding and tessellate layers to
   minimize noise

I/O & data readers:
 - Experimental data reader that generates graph random walks with
   HavoqGT
 - Added explict tournament execution mode
 - Added support to split training data reader into validation and
   tournament readers
 - node2vec data reader

Build system:
 - Hydrogen v1.5.0+
 - Aluminum v0.5.0+
 - DiHydrogen v0.2.0 is required
 - C++14 or newer standard with CUDA (CMake: "-DCMAKE_CUDA_STANDARD=14")
 - OpenCV is now an optional dependency via CMake "LBANN_WITH_VISION"
 - CNPY is now an optional dependency via CMake "LBANN_WITH_CNPY"
 - Adds support in the build_lbann.sh script for concretizing extra
   packages with the primary LBANN installation
 - New features in the build script to setup / configure the build
   environment, but stop and allow the user to manually add extra
   packages
 - Add a set of user-focused build scripts that use the main
   build_lbann.sh script to setup good defaults on known systems
 - Added application specific build scripts for users such as ATOM
 - Added support for pulling from Spack mirrors and setting them up
 - Split embedded Python support from Python Front End
 - Switched Spack-based build script to use Spack's clingo concretizer

Bug fixes:
 - Fixed a bug where LBANN didn't set the Hydrogen RNG seed
 - Fixed both CosmoFlow and UNet models PFE as well as addressed
   issues in the data reader and data coordinator.
 - Fixed the HDF5 data reader to properly specify the supported I/O
   types
 - Fixed calculation of the linearized response size
 - Fixed the data coordinator's interface to input_layer
 - Fixed error with deterministic execution of dropout layers

Retired features:
 - Removed deprecated JAG leader mode which was made obsolete when the
   data reader moved into the data coordinator
 - Removed the deprecated partitioned data reader modes that were used
   to partition and overlap data sets for multiple models
 - Removed deprecated ActivationDescriptor class
  • Loading branch information
bvanessen committed May 28, 2021
2 parents 0bb0f50 + 033cef5 commit 0433bb8
Show file tree
Hide file tree
Showing 1,075 changed files with 67,030 additions and 21,053 deletions.
165 changes: 165 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
###############################################################################
# Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC.
# Produced at the Lawrence Livermore National Laboratory.
# Written by the LBANN Research Team (B. Van Essen, et al.) listed in
# the CONTRIBUTORS file. <[email protected]>
#
# LLNL-CODE-697807.
# All rights reserved.
#
# This file is part of LBANN: Livermore Big Artificial Neural Network
# Toolkit. For details, see http://software.llnl.gov/LBANN or
# https://github.com/LLNL/LBANN.
#
# Licensed under the Apache License, Version 2.0 (the "Licensee"); you
# may not use this file except in compliance with the License. You may
# obtain a copy of the License at:
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the license.
###############################################################################

# Basic clang-format specification for LBANN.
# Based on clang-10 for LC compatibility.

---
Language: Cpp
BasedOnStyle: LLVM
AccessModifierOffset: -2
AlignAfterOpenBracket: Align
AlignConsecutiveMacros: false
AlignConsecutiveAssignments: false
AlignConsecutiveDeclarations: false
AlignEscapedNewlines: Right
AlignOperands: true
AlignTrailingComments: true
AllowAllArgumentsOnNextLine: false
AllowAllConstructorInitializersOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: false
AllowShortBlocksOnASingleLine: Never
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: All
AllowShortLambdasOnASingleLine: All
AllowShortIfStatementsOnASingleLine: Never
AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterDefinitionReturnType: None
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: MultiLine
BinPackArguments: false
BinPackParameters: false
BraceWrapping:
AfterCaseLabel: false
AfterClass: true
AfterControlStatement: false
AfterEnum: true
AfterFunction: true
AfterNamespace: false
AfterObjCDeclaration: false
AfterStruct: true
AfterUnion: true
AfterExternBlock: false
BeforeCatch: true
BeforeElse: true
IndentBraces: false
SplitEmptyFunction: false
SplitEmptyRecord: true
SplitEmptyNamespace: true
BreakBeforeBinaryOperators: None
BreakBeforeBraces: Custom
BreakBeforeInheritanceComma: false
BreakInheritanceList: BeforeColon
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
BreakConstructorInitializers: BeforeColon
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 80
CommentPragmas: '^ IWYU pragma:'
CompactNamespaces: false
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 2
ContinuationIndentWidth: 2
Cpp11BracedListStyle: true
DeriveLineEnding: true
DerivePointerAlignment: false
DisableFormat: false
ExperimentalAutoDetectBinPacking: false
FixNamespaceComments: true
ForEachMacros:
- foreach
- Q_FOREACH
- BOOST_FOREACH
IncludeBlocks: Preserve
IncludeCategories:
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
Priority: 2
SortPriority: 0
- Regex: '^(<|"(catch|gtest|gmock|isl|json)/)'
Priority: 3
SortPriority: 0
- Regex: '.*'
Priority: 1
SortPriority: 0
IncludeIsMainRegex: '(Test)?$'
IncludeIsMainSourceRegex: ''
IndentCaseLabels: false
IndentGotoLabels: true
IndentPPDirectives: None
IndentWidth: 2
IndentWrappedFunctionNames: false
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: true
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBinPackProtocolList: Auto
ObjCBlockIndentWidth: 2
ObjCSpaceAfterProperty: false
ObjCSpaceBeforeProtocolList: true
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 19
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 60
PointerAlignment: Left
ReflowComments: true
SortIncludes: true
SortUsingDeclarations: true
SpaceAfterCStyleCast: false
SpaceAfterLogicalNot: false
SpaceAfterTemplateKeyword: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeParens: ControlStatements
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyBlock: false
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: false
SpacesInConditionalStatement: false
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInParentheses: false
SpacesInSquareBrackets: false
SpaceBeforeSquareBrackets: false
Standard: c++17
StatementMacros:
- Q_UNUSED
- QT_REQUIRE_VERSION
TabWidth: 8
UseCRLF: false
UseTab: Never
...
36 changes: 36 additions & 0 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

before_script:
- echo "=== before_script section ==="

after_script:
- echo "=== after_script section ==="

stages:
- compiler
- integration
- unit

compilerRay:
stage: compiler
tags:
- ray
- shell
script:
- echo "=== compilerRay section ==="

integrationRay:
stage: compiler
tags:
- ray
- shell
script:
- echo "=== integrationRay section ==="

unitRay:
stage: compiler
tags:
- ray
- shell
script:
- echo "=== unitRay section ==="
- echo "FINISHED"
21 changes: 13 additions & 8 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
[submodule "applications/graph/snap"]
path = applications/graph/snap
url = https://github.com/snap-stanford/snap
ignore = dirty
[submodule "applications/graph/largescale_node2vec"]
path = applications/graph/largescale_node2vec
url = https://lc.llnl.gov/bitbucket/scm/havoq/largescale_node2vec.git
ignore = dirty
[submodule "applications/ATOM/moses"]
path = applications/ATOM/moses
url = [email protected]:samadejacobs/moses.git
[submodule "applications/graph/node2vec/snap"]
path = applications/graph/node2vec/snap
url = https://github.com/snap-stanford/snap
ignore = dirty
[submodule "applications/graph/node2vec/havoqgt"]
path = applications/graph/node2vec/havoqgt
url = https://github.com/KIwabuchi/havoqgt
branch = develop
ignore = dirty
[submodule "applications/graph/node2vec/largescale_node2vec"]
path = applications/graph/node2vec/largescale_node2vec
url = https://lc.llnl.gov/bitbucket/scm/havoq/largescale_node2vec.git
ignore = dirty
50 changes: 0 additions & 50 deletions .travis.yml

This file was deleted.

Loading

0 comments on commit 0433bb8

Please sign in to comment.