All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Changed the tests to allow IPv6 CIDR addresses
- Updated the ct tests to support the VirtualNode type
- Added support for VirtualNode Hardware entries
- Created a simple benchmark tool that can stress test SLS with many concurrent requests.
- Updated Golang to 1.19
- Introduced a caching HTTP middleware. This feature is disabled by default.
- Database interaction improvements
- Pass contexts from the HTTP layer into database layer to cancel database operations if the HTTP requests is canceled.
- Database functions now accept
*tx.SQL
to allow use within database transactions. - Created
*Context
functions that warp the database functions within a transaction that is canceled by the provided context. SetGenericHardwareContext
andSetNetworkContext
were added to atomicity create or update hardware and network information, instead of running 2-3 database queriers that could leave the database in an inconsistent state if canceled.SearchGenericHardware
was modified to perform a single query likeGetAllGenericHardware
, as it was running additional SQL queries while processing data from another query. This type of database interaction can cause SLS to become deadlocked.- Removed
getChildrenForXname
as it is no longer required.
- Datastore changes
- Pass in contexts from HTTP layer for use with the database layer.
- Removed xname parameter from
SetXname
as it was unused, as the xname was being provided from the generic hardware object.
- SLS Client
- Added
GetNetworks
method to call/v1/networks
- Added
GetNetwork
method to call/v1/networks/$NETWORK_NAME
- Added
- Language linting of API spec file (no content changes)
- Removed encrypted dumpstate and loadstate APIs
- added many API tests using Tavern.
- CASMHMS-5695 - Improved the performance getting the hardware components
- CASMHMS-5696 - Disallow Networks with empty names
- CASMHMS-4267 - Changed loadstate to validate
- CASMHMS-5691: Added the slingshot11 network type
- CASMINST-3902: Expanded the SLS client to perform dumpstate and PUT to networks.
- CASMINST-3788: Add SLS Migrator to deal with malformed data from older CSM releases.
- When malformed liquid-cooled Chassis data is encountered a corresponding ChassisBMC will be created.
- When malformed xname derivived fields (
Parent
,Type
,TypeString
) are encountered the object will be PUT back into SLS to recalculate the fields.
- Added CDUMgmtSwitch as an acceptable type to the SLS CT functional tests.
- CASMHMS-5488: Changed the way the SQL is built for the search API
- CASMHMS-5291: Add Model field to the
ComptypeCabinet
structure. - Updated the hms-xname package to 1.1.0.
- Updated references of hms-base to the v2 version of the package.
- Updated CT tests to hms-test:3.1.0 image as part of Helm test coordination.
- Updated SLS to build using GitHub Actions instead of Jenkins.
- Pull images from artifactory.algol60.net instead of arti.dev.cray.com.
- Added a runCT.sh script that can run the CT tests in a docker-compose environment.
- CASMHMS-5350 - Improved swagger documentation
- CASMHMS-5259 - Added validation for IPRanges when setting networks
- CASMHMS-4670 - PUT and POST /hardware validation and http response improvements.
- CASMHMS-4671 - Hardware Search API now returns 400 for bad requests instead of the 500 HTTP status.
- CASMHMS-4270 - Support Cabinet and CDU hardware objects
- CASMHMS-4669 - Support HL Switch and RTR TOR FPGA hardware objects.
- CASMINST-3617 - Added PeerASN and MyASN to NetworkExtraProperties struct
- CASMNET-697 - Added MetalLBPoolName to IPV4Subnet struct
- CASMNET-692 - Added Bifurcated CAN default route toggle.
- CASMHMS-5055 - Added SLS CT test RPM.
- Changed the docker image to run as the user nobody
- Added GitHub configuration files.
- Replaced all old stash paths with github.com
- Add support for building within the CSM Jenkins.
- CASMHMS-4929 - Enable automatic postgres backups for SLS.
- CASMHMS-4898 - Updated base container images for security updates.
- Bump minor version for CSM 1.2 release branch
- Bump minor version for CSM 1.1 release branch
- Added aliases to ComptypeNodeBmc and added new struct for ComptypeChassisBmc to support aliases for that as well if necessary.
- Updated docker-compose files to pull images from Artifactory instead of DTR.
- CASMINST-2121: Added new fields to the IPV4Subnet struct to support uai_macvlan in csi
- CASMHMS-4765: Set a limit for the maximum number of database connections SLS can have open.
- Updated Dockerfile to pull base images from Artifactory instead of DTR.
- CASMHMS-4600 - Fixed an issue where the Hardware search API did not accept
comptype_hl_switch
andcomptype_cdu_mgmt_switch
as valid values to thetype
query param. - CASMHMS-4578/CASMHMS-4749 - Update the cray-service chart to 2.4.7 to address postgres security vulnerabilities and wait-for-postgres resource limit changes..
- Fixed an issue where SLS did not have
comptype_cab_pdu_pwr_connector
properly defined.
- CASMHMS-4605 - Update the loftsman/docker-kubectl image to use a production version.
- CASMHMS-4554: Scale SLS to 3 replicas with anti-affinity to prevent multiple SLS pods running on the same worker node.
- CASMINST-1546: Improved error handling in the SLS loader job. Modified the process of determining the IP address of rgw-vip.nmn to be more robust.
- Adding the runSnyk.sh script which was missed previosuly.
- Update License/Copyright info, re-vendor go packages.
- CASMINST-1126: Pickup the latest cray-service base chart to pick the wait-for-postgres jobs to prevent these jobs from getting OOMKilled
- Updated license file.
- CASMINST-759: Use the livecd nameserver to determine the IP address of the S3 endpoint. In order for DNS name resolution in k8s to work properly SLS needs to be populated with data, so that Ubound manager job can setup DNS records. However, when the SLS loader job first runs unbound is empty and is unable to resolve the S3 endpoint.
- CASMHMS-4266 - Added support to SLS for MgmtHLSwitch & CDUMgmtSwitch, and updated HMS Base to v1.8.4.
- CASMHMS-4148 - Update go module vendor code for security fix.
- CASMHMS-4055 - The SLS Loader job will now only upload the SLS input file once. The new default behavior of the SLS loader job is to upload the SLS file if the SLS S3 bucket does not contain the special file
uploaded
. If that file is not present in S3 then the SLS loader will load the SLS file into SLS, otherwise the loader will perform a no-op. If the that file is present in S3, then the loader job will do nothing. After the loader performs the SLS loadstate, it will create theuploaded
file in the SLS S3 bucket. - CASMHMS-4163: Update cray-service-base char to the latest 2.2.0 version.
- CASMHMS-4105 - Updated base Golang Alpine image to resolve libcrypto vulnerability.
- CASMHMS-4099 - The SLS Network structures have been greatly enriched. The base Network structure has not been modified, and all new networking information has been added to the network's extra properties. Networks are now meant to represent a IPv4 Network, and each IPv4 network can describe the IPv4 subnets within the network. IP reservations can also be described within a IPv4 subnet.
- CASMHMS-4100 - Download the pre-generated
sls_input_file.json
from the SLS S3 bucket. SLS no longer generates the SLS input file within the SLS Init/Load job, instead the SLS file is generated off of the system and then uploaded into the SLS S3 bucket. This is the new behavior in Shasta v1.4 and forward.
- Upgraded the cray-service chart to the latest version
- CASMCLOUD-1023 - Updated cray-service base chart to the latest 2.x version. This new version of the cray service chart now supports Helm v3.
- Modified containers/init containers, volume, and persistent volume claim value definitions to be objects instead of arrays
- CASMHMS-3996 - Updated hms-sls to use trusted baseOS images.
- CASMHMS-3985 - fixed switch xname generation to use destination rack rather than source.
- CASMHMS-3792 - Improved support for PDUs. Fixed Management switch connectors for PDUs to use the correct xname for the PDU.
- CASMHMS-3914 - moved CMC BMC number to 999.
- CASMHMS-3768 - made parsing tolerate unknown hardware better.
- CASMHMS-3768 - Fixed bug where the config parser would fall right over when it got to something not in a U.
- CASMHMS-3674 - fixed parsing bug.
- CASMHMS-3611 - Added CT smoke test for SLS.
- CASMHMS-3648 - fixed processing of xnames for Columbia switches.
- CASMHMS-3639 - fixed config parser bug.
- CASMHMS-3635 - made file getting from S3 try forever.
- CASMHMS-3550 - added all logic relating to downloading files from S3, generating SLS config, and pushing that into SLS.
- CASMHMS-3466 - added a lot of the parsing logic for the new config files.
- CASMHMS-3526 - fixed job cleanup.
- CASMHMS-3456 - added ExtraProperties section to Networks object.
- Changed migration logic slightly to requested version instead of up all the time.
- Updated to version 1.5.0 of the base
cray-service
chart.
- CASMHMS-3263 - updated cray-service base chart to enable online install upgrade/rollback
- CASMHMS-2965 - use golang based image for build-base to align with other services.
- CASMHMS-2965 - use trusted baseOS image.
- Moved the SLS loader out of the ansible installer and into the SLS helm chart.
- Removed the use of the wait-for-postgres job, which will be removed from the base chart.
- CASMHMS-2900 - Updated swagger file to fix openapi conversion issues and include missed commands.
- Added SLS loader.
- Made SLS tolerate not having keys for load/dump state.
- Changed the Postgres configuration during unit tests to allow connections without passwords. A breaking change was made to the official postgres image to require a password by default.
- CASMHMS-2641 - added liveness, readiness, and health endpoints.
- Updated hms-common lib.
- Added encrypted dump/load of SLS. This can be used by:
-
Generate a public/private key pair:
openssl rsa -in private.pem -outform PEM -pubout -out public.pem
-
Dump encrypted:
curl -X POST \ http://localhost:8376/v1/dumpstate \ -H 'Accept: */*' \ -H 'Accept-Encoding: gzip, deflate' \ -H 'Cache-Control: no-cache' \ -H 'Connection: keep-alive' \ -H 'Content-Length: 1034' \ -H 'Content-Type: multipart/form-data; boundary=--------------------------089094351527063763744770' \ -H 'Host: localhost:8376' \ -H 'cache-control: no-cache' \ -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \ -F public_key=@public_key.pem
-
Save output above in file. Load encrypted:
curl -X POST \ http://localhost:8376/v1/loadstate \ -H 'Accept: */*' \ -H 'Accept-Encoding: gzip, deflate' \ -H 'Cache-Control: no-cache' \ -H 'Connection: keep-alive' \ -H 'Content-Length: 5443' \ -H 'Content-Type: multipart/form-data; boundary=--------------------------767615378467519380075801' \ -H 'Host: localhost:8376' \ -H 'cache-control: no-cache' \ -H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \ -F sls_dump=@sls_test_config.json \ -F private_key=@private_key.pem
-
- Liveness/readiness probes.
- Changed URLS so they do not begin with /sls/. The use of /sls insternally was resulting in URLs beginning with /sls/sls/ when transiting the API gateway.
- Added GET for /hardware (gets list of all hardware components)
- Added search for hardware and networks.
- Added /hardware API set
- Added completed /loadstate and /dumpstate endpoints
- Added unit testing for database against real database instance.
- Adds support for all network API operations except for PATCH.
- This release adds the final bits necessary to support the basic operations of SLS. It also builds and functionally runs (though doesn't do anything all that useful yet.)
- This is the initial release. It contains no functionality yet.