Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Everest storage (port of seba_sqlite logic) #9763

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

yngve-sk
Copy link
Contributor

@yngve-sk yngve-sk commented Jan 16, 2025

Release notes:

  • Stores constraint violation values
  • Replaces seba sqlite storage

(Storage PR back in, for reviews, previous PR (accidentally merged): #9161)

Issue
Resolves #8811

Base idea/documentation:

Store datasets by [batch, realization, perturbation] x [controls, objectives, constraints, objective_gradient, constraint_gradient]:

Exhaustive list of data stored PER BATCH :

  • batch.json - contains info about the batch, batch_id and whether it is an improvement (aka merit flag, but the concepts are now unified for dakota and non-dakota runs)
  • batch_constraints constraint values (and violations) for constraints, batch-wide
  • batch_objectives objective values, batch-wide
  • realization_controls - control values for geo-realizations, also includes simulation_id
  • realization_objectives - objective values per geo-realization
  • realization_constraints - constraint values per geo-realization
  • perturbation_objectives - objective and control values per perturbation
  • perturbation_constraints - constraint and control values per perturbation (Note/discussion point: control values could be pulled into separate table to avoid redundancy)
  • batch_objective_gradient - Partial derivatives of objectives, given different controls. This dataset has one column per objective, and one row per control value, and the intersecting cells represent the partial derivative of the objective wrt that control value.
  • batch_constraint_gradient - Partial derivatives of constraints, given different controls. This dataset has one column per constraint, and one row per control value, and the intersecting cells represent the partial derivative of the constraint wrt that control value.

Example data from math_func/config_advanced.yml (json format)
Screenshot 2025-01-10 at 14 53 04

Exhaustive list of data stored PER OPTIMIZATION

  • controls.json - control values for this batch
  • realization_weights.json - realization weights
  • nonlinear_constraints - conditions for constraints to satisfy (on average over the batch)
  • objective_functions - objective function names, weights, and normalization

Example data from math_func/config_advanced.yml
Screenshot 2025-01-10 at 15 00 29

Potential simplifications

The everest_data_api is currently used for plotting, but could be used (probably expanded a bit) to avoid doing direct (polars) dataframe manipulations elsewhere in the code, but currently they are done directly in the code.

@yngve-sk yngve-sk changed the title 24.10.25.store everest opt results in ertstorage Add Everest storage Jan 16, 2025
Copy link

codspeed-hq bot commented Jan 16, 2025

CodSpeed Performance Report

Merging #9763 will improve performances by 11.87%

Comparing yngve-sk:24.10.25.store-everest-opt-results-in-ertstorage (66419ce) with main (2e7fba6)

Summary

⚡ 1 improvements
✅ 23 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_direct_dark_performance_with_storage[gen_x: 20, sum_x: 20 reals: 10-summary-get_record_observations] 1.4 ms 1.3 ms +11.87%

@yngve-sk yngve-sk self-assigned this Jan 16, 2025
@yngve-sk yngve-sk added release-notes:breaking-change Automatically categorise as breaking change in release notes enhancement labels Jan 16, 2025
@yngve-sk yngve-sk changed the title Add Everest storage Add Everest storage (port of seba_sqlite logic) Jan 16, 2025
Copy link
Contributor

@DanSava DanSava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good 👍 🏅

src/everest/api/everest_data_api.py Outdated Show resolved Hide resolved
src/everest/detached/__init__.py Show resolved Hide resolved
Copy link
Contributor

@StephanDeHoop StephanDeHoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Most of my comments are just asking to clarify things I don't understand :). Amazing work, well done :) !!!

src/everest/everest_storage.py Show resolved Hide resolved
src/everest/everest_storage.py Outdated Show resolved Hide resolved
src/everest/everest_storage.py Show resolved Hide resolved
src/everest/everest_storage.py Outdated Show resolved Hide resolved
src/everest/everest_storage.py Show resolved Hide resolved
src/everest/bin/config_branch_script.py Show resolved Hide resolved
tests/everest/entry_points/test_config_branch_entry.py Outdated Show resolved Hide resolved
tests/everest/entry_points/test_config_branch_entry.py Outdated Show resolved Hide resolved
@yngve-sk yngve-sk force-pushed the 24.10.25.store-everest-opt-results-in-ertstorage branch 11 times, most recently from 2f0127c to ab585e9 Compare January 21, 2025 10:29
@yngve-sk yngve-sk force-pushed the 24.10.25.store-everest-opt-results-in-ertstorage branch 7 times, most recently from 27d32ae to 9870a08 Compare January 23, 2025 06:56
@yngve-sk yngve-sk force-pushed the 24.10.25.store-everest-opt-results-in-ertstorage branch from 9870a08 to 66419ce Compare January 24, 2025 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement release-notes:breaking-change Automatically categorise as breaking change in release notes
Projects
Status: Ready for Review
Development

Successfully merging this pull request may close these issues.

Refactor communication and storage of optimization results
4 participants