Skip to content

Minor release 16 of March 2021

Compare
Choose a tag to compare
@amc1999 amc1999 released this 16 Mar 23:36
· 1254 commits to master since this release

Minor release 16 of March 2021:

  • output tables suppression (see below for details).
  • fix to Modgen UNDEF_VALUE issue in derived tables (see below for details).
  • do not force import of std namespace into model code (this change can break existing models code compilation, see below for details).
  • use STL filesystem instead of custom path handling functions.
  • cleanup compilation warnings in model c++ code generated by openM++ compiler.

Download source code and binaries:

Download cluster version (MPI):

Linux downloads:

  • Desktop version build on Debian-10 and expected to work on any modern Linux, including Ubuntu 20.04 and CentOS 8
  • Cluster (MPI) Debian-10 version expected to work on any modern Linux with Open MPI version 3+ installed, including CentOS 8.
  • Cluster (MPI) CentOS-8 version require Open MPI 4.x and it would not work on RedHat / CentOS below 8.2 version.

IMPORTANT:
Full version of OpenM++ source code always included into "Download" links above.
Please do NOT use "Source code (zip)" or "Source code (tar.gz)" archives from "Assets" links below.
It is auto-generated by GitHub release tools and contains only half of OpenM++ source code.

Output tables suppression

By default model calculate all output tables and write it into database as model run results. Sometime it may be convenient to save only some output tables to reduce a time of each model run. This can be done by either suppressing model output table(s) or table group(s):

model.exe -Tables.Suppress ageSexIncome
model.exe -Tables.Suppress ageSexIncome,fullAgeSalary,A_TablesGroup

Or by suppressing output for all tables except of some:

model.exe -Tables.Retain ageSexIncome
model.exe -Tables.Retain ageSexIncome,fullAgeSalary,A_TablesGroup

Suppress and Retain options are mutually exclusive and cannot be mixed. For example, this model run would fail:

model.exe -Tables.Suppress ageSexIncome -Tables.Retain fullAgeSalary

Do not force import of std namespace into model code

This release may cause C++ compilation errors in existing models. These errors can be eliminated by inserting the line

  using namespace std;

into the model source file custom_early.h.

Starting with this release, openM++ no longer forces the import of the C++ standard library namespace ‘std’ into model code, leaving that choice to the model developer. If the model developer chooses to import the std namespace, model source code can refer to C++ standard library names without a leading std:: prefix, eg.

  string s;

If the std namespace is not imported (as in this release), C++ standard library names require a std:: prefix, eg.

   std::string s;

Explicit use of the std:: prefix in model code makes clear that the symbol comes from the C++ standard library, which is thoroughly documented with examples at https://en.cppreference.com/w/ and elsewhere. The demonstration models in the openM++ distribution use the std:: prefix to refer to symbols in the C++ standard library.

Fix to Modgen UNDEF_VALUE issue in derived tables

Summary:

This release includes new functions FixedGetTableValue and FixedSetTableValue to replace GetTableValue and SetTableValue to work around a Modgen issue which can result in writing erroneous values (often but not always very large values) to cells of derived tables instead of writing missing (empty) values. To use this functionality in the x-development framework, add the statement:

#include "omc/fixed_modgen_api.h"

to the model file custom.h.

Addressing the issue in Modgen requires that all occurrences of GetTableValue be replaced by FixedGetTableValue and all occurrences of SetTableValue be replaced by FixedSetTableValue.

This is a recommended change and has been implemented in all demonstration models in the openM++ distribution.

If this change is implemented in a model, it is possible that some cells of some derived tables may change. If so, the old values were probably incorrect and the new values correct. See the Details section below for more information on the root cause of these differences.

This issue does not affect openM++. In openM++, the new optional functions are identical to the original functions, and either can be used.

For x-compatible models, FixedGetTableValue and FixedSetTableValue should be used to defend against erroneous derived tables in the Modgen version of a model.

This issue affects many models which use derived tables. It can be detected by a logical analysis of model code, noticing anomalous values in output tables, or by comparing Modgen results to openM++ results. It may manifest more frequently with smaller population sizes due to a higher number of missing table cells. The issue was most recently found in a mechanical comparison of openM++ and Modgen outputs in a large model. The underlying behaviour is documented in the Modgen Developer’s Guide. That said, it is difficult to avoid the underlying issue in all but simple uses of derived tables.

Details:

OpenM++ and Modgen include the concept of derived tables (called user tables in Modgen). Model code which runs after the simulation completes assigns values to a derived table based on the values of other tables, using the function GetTableValue to obtain the value of a table cell, and SetTableValue to write the value of a table cell. A table cell may have an empty (undefined) value, for example if it is a mean with no observations.

In Modgen, if GetTableValue is called for a table cell containing an empty value, the special value UNDEF_VALUE is returned, and if SetTableValue is called with UNDEF_VALUE, a missing (empty) value is recorded. This behaviour is documented in the Modgen Developer’s Guide.

However, if mathematical operations are done using UNDEF_VALUE, a different value may result. Consider the following code fragment:

   // if the table cell is empty, Modgen returns UNDEF_VALUE which is 2,147,483,647 and the variable cost is set to that value
   double cost = GetTableValue("table.measure", …);
   double deflated_cost = cost / 1.05;

   // if cell was empty, deflated_cost is 2,045,222,520.952381
   SetTableValue("table_deflated.measure", deflated_cost, …);

Because deflated_cost is not equal to UNDEF_VALUE, Modgen thinks it is a valid numeric value, and will write that value to the table cell, producing erroneous output. The value of the cell in table_deflated should be undefined (empty) if cost is undefined.

Standards for floating point math include a special value called a ‘quiet NaN’, where NaN means not-a-number, to handle this kind of situation. A mathematical operation involving a NaN will result in another NaN. In the example, cost would be NaN, and deflated_cost would also be NaN.

OpenM++ uses NaN’s for undefined (missing) values, so missing values propagate correctly in model code, as in the example above.

The new functions in this release convert to and from Modgen’s UNDEF_VALUE and NaN, so that missing (empty) values are NaN, as they are in openM++.

Specifically, in Modgen models

FixedGetTableValue silently converts Modgen’s UNDEF_VALUE to a quiet NaN when reading a cell value, which will then propagate correctly in mathematical operations in model code.

FixedSetTableValue silently converts a quiet NaN to Modgen’s UNDEF_VALUE when writing a cell value.

Note that all logical operations with NaN return false, even ==. In rare situations where it is necessary to determine if a value is NaN, use the C++ standard function

   bool std::isnan(x)

Possible new API for derived tables:

The existing API for derived tables has significant design and usability issues.

It might be appealing to replace it with a design which:

1.Enables model code to treat tables, both input and output tables, as normal C++ multi-dimensional arrays, dispensing with the existing string-based interface GetTableValue and SetTableValue.

2.Eliminates the need for the model developer to analyze their own code and specify dependencies among tables

3.Eliminates the need for the model developer to correctly manage the invocation order of multiple UserTables() functions, or portions of code inside such functions, so that upstream tables are computed before computing downstream tables which use them.

4.Enables the framework to compute only the tables required (and in the right order) to produce the specific tables a user requests at run time.

Here’s a short sketch of a language extension which would do that:

1.After the derived_table statement body which gives the dimensions and measures of the table (like user_table now), but before the closing ; which finishes the derived_table statement, the model developer provides a comma separated list of tables which are directly required to compute the derived table.

notional example:

  derived_table TheTable {
     REGION
     *
     {
       QUANTITY,
       RATE
     }
  }
  needs UpstreamTable1, UpstreamTable2
  ;

2.The openM++ compiler would generate a function declaration with arguments for all upstream tables (read-only) and for the result table (writable), with a function name based on the derived table being computed. Continuing the example, assume that UpstreamTable1 and UpstreamTable2 are tables by REGION, each with a single measure. The function declaration produced by the openM++ compiler to compute the table TheTable might look something kind of like

  void TheTable_compute( double& TheTable[10][2], const double& UpstreamTable1[10][1], const double& UpstreamTable2[10][1] );

3.The model developer would have the responsibility of providing the definition (body) of that function.

The model developer might have difficulty figuring out in advance what that function declaration looks like, so might just choose to let the openM++ compiler generate the declaration, have the build fail because the function was not defined, then copy the declaration from the error message or the generated code to start off the definition of the function in an .mpp file.

The model dev code inside the function TheTable_compute can refer to the 3 tables as though they are normal C++ multidimension arrays. The function declaration will prevent the model dev from inadvertently changing any values in the upstream tables (any attempt will produce a C++ compiler error). The downstream table would be writable.

4.The design of the derived_table statement is such that the framework has complete knowledge of dependencies among tables. Moreover, that knowledge is accurate, because the body of the TheTable_compute() function refers only to the upstream tables named in the derived_table statement and passed as arguments, and can only modify the correct result table, passed as a writable array argument.