Skip to content

Oms API Read Run Microdata compare runs

amc1999 edited this page May 14, 2024 · 5 revisions

Read a "page" of microdata values and compare model runs.

Compare [base] and [variant] model runs microdata value attributes (float of integer type attributes), group it by dimension attributes (enum-based or bool type attributes).

  • Compare one or more microdata value attributes (float of integer type attribute). For example, two comparisons: OM_AVG(Income[variant] - Income[base]) , OM_MAX( 100 * (Salary[variant] + Pension[variant]) / Income[base]).

  • It is also posiible to include aggreagted value attribute(s) for each single run, for example: OM_MAX(Salary) , OM_MIN(Pension).

  • Group by one or more dimension attributes (enum-based or bool type attribute). For example, group by two dimension attributes: AgeGroup , Sex.

  • Page is part of output table values defined by zero-based "start" row number and row count. If row count <= 0 then all rows below start row number returned.

  • Dimension attribute(s) returned as enum codes. For boolean dimensions string value used, e.g.: "true".

  • Method verb must be POST and Content-Type header "application/json".

Following aggregation functions avaliable:

  • OM_AVG mean of accumulators sub-values
  • OM_SUM sum of accumulators sub-values
  • OM_COUNT count of accumulators sub-values (excluding NULL's)
  • OM_COUNT_IF count values matching condition
  • OM_MAX maximum of accumulators sub-values
  • OM_MIN minimum of accumulators sub-values
  • OM_VAR variance of accumulators sub-values
  • OM_SD standard deviation of accumulators sub-values
  • OM_SE standard error of accumulators sub-values
  • OM_CV coefficient of variation of accumulators sub-values

It is also possible to use parameter(s) in calculation, parameter must be a scalar of float or integer type. For example: OM_COUNT_IF((Income[variant - Income[base]) > param.High[base]), where param.High[base] is a value of scalar parameter High in [base] model run.

For more details please see: Model Output Expressions

JSON body POSTed to specify entity name, page size, row count, filters and row order. It is expected to be JSON representation of db.ReadCompareMicroLayout structure from Go library. See also: db.ReadLayout structure from Go library.

// ReadCompareMicroLayout to compare microdata runs with base run using multiple comparison aggregations and/or calculation aggregations.
//
// Comparison aggregation must contain [base] and [variant] attribute(s), ex.: OM_AVG(Income[base] - Income[variant]).
// Calculation aggregation is attribute(s) aggregation expression, ex.: OM_MAX(Income) / OM_MIN(Salary).
type ReadCompareMicroLayout struct {
	ReadCalculteMicroLayout          // aggregation measures and group by attributes
	Runs                    []string // runs to compare: list of digest, stamp or name
}

// ReadCalculteMicroLayout describe microdata generation read layout, aggregation measures and group by attributes.
type ReadCalculteMicroLayout struct {
	ReadLayout           // entity name, run id, page size, where filters and order by
	CalculateMicroLayout // microdata aggregations
}

// CalculateMicroLayout describes aggregations of microdata.
//
// It can be comparison aggregations and/or calculation aggregations.
// Comparison aggregation must contain [base] and [variant] attribute(s), ex.: OM_AVG(Income[base] - Income[variant]).
// Calculation aggregation is attribute(s) aggregation expression, ex.: OM_MAX(Income) / OM_MIN(Salary).
type CalculateMicroLayout struct {
	Calculation []CalculateLayout // aggregation measures, ex.: OM_MIN(Salary), OM_AVG(Income[base] - Income[variant])
	GroupBy     []string          // attributes to group by
}

// CalculateLayout describes calculation expression for parameters, output table values or microdata entity.
// It can be comparison calculation for multiple model runs, ex.: Expr0[base] - Expr0[variant].
type CalculateLayout struct {
	Calculate string // expression to calculate, ex.: Expr0[base] - Expr0[variant]
	CalcId    int    // calculated expression id, calc_id column in csv,     ex.: 0, 12000, 24000
	Name      string // calculated expression name, calc_name column in csv, ex.: Expr0, AVG_Expr0, RATIO_Expro0
}

// ReadLayout describes source and size of data page to read input parameter, output table values or microdata.
//
// Row filters combined by AND and allow to select dimension or attribute items,
// it can be enum codes or enum id's, ex.: dim0 = 'CA' AND dim1 IN (2010, 2011, 2012)
type ReadLayout struct {
	Name           string           // parameter name, output table name or entity microdata name
	FromId         int              // run id or set id to select input parameter, output table values or microdata from
	ReadPageLayout                  // read page first row offset, size and last page flag
	Filter         []FilterColumn   // dimension or attribute filters, final WHERE does join all filters by AND
	FilterById     []FilterIdColumn // dimension or attribute filters by enum ids, final WHERE does join filters by AND
	OrderBy        []OrderByColumn  // order by columnns, if empty then dimension id ascending order is used
}

Methods:

POST /api/model/:model/run/:run/microdata/compare

For example:

curl -v -X POST -H "Content-Type: application/json" http://localhost:4040/api/model/modelOne/run/Microdata%20in%20database/microdata/compare -d @read_m1_person_cmp_1.json

Arguments:

:model - (required) model digest or model name

Model can be identified by digest or by model name. It is recommended to use digest because it is uniquely identifies model. It is possible to use model name, which is more human readable than digest, but if there are multiple models with same name in database than result is undefined.

:run - (required) model run digest, run stamp or run name

Model run can be identified by run digest, run stamp or run name. It is recommended to use digest because it is uniquely identifies model run. Run stamp, if not explicitly specified as model run option, automatically generated as timestamp string, ex.: 2016_08_17_21_07_55_123. It is also possible to use name, which is more human readable than digest, but if there are multiple runs with same name in database than result is undefined.

JSON body arguments:

Example 1: Compare Person entity between [base] model run and [variant] model run: Microdata other in database by OM_AVG() average Income[variant] - Income[base] value and group it by AgeGroup , Sex dimension attributes.

{
    "Name": "Person",
    "Calculation": [{
            "Calculate": "OM_AVG(Income[variant] - Income[base])",
            "CalcId": 2401,
            "Name": "Avg_Income"
        }
    ],
    "GroupBy": [
        "AgeGroup",
        "Sex"
    ],
    "Runs": [
        "Microdata other in database"
    ]
}

Example 2.

  • compare Person entity
  • between [base] model run and [variant] model run: Microdata other in database
  • calculate two values:
    • OM_AVG() average of Income[variant] - Income[base] value, adjusted by using parameter StartingSeed values
    • OM_AVG() average of Salary + Pension value, adjusted by using parameter StartingSeed values
  • and group it by AgeGroup , Sex dimension attributes
  • filter only rows where:
    • dimension AgeGroup IN ["20-30", "40+"]
    • and dimension Sex = "F"
    • and value of Avg_Income_adjusted > 65000
    • and value of Avg_Salary_Pension_adjusted < 75000
{
    "Name": "Person",
    "Calculation": [{
            "Calculate": "OM_AVG((Income[variant] - Income[base]) * (param.StartingSeed[variant] - param.StartingSeed[base]))",
            "CalcId": 2401,
            "Name": "Avg_Income_adjusted"
        }, {
            "Calculate": "param.StartingSeed + OM_AVG(Salary + Pension)",
            "CalcId": 2408,
            "Name": "Avg_Salary_Pension_adjusted"
        }
    ],
    "GroupBy": [
        "AgeGroup",
        "Sex"
    ],
    "Runs": [
        "Microdata other in database"
    ],
    "Offset": 0,
    "Size": 100,
    "IsFullPage": true,
    "Filter": [{
            "Name": "AgeGroup",
            "Op": "IN",
            "Values": ["20-30", "40+"]
        }, {
            "Name": "Sex",
            "Op": "=",
            "Values": ["F"]
        }, {
            "Name": "Avg_Income_adjusted",
            "Op": ">",
            "Values": ["65000"]
        }, {
            "Name": "Avg_Salary_Pension_adjusted",
            "Op": "<",
            "Values": ["75000"]
        }
    ]
}
Name       - (required) entity name
Offset     - (optional) zero-based start row to select aggreagted microdata values
Size       - (optional) max row count to select rows, if size <= 0 then all rows selected
IsFullPage - (optional) if true then always return non-empty last page of data
Filter     - (optional) conditions to filter dimension attributes
OrderBy    - (optional) list of columns indexes (one based) to order by

Filter conditions joined by AND and can have following operations:

=       - enum equal to:          AgeGroup = "20-30"
!=      - enum not equal to:      AgeGroup <> "20-30"
>       - enum greater than:      AgeGroup > "20-30"
>=      - enum greater or equal:  AgeGroup >= "20-30"
<       - enum less than:         AgeGroup < "20-30"
<=      - enum less or equal:     AgeGroup <= "20-30"
IN      - enum is in the list of: AgeGroup IN ("20-30", "30-40", "40+")
BETWEEN - between min and max:    AgeGroup BETWEEN "30-40" AND "all"
IN_AUTO - automatically choose most suitable: = or != or IN or BETWEEN

Keep in mind: dimension enums are always ordered by id's, not by code and result of filter Sex < "M" may not be Sex = "F".

Order by specified by one-based column(s) index(es) in result. Columns always contain enum id's, not enum codes and therefore result ordered by id's. First two columns are run_id, calc_id:

  SELECT run_id, CalcId AS calc_id, AgeGroup, Sex, ..., calc_value FROM .... ORDER BY 1, 2,...

JSON response:

{
  Layout: {
    Offset:     actual first row number of the page data (zero-base),
    Size:       actual data page row count,
    IsLastPage: true if this is last page of data
  },
  Page: [....page of data...]
}

Result:

{
  "Page": [{
      "Attr": [{
          "IsNull": false,
          "Value": "40+"
        }, {
          "IsNull": false,
          "Value": "F"
        }, {
          "IsNull": false,
          "Value": 69934.18659670698
        }
      ],
      "CalcName": "Avg_Salary_Pension_adjusted",
      "RunDigest": "703d8b78039d69b795ab2e601c32b789"
    }, {
      "Attr": [{
          "IsNull": false,
          "Value": "20-30"
        }, {
          "IsNull": false,
          "Value": "F"
        }, {
          "IsNull": false,
          "Value": 920051.7871559632
        }
      ],
      "CalcName": "Avg_Income_adjusted",
      "RunDigest": "c8382ce22a004bf86d83000aa022d45b"
    }, {
      "Attr": [{
          "IsNull": false,
          "Value": "40+"
        }, {
          "IsNull": false,
          "Value": "F"
        }, {
          "IsNull": false,
          "Value": 65848.00593810345
        }
      ],
      "CalcName": "Avg_Salary_Pension_adjusted",
      "RunDigest": "c8382ce22a004bf86d83000aa022d45b"
    }
  ],
  "Layout": {
    "Offset": 0,
    "Size": 3,
    "IsLastPage": true,
    "IsFullPage": true
  }
}

Home

Getting Started

Model development in OpenM++

Using OpenM++

Model Development Topics

OpenM++ web-service: API and cloud setup

Using OpenM++ from Python and R

Docker

OpenM++ Development

OpenM++ Design, Roadmap and Status

OpenM++ web-service API

GET Model Metadata

GET Model Extras

GET Model Run results metadata

GET Model Workset metadata: set of input parameters

Read Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata values

GET Parameters, Output Tables or Microdata as CSV

GET Modeling Task metadata and task run history

Update Model Profile: set of key-value options

Update Model Workset: set of input parameters

Update Model Runs

Update Modeling Tasks

Run Models: run models and monitor progress

Download model, model run results or input parameters

Upload model runs or worksets (input scenarios)

Download and upload user files

User: manage user settings

Model run jobs and service state

Administrative: manage web-service state

Clone this wiki locally