This repository contains the code examples for consuming Sisu's insights in BI tools. For more information or to get started, contact your Sisu Customer Success team
Below is an overview of the tables and columns, along with their semantic meaning:
NOTE: It is intended that SET1 refers to Group A or the Recent Period value, and SET2 refers to_ Group B_ or the_ Previous Period_ value. For General Performance analyses, SET2 will always be NULL
ANALYSIS - Holds the metadata for this analysis
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_NAME | TEXT | The name of the analysis in Sisu |
ANALYSIS_TYPE | TEXT | The type of the analysis |
APPLICATION_URL | TEXT | A direct link to the analysis in Sisu |
CREATED_AT | TIMESTAMP | Timestamp when this analysis was created |
METRIC_ID | BIGINT | The static ID of the metric analyzed |
METRIC_NAME | TEXT | The name of the metric analyzed |
METRIC_DESIRED_DIRECTION | TEXT | The desired direction of the metric analyzed |
PROJECT_ID | BIGINT | The static ID of the project where this analysis is located in Sisu |
PROJECT_NAME | TEXT | The name of the Sisu project |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
ANALYSIS_RESULT_SUMMARY - Holds the summary ribbon information for this analysis
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_RESULT_ID | BIGINT, PRIMARY KEY | The static ID for the results associated with a specific execution of a specific analysis |
REQUESTED_AT | TIMESTAMP | Timestamp when this analysis execution began |
COMPLETED_AT | TIMESTAMP | Timestamp when this analysis execution completed |
RUN_STATUS | TEXT | Status of the analysis execution |
RUN_TYPE | TEXT | Type of the analysis execution |
TIMEFRAME | TEXT | Currently empty |
PREVIOUS_PERIOD_START | TIMESTAMP | Start date for the Previous Period of a Time Comparison analysis |
PREVIOUS_PERIOD_END | TIMESTAMP | End date for the Previous Period of a Time Comparison analysis |
RECENT_PERIOD_START | TIMESTAMP | Start date for the Recent Period of a Time Comparison analysis |
RECENT_PERIOD_END | TIMESTAMP | End date for the Recent Period of a Time Comparison analysis |
GROUP_A_NAME | TEXT | The name of Group A for a Group Comparison analysis |
GROUP_B_NAME | TEXT | The name of Group B for a Group Comparison analysis |
METRIC_TYPE_LABEL | TEXT | The type of metric analyzed |
PERCENT_CHANGE | FLOAT | The Percentage Change in metric value between periods or groups |
DIRECTION | TEXT | The direction of change of the metric value between periods or groups |
SET1_CARD_LABEL | TEXT | Summary card label for set 1 |
SET1_CATEGORY_FILTER | TEXT | Any filters applied to set 1 data |
SET1_AVERAGE | FLOAT | The average value for set 1 |
SET1_MIN | FLOAT | The minimum value for set 1 |
SET1_MAX | FLOAT | The maximum value for set 1 |
SET1_MEDIAN | FLOAT | The median value for set 1 |
SET1_SUM | FLOAT | The sum value for set 1 |
SET1_SUMMARY_VALUE | FLOAT | The metric value for set 1 |
SET1_TOTAL_SIZE | FLOAT | The population size for set 1 |
SET1_TOTAL_NUMERATOR | FLOAT | For rate metrics, the total numerator value for set 1 |
SET1_TOTAL_DENOMINATOR | FLOAT | For rate metrics the total denominator value for set 1 |
SET1_MATCH_SIZE | FLOAT | The match size for set 1 |
SET1_WEIGHT | FLOAT | The weight value for set 1 |
SET2_CARD_LABEL | TEXT | Summary card label for set 2 |
SET2_CATEGORY_FILTER | TEXT | Any filters applied to set 2 data |
SET2_AVERAGE | FLOAT | The average value for set 2 |
SET2_MIN | FLOAT | The minimum value for set 2 |
SET2_MAX | FLOAT | The maximum value for set 2 |
SET2_MEDIAN | FLOAT | The median value for set 2 |
SET2_SUM | FLOAT | The sum value for set 2 |
SET2_SUMMARY_VALUE | FLOAT | The metric value for set 2 |
SET2_TOTAL_SIZE | FLOAT | The population size for set 2 |
SET2_TOTAL_NUMERATOR | FLOAT | For rate metrics, the total numerator value for set 2 |
SET2_TOTAL_DENOMINATOR | FLOAT | For rate metrics the total denominator value for set 2 |
SET2_MATCH_SIZE | FLOAT | The match size for set 2 |
SET2_WEIGHT | FLOAT | The weight value for set 2 |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
ANALYSIS_RESULT_WATERFALL - Holds the waterfall steps for a given analysis, if applicable (NOTE: General Performance analyses do not have waterfall charts)
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_RESULT_ID | BIGINT, PRIMARY KEY | The static ID for the results associated with a specific execution of a specific analysis |
STEP_ID | INTEGER, PRIMARY KEY | The step ID for this step of the waterfall chart |
STEP_TYPE | TEXT | The type of this step of the waterfall chart |
FACTOR_0_DIMENSION | TEXT | Dimension name for the first factor of this step |
FACTOR_0_VALUE | TEXT | Dimension value for the first factor of this step |
FACTOR_1_DIMENSION | TEXT | Dimension name for the second factor of this step |
FACTOR_1_VALUE | TEXT | Dimension value for the second factor of this step |
FACTOR_2_DIMENSION | TEXT | Dimension name for the third factor of this step |
FACTOR_2_VALUE | TEXT | Dimension value for the third factor of this step |
STEP_IMPACT | FLOAT | The net impact of this step of the waterfall |
CUMULATIVE_IMPACT_BEFORE_STEP | FLOAT | The cumulative impact before this step of the waterfall |
CUMULATIVE_IMPACT_AFTER_STEP | FLOAT | The cumulative impact after this step of the waterfall |
OVERLAPPING_IMPACT | FLOAT | The amount of any overlapping impact |
CHANGE_IN_SIZE_SET1 | FLOAT | The change in the below value type for this metric for set 1 |
CHANGE_IN_SIZE_SET2 | FLOAT | The change in the below value type for this metric for set 2 |
CHANGE_IN_TYPE_SET1 | FLOAT | The type of value changed above for this metric for set 1 |
CHANGE_IN_TYPE_SET2 | FLOAT | The type of value changed above for this metric for set 2 |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
ANALYSIS_RESULT_DETAIL - Holds the segment table information for a given analysis
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_RESULT_ID | BIGINT, PRIMARY KEY | The static ID for the results associated with a specific execution of a specific analysis |
SUBGROUP_ID | BIGINT, PRIMARY KEY | The static ID for this subgroup |
CONFIDENCE | TEXT | The Sisu calculated confidence bucket for this segment |
FACTOR_0_DIMENSION | TEXT | Dimension name for the first factor of this segment |
FACTOR_0_VALUE | TEXT | Dimension value for the first factor of this segment |
FACTOR_1_DIMENSION | TEXT | Dimension name for the second factor of this segment |
FACTOR_1_VALUE | TEXT | Dimension value for the second factor of this segment |
FACTOR_2_DIMENSION | TEXT | Dimension name for the third factor of this segment |
FACTOR_2_VALUE | TEXT | Dimension value for the third factor of this segment |
FACTOR_0_DIMENSION_FRIENDLY | TEXT | Dimension name for the first factor of this segment that has been formatted for easier readability |
FACTOR_1_DIMENSION_FRIENDLY | TEXT | Dimension value for the first factor of this segment that has been formatted for easier readability |
FACTOR_2_DIMENSION_FRIENDLY | TEXT | Dimension name for the second factor of this segment that has been formatted for easier readability |
FACTOR_0_VALUE_FRIENDLY | TEXT | Dimension value for the second factor of this segment that has been formatted for easier readability |
FACTOR_1_VALUE_FRIENDLY | TEXT | Dimension name for the third factor of this segment that has been formatted for easier readability |
FACTOR_2_VALUE_FRIENDLY | TEXT | Dimension value for the third factor of this segment that has been formatted for easier readability |
FACTOR_0_TEXT | TEXT | The combined, formatted, dimension name and value for the first factor of this segment |
FACTOR_1_TEXT | TEXT | The combined, formatted, dimension name and value for the second factor of this segment |
FACTOR_2_TEXT | TEXT | The combined, formatted, dimension name and value for the third factor of this segment |
SEGMENT_TEXT | TEXT | The combined, formatted, dimension names and values for this segment |
IMPACT | FLOAT | The Sisu calculated impact for this segment |
IMPACT_MAGNITUDE | FLOAT | The absolute value of the Sisu calculated impact for this segment |
IMPACT_RANK | TEXT | Currently empty |
SET1_SIZE | FLOAT | The Sisu calculated population size for this segment for set 1 |
SET2_SIZE | FLOAT | The Sisu calculated population size for this segment for set 2 |
SET1_VALUE | FLOAT | The Sisu calculated metric value for this segment for set 1 |
SET2_VALUE | FLOAT | The Sisu calculated metric value for this segment for set 2 |
PERCENT_CHANGE | FLOAT | The percentage change in metric value from set 1 to set 2 |
DIRECTION | TEXT | The formatted direction of change in metric value for this segment |
DIRECTION_TEXT | TEXT | The textual representation of the direction of change in metric value for this segment |
ORIENTATION_MATCHES_METRIC | TEXT | True if the direction of change for this segment matches the desired direction of change for this metric. Otherwise, False |
SEGMENT_ORDER | SMALLINT | Number of factors involved in this segment (1, 2, or 3) |
SEGMENT_ORDER_TEXT | TEXT | A textual representation of whether this segment is a first, second, or third, order fact |
INSIGHT_TEXT | TEXT | A textual representation of this segment, its impact, direction of change, etc. |
CHANGE_IN_SIZE | FLOAT | Currently empty |
MIX_EFFECT | FLOAT | Currently empty |
NET_EFFECT | FLOAT | Currently empty |
NET_RELATIVE_EFFECT | FLOAT | Currently empty |
PERCENT_CHANGE_IN_SIZE | FLOAT | Currently empty |
RATE_EFFECT | FLOAT | Currently empty |
RELATIVE_PERCENT_CHANGE | FLOAT | Currently empty |
RATE_CHANGE | FLOAT | Currently empty |
RELATIVE_MIX_EFFECT | FLOAT | Currently empty |
SEGMENT_NAME | TEXT | Currently empty |
SEGMENT_RANK | TEXT | Currently empty |
SEGMENT_HASH | TEXT | Currently empty |
UNWEIGHTED_CHANGE_IN_AVERAGE | FLOAT | Currently empty |
WEIGHT | FLOAT | Currently empty |
WEIGHTED_CHANGE_IN_SUM | FLOAT | Currently empty |
UNITS | TEXT | Currently empty |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
TREND_RESULT_SUMMARY - Holds summary card information for a trend detection
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_RESULT_ID | BIGINT, PRIMARY KEY | The static ID for the results associated with a specific execution of a specific analysis |
REQUESTED_AT | TIMESTAMP | Timestamp when this analysis execution began |
COMPLETED_AT | TIMESTAMP | Timestamp when this analysis execution completed |
RUN_STATUS | TEXT | Status of the analysis execution |
RUN_TYPE | TEXT | Type of the analysis execution |
CURRENT_PERIOD_CARD_LABEL | TEXT | Summary card label for the current trend |
CURRENT_PERIOD_DENOMINATOR_LABEL | TEXT | Summary card denominator label for the current trend |
CURRENT_PERIOD_PERCENT_CHANGE | FLOAT | Percent change for the current trend |
CURRENT_PERIOD_SLOPE | FLOAT | Trend line slope for the current trend |
PREVIOUS_PERIOD_CARD_LABEL | TEXT | Summary card label for the previous trend |
PREVIOUS_PERIOD_DENOMINATOR_LABEL | TEXT | Summary card denominator label for the previous trend |
PREVIOUS_PERIOD_PERCENT_CHANGE | FLOAT | Percent change for the previous trend |
PREVIOUS_PERIOD_SLOPE | FLOAT | Trend line slope for the previous trend |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
TREND_RESULT_DETAIL - Holds the segment details for a trend detection
Column Name | Type | Semantic Meaning |
ANALYSIS_ID | BIGINT, PRIMARY KEY | The static ID for a given analysis |
ANALYSIS_RESULT_ID | BIGINT, PRIMARY KEY | The static ID for the results associated with a specific execution of a specific analysis |
SUBGROUP_ID | BIGINT, PRIMARY KEY | The static ID for this subgroup |
FACTOR_0_DIMENSION | TEXT | Dimension name for the first factor of this segment |
FACTOR_0_VALUE | TEXT | Dimension value for the first factor of this segment |
FACTOR_1_DIMENSION | TEXT | Dimension name for the second factor of this segment |
FACTOR_1_VALUE | TEXT | Dimension value for the second factor of this segment |
FACTOR_2_DIMENSION | TEXT | Dimension name for the third factor of this segment |
FACTOR_2_VALUE | TEXT | Dimension value for the third factor of this segment |
IMPACT | FLOAT | The Sisu calculated impact for this segment |
START_DATE | TIMESTAMP | The Start Date of the trend for this segment |
END_DATE | TIMESTAMP | The End Date of the trend for this segment |
INTERCEPT | FLOAT | The intercept of the trend line for this segment |
SLOPE | FLOAT | The slope of the trend line for this segment |
SIZE | FLOAT | The population size of this segment |
LOAD_TS | TIMESTAMP | The timestamp when this record was loaded into the data warehouse |
The following scripts comprise the working solution:
Configuration of the project is completed in config.py
. Each of the project scripts uses different sections of this configuration file:
- REQUIRED IMPORTS: In this section we call additional libraries or classes that are required for operation of the project. Note that we import a custom class
SisuOutputDatabaseConnection
. Customers writing segments to Snowflake may use the existing class. Customers writing segments to other databases will need to write a database handler (see Extensions.) We also importio
to allow input parameters such as passwords and API keys to be hosted in a profile file for security purposes - RETURN A DATABASE CONNECTION: In this section, we define a method called
getDatabaseConnection()
that is to return a newSisuOutputDatabaseConnection
object that implements the required handler methods (see Extensions.) Customers using Snowflake as their data warehouse require no changes - DATABASE CONNECTION PARAMETERS: This section defines the connection parameters used by the
SisuOutputDatabaseConnection
. Customers will need to update the parameter values in this section for their data warehouse. Customers using Snowflake will require new values for each of the existing parameters. Other data warehouse types will require different parameters (see Extensions) - SEGMENT LOAD PARAMETERS: This section defines parameters used by
load_analysis_results.py
to control record insert batch size, the string delimiter to be used for segment load, whether database tables should be truncated at the start of execution, and whether the script should operate in debug mode (verbose logging to the console.) Customers should not need to modify this section unless directed by their Customer Success Manager or Solution Architect. NOTE: setting the truncate option to true will truncate all the results tables at the start of execution. This cannot be undone; use with extreme caution - USER PARAMETERS: Customers may put additional parameters for their organization in this section. In Sisu’s default configuration, we show how one may retrieve Sisu API keys stored in the executing user’s
bash
orzsh
profile
This is the main project script that performs the actions of:
- Updating analyses with new time period configurations
- Iterating over time periods for historical load of segments
- Executing an analysis
- Retrieving and formatting Sisu segments
- Passing segments to the database helper to be written to the warehouse
It is split into two main functions:
process_tc_action
This method takes inputs to allow historical loads of different types for a Time Comparison analysis. For example, we may wish to execute a Time Comparison analysis for each of the previous 12 months, and write the segments to the database
Inputs to this method are all required:
- API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
- ANALYSIS_ID - Integer: A valid Time Comparison analysis ID
- RETURN_RESULTS - Boolean: True if the results are to be returned to the calling function as a JSON blob or object; False if the results are to be inserted into the database directly
- ACTION_TYPE - String: Must be one of:
- DOD: Day Over Day execution
- WOW: Week Over Week
- MOM: Month Over Month
- START_DATE - Datetime: The starting date of the first Recent Period of the historical load
- END_DATE - Datetime: The starting date of the last Recent Period of the historical load
Executing historical loads is then automated. The script will perform one of the ACTION_TYPE loads as requested by the user. Dates are calculated, the analysis specified in ANALYSIS_ID is modified to set the Recent and Previous Period dates as calculated (see below,) and execute_load()
is called with the provided API_KEY, RETURN_RESULTS and ANALYSIS_ID. EXECUTE_ANALYSIS is set to True
Dates are calculated and the script operates in the following manner:
- We first calculate a timeDelta, which is a number of days according to the ACTION_TYPE specified. For DOD, timeDelta is one day. For WOW, timeDelta is seven days. For MOM, timeDelta is one month
- We then calculate the initial dates based on the START_DATE provided. The Recent Period is set as START_DATE to START_DATE + timeDelta. The Previous Period is set as START_DATE - timeDelta to START_DATE
- We then operate a loop of
- Update the Time Comparison to set the date ranges as calculated
- Call
execute_load()
for the modified analysis - Update the date ranges, adding timeDelta to each of the Previous / Recent Period Start / End dates
- Stop operating when the Recent Period Start Date is AFTER the END_DATE provided as input to the function
NOTES:
- Our calculations are based on the Recent Period, that is the later period, start and end dates. The Previous Period is set to the comparison period BEFORE the Recent Period. One must ensure that data exists for both the Previous Period and Recent Period, or execution will fail
- The START_DATE and END_DATE input parameters specify the range of time to operate this historical load, NOT as the start and end dates of any period of the analysis
- Sisu’s time period Start Date is inclusive, but End Date is exclusive. So, the Previous Period End Date may be the same as the Recent Period Start Date without any overlap in data
Examples:
{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': 'DOD', 'START_DATE': '2018-12-04', 'END_DATE': '2018-12-06'}
This configuration will execute the analysis to do a daily comparison as follows:
Recent Period Start | 2018-12-04 | 2018-12-05 | 2018-12-06 |
Recent Period End | 2018-12-05 | 2018-12-06 | 2018-12-07 |
Previous Period Start | 2018-12-03 | 2018-12-04 | 2018-12-05 |
Previous Period End | 2018-12-04 | 2018-12-05 | 2018-12-06 |
{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': WOW, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}
This configuration will execute the analysis to do a weekly comparison as follows:
Recent Period Start | 2018-12-04 | 2018-12-11 | 2018-12-18 | 2018-12-25 | 2019-01-01 |
Recent Period End | 2018-12-11 | 2018-12-18 | 2018-12-25 | 2019-01-01 | 2019-01-08 |
Previous Period Start | 2018-11-27 | 2018-12-04 | 2018-12-11 | 2018-12-18 | 2018-12-25 |
Previous Period End | 2018-12-04 | 2018-12-11 | 2018-12-18 | 2018-12-25 | 2019-01-01 |
{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': MOM, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}
This configuration will execute the analysis to do a monthly comparison as follows:
Recent Period Start | 2018-12-04 | 2019-01-04 |
Recent Period End | 2019-01-04 | 2019-02-04 |
Previous Period Start | 2018-11-04 | 2018-12-04 |
Previous Period End | 2018-12-04 | 2019-01-04 |
execute_load
This function performs the work of executing an analysis, retrieving and formatting the results, and sending them to the database helper to be stored in the warehouse
Inputs to this method are all required:
- API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
- ANALYSIS_ID - Integer: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID
- EXECUTE_ANALYSIS - Boolean: True - execute the analysis before retrieving the analysis results; False - do not execute the analysis before retrieving the analysis results (i.e. use the results provided by Sisu’s previous execution of the analysis)
- RETURN_RESULTS - Boolean: True - return the results to the caller as a formatted JSON string or object; False - insert the data directly into the database
This function will then perform the following actions:
- Get a database connection by calling
config.getDatabaseConnection()
, which will return an object implementing the methods required of aSisuOutputDatabaseConnection
(see above)- Truncate the database tables if requested in config.py
- Connect to Sisu with the provided API_KEY, and get the Analysis metadata (including some Metric information.) Format it. Delete the existing metadata for this Analysis and write updated metadata for this Analysis to the data warehouse if requested with RETURN_RESULTS = False
- Execute the Analysis, if requested with EXECUTE_ANALYSIS = True, and retrieve the result summary and segment detail. Find any existing runs with the same group names or date ranges, and delete them from the summary and detail tables
- Format the summary information. Calculate some derived columns for the analysis summary table. Insert the summary information into the data warehouse, if requested with RETURN_RESULTS = False
- Retrieve the analysis waterfall, format it, and insert it into the data warehouse if requested with RETURN_RESULTS = False
- Format the segment detail information. Calculate some derived columns for the segment detail table, and insert the segment details into the data warehouse if requested with RETURN_RESULTS = False
- Return the results for formatting into a JSON or object, if requested with RETURN_RESULTS = True
This script defines a class called SisuOutputDatabaseConnection
that performs database-related functions to write Sisu analysis and segment information into the data warehouse, or return a formatted object / JSON string the caller can understand. We provide example helpers for Snowflake and Fivetran. Customers using other data warehouses will need to create their own implementation of this class that implements all of the same functions available in this object
Class variables are declared at the top of the class to define SQL statements that would be used to execute select, insert, delete, and truncate operations for the specific data warehouse, and internal data structures to store result data. These variables are convenience for the developer and not explicitly required as they are not used by callers of the class
Class functions are declared inside the class declaration to perform the various operations required to operate the execute_load()
function detailed above. Unlike class variables, class functions are required and must be implemented for any new data warehouse type a customer desires to use:
- init(): This is a standard class initialization function. Using configuration in
config.py
, it performs warehouse-specific tasks to establish a connection and create a reusable cursor for database operations that span the life of the execution. At the end of this function, a connection and cursor object, and any internal data structures required, should be fully initialized and stored in the class variables - truncateTables(): This function should perform the database operations required to remove all data from all of the database tables defined for the solution. Think of this as a reset button for the data tables. At the end of this function, each of the tables will be empty
- deleteAnalysisMetadata(ANALYSIS_ID Integer): This function should delete the analysis with the ANALYSIS_ID provided as input from the ANALYSIS table
- writeAnalysisMetadata(vals List): This function should insert a record into the ANALYSIS table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
- deleteAnalysisResultSummary(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_SUMMARY table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of one record from the table
- writeAnalysisResultSummary(vals List): This function should insert a record into the ANALYSIS_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
- deleteAnalysisResultDetail(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_DETAIL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a segment from the given execution of a specific analysis
- writeAnalysisResultDetail(df Dataframe): This function should write the pandas dataframe containing the segments for a given analysis execution into the ANALYSIS_RESULT_DETAIL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
- deleteAnalysisResultWaterfall(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_WATERFALL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a step in the waterfall chart from the given execution of a specific analysis
- writeAnalysisResultWaterfall(df Dataframe): This function should write the pandas dataframe containing the steps for a given analysis waterfall into the ANALYSIS_RESULT_WATERFALL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
- writeTrendResultSummary(vals List): This function should delete any existing data with the same ANALYSIS_ID, and ANALYSIS_RESULT_ID, then insert a record into the TREND_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL. The first and second fields in vals correspond to the ANALYSIS_ID and ANALYSIS_RESULT_ID respectively, to facilitate deletion of existing records
- writeTrendResultDetail(df Dataframe, vals List): This function should delete any existing data with the ANALYSIS_ID and ANALYSIS_RESULT_ID found as the first and second entry in the input list vals, and then insert the trend details in the pandas dataframe df into the TREND_RESULT_DETAIL table. Each row of df will contain a complete record, even if some of the column values are NULL
- getTCAnalysisResultIDs(ANALYSIS_ID Integer, RECENT_PERIOD_START String, RECENT_PERIOD_END String, PREVIOUS_PERIOD_START String, PREVIOUS_PERIOD_END String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and date ranges. This is to facilitate deletion of previous executions of a time comparison analysis with those date ranges to avoid having duplicate data in the database
- getGCAnalysisResultIDs(ANALYSIS_ID Integer, GROUP_A_NAME String, GROUP_B_NAME String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and group names. This is to facilitate deletion of previous executions of a group comparison analysis with those group names to avoid having duplicate data in the database
- getGPAnalysisResultIDs(ANALYSIS_ID Integer): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID. This is to facilitate deletion of previous executions of a general performance analysis to avoid having duplicate data in the database
- formatResults(event Dict): This function will return a formatted JSON string or other returnable object that delivers results back to the calling function / system instead of inserting it into the database. This method will only be called if RETURN_RESULTS = True is specified by the caller. This method may not be applicable to all database handlers
We provide a script called lambda_function.py
that can immediately be used with an AWS Lambda Function to support workflow orchestration as outlined above in Step 3. This script may also be useful for customers wishing to integrate scheduled execution / load of Sisu segments into other workflows and tools; this script will work with AWS Lambda Functions directly, but also any other caller that makes execution calls with the same input Event format
This script defines a single function, lambda_handler(event Dictionary, context Dictionary)
: The input context is not used by the function. The input event is used to pass execution requests to the orchestrator.py
script for either a one-off execution of an analysis, or a historical execution
We provide a script called fivetran_gcf_main.py
that can immediately be used with a Google Cloud Function to support workflow orchestration as outlined above in Step 3. This script will work with Google Cloud Functions called through Fivetran Connectors directly, but also any other caller that makes execution calls with the same input event format contained in a secrets section of a JSON GET message
This script defines a single function, main(req Request)
: The input req is used to pass a formatted JSON request message that is parsed into a dictionary. The secrets dictionary is extracted from the request and sent to the orchestrator.py
script for either a one-off execution of an analysis, or a historical execution
This script defines a single function, main(event Dictionary)
, that contains the input parameters needed to operate the workflow. The following required and optional parameters may be specified:
- API_KEY - String - Required: A valid Sisu API Key for the customer organization where the analysis to be operated resides
- ANALYSIS_ID - Integer - Required: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID. NOTE: If ACTION_TYPE (below) is specified, the ANALYSIS_ID parameter MUST correspond to a Time Comparison analysis
- ACTION_TYPE - String - Optional: Specifying an ACTION_TYPE will kick off a historical load as outlined above. Must be one of:
- DOD: Day Over Day execution
- WOW: Week Over Week
- MOM: Month Over Month
- START_DATE - String - Required iff ACTION_TYPE is specified**,** ignored otherwise: The START_DATE must be either a String in the format of ‘YYYY-MM’DD’, or one of the following:
- YESTERDAY: Sets the START_DATE to the current date - 1 day
- TODAY: Sets the START_DATE to the current date
- LAST_WEEK: Sets the START_DATE to the current date - 7 days
- THIS_WEEK: Sets the START_DATE to the current date
- LAST_MONTH: Sets the START_DATE to the current date - 1 month
- END_DATE - String - Optional iff ACTION_TYPE is specified, ignored otherwise: The END_DATE, if specified, must be a String in the format of ‘YYYY-MM-DD’. If no END_DATE is specified, the END_DATE is set to the same value as the decoded START_DATE
- RETURN_RESULTS - Boolean - Optional: Default is False. Pass True if you wish the scripts to return a formatted response object instead of directly inserting results into the database (ex: to return a JSON response object to a web service call.) A value of True will result in a call to formatResults to the database handler
- EXECUTE_ANALYSIS - Boolean - Optional: Default is True. Pass False if you wish the scripts to not execute the analysis before retrieving results (ex: to use the results as they exist in Sisu at the time of script execution). True will result in the script executing the analysis through a Sisu API call before retrieving results
If an API_KEY and ANALYSIS_ID are provided, a single call is made to load_analysis_results.execute_load()
If an ACTION_TYPE is specified, the START_DATE parameter is decoded, the END_DATE parameter is decoded, and a single call is made to load_analysis_results.process_tc_action()
If the calls above complete successfully and RETURN_RESULTS = False, a response dictionary is returned with a statusCode of 200, and a JSON body indicating successful execution. If the calls above complete successfully and RETURN_RESULTS = True, formatted results are returned by the database handler, and it is up to the implementing class to return proper results that the caller will accept. Otherwise, exceptions raised by the script are thrown to the caller to indicate failure