Welcome!

This repository contains the code examples for consuming Sisu's insights in BI tools. For more information or to get started, contact your Sisu Customer Success team

Code Details

Database Structure

Below is an overview of the tables and columns, along with their semantic meaning:

NOTE: It is intended that SET1 refers to Group A or the Recent Period value, and SET2 refers to_ Group B_ or the_ Previous Period_ value. For General Performance analyses, SET2 will always be NULL

ANALYSIS - Holds the metadata for this analysis

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_NAME	TEXT	The name of the analysis in Sisu
ANALYSIS_TYPE	TEXT	The type of the analysis
APPLICATION_URL	TEXT	A direct link to the analysis in Sisu
CREATED_AT	TIMESTAMP	Timestamp when this analysis was created
METRIC_ID	BIGINT	The static ID of the metric analyzed
METRIC_NAME	TEXT	The name of the metric analyzed
METRIC_DESIRED_DIRECTION	TEXT	The desired direction of the metric analyzed
PROJECT_ID	BIGINT	The static ID of the project where this analysis is located in Sisu
PROJECT_NAME	TEXT	The name of the Sisu project
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_SUMMARY - Holds the summary ribbon information for this analysis

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_RESULT_ID	BIGINT, PRIMARY KEY	The static ID for the results associated with a specific execution of a specific analysis
REQUESTED_AT	TIMESTAMP	Timestamp when this analysis execution began
COMPLETED_AT	TIMESTAMP	Timestamp when this analysis execution completed
RUN_STATUS	TEXT	Status of the analysis execution
RUN_TYPE	TEXT	Type of the analysis execution
TIMEFRAME	TEXT	Currently empty
PREVIOUS_PERIOD_START	TIMESTAMP	Start date for the Previous Period of a Time Comparison analysis
PREVIOUS_PERIOD_END	TIMESTAMP	End date for the Previous Period of a Time Comparison analysis
RECENT_PERIOD_START	TIMESTAMP	Start date for the Recent Period of a Time Comparison analysis
RECENT_PERIOD_END	TIMESTAMP	End date for the Recent Period of a Time Comparison analysis
GROUP_A_NAME	TEXT	The name of Group A for a Group Comparison analysis
GROUP_B_NAME	TEXT	The name of Group B for a Group Comparison analysis
METRIC_TYPE_LABEL	TEXT	The type of metric analyzed
PERCENT_CHANGE	FLOAT	The Percentage Change in metric value between periods or groups
DIRECTION	TEXT	The direction of change of the metric value between periods or groups
SET1_CARD_LABEL	TEXT	Summary card label for set 1
SET1_CATEGORY_FILTER	TEXT	Any filters applied to set 1 data
SET1_AVERAGE	FLOAT	The average value for set 1
SET1_MIN	FLOAT	The minimum value for set 1
SET1_MAX	FLOAT	The maximum value for set 1
SET1_MEDIAN	FLOAT	The median value for set 1
SET1_SUM	FLOAT	The sum value for set 1
SET1_SUMMARY_VALUE	FLOAT	The metric value for set 1
SET1_TOTAL_SIZE	FLOAT	The population size for set 1
SET1_TOTAL_NUMERATOR	FLOAT	For rate metrics, the total numerator value for set 1
SET1_TOTAL_DENOMINATOR	FLOAT	For rate metrics the total denominator value for set 1
SET1_MATCH_SIZE	FLOAT	The match size for set 1
SET1_WEIGHT	FLOAT	The weight value for set 1
SET2_CARD_LABEL	TEXT	Summary card label for set 2
SET2_CATEGORY_FILTER	TEXT	Any filters applied to set 2 data
SET2_AVERAGE	FLOAT	The average value for set 2
SET2_MIN	FLOAT	The minimum value for set 2
SET2_MAX	FLOAT	The maximum value for set 2
SET2_MEDIAN	FLOAT	The median value for set 2
SET2_SUM	FLOAT	The sum value for set 2
SET2_SUMMARY_VALUE	FLOAT	The metric value for set 2
SET2_TOTAL_SIZE	FLOAT	The population size for set 2
SET2_TOTAL_NUMERATOR	FLOAT	For rate metrics, the total numerator value for set 2
SET2_TOTAL_DENOMINATOR	FLOAT	For rate metrics the total denominator value for set 2
SET2_MATCH_SIZE	FLOAT	The match size for set 2
SET2_WEIGHT	FLOAT	The weight value for set 2
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_WATERFALL - Holds the waterfall steps for a given analysis, if applicable (NOTE: General Performance analyses do not have waterfall charts)

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_RESULT_ID	BIGINT, PRIMARY KEY	The static ID for the results associated with a specific execution of a specific analysis
STEP_ID	INTEGER, PRIMARY KEY	The step ID for this step of the waterfall chart
STEP_TYPE	TEXT	The type of this step of the waterfall chart
FACTOR_0_DIMENSION	TEXT	Dimension name for the first factor of this step
FACTOR_0_VALUE	TEXT	Dimension value for the first factor of this step
FACTOR_1_DIMENSION	TEXT	Dimension name for the second factor of this step
FACTOR_1_VALUE	TEXT	Dimension value for the second factor of this step
FACTOR_2_DIMENSION	TEXT	Dimension name for the third factor of this step
FACTOR_2_VALUE	TEXT	Dimension value for the third factor of this step
STEP_IMPACT	FLOAT	The net impact of this step of the waterfall
CUMULATIVE_IMPACT_BEFORE_STEP	FLOAT	The cumulative impact before this step of the waterfall
CUMULATIVE_IMPACT_AFTER_STEP	FLOAT	The cumulative impact after this step of the waterfall
OVERLAPPING_IMPACT	FLOAT	The amount of any overlapping impact
CHANGE_IN_SIZE_SET1	FLOAT	The change in the below value type for this metric for set 1
CHANGE_IN_SIZE_SET2	FLOAT	The change in the below value type for this metric for set 2
CHANGE_IN_TYPE_SET1	FLOAT	The type of value changed above for this metric for set 1
CHANGE_IN_TYPE_SET2	FLOAT	The type of value changed above for this metric for set 2
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_DETAIL - Holds the segment table information for a given analysis

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_RESULT_ID	BIGINT, PRIMARY KEY	The static ID for the results associated with a specific execution of a specific analysis
SUBGROUP_ID	BIGINT, PRIMARY KEY	The static ID for this subgroup
CONFIDENCE	TEXT	The Sisu calculated confidence bucket for this segment
FACTOR_0_DIMENSION	TEXT	Dimension name for the first factor of this segment
FACTOR_0_VALUE	TEXT	Dimension value for the first factor of this segment
FACTOR_1_DIMENSION	TEXT	Dimension name for the second factor of this segment
FACTOR_1_VALUE	TEXT	Dimension value for the second factor of this segment
FACTOR_2_DIMENSION	TEXT	Dimension name for the third factor of this segment
FACTOR_2_VALUE	TEXT	Dimension value for the third factor of this segment
FACTOR_0_DIMENSION_FRIENDLY	TEXT	Dimension name for the first factor of this segment that has been formatted for easier readability
FACTOR_1_DIMENSION_FRIENDLY	TEXT	Dimension value for the first factor of this segment that has been formatted for easier readability
FACTOR_2_DIMENSION_FRIENDLY	TEXT	Dimension name for the second factor of this segment that has been formatted for easier readability
FACTOR_0_VALUE_FRIENDLY	TEXT	Dimension value for the second factor of this segment that has been formatted for easier readability
FACTOR_1_VALUE_FRIENDLY	TEXT	Dimension name for the third factor of this segment that has been formatted for easier readability
FACTOR_2_VALUE_FRIENDLY	TEXT	Dimension value for the third factor of this segment that has been formatted for easier readability
FACTOR_0_TEXT	TEXT	The combined, formatted, dimension name and value for the first factor of this segment
FACTOR_1_TEXT	TEXT	The combined, formatted, dimension name and value for the second factor of this segment
FACTOR_2_TEXT	TEXT	The combined, formatted, dimension name and value for the third factor of this segment
SEGMENT_TEXT	TEXT	The combined, formatted, dimension names and values for this segment
IMPACT	FLOAT	The Sisu calculated impact for this segment
IMPACT_MAGNITUDE	FLOAT	The absolute value of the Sisu calculated impact for this segment
IMPACT_RANK	TEXT	Currently empty
SET1_SIZE	FLOAT	The Sisu calculated population size for this segment for set 1
SET2_SIZE	FLOAT	The Sisu calculated population size for this segment for set 2
SET1_VALUE	FLOAT	The Sisu calculated metric value for this segment for set 1
SET2_VALUE	FLOAT	The Sisu calculated metric value for this segment for set 2
PERCENT_CHANGE	FLOAT	The percentage change in metric value from set 1 to set 2
DIRECTION	TEXT	The formatted direction of change in metric value for this segment
DIRECTION_TEXT	TEXT	The textual representation of the direction of change in metric value for this segment
ORIENTATION_MATCHES_METRIC	TEXT	True if the direction of change for this segment matches the desired direction of change for this metric. Otherwise, False
SEGMENT_ORDER	SMALLINT	Number of factors involved in this segment (1, 2, or 3)
SEGMENT_ORDER_TEXT	TEXT	A textual representation of whether this segment is a first, second, or third, order fact
INSIGHT_TEXT	TEXT	A textual representation of this segment, its impact, direction of change, etc.
CHANGE_IN_SIZE	FLOAT	Currently empty
MIX_EFFECT	FLOAT	Currently empty
NET_EFFECT	FLOAT	Currently empty
NET_RELATIVE_EFFECT	FLOAT	Currently empty
PERCENT_CHANGE_IN_SIZE	FLOAT	Currently empty
RATE_EFFECT	FLOAT	Currently empty
RELATIVE_PERCENT_CHANGE	FLOAT	Currently empty
RATE_CHANGE	FLOAT	Currently empty
RELATIVE_MIX_EFFECT	FLOAT	Currently empty
SEGMENT_NAME	TEXT	Currently empty
SEGMENT_RANK	TEXT	Currently empty
SEGMENT_HASH	TEXT	Currently empty
UNWEIGHTED_CHANGE_IN_AVERAGE	FLOAT	Currently empty
WEIGHT	FLOAT	Currently empty
WEIGHTED_CHANGE_IN_SUM	FLOAT	Currently empty
UNITS	TEXT	Currently empty
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

TREND_RESULT_SUMMARY - Holds summary card information for a trend detection

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_RESULT_ID	BIGINT, PRIMARY KEY	The static ID for the results associated with a specific execution of a specific analysis
REQUESTED_AT	TIMESTAMP	Timestamp when this analysis execution began
COMPLETED_AT	TIMESTAMP	Timestamp when this analysis execution completed
RUN_STATUS	TEXT	Status of the analysis execution
RUN_TYPE	TEXT	Type of the analysis execution
CURRENT_PERIOD_CARD_LABEL	TEXT	Summary card label for the current trend
CURRENT_PERIOD_DENOMINATOR_LABEL	TEXT	Summary card denominator label for the current trend
CURRENT_PERIOD_PERCENT_CHANGE	FLOAT	Percent change for the current trend
CURRENT_PERIOD_SLOPE	FLOAT	Trend line slope for the current trend
PREVIOUS_PERIOD_CARD_LABEL	TEXT	Summary card label for the previous trend
PREVIOUS_PERIOD_DENOMINATOR_LABEL	TEXT	Summary card denominator label for the previous trend
PREVIOUS_PERIOD_PERCENT_CHANGE	FLOAT	Percent change for the previous trend
PREVIOUS_PERIOD_SLOPE	FLOAT	Trend line slope for the previous trend
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

TREND_RESULT_DETAIL - Holds the segment details for a trend detection

Column Name	Type	Semantic Meaning
ANALYSIS_ID	BIGINT, PRIMARY KEY	The static ID for a given analysis
ANALYSIS_RESULT_ID	BIGINT, PRIMARY KEY	The static ID for the results associated with a specific execution of a specific analysis
SUBGROUP_ID	BIGINT, PRIMARY KEY	The static ID for this subgroup
FACTOR_0_DIMENSION	TEXT	Dimension name for the first factor of this segment
FACTOR_0_VALUE	TEXT	Dimension value for the first factor of this segment
FACTOR_1_DIMENSION	TEXT	Dimension name for the second factor of this segment
FACTOR_1_VALUE	TEXT	Dimension value for the second factor of this segment
FACTOR_2_DIMENSION	TEXT	Dimension name for the third factor of this segment
FACTOR_2_VALUE	TEXT	Dimension value for the third factor of this segment
IMPACT	FLOAT	The Sisu calculated impact for this segment
START_DATE	TIMESTAMP	The Start Date of the trend for this segment
END_DATE	TIMESTAMP	The End Date of the trend for this segment
INTERCEPT	FLOAT	The intercept of the trend line for this segment
SLOPE	FLOAT	The slope of the trend line for this segment
SIZE	FLOAT	The population size of this segment
LOAD_TS	TIMESTAMP	The timestamp when this record was loaded into the data warehouse

Description of provided scripts

The following scripts comprise the working solution:

config.py

Configuration of the project is completed in config.py. Each of the project scripts uses different sections of this configuration file:

REQUIRED IMPORTS: In this section we call additional libraries or classes that are required for operation of the project. Note that we import a custom class SisuOutputDatabaseConnection. Customers writing segments to Snowflake may use the existing class. Customers writing segments to other databases will need to write a database handler (see Extensions.) We also import io to allow input parameters such as passwords and API keys to be hosted in a profile file for security purposes
RETURN A DATABASE CONNECTION: In this section, we define a method called getDatabaseConnection() that is to return a new SisuOutputDatabaseConnection object that implements the required handler methods (see Extensions.) Customers using Snowflake as their data warehouse require no changes
DATABASE CONNECTION PARAMETERS: This section defines the connection parameters used by the SisuOutputDatabaseConnection. Customers will need to update the parameter values in this section for their data warehouse. Customers using Snowflake will require new values for each of the existing parameters. Other data warehouse types will require different parameters (see Extensions)
SEGMENT LOAD PARAMETERS: This section defines parameters used by load_analysis_results.py to control record insert batch size, the string delimiter to be used for segment load, whether database tables should be truncated at the start of execution, and whether the script should operate in debug mode (verbose logging to the console.) Customers should not need to modify this section unless directed by their Customer Success Manager or Solution Architect. NOTE: setting the truncate option to true will truncate all the results tables at the start of execution. This cannot be undone; use with extreme caution
USER PARAMETERS: Customers may put additional parameters for their organization in this section. In Sisu’s default configuration, we show how one may retrieve Sisu API keys stored in the executing user’s bash or zsh profile

load_analysis_results.py

This is the main project script that performs the actions of:

Updating analyses with new time period configurations
Iterating over time periods for historical load of segments
Executing an analysis
Retrieving and formatting Sisu segments
Passing segments to the database helper to be written to the warehouse

It is split into two main functions:

process_tc_action

This method takes inputs to allow historical loads of different types for a Time Comparison analysis. For example, we may wish to execute a Time Comparison analysis for each of the previous 12 months, and write the segments to the database

Inputs to this method are all required:

API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
ANALYSIS_ID - Integer: A valid Time Comparison analysis ID
RETURN_RESULTS - Boolean: True if the results are to be returned to the calling function as a JSON blob or object; False if the results are to be inserted into the database directly
ACTION_TYPE - String: Must be one of:
- DOD: Day Over Day execution
- WOW: Week Over Week
- MOM: Month Over Month
START_DATE - Datetime: The starting date of the first Recent Period of the historical load
END_DATE - Datetime: The starting date of the last Recent Period of the historical load

Executing historical loads is then automated. The script will perform one of the ACTION_TYPE loads as requested by the user. Dates are calculated, the analysis specified in ANALYSIS_ID is modified to set the Recent and Previous Period dates as calculated (see below,) and execute_load() is called with the provided API_KEY, RETURN_RESULTS and ANALYSIS_ID. EXECUTE_ANALYSIS is set to True

Dates are calculated and the script operates in the following manner:

We first calculate a timeDelta, which is a number of days according to the ACTION_TYPE specified. For DOD, timeDelta is one day. For WOW, timeDelta is seven days. For MOM, timeDelta is one month
We then calculate the initial dates based on the START_DATE provided. The Recent Period is set as START_DATE to START_DATE + timeDelta. The Previous Period is set as START_DATE - timeDelta to START_DATE
We then operate a loop of
1. Update the Time Comparison to set the date ranges as calculated
2. Call execute_load() for the modified analysis
3. Update the date ranges, adding timeDelta to each of the Previous / Recent Period Start / End dates
4. Stop operating when the Recent Period Start Date is AFTER the END_DATE provided as input to the function

NOTES:

Our calculations are based on the Recent Period, that is the later period, start and end dates. The Previous Period is set to the comparison period BEFORE the Recent Period. One must ensure that data exists for both the Previous Period and Recent Period, or execution will fail
The START_DATE and END_DATE input parameters specify the range of time to operate this historical load, NOT as the start and end dates of any period of the analysis
Sisu’s time period Start Date is inclusive, but End Date is exclusive. So, the Previous Period End Date may be the same as the Recent Period Start Date without any overlap in data

Examples:

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': 'DOD', 'START_DATE': '2018-12-04', 'END_DATE': '2018-12-06'}

This configuration will execute the analysis to do a daily comparison as follows:

Recent Period Start	2018-12-04	2018-12-05	2018-12-06
Recent Period End	2018-12-05	2018-12-06	2018-12-07
Previous Period Start	2018-12-03	2018-12-04	2018-12-05
Previous Period End	2018-12-04	2018-12-05	2018-12-06

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': WOW, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}

This configuration will execute the analysis to do a weekly comparison as follows:

Recent Period Start	2018-12-04	2018-12-11	2018-12-18	2018-12-25	2019-01-01
Recent Period End	2018-12-11	2018-12-18	2018-12-25	2019-01-01	2019-01-08
Previous Period Start	2018-11-27	2018-12-04	2018-12-11	2018-12-18	2018-12-25
Previous Period End	2018-12-04	2018-12-11	2018-12-18	2018-12-25	2019-01-01

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': MOM, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}

This configuration will execute the analysis to do a monthly comparison as follows:

Recent Period Start	2018-12-04	2019-01-04
Recent Period End	2019-01-04	2019-02-04
Previous Period Start	2018-11-04	2018-12-04
Previous Period End	2018-12-04	2019-01-04

execute_load

This function performs the work of executing an analysis, retrieving and formatting the results, and sending them to the database helper to be stored in the warehouse

Inputs to this method are all required:

API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
ANALYSIS_ID - Integer: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID
EXECUTE_ANALYSIS - Boolean: True - execute the analysis before retrieving the analysis results; False - do not execute the analysis before retrieving the analysis results (i.e. use the results provided by Sisu’s previous execution of the analysis)
RETURN_RESULTS - Boolean: True - return the results to the caller as a formatted JSON string or object; False - insert the data directly into the database

This function will then perform the following actions:

Get a database connection by calling config.getDatabaseConnection(), which will return an object implementing the methods required of a SisuOutputDatabaseConnection (see above)
1. Truncate the database tables if requested in config.py
Connect to Sisu with the provided API_KEY, and get the Analysis metadata (including some Metric information.) Format it. Delete the existing metadata for this Analysis and write updated metadata for this Analysis to the data warehouse if requested with RETURN_RESULTS = False
Execute the Analysis, if requested with EXECUTE_ANALYSIS = True, and retrieve the result summary and segment detail. Find any existing runs with the same group names or date ranges, and delete them from the summary and detail tables
Format the summary information. Calculate some derived columns for the analysis summary table. Insert the summary information into the data warehouse, if requested with RETURN_RESULTS = False
Retrieve the analysis waterfall, format it, and insert it into the data warehouse if requested with RETURN_RESULTS = False
Format the segment detail information. Calculate some derived columns for the segment detail table, and insert the segment details into the data warehouse if requested with RETURN_RESULTS = False
Return the results for formatting into a JSON or object, if requested with RETURN_RESULTS = True

snowflake_database_helper.py / fivetran_database_helper.py

This script defines a class called SisuOutputDatabaseConnection that performs database-related functions to write Sisu analysis and segment information into the data warehouse, or return a formatted object / JSON string the caller can understand. We provide example helpers for Snowflake and Fivetran. Customers using other data warehouses will need to create their own implementation of this class that implements all of the same functions available in this object

Class variables are declared at the top of the class to define SQL statements that would be used to execute select, insert, delete, and truncate operations for the specific data warehouse, and internal data structures to store result data. These variables are convenience for the developer and not explicitly required as they are not used by callers of the class

Class functions are declared inside the class declaration to perform the various operations required to operate the execute_load() function detailed above. Unlike class variables, class functions are required and must be implemented for any new data warehouse type a customer desires to use:

init(): This is a standard class initialization function. Using configuration in config.py, it performs warehouse-specific tasks to establish a connection and create a reusable cursor for database operations that span the life of the execution. At the end of this function, a connection and cursor object, and any internal data structures required, should be fully initialized and stored in the class variables
truncateTables(): This function should perform the database operations required to remove all data from all of the database tables defined for the solution. Think of this as a reset button for the data tables. At the end of this function, each of the tables will be empty
deleteAnalysisMetadata(ANALYSIS_ID Integer): This function should delete the analysis with the ANALYSIS_ID provided as input from the ANALYSIS table
writeAnalysisMetadata(vals List): This function should insert a record into the ANALYSIS table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
deleteAnalysisResultSummary(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_SUMMARY table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of one record from the table
writeAnalysisResultSummary(vals List): This function should insert a record into the ANALYSIS_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
deleteAnalysisResultDetail(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_DETAIL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a segment from the given execution of a specific analysis
writeAnalysisResultDetail(df Dataframe): This function should write the pandas dataframe containing the segments for a given analysis execution into the ANALYSIS_RESULT_DETAIL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
deleteAnalysisResultWaterfall(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_WATERFALL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a step in the waterfall chart from the given execution of a specific analysis
writeAnalysisResultWaterfall(df Dataframe): This function should write the pandas dataframe containing the steps for a given analysis waterfall into the ANALYSIS_RESULT_WATERFALL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
writeTrendResultSummary(vals List): This function should delete any existing data with the same ANALYSIS_ID, and ANALYSIS_RESULT_ID, then insert a record into the TREND_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL. The first and second fields in vals correspond to the ANALYSIS_ID and ANALYSIS_RESULT_ID respectively, to facilitate deletion of existing records
writeTrendResultDetail(df Dataframe, vals List): This function should delete any existing data with the ANALYSIS_ID and ANALYSIS_RESULT_ID found as the first and second entry in the input list vals, and then insert the trend details in the pandas dataframe df into the TREND_RESULT_DETAIL table. Each row of df will contain a complete record, even if some of the column values are NULL
getTCAnalysisResultIDs(ANALYSIS_ID Integer, RECENT_PERIOD_START String, RECENT_PERIOD_END String, PREVIOUS_PERIOD_START String, PREVIOUS_PERIOD_END String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and date ranges. This is to facilitate deletion of previous executions of a time comparison analysis with those date ranges to avoid having duplicate data in the database
getGCAnalysisResultIDs(ANALYSIS_ID Integer, GROUP_A_NAME String, GROUP_B_NAME String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and group names. This is to facilitate deletion of previous executions of a group comparison analysis with those group names to avoid having duplicate data in the database
getGPAnalysisResultIDs(ANALYSIS_ID Integer): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID. This is to facilitate deletion of previous executions of a general performance analysis to avoid having duplicate data in the database
formatResults(event Dict): This function will return a formatted JSON string or other returnable object that delivers results back to the calling function / system instead of inserting it into the database. This method will only be called if RETURN_RESULTS = True is specified by the caller. This method may not be applicable to all database handlers

lambda_function.py

We provide a script called lambda_function.py that can immediately be used with an AWS Lambda Function to support workflow orchestration as outlined above in Step 3. This script may also be useful for customers wishing to integrate scheduled execution / load of Sisu segments into other workflows and tools; this script will work with AWS Lambda Functions directly, but also any other caller that makes execution calls with the same input Event format

This script defines a single function, lambda_handler(event Dictionary, context Dictionary): The input context is not used by the function. The input event is used to pass execution requests to the orchestrator.py script for either a one-off execution of an analysis, or a historical execution

fivetran_gcf_main.py

We provide a script called fivetran_gcf_main.py that can immediately be used with a Google Cloud Function to support workflow orchestration as outlined above in Step 3. This script will work with Google Cloud Functions called through Fivetran Connectors directly, but also any other caller that makes execution calls with the same input event format contained in a secrets section of a JSON GET message

This script defines a single function, main(req Request): The input req is used to pass a formatted JSON request message that is parsed into a dictionary. The secrets dictionary is extracted from the request and sent to the orchestrator.py script for either a one-off execution of an analysis, or a historical execution

orchestrator.py

This script defines a single function, main(event Dictionary), that contains the input parameters needed to operate the workflow. The following required and optional parameters may be specified:

API_KEY - String - Required: A valid Sisu API Key for the customer organization where the analysis to be operated resides
ANALYSIS_ID - Integer - Required: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID. NOTE: If ACTION_TYPE (below) is specified, the ANALYSIS_ID parameter MUST correspond to a Time Comparison analysis
ACTION_TYPE - String - Optional: Specifying an ACTION_TYPE will kick off a historical load as outlined above. Must be one of:
- DOD: Day Over Day execution
- WOW: Week Over Week
- MOM: Month Over Month
START_DATE - String - Required iff ACTION_TYPE is specified**,** ignored otherwise: The START_DATE must be either a String in the format of ‘YYYY-MM’DD’, or one of the following:
- YESTERDAY: Sets the START_DATE to the current date - 1 day
- TODAY: Sets the START_DATE to the current date
- LAST_WEEK: Sets the START_DATE to the current date - 7 days
- THIS_WEEK: Sets the START_DATE to the current date
- LAST_MONTH: Sets the START_DATE to the current date - 1 month
END_DATE - String - Optional iff ACTION_TYPE is specified, ignored otherwise: The END_DATE, if specified, must be a String in the format of ‘YYYY-MM-DD’. If no END_DATE is specified, the END_DATE is set to the same value as the decoded START_DATE
RETURN_RESULTS - Boolean - Optional: Default is False. Pass True if you wish the scripts to return a formatted response object instead of directly inserting results into the database (ex: to return a JSON response object to a web service call.) A value of True will result in a call to formatResults to the database handler
EXECUTE_ANALYSIS - Boolean - Optional: Default is True. Pass False if you wish the scripts to not execute the analysis before retrieving results (ex: to use the results as they exist in Sisu at the time of script execution). True will result in the script executing the analysis through a Sisu API call before retrieving results

If an API_KEY and ANALYSIS_ID are provided, a single call is made to load_analysis_results.execute_load()

If an ACTION_TYPE is specified, the START_DATE parameter is decoded, the END_DATE parameter is decoded, and a single call is made to load_analysis_results.process_tc_action()

If the calls above complete successfully and RETURN_RESULTS = False, a response dictionary is returned with a statusCode of 200, and a JSON body indicating successful execution. If the calls above complete successfully and RETURN_RESULTS = True, formatted results are returned by the database handler, and it is up to the implementing class to return proper results that the caller will accept. Otherwise, exceptions raised by the script are thrown to the caller to indicate failure

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
DDL-BigQuery.sql		DDL-BigQuery.sql
DDL-Snowflake.sql		DDL-Snowflake.sql
LICENSE		LICENSE
README.md		README.md
config.py		config.py
csv_database_helper.py		csv_database_helper.py
fivetran_database_helper.py		fivetran_database_helper.py
fivetran_gcf_main.py		fivetran_gcf_main.py
lambda-dependencies.txt		lambda-dependencies.txt
lambda_function.py		lambda_function.py
load_analysis_results.py		load_analysis_results.py
orchestrator.py		orchestrator.py
pip-dependencies.txt		pip-dependencies.txt
requirements.txt		requirements.txt
snowflake_database_helper.py		snowflake_database_helper.py
test_fivetran_gcf_main.py		test_fivetran_gcf_main.py
test_lambda_function.py		test_lambda_function.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome!

Code Details

Database Structure

Description of provided scripts

config.py

load_analysis_results.py

snowflake_database_helper.py / fivetran_database_helper.py

lambda_function.py

fivetran_gcf_main.py

orchestrator.py

About

Releases

Packages

Languages

License

sisudata/api_examples

Folders and files

Latest commit

History

Repository files navigation

Welcome!

Code Details

Database Structure

Description of provided scripts

config.py

load_analysis_results.py

snowflake_database_helper.py / fivetran_database_helper.py

lambda_function.py

fivetran_gcf_main.py

orchestrator.py

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages