Skip to content

sisudata/api_examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome!

This repository contains the code examples for consuming Sisu's insights in BI tools. For more information or to get started, contact your Sisu Customer Success team

Code Details

Database Structure

Below is an overview of the tables and columns, along with their semantic meaning:

NOTE: It is intended that SET1 refers to Group A or the Recent Period value, and SET2 refers to_ Group B_ or the_ Previous Period_ value. For General Performance analyses, SET2 will always be NULL

ANALYSIS - Holds the metadata for this analysis

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_NAME TEXT The name of the analysis in Sisu
ANALYSIS_TYPE TEXT The type of the analysis
APPLICATION_URL TEXT A direct link to the analysis in Sisu
CREATED_AT TIMESTAMP Timestamp when this analysis was created
METRIC_ID BIGINT The static ID of the metric analyzed
METRIC_NAME TEXT The name of the metric analyzed
METRIC_DESIRED_DIRECTION TEXT The desired direction of the metric analyzed
PROJECT_ID BIGINT The static ID of the project where this analysis is located in Sisu
PROJECT_NAME TEXT The name of the Sisu project
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_SUMMARY - Holds the summary ribbon information for this analysis

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_RESULT_ID BIGINT, PRIMARY KEY The static ID for the results associated with a specific execution of a specific analysis
REQUESTED_AT TIMESTAMP Timestamp when this analysis execution began
COMPLETED_AT TIMESTAMP Timestamp when this analysis execution completed
RUN_STATUS TEXT Status of the analysis execution
RUN_TYPE TEXT Type of the analysis execution
TIMEFRAME TEXT Currently empty
PREVIOUS_PERIOD_START TIMESTAMP Start date for the Previous Period of a Time Comparison analysis
PREVIOUS_PERIOD_END TIMESTAMP End date for the Previous Period of a Time Comparison analysis
RECENT_PERIOD_START TIMESTAMP Start date for the Recent Period of a Time Comparison analysis
RECENT_PERIOD_END TIMESTAMP End date for the Recent Period of a Time Comparison analysis
GROUP_A_NAME TEXT The name of Group A for a Group Comparison analysis
GROUP_B_NAME TEXT The name of Group B for a Group Comparison analysis
METRIC_TYPE_LABEL TEXT The type of metric analyzed
PERCENT_CHANGE FLOAT The Percentage Change in metric value between periods or groups
DIRECTION TEXT The direction of change of the metric value between periods or groups
SET1_CARD_LABEL TEXT Summary card label for set 1
SET1_CATEGORY_FILTER TEXT Any filters applied to set 1 data
SET1_AVERAGE FLOAT The average value for set 1
SET1_MIN FLOAT The minimum value for set 1
SET1_MAX FLOAT The maximum value for set 1
SET1_MEDIAN FLOAT The median value for set 1
SET1_SUM FLOAT The sum value for set 1
SET1_SUMMARY_VALUE FLOAT The metric value for set 1
SET1_TOTAL_SIZE FLOAT The population size for set 1
SET1_TOTAL_NUMERATOR FLOAT For rate metrics, the total numerator value for set 1
SET1_TOTAL_DENOMINATOR FLOAT For rate metrics the total denominator value for set 1
SET1_MATCH_SIZE FLOAT The match size for set 1
SET1_WEIGHT FLOAT The weight value for set 1
SET2_CARD_LABEL TEXT Summary card label for set 2
SET2_CATEGORY_FILTER TEXT Any filters applied to set 2 data
SET2_AVERAGE FLOAT The average value for set 2
SET2_MIN FLOAT The minimum value for set 2
SET2_MAX FLOAT The maximum value for set 2
SET2_MEDIAN FLOAT The median value for set 2
SET2_SUM FLOAT The sum value for set 2
SET2_SUMMARY_VALUE FLOAT The metric value for set 2
SET2_TOTAL_SIZE FLOAT The population size for set 2
SET2_TOTAL_NUMERATOR FLOAT For rate metrics, the total numerator value for set 2
SET2_TOTAL_DENOMINATOR FLOAT For rate metrics the total denominator value for set 2
SET2_MATCH_SIZE FLOAT The match size for set 2
SET2_WEIGHT FLOAT The weight value for set 2
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_WATERFALL - Holds the waterfall steps for a given analysis, if applicable (NOTE: General Performance analyses do not have waterfall charts)

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_RESULT_ID BIGINT, PRIMARY KEY The static ID for the results associated with a specific execution of a specific analysis
STEP_ID INTEGER, PRIMARY KEY The step ID for this step of the waterfall chart
STEP_TYPE TEXT The type of this step of the waterfall chart
FACTOR_0_DIMENSION TEXT Dimension name for the first factor of this step
FACTOR_0_VALUE TEXT Dimension value for the first factor of this step
FACTOR_1_DIMENSION TEXT Dimension name for the second factor of this step
FACTOR_1_VALUE TEXT Dimension value for the second factor of this step
FACTOR_2_DIMENSION TEXT Dimension name for the third factor of this step
FACTOR_2_VALUE TEXT Dimension value for the third factor of this step
STEP_IMPACT FLOAT The net impact of this step of the waterfall
CUMULATIVE_IMPACT_BEFORE_STEP FLOAT The cumulative impact before this step of the waterfall
CUMULATIVE_IMPACT_AFTER_STEP FLOAT The cumulative impact after this step of the waterfall
OVERLAPPING_IMPACT FLOAT The amount of any overlapping impact
CHANGE_IN_SIZE_SET1 FLOAT The change in the below value type for this metric for set 1
CHANGE_IN_SIZE_SET2 FLOAT The change in the below value type for this metric for set 2
CHANGE_IN_TYPE_SET1 FLOAT The type of value changed above for this metric for set 1
CHANGE_IN_TYPE_SET2 FLOAT The type of value changed above for this metric for set 2
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

ANALYSIS_RESULT_DETAIL - Holds the segment table information for a given analysis

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_RESULT_ID BIGINT, PRIMARY KEY The static ID for the results associated with a specific execution of a specific analysis
SUBGROUP_ID BIGINT, PRIMARY KEY The static ID for this subgroup
CONFIDENCE TEXT The Sisu calculated confidence bucket for this segment
FACTOR_0_DIMENSION TEXT Dimension name for the first factor of this segment
FACTOR_0_VALUE TEXT Dimension value for the first factor of this segment
FACTOR_1_DIMENSION TEXT Dimension name for the second factor of this segment
FACTOR_1_VALUE TEXT Dimension value for the second factor of this segment
FACTOR_2_DIMENSION TEXT Dimension name for the third factor of this segment
FACTOR_2_VALUE TEXT Dimension value for the third factor of this segment
FACTOR_0_DIMENSION_FRIENDLY TEXT Dimension name for the first factor of this segment that has been formatted for easier readability
FACTOR_1_DIMENSION_FRIENDLY TEXT Dimension value for the first factor of this segment that has been formatted for easier readability
FACTOR_2_DIMENSION_FRIENDLY TEXT Dimension name for the second factor of this segment that has been formatted for easier readability
FACTOR_0_VALUE_FRIENDLY TEXT Dimension value for the second factor of this segment that has been formatted for easier readability
FACTOR_1_VALUE_FRIENDLY TEXT Dimension name for the third factor of this segment that has been formatted for easier readability
FACTOR_2_VALUE_FRIENDLY TEXT Dimension value for the third factor of this segment that has been formatted for easier readability
FACTOR_0_TEXT TEXT The combined, formatted, dimension name and value for the first factor of this segment
FACTOR_1_TEXT TEXT The combined, formatted, dimension name and value for the second factor of this segment
FACTOR_2_TEXT TEXT The combined, formatted, dimension name and value for the third factor of this segment
SEGMENT_TEXT TEXT The combined, formatted, dimension names and values for this segment
IMPACT FLOAT The Sisu calculated impact for this segment
IMPACT_MAGNITUDE FLOAT The absolute value of the Sisu calculated impact for this segment
IMPACT_RANK TEXT Currently empty
SET1_SIZE FLOAT The Sisu calculated population size for this segment for set 1
SET2_SIZE FLOAT The Sisu calculated population size for this segment for set 2
SET1_VALUE FLOAT The Sisu calculated metric value for this segment for set 1
SET2_VALUE FLOAT The Sisu calculated metric value for this segment for set 2
PERCENT_CHANGE FLOAT The percentage change in metric value from set 1 to set 2
DIRECTION TEXT The formatted direction of change in metric value for this segment
DIRECTION_TEXT TEXT The textual representation of the direction of change in metric value for this segment
ORIENTATION_MATCHES_METRIC TEXT True if the direction of change for this segment matches the desired direction of change for this metric. Otherwise, False
SEGMENT_ORDER SMALLINT Number of factors involved in this segment (1, 2, or 3)
SEGMENT_ORDER_TEXT TEXT A textual representation of whether this segment is a first, second, or third, order fact
INSIGHT_TEXT TEXT A textual representation of this segment, its impact, direction of change, etc.
CHANGE_IN_SIZE FLOAT Currently empty
MIX_EFFECT FLOAT Currently empty
NET_EFFECT FLOAT Currently empty
NET_RELATIVE_EFFECT FLOAT Currently empty
PERCENT_CHANGE_IN_SIZE FLOAT Currently empty
RATE_EFFECT FLOAT Currently empty
RELATIVE_PERCENT_CHANGE FLOAT Currently empty
RATE_CHANGE FLOAT Currently empty
RELATIVE_MIX_EFFECT FLOAT Currently empty
SEGMENT_NAME TEXT Currently empty
SEGMENT_RANK TEXT Currently empty
SEGMENT_HASH TEXT Currently empty
UNWEIGHTED_CHANGE_IN_AVERAGE FLOAT Currently empty
WEIGHT FLOAT Currently empty
WEIGHTED_CHANGE_IN_SUM FLOAT Currently empty
UNITS TEXT Currently empty
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

TREND_RESULT_SUMMARY - Holds summary card information for a trend detection

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_RESULT_ID BIGINT, PRIMARY KEY The static ID for the results associated with a specific execution of a specific analysis
REQUESTED_AT TIMESTAMP Timestamp when this analysis execution began
COMPLETED_AT TIMESTAMP Timestamp when this analysis execution completed
RUN_STATUS TEXT Status of the analysis execution
RUN_TYPE TEXT Type of the analysis execution
CURRENT_PERIOD_CARD_LABEL TEXT Summary card label for the current trend
CURRENT_PERIOD_DENOMINATOR_LABEL TEXT Summary card denominator label for the current trend
CURRENT_PERIOD_PERCENT_CHANGE FLOAT Percent change for the current trend
CURRENT_PERIOD_SLOPE FLOAT Trend line slope for the current trend
PREVIOUS_PERIOD_CARD_LABEL TEXT Summary card label for the previous trend
PREVIOUS_PERIOD_DENOMINATOR_LABEL TEXT Summary card denominator label for the previous trend
PREVIOUS_PERIOD_PERCENT_CHANGE FLOAT Percent change for the previous trend
PREVIOUS_PERIOD_SLOPE FLOAT Trend line slope for the previous trend
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

TREND_RESULT_DETAIL - Holds the segment details for a trend detection

Column Name Type Semantic Meaning
ANALYSIS_ID BIGINT, PRIMARY KEY The static ID for a given analysis
ANALYSIS_RESULT_ID BIGINT, PRIMARY KEY The static ID for the results associated with a specific execution of a specific analysis
SUBGROUP_ID BIGINT, PRIMARY KEY The static ID for this subgroup
FACTOR_0_DIMENSION TEXT Dimension name for the first factor of this segment
FACTOR_0_VALUE TEXT Dimension value for the first factor of this segment
FACTOR_1_DIMENSION TEXT Dimension name for the second factor of this segment
FACTOR_1_VALUE TEXT Dimension value for the second factor of this segment
FACTOR_2_DIMENSION TEXT Dimension name for the third factor of this segment
FACTOR_2_VALUE TEXT Dimension value for the third factor of this segment
IMPACT FLOAT The Sisu calculated impact for this segment
START_DATE TIMESTAMP The Start Date of the trend for this segment
END_DATE TIMESTAMP The End Date of the trend for this segment
INTERCEPT FLOAT The intercept of the trend line for this segment
SLOPE FLOAT The slope of the trend line for this segment
SIZE FLOAT The population size of this segment
LOAD_TS TIMESTAMP The timestamp when this record was loaded into the data warehouse

Description of provided scripts

The following scripts comprise the working solution:

config.py

Configuration of the project is completed in config.py. Each of the project scripts uses different sections of this configuration file:

  • REQUIRED IMPORTS: In this section we call additional libraries or classes that are required for operation of the project. Note that we import a custom class SisuOutputDatabaseConnection. Customers writing segments to Snowflake may use the existing class. Customers writing segments to other databases will need to write a database handler (see Extensions.) We also import io to allow input parameters such as passwords and API keys to be hosted in a profile file for security purposes
  • RETURN A DATABASE CONNECTION: In this section, we define a method called getDatabaseConnection() that is to return a new SisuOutputDatabaseConnection object that implements the required handler methods (see Extensions.) Customers using Snowflake as their data warehouse require no changes
  • DATABASE CONNECTION PARAMETERS: This section defines the connection parameters used by the SisuOutputDatabaseConnection. Customers will need to update the parameter values in this section for their data warehouse. Customers using Snowflake will require new values for each of the existing parameters. Other data warehouse types will require different parameters (see Extensions)
  • SEGMENT LOAD PARAMETERS: This section defines parameters used by load_analysis_results.py to control record insert batch size, the string delimiter to be used for segment load, whether database tables should be truncated at the start of execution, and whether the script should operate in debug mode (verbose logging to the console.) Customers should not need to modify this section unless directed by their Customer Success Manager or Solution Architect. NOTE: setting the truncate option to true will truncate all the results tables at the start of execution. This cannot be undone; use with extreme caution
  • USER PARAMETERS: Customers may put additional parameters for their organization in this section. In Sisu’s default configuration, we show how one may retrieve Sisu API keys stored in the executing user’s bash or zsh profile

load_analysis_results.py

This is the main project script that performs the actions of:

  • Updating analyses with new time period configurations
  • Iterating over time periods for historical load of segments
  • Executing an analysis
  • Retrieving and formatting Sisu segments
  • Passing segments to the database helper to be written to the warehouse

It is split into two main functions:

process_tc_action

This method takes inputs to allow historical loads of different types for a Time Comparison analysis. For example, we may wish to execute a Time Comparison analysis for each of the previous 12 months, and write the segments to the database

Inputs to this method are all required:

  • API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
  • ANALYSIS_ID - Integer: A valid Time Comparison analysis ID
  • RETURN_RESULTS - Boolean: True if the results are to be returned to the calling function as a JSON blob or object; False if the results are to be inserted into the database directly
  • ACTION_TYPE - String: Must be one of:
    • DOD: Day Over Day execution
    • WOW: Week Over Week
    • MOM: Month Over Month
  • START_DATE - Datetime: The starting date of the first Recent Period of the historical load
  • END_DATE - Datetime: The starting date of the last Recent Period of the historical load

Executing historical loads is then automated. The script will perform one of the ACTION_TYPE loads as requested by the user. Dates are calculated, the analysis specified in ANALYSIS_ID is modified to set the Recent and Previous Period dates as calculated (see below,) and execute_load() is called with the provided API_KEY, RETURN_RESULTS and ANALYSIS_ID. EXECUTE_ANALYSIS is set to True

Dates are calculated and the script operates in the following manner:

  1. We first calculate a timeDelta, which is a number of days according to the ACTION_TYPE specified. For DOD, timeDelta is one day. For WOW, timeDelta is seven days. For MOM, timeDelta is one month
  2. We then calculate the initial dates based on the START_DATE provided. The Recent Period is set as START_DATE to START_DATE + timeDelta. The Previous Period is set as START_DATE - timeDelta to START_DATE
  3. We then operate a loop of
    1. Update the Time Comparison to set the date ranges as calculated
    2. Call execute_load() for the modified analysis
    3. Update the date ranges, adding timeDelta to each of the Previous / Recent Period Start / End dates
    4. Stop operating when the Recent Period Start Date is AFTER the END_DATE provided as input to the function

NOTES:

  • Our calculations are based on the Recent Period, that is the later period, start and end dates. The Previous Period is set to the comparison period BEFORE the Recent Period. One must ensure that data exists for both the Previous Period and Recent Period, or execution will fail
  • The START_DATE and END_DATE input parameters specify the range of time to operate this historical load, NOT as the start and end dates of any period of the analysis
  • Sisu’s time period Start Date is inclusive, but End Date is exclusive. So, the Previous Period End Date may be the same as the Recent Period Start Date without any overlap in data

Examples:

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': 'DOD', 'START_DATE': '2018-12-04', 'END_DATE': '2018-12-06'}

This configuration will execute the analysis to do a daily comparison as follows:

Recent Period Start 2018-12-04 2018-12-05 2018-12-06
Recent Period End 2018-12-05 2018-12-06 2018-12-07
Previous Period Start 2018-12-03 2018-12-04 2018-12-05
Previous Period End 2018-12-04 2018-12-05 2018-12-06

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': WOW, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}

This configuration will execute the analysis to do a weekly comparison as follows:

Recent Period Start 2018-12-04 2018-12-11 2018-12-18 2018-12-25 2019-01-01
Recent Period End 2018-12-11 2018-12-18 2018-12-25 2019-01-01 2019-01-08
Previous Period Start 2018-11-27 2018-12-04 2018-12-11 2018-12-18 2018-12-25
Previous Period End 2018-12-04 2018-12-11 2018-12-18 2018-12-25 2019-01-01

{'ANALYSIS_ID': '174392', 'API_KEY': config.AI_API_KEY, 'ACTION_TYPE': MOM, 'START_DATE': '2018-12-04', 'END_DATE': '2019-01-06'}

This configuration will execute the analysis to do a monthly comparison as follows:

Recent Period Start 2018-12-04 2019-01-04
Recent Period End 2019-01-04 2019-02-04
Previous Period Start 2018-11-04 2018-12-04
Previous Period End 2018-12-04 2019-01-04

execute_load

This function performs the work of executing an analysis, retrieving and formatting the results, and sending them to the database helper to be stored in the warehouse

Inputs to this method are all required:

  • API_KEY - String: A valid Sisu API Key for the customer organization where the analysis to be operated resides
  • ANALYSIS_ID - Integer: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID
  • EXECUTE_ANALYSIS - Boolean: True - execute the analysis before retrieving the analysis results; False - do not execute the analysis before retrieving the analysis results (i.e. use the results provided by Sisu’s previous execution of the analysis)
  • RETURN_RESULTS - Boolean: True - return the results to the caller as a formatted JSON string or object; False - insert the data directly into the database

This function will then perform the following actions:

  1. Get a database connection by calling config.getDatabaseConnection(), which will return an object implementing the methods required of a SisuOutputDatabaseConnection (see above)
    1. Truncate the database tables if requested in config.py
  2. Connect to Sisu with the provided API_KEY, and get the Analysis metadata (including some Metric information.) Format it. Delete the existing metadata for this Analysis and write updated metadata for this Analysis to the data warehouse if requested with RETURN_RESULTS = False
  3. Execute the Analysis, if requested with EXECUTE_ANALYSIS = True, and retrieve the result summary and segment detail. Find any existing runs with the same group names or date ranges, and delete them from the summary and detail tables
  4. Format the summary information. Calculate some derived columns for the analysis summary table. Insert the summary information into the data warehouse, if requested with RETURN_RESULTS = False
  5. Retrieve the analysis waterfall, format it, and insert it into the data warehouse if requested with RETURN_RESULTS = False
  6. Format the segment detail information. Calculate some derived columns for the segment detail table, and insert the segment details into the data warehouse if requested with RETURN_RESULTS = False
  7. Return the results for formatting into a JSON or object, if requested with RETURN_RESULTS = True

snowflake_database_helper.py / fivetran_database_helper.py

This script defines a class called SisuOutputDatabaseConnection that performs database-related functions to write Sisu analysis and segment information into the data warehouse, or return a formatted object / JSON string the caller can understand. We provide example helpers for Snowflake and Fivetran. Customers using other data warehouses will need to create their own implementation of this class that implements all of the same functions available in this object

Class variables are declared at the top of the class to define SQL statements that would be used to execute select, insert, delete, and truncate operations for the specific data warehouse, and internal data structures to store result data. These variables are convenience for the developer and not explicitly required as they are not used by callers of the class

Class functions are declared inside the class declaration to perform the various operations required to operate the execute_load() function detailed above. Unlike class variables, class functions are required and must be implemented for any new data warehouse type a customer desires to use:

  • init(): This is a standard class initialization function. Using configuration in config.py, it performs warehouse-specific tasks to establish a connection and create a reusable cursor for database operations that span the life of the execution. At the end of this function, a connection and cursor object, and any internal data structures required, should be fully initialized and stored in the class variables
  • truncateTables(): This function should perform the database operations required to remove all data from all of the database tables defined for the solution. Think of this as a reset button for the data tables. At the end of this function, each of the tables will be empty
  • deleteAnalysisMetadata(ANALYSIS_ID Integer): This function should delete the analysis with the ANALYSIS_ID provided as input from the ANALYSIS table
  • writeAnalysisMetadata(vals List): This function should insert a record into the ANALYSIS table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
  • deleteAnalysisResultSummary(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_SUMMARY table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of one record from the table
  • writeAnalysisResultSummary(vals List): This function should insert a record into the ANALYSIS_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL
  • deleteAnalysisResultDetail(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_DETAIL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a segment from the given execution of a specific analysis
  • writeAnalysisResultDetail(df Dataframe): This function should write the pandas dataframe containing the segments for a given analysis execution into the ANALYSIS_RESULT_DETAIL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
  • deleteAnalysisResultWaterfall(ANALYSIS_ID Integer, ANALYSIS_RESULT_ID Integer): This function should delete any records from the ANALYSIS_RESULT_WATERFALL table with the given ANALYSIS_ID and ANALYSIS_RESULT_ID combination. In practice, each call to this method should result in the deletion of many records from the table, each corresponding to a step in the waterfall chart from the given execution of a specific analysis
  • writeAnalysisResultWaterfall(df Dataframe): This function should write the pandas dataframe containing the steps for a given analysis waterfall into the ANALYSIS_RESULT_WATERFALL table. df will contain many rows of data that may be inserted in bulk, or as individual records, depending on the capabilities of the data warehouse. Each row of df will contain a complete record, even if some of the column values are NULL
  • writeTrendResultSummary(vals List): This function should delete any existing data with the same ANALYSIS_ID, and ANALYSIS_RESULT_ID, then insert a record into the TREND_RESULT_SUMMARY table. The input list vals will contain, in order, the column values to be inserted into the table to form a complete record, even if a column value is NULL. The first and second fields in vals correspond to the ANALYSIS_ID and ANALYSIS_RESULT_ID respectively, to facilitate deletion of existing records
  • writeTrendResultDetail(df Dataframe, vals List): This function should delete any existing data with the ANALYSIS_ID and ANALYSIS_RESULT_ID found as the first and second entry in the input list vals, and then insert the trend details in the pandas dataframe df into the TREND_RESULT_DETAIL table. Each row of df will contain a complete record, even if some of the column values are NULL
  • getTCAnalysisResultIDs(ANALYSIS_ID Integer, RECENT_PERIOD_START String, RECENT_PERIOD_END String, PREVIOUS_PERIOD_START String, PREVIOUS_PERIOD_END String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and date ranges. This is to facilitate deletion of previous executions of a time comparison analysis with those date ranges to avoid having duplicate data in the database
  • getGCAnalysisResultIDs(ANALYSIS_ID Integer, GROUP_A_NAME String, GROUP_B_NAME String): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID and group names. This is to facilitate deletion of previous executions of a group comparison analysis with those group names to avoid having duplicate data in the database
  • getGPAnalysisResultIDs(ANALYSIS_ID Integer): This function will return a list of ANALYSIS_RESULT_ID’s that correspond to an analysis with the given ID. This is to facilitate deletion of previous executions of a general performance analysis to avoid having duplicate data in the database
  • formatResults(event Dict): This function will return a formatted JSON string or other returnable object that delivers results back to the calling function / system instead of inserting it into the database. This method will only be called if RETURN_RESULTS = True is specified by the caller. This method may not be applicable to all database handlers

lambda_function.py

We provide a script called lambda_function.py that can immediately be used with an AWS Lambda Function to support workflow orchestration as outlined above in Step 3. This script may also be useful for customers wishing to integrate scheduled execution / load of Sisu segments into other workflows and tools; this script will work with AWS Lambda Functions directly, but also any other caller that makes execution calls with the same input Event format

This script defines a single function, lambda_handler(event Dictionary, context Dictionary): The input context is not used by the function. The input event is used to pass execution requests to the orchestrator.py script for either a one-off execution of an analysis, or a historical execution

fivetran_gcf_main.py

We provide a script called fivetran_gcf_main.py that can immediately be used with a Google Cloud Function to support workflow orchestration as outlined above in Step 3. This script will work with Google Cloud Functions called through Fivetran Connectors directly, but also any other caller that makes execution calls with the same input event format contained in a secrets section of a JSON GET message

This script defines a single function, main(req Request): The input req is used to pass a formatted JSON request message that is parsed into a dictionary. The secrets dictionary is extracted from the request and sent to the orchestrator.py script for either a one-off execution of an analysis, or a historical execution

orchestrator.py

This script defines a single function, main(event Dictionary), that contains the input parameters needed to operate the workflow. The following required and optional parameters may be specified:

  • API_KEY - String - Required: A valid Sisu API Key for the customer organization where the analysis to be operated resides
  • ANALYSIS_ID - Integer - Required: A valid Trend Detection, General Performance, Group Comparison, or Time Comparison, analysis ID. NOTE: If ACTION_TYPE (below) is specified, the ANALYSIS_ID parameter MUST correspond to a Time Comparison analysis
  • ACTION_TYPE - String - Optional: Specifying an ACTION_TYPE will kick off a historical load as outlined above. Must be one of:
    • DOD: Day Over Day execution
    • WOW: Week Over Week
    • MOM: Month Over Month
  • START_DATE - String - Required iff ACTION_TYPE is specified**,** ignored otherwise: The START_DATE must be either a String in the format of ‘YYYY-MM’DD’, or one of the following:
    • YESTERDAY: Sets the START_DATE to the current date - 1 day
    • TODAY: Sets the START_DATE to the current date
    • LAST_WEEK: Sets the START_DATE to the current date - 7 days
    • THIS_WEEK: Sets the START_DATE to the current date
    • LAST_MONTH: Sets the START_DATE to the current date - 1 month
  • END_DATE - String - Optional iff ACTION_TYPE is specified, ignored otherwise: The END_DATE, if specified, must be a String in the format of ‘YYYY-MM-DD’. If no END_DATE is specified, the END_DATE is set to the same value as the decoded START_DATE
  • RETURN_RESULTS - Boolean - Optional: Default is False. Pass True if you wish the scripts to return a formatted response object instead of directly inserting results into the database (ex: to return a JSON response object to a web service call.) A value of True will result in a call to formatResults to the database handler
  • EXECUTE_ANALYSIS - Boolean - Optional: Default is True. Pass False if you wish the scripts to not execute the analysis before retrieving results (ex: to use the results as they exist in Sisu at the time of script execution). True will result in the script executing the analysis through a Sisu API call before retrieving results

If an API_KEY and ANALYSIS_ID are provided, a single call is made to load_analysis_results.execute_load()

If an ACTION_TYPE is specified, the START_DATE parameter is decoded, the END_DATE parameter is decoded, and a single call is made to load_analysis_results.process_tc_action()

If the calls above complete successfully and RETURN_RESULTS = False, a response dictionary is returned with a statusCode of 200, and a JSON body indicating successful execution. If the calls above complete successfully and RETURN_RESULTS = True, formatted results are returned by the database handler, and it is up to the implementing class to return proper results that the caller will accept. Otherwise, exceptions raised by the script are thrown to the caller to indicate failure

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages