Skip to content
This repository has been archived by the owner on Mar 2, 2023. It is now read-only.

Simplify architecture by using one DynamoDB table #69

Open
cawilson1 opened this issue Feb 10, 2022 · 1 comment
Open

Simplify architecture by using one DynamoDB table #69

cawilson1 opened this issue Feb 10, 2022 · 1 comment

Comments

@cawilson1
Copy link

cawilson1 commented Feb 10, 2022

In the current architecture, three DynamoDB tables are provisioned, however, all three have the same partition key. This allows for a fairly simple refactor to use single-table design for DynamoDB. This has multiple benefits, few, if any, downsides, and aligns with AWS' stated best practice for DynamoDB.

A proposal for changes to this architecture to use single-table design at the bottom.

The proposal simplifies/improves the architecture by:

  • reducing number of resources to be provisioned, managed, and considered in AWS account, source code, and diagrams
  • using a more flexible schema, which allows all events for a call to be retrieved with a single DynamoDB Query (instead of 3 queries to 3 tables), while preserving the document sort order and the fine-grained query ability via startTime available in previous versions.
  • simplifying table throughput allocations because all W/RCUs come from the same pool instead of three separate pools
  • allowing an organization to integrate this as part of an organization-wide, single DynamoDB table, allowing for indexing of common fields and maintaining throughput and cost for only one table
  • retaining all of the features from previous versions

Propsal:

Simplify architecture by using single-table design for DynamoDB.

Consolidate into contactDetails, transcriptSegments, and transcriptSegmentsToCustomer tables into one DynamoDB table, contactDetails.
Update key schema for contactDetails to use a string partition key named "pk", and a string sort key named "sk"
Increase default table throughput allocation to 15 W/RCUs for this table, instead of 3 tables with 5 W/RCUs.
Use "#!#" as a delimiter.
For all documents for a given Amazon Connect call, the format of "pk" for all summary or transcript segment events for that call will be "CONNECT_TRANSCRIPTION#!#contactId"
Use "sk" to differentiate the types of data inserted. There are 3 prefix categories for "sk", analogous to the three DynamoDB tables from previous version.

  • "sk" starting with "FROM_CUSTOMER" (1) replaces the "transcriptSegments" table, and "sk" starting with "TO_CUSTOMER" (2) replaces the "transcriptSegmentsToCustomer" table.
    • In the multi-table version, the transcript segment tables both have a sort key named "StartTime". In the single-table version, this order is retained by appending the delimiter and startTime to either "TO_CUSTOMER" or "FROM_CUSTOMER". An example of this new sk value would be "FROM_CUSTOMER#!# 19.37". A "begins_with" query on "sk" on the single table returns the same set of documents as the "ContactId" partition key query from previous versions on the specific table.
  • "sk" exactly matching "TRANSCRIPTION_SUMMARY" (3) replaces the previous version of the "contactDetails" table.
@angieyu
Copy link
Contributor

angieyu commented Oct 3, 2022

Thank you for the suggestion! We probably won't be making architecture updates to this solution, as it is essentially legacy now with the release of the Contact Lens Streaming API, https://docs.aws.amazon.com/connect/latest/adminguide/contact-analysis-segment-streams.html for your reference. If you have thoughts, we would love to hear them

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants