Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jap/simple forms/s3 spike #18338

Closed
wants to merge 38 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
9514c97
move in generic version of runner
pennja Sep 5, 2024
72ec29a
code refinement
pennja Sep 5, 2024
c066786
fix long param list
pennja Sep 5, 2024
e6c56e2
rename class
pennja Sep 5, 2024
4ef32b1
add existing version of DumpSubmissionToPdf class
pennja Sep 5, 2024
3309a0e
refine DumpSubmissionToPdf class further
pennja Sep 5, 2024
83376ea
add existing version of UserSubmissionDumpBuilder class
pennja Sep 5, 2024
45a9517
refine UserSubmissionDumpBuilder class further
pennja Sep 5, 2024
6359235
further refinement of shared logic and naming
pennja Sep 6, 2024
56676ae
add Sidekiq job to handle running script
pennja Sep 6, 2024
cab5b38
minor tweaks
pennja Sep 6, 2024
04fc1f0
misc tweaks and comments
pennja Sep 6, 2024
298792d
more code consolidation
pennja Sep 6, 2024
57b7f2f
more changes to make VFF forms work
pennja Sep 6, 2024
621e210
remove verbose inheritance
pennja Sep 6, 2024
79f4015
updates in accordance with PR feedback
pennja Sep 10, 2024
fd3c324
more misc changes
pennja Sep 10, 2024
814f2e8
many changes, stripping unnecessary logic
pennja Sep 10, 2024
4593583
misc tweaks
pennja Sep 10, 2024
3788cdf
rip off debt management center sharepoint service
pennja Sep 10, 2024
146ca82
first pass at code refinement and exploration
pennja Sep 10, 2024
c2d5c24
one more comment
pennja Sep 10, 2024
b8fe57f
lots of renaming and note taking
pennja Sep 11, 2024
8206250
more s3 changes
pennja Sep 11, 2024
f006b3c
more renaming
pennja Sep 11, 2024
eca30a8
manifest notes
pennja Sep 11, 2024
fff0df3
archiver optimizations, make one s3 call instead of many
pennja Sep 11, 2024
1e137b0
more cleanup
pennja Sep 11, 2024
195110f
split up share point logic
pennja Sep 11, 2024
6b8c713
add job to handle moving S3 stuff to SharePoint
pennja Sep 11, 2024
df18c45
pull out submission specific logic from archiver
pennja Sep 11, 2024
96059f6
more sharepoint service changes
pennja Sep 11, 2024
cb3696c
updates in accordance with remediation documentation
pennja Sep 11, 2024
4ffc337
minor sharepoint tweaks, archive builder tests
pennja Sep 12, 2024
d00cbcf
add job to circumvent S3 if necessary
pennja Sep 12, 2024
95455f7
add missing library to utils
pennja Sep 12, 2024
c8fde24
update test coverage for submission archiver
pennja Sep 12, 2024
929380e
fix random thing
pennja Sep 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -164,9 +164,14 @@ def get_file_paths_and_metadata(parsed_form_data)
end

def upload_pdf(file_path, metadata, form)
location, uuid = prepare_for_upload(form, file_path)
location, uuid, submission_attempt = prepare_for_upload(form, file_path)
log_upload_details(location, uuid)
response = perform_pdf_upload(location, file_path, metadata, form)
SimpleFormsApi::S3::SubmissionArchiveHandlerJob.perform_async(
submission_ids: [submission_attempt.form_submission.id],
metadata:,
file_path:
)

[response.status, uuid]
end
Expand All @@ -176,9 +181,9 @@ def prepare_for_upload(form, file_path)
form_id: get_form_id)
location, uuid = lighthouse_service.request_upload
stamp_pdf_with_uuid(form, uuid, file_path)
create_form_submission_attempt(uuid)
submission_attempt = create_form_submission_attempt(uuid)
pennja marked this conversation as resolved.
Show resolved Hide resolved

[location, uuid]
[location, uuid, submission_attempt]
end

def stamp_pdf_with_uuid(form, uuid, stamped_template_path)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# frozen_string_literal: true

require 'zip'

module SimpleFormsApi
module S3
module Jobs
class ArchiveUploaderJob < SimpleFormsApi::S3::Utils
include Sidekiq::Worker

sidekiq_options retry: 3, queue: 'default'

def perform(benefits_intake_uuid:)
@benefits_intake_uuid = benefits_intake_uuid

temp_directory_path = fetch_s3_folder
zip_temp_folder(temp_directory_path)
upload_s3_folder_to_sharepoint(temp_directory_path)

FileUtils.rm_rf(temp_directory_path)
rescue => e
handle_error('ArchiveUploaderJob failed.', e)
end

private

attr_reader :benefits_intake_uuid

def zip_temp_folder(temp_directory_path)
Zip::File.open(temp_directory_path, Zip::File::CREATE) do |zip_file|
Dir[File.join(temp_directory_path, '**', '**')].each do |file|
zip_file.add(file.sub("#{temp_directory_path}/", ''), file)
end
end
end

def fetch_s3_folder
SimpleFormsApi::S3::SubmissionArchiver.fetch_s3_submission(benefits_intake_uuid)
end

def upload_s3_folder_to_sharepoint(zip_file_path)
SimpleFormsApi::SharePoint::ArchiveUploader.upload(benefits_intake_uuid:, zip_file_path:)
end
end
end
end
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# frozen_string_literal: true

module SimpleFormsApi
module S3
module Jobs
class SubmissionArchiveHandlerJob < SimpleFormsApi::S3::Utils
include Sidekiq::Worker

sidekiq_options retry: 3, queue: 'default'

def perform(benefits_intake_uuids:, **options)
defaults = default_options.merge(options)

runner = SubmissionArchiveHandler.new(benefits_intake_uuids:, **defaults)
result_dir = runner.run
log_info("Job completed successfully. Results saved in directory: #{result_dir}")
rescue => e
handle_error('SubmissionArchiveHandlerJob failed.', e)
end

private

def default_options
{
attachments: [], # an array of attachment confirmation codes
file_path: nil, # file path for the PDF file to be archived
metadata: {}, # pertinent metadata for original file upload/submission
parent_dir: 'vff-simple-forms' # S3 bucket base directory where files live
}
end
end
end
end
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
# frozen_string_literal: true

require 'csv'
require 'fileutils'

# built in accordance with the following documentation:
# https://github.com/department-of-veterans-affairs/va.gov-team-sensitive/blob/master/platform/practices/zero-silent-failures/remediation.md
module SimpleFormsApi
module S3
class SubmissionArchiveBuilder < Utils
def initialize(benefits_intake_uuid: nil, submission: nil, **options) # rubocop:disable Lint/MissingSuper
defaults = default_options.merge(options)

@submission = submission || FormSubmission.find_by(benefits_intake_uuid:)
raise 'Submission was not found' unless @submission

@benefits_intake_uuid = @submission.benefits_intake_uuid

assign_instance_variables(defaults)
end

def run
FileUtils.mkdir_p(temp_directory_path)

process_submission_files

temp_directory_path
rescue => e
handle_error("Failed building submission: #{submission.id}", e, { benefits_intake_uuid: })
end

private

attr_reader :attachments, :benefits_intake_uuid, :file_path, :include_json_archive, :include_manifest,
:include_text_archive, :metadata, :parent_dir, :submission

def default_options
{
attachments: [], # an array of attachment confirmation codes
file_path: nil, # file path for the PDF file to be archived
include_json_archive: true, # include the form data as a JSON object
include_manifest: true, # include a CSV file containing manifest data
include_text_archive: true, # include the form data as a text file
metadata: {}, # pertinent metadata for original file upload/submission
parent_dir: 'vff-simple-forms' # S3 bucket base directory where files live
}
end

def process_submission_files
write_pdf
write_as_json_archive if include_json_archive
write_as_text_archive if include_text_archive
write_attachments unless attachments.empty?
write_manifest if include_manifest
write_metadata
end

def write_pdf
write_tempfile(submission_pdf_filename, File.read(generate_pdf_content))
end

# TODO: this will be pulled out to be more team agnostic
def generate_pdf_content
return file_path if file_path

form_number = SimpleFormsApi::V1::UploadsController::FORM_NUMBER_MAP[submission.form_type]
form = "SimpleFormsApi::#{form_number.titleize.gsub(' ', '')}".constantize.new(form_data_hash)
filler = SimpleFormsApi::PdfFiller.new(form_number:, form:)

@file_path = filler.generate(timestamp: submission.created_at)
@metadata = SimpleFormsApiSubmission::MetadataValidator.validate(
form.metadata,
zip_code_is_us_based: form.zip_code_is_us_based
)

form.handle_attachments(file_path) if %w[vba_40_0247 vba_20_10207 vba_40_10007].include? form_number

@attachments = form.get_attachments if form_number == 'vba_20_10207'
@file_path
end

def form_data_hash
@form_data_hash ||= JSON.parse(submission.form_data)
end

def submission_pdf_filename
@submission_pdf_filename ||= "form_#{form_data_hash['form_number']}.pdf"
end

def error_details(error)
"#{error.message}\n\n#{error.backtrace.join("\n")}"
end

def write_as_json_archive
write_tempfile('form_json_archive.json', JSON.pretty_generate(form_data_hash))
end

def write_as_text_archive
form_data_hash['claim_date'] ||= submission.created_at.iso8601
write_tempfile('form_text_archive.txt', form_data_hash.to_s)
end

def write_metadata
write_tempfile('metadata.json', metadata.to_json)
end

def write_attachments
log_info("Processing #{attachments.count} attachments")
attachments.each_with_index { |upload, i| process_attachment(i + 1, upload) }
write_attachment_failure_report if attachment_failures.present?
rescue => e
handle_upload_error(e)
end

def process_attachment(attachment_number, guid)
log_info("Processing attachment ##{attachment_number}: #{guid}")
attachment = PersistentAttachment.find_by(guid:).to_pdf
raise 'Local record not found' unless attachment

write_tempfile("attachment_#{attachment_number}.pdf", attachment)
rescue => e
attachment_failures << e
handle_error('Attachment failure.', e)
raise e
end

def write_manifest
file_name = "submission_#{benefits_intake_uuid}_#{submission.created_at}_manifest.csv"
file_path = File.join(temp_directory_path, file_name)

CSV.open(file_path, 'wb') do |csv|
csv << ['Submission DateTime', 'Form Type', 'VA.gov ID', 'Veteran ID', 'First Name', 'Last Name']
csv << [
submission.created_at,
form_data_hash['form_number'],
benefits_intake_uuid,
metadata['fileNumber'],
metadata['veteranFirstName'],
metadata['veteranLastName']
]
end

file_path
end

def write_attachment_failure_report
write_tempfile('attachment_failures.txt', JSON.pretty_generate(attachment_failures))
end

def write_tempfile(file_name, payload)
File.write("#{temp_directory_path}#{file_name}", payload)
end

def attachment_failures
@attachment_failures ||= []
end

def temp_directory_path
@temp_directory_path ||= Rails.root.join("tmp/#{benefits_intake_uuid}-#{SecureRandom.hex}/").to_s
end
end
end
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# frozen_string_literal: true

module SimpleFormsApi
module S3
class SubmissionArchiveHandler < Utils
attr_reader :attachments, :benefits_intake_uuids, :parent_dir, :metadata, :file_path

def initialize(benefits_intake_uuids: [], **options) # rubocop:disable Lint/MissingSuper
defaults = default_options.merge(options)

@benefits_intake_uuids = benefits_intake_uuids

assign_instance_variables(defaults)
end

def run
process_individual_submissions
cleanup_tmp_files
parent_dir
end

private

def default_options
{
attachments: [], # an array of attachment confirmation codes
file_path: nil, # file path for the PDF file to be archived
metadata: {}, # pertinent metadata for original file upload/submission
parent_dir: 'vff-simple-forms' # S3 bucket base directory where files live
}
end

def submissions
@submissions ||= FormSubmission.where(benefits_intake_uuid: benefits_intake_uuids)
end

def process_individual_submissions
submissions.each_with_index do |sub, idx|
message = "Processing submission: #{sub.benefits_intake_uuid} " \
"##{idx + 1} of #{submissions.count} total submissions"
log_info(message, benefits_intake_uuid: sub.benefits_intake_uuid, submission_count: submissions.count)
process_submission(sub.benefits_intake_uuid)
end
end

def process_submission(benefits_intake_uuid)
SubmissionArchiver.new(attachments:, file_path:, metadata:, parent_dir:, benefits_intake_uuid:).run
rescue => e
handle_error("Submission archiver failure: #{benefits_intake_uuid}", e, benefits_intake_uuid:)
end

def cleanup_tmp_files
system('rm -f tmp/* > /dev/null 2>&1')
end
end
end
end
Loading
Loading