generated from NHSDigital/nhs-notify-repository-template
-
Notifications
You must be signed in to change notification settings - Fork 3
CCM-13295: Ingest reporting metadata into Glue table #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gareth-allan
wants to merge
6
commits into
main
Choose a base branch
from
feature/CCM-13295_reporting_data_ingestion
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
552dcf1 to
5a49fc8
Compare
tdroza-nhs
requested changes
Feb 6, 2026
| period = 60 | ||
| statistic = "Sum" | ||
| threshold = 1 | ||
| alarm_description = "This metric monitors failed step function executions" |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change
| alarm_description = "This metric monitors failed step function executions" | |
| alarm_description = "This metric monitors aborted step function executions" |
infrastructure/terraform/components/dl/scheduler_schedule_sf_metadata_refresh_scheduler.tf
Outdated
Show resolved
Hide resolved
simonlabarere
previously approved these changes
Feb 6, 2026
simonlabarere
previously approved these changes
Feb 9, 2026
aidenvaines-cgi
previously approved these changes
Feb 9, 2026
tdroza-nhs
previously approved these changes
Feb 9, 2026
b3d1be2
simonlabarere
approved these changes
Feb 9, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR automates the refreshing of the metadata on the Digital Letters event record data in S3 in the corresponding AWS Glue table. The infrastructure to record events to S3 and to set up the Glue table was added in #190.
This is achieved by adding a Step Function that runs the
MSCK REPAIR TABLE event_recordcommand (docs) against the DL environment's reporting database. As theevent_recordtable is already configured to point at the correct S3 location this causes the metadata to be refreshed and any new or updated partitions to be detected.Changes:
MSCK REPAIR TABLEcommand against the event_record table. This is based on the housekeeping function in the reporting domain.MSCK REPAIR TABLEcommand in corereasonCodeandreasonTextcolumns to theevent_recordtable and ensured the report-event-transformation lambda maps them to the flattened object it producesconditions to all SQS IAM policies that allowed EventBridge to send events to the queue, so they are restricted to only allow events from the expected rule(s)Context
This is required as the solution implemented in #190 did not automatically import the events recorded in S3 into the Glue table, so Athena queries would not return any data unless the
MSCK REPAIR TABLEcommand was run manually. Automating this refresh means that new events will become available to Athena queries on a regular basis.Validation
Verifying that Events are (Eventually) Visible in Athena Without Manual Intervention
Sent a

uk.nhs.notify.digital.letters.print.pdf.analysed.v1event to the event bus:Queried Athena:

Waited until the next run of the step function and then queried Athena:

Verifying the New Columns are Recorded as Expected
Sent a

uk.nhs.notify.digital.letters.print.letter.transitioned.v1event to the event bus:Waited until the next run of the step function and then queried Athena:

Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.