Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

261 all chapters black box model comments #285

Merged
merged 12 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions analysis.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,6 @@ While the scoping and design stages identified and described risks and uncertain

[Black box models](definitions_and_key_concepts.html/#black-box-models) such as Artificial Intelligence (AI) and machine learning models are not as transparent as traditionally coded models. This adds challenge to the assurance of these models as compared to other forms of analysis.

Assurance activities of these models during the analysis stage:

* should include the verification steps set out in the design stage
* should include validation and verification of automatic tests to ensure the model behave as expected
* may include performance testing in a live environment
Expand Down
1 change: 1 addition & 0 deletions analytical_lifecycle.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ The analytical lifecycle is not a linear process. Where analysis is used on an o
* any software relied on continues to be supported and up to date
* the model continues to be calibrated appropriately (this is particularly important for [black box models](definitions_and_key_concepts.html#black-box-models))


Additionally, a robust version control process should be in place to ensure any changes to the analysis are appropriately assured.

## Urgent analysis
Expand Down
11 changes: 6 additions & 5 deletions definitions_and_key_concepts.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,11 @@ This may include:

## Artificial Intelligence {.unnumbered}

Artificial Intelligence (AI) attempts to simulate human intelligence using techniques and methods such as machine learning, natural language processing and robotics. AI aims to perform tasks that typically require human intelligence, such as problem-solving, decision-making and language understanding. AI models are a subset of [black box models](#black_box_models)

Artificial Intelligence (AI) attempts to simulate human intelligence using techniques and methods such as [machine learning](#Machine Learning), natural language processing, robotics, and generative AI. AI aims to perform tasks that typically require human intelligence, such as problem-solving, decision-making, and language understanding. AI models are usually considered [black box models](#black_box_models) and can be pre-trained models or custom built.

## Black box models {.unnumbered}

The internal workings of black box models are not visible or easily understood. These models take input and produce output without providing clarity about the process used to arrive at the output. [AI](#Artificial Intelligence) models are the most common type of black box models used today. Other forms of black box models may be developed in the future.

Black box models internal workings are not visible, easily understood, or succinctly explained. These models take input and produce output without providing clarity about the process used to arrive at the output. This also includes proprietary models with protected intellectual property. [Artificial Intelligence](#Artificial Intelligence) models (including [Machine Learning](#Machine Learning)) can often be considered a type of black box models. Other forms of this type of black box models may arise in future.

## Business critical analysis {.unnumbered}

Expand Down Expand Up @@ -88,7 +86,6 @@ The decisions log is a register of decisions, whether provided by the commission

A register of data provided by the commissioner or derived by the analysis that has been risked assessed and signed-off by an appropriate governance group or stakeholder.


### User and technical documentation {.unnumbered}

All analysis shall have user documentation, even if the only user is the analyst leading the analysis. This documentation should include:
Expand All @@ -113,6 +110,10 @@ The Department for Energy Security and Net Zero (DESNZ) and Department for Busin

:::

## Machine Learning {.unnumbered}

Machine Learning uses algorithms to learn from patterns in data without needing to programme explicit business rules. Some models are white box models and others are considered [black box models](#Black box models). Machine Learning is a subset of [Artificial Intelligence](#Artificial Intelligence).

## Multi-use models {.unnumbered}

Multi-use models are used by more than one user or group of users for related but different purposes. These are often complex and large.
Expand Down
12 changes: 9 additions & 3 deletions delivery_and_communication.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,15 @@ The approver is responsible for the sign-off that confirms that all risks and et

This may include:

* formal consultation and approval by an ethics committee or similar
* provisions for regular review
* communicating the health of the model at regular intervals to the commissioner (for exampe, confirming that the model is continuing to behave as expected)
## Black-box models and the delivery, communication and sign-off stage

The Approver is responsible for signing-off that all risks and ethical considerations around the use of black-box models have been addressed.
This may include
* formal consultation and approval by an ethics committee or similar depending on internal governance structures
* provisions for regular review, including whether on-going peer review is required to ensure the lastest guidance and assurance methodology is taken into account
* communicating the "health" of the model at regular intervals to the commissioner i.e. is it continuing to behave as expected or has there been data drift i.e. is the model's performance decreasing over time? This can happen when a model is trained on historical data, but then uses current data when it is being used in production. The model may become less effective because it is no longer conditioned on the current state of the data.
The aspects to be considered are detailed in the [Introduction to AI assurance](https://www.gov.uk/government/publications/introduction-to-ai-assurance).


You can read more in the [Introduction to AI assurance](https://www.gov.uk/government/publications/introduction-to-ai-assurance).

Expand Down
16 changes: 9 additions & 7 deletions design.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,16 @@ Using [black box models](definitions_and_key_concepts.qmd/#black-box-models) pla
This [guidance on AI assurance](https://www.gov.uk/government/publications/introduction-to-ai-assurance/introduction-to-ai-assurance) outlines considerations for the design of AI models, including risk assessment, impact assessment, bias audits and compliance audits.

In the design of AI and machine learning models, the analyst should:

* define the situation they wish to model
* the prediction they wish to make
* the data that could be used to make the prediction
* define the situation they wish to model;
* the prediction they wish to make;
* assess the quality of data that could be used to make the prediction, this includes the data used in any pretrained models
* carry out a literature review to identify appropriate modelling, valuation verification methods and document the rationale for selecting their approach
* consider how to separate the data for the design and testing of models - it's usual to design a model with a fraction of the data and then test it with the data that was not used in the design
* consider developing automatic checks to identify if the model is behaving unexpectedly, this is important if the model is likely to be used frequently to make regular decisions
* consider whether to refer the model to their ethics committee, or a similar group as described in the [Data Ethics Framework](https://www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework-2020)
* consider the appropriate data training, validation and testing strategy for your models - it's usual to design a model with a fraction of the data, validate with a separate portion and then test the final model with the data that was not used in the design;
* consider your strategy when testing a pre-trained model, including appropriate validation methods for the models such as calculating similarity to labelled images or ground truths for generative AI
* consider developing automatic checks to identify if the model is behaving unexpectedly, this is important if the model is likely to be used frequently to make regular decisions or is deployed into a production environment
* consider the plan for maintenance and continuous review, including the thresholds or timeline to retrain the model and the resources required to support this - see the [maintenance and continuous review](analytical_lifecycle.qmd/#mainenance-and-continuous-review) section
* consider referring the model to their ethics committee, or a similar group dependent on your interal governance structures - see the [Data Ethics Framework](https://www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework-2020)
* consider setting up a peer or academic review process to test the methodologies and design decisions

## Multi-use models

Expand Down
3 changes: 2 additions & 1 deletion engagement_and_scoping.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,12 @@ The output of the engagement and scoping stage should be a [specification docume
## Treatment of uncertainty

The engagement and scoping stage will inform the treatment of uncertainty by:

* providing a clear definition of the analytical question
* identifying sources of high or intractable uncertainty
* establishing an understanding of how the analysis will inform decisions

The Analyst and Commissioner should also clarify risks and potential effects on the outcomes to inform the decisions around [proportionate assurance](proportionality.qmd). This includes a discussion on the consideration on the ethics. Constraints around resource and timelines should also be clarified and agreed.

You can read more about uncertainty in engagement and scoping in the [Uncertainty Toolkit](https://analystsuncertaintytoolkit.github.io/UncertaintyWeb/chapter_2.html#Jointly_agreeing_how_uncertainty_should_be_used).

## Black box models
Expand Down
2 changes: 2 additions & 0 deletions proportionality.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,5 @@ The difference will be in ensuring the assessment of risks and the applied mitig

Increasingly analysis may be underpinned by Artificial intelligence (AI) or other forms of black-box models. With these models the need to understand business risk remains and the same structured approach to assessing business risk should be taken. The challenges in providing this assessment will be in ensuring the transparency of the analysis, availability of a suitable mix of experts and developing understanding of what mitigations are possible.

To note the [Generative AI Framework for HMG](https://www.gov.uk/government/publications/generative-ai-framework-for-hmg/generative-ai-framework-for-hmg-html) has assurance highlighted in "Principle 10".