diff --git a/rmarkdown/Page_2_Question_Bank_Use.rmd b/rmarkdown/Page_2_Question_Bank_Use.rmd
new file mode 100644
index 0000000..aef3cd0
--- /dev/null
+++ b/rmarkdown/Page_2_Question_Bank_Use.rmd
@@ -0,0 +1,73 @@
+---
+title: "How to use the question bank"
+output:
+ html_document:
+ css: "question_bank.css"
+ toc: yes
+ toc_depth: 4
+ toc_float:
+ collapsed: yes
+ pdf_document:
+ toc: yes
+ toc_depth: '4'
+---
+
+```{r global-options, include=FALSE}
+# Set echo=false for all chunks
+knitr::opts_chunk$set(echo=FALSE)
+```
+
+---
+
+## Question bank structure and how to use
+
+
+This question bank groups questions into sections, some of which having further sub-sections. These sections and sub-sections are as follows:
+
+
+
+1. Quality dimensions as defined by the Data Management Association UK (DAMA), paired into the following:
+
+ * accuracy and validity
+
+ * completeness and uniqueness
+
+ * consistency and timeliness
+
+
+
+2. Data linkage
+
+
+
+These sections and their sub-sections have been chosen to give you a wide selection of questions to gain further insights into administrative data quality. According to the [Code of Practice for Statistics](https://code.statisticsauthority.gov.uk/), “quality means that statistics fit their intended uses, are based on appropriate data and methods, and are not materially misleading.”
+
+In essence, quality centers around a consideration of fitness for purpose, including:
+
++ Are the data good enough for what I want to use it for?
+
++ Did the statistic I produce meet the needs of the people who are using it?
+
+
+
+The questions in this question bank can be used to assess the different aspects of fitness for your use. Very rarely will there be data that is completely perfect for statistical and research purposes. Understanding which dimensions are important for your specific uses will help you when deciding if data are fit for purpose. To this end, the question bank has been designed to be flexible, and your approach can be tailored in proportion to your needs, for example, by making them relevant to the variables you are interested in. These questions provide a structure into assessing the data’s fitness for purpose and ensuring that you cover the key issues to help understand the data’s quality.
+
+
+The first key step is to identify what dataset you are interested in and wish to assess using the set of questions in this question bank. You can then use the questions in this The Administrative Data Quality Question Bank (ADQQB) and tailor these to find out more about that specific dataset.
+
+
+The Administrative Data Quality Question Bank (ADQQB) focuses on assessing quality of data at input data level. Input data level refers to the point at which your organisation receives the data. Quality at this stage refers to how well the data fits the purpose(s) you want to use them for. Essentially, this could be suitability of the data to produce statistics, or suitability of the data to carry out analysis or research.
+
+
+
+We have included questions on the [DAMA quality dimensions](https://www.gov.uk/government/news/meet-the-data-quality-dimensions) because these are dimensions that are widely used across the government to assess if data is good enough to use, or whether improvements need to be made. We have also chosen to include questions on data linkage in data collection and production as answers to these questions could supply further information and context around how the data are produced. It also provides a reminder that you should check-in regularly with the data suppliers regarding any changes that may affect the resulting data.
+
+
+As we further develop the Administrative Data Quality Question Bank (ADQQB) beyond the current publication, we will add further sections. These will include output data quality: "how well your ‘final’ output meets your users’ needs”. This will be done through integrating relevant dimensions from the European Statistical System’s (ESS) dimensions of quality.
+
+
+
+
+
+
+
diff --git a/rmarkdown/Page_3_Quality_Dimensions.rmd b/rmarkdown/Page_3_Quality_Dimensions.rmd
new file mode 100644
index 0000000..0f83b5a
--- /dev/null
+++ b/rmarkdown/Page_3_Quality_Dimensions.rmd
@@ -0,0 +1,839 @@
+---
+title: "Quality dimensions"
+output:
+ html_document:
+ css: "question_bank.css"
+ toc: yes
+ toc_depth: 4
+ toc_float:
+ collapsed: yes
+ pdf_document:
+ toc: yes
+ toc_depth: '4'
+---
+
+```{r global-options, include=FALSE}
+# Set echo=false for all chunks
+knitr::opts_chunk$set(echo=FALSE)
+```
+
+---
+
+This question bank includes data quality themes as defined by the Data Management Association UK (DAMA) dimensions outlined in [The Government Data Quality Framework](https://www.gov.uk/government/publications/the-government-data-quality-framework/the-government-data-quality-framework#Data-quality-dimensions). These dimensions and definitions used are the same as those outlined in our [Administrative Data Quality Framework (ADQF)](https://analysisfunction.civilservice.gov.uk/policy-store/quality-of-administrative-data-in-statistics/). The dimensions covered are:
+
++ accuracy
+
++ validity
+
++ completeness
+
++ uniqueness
+
++ consistency
+
++ timeliness
+
+We have used these because they were developed by experts in data quality to assess the fitness for purpose of data. Finding which dimensions are important for you will help you make decisions around how fit for purpose the data are for your needs.
+
+In future publications of this question bank, we intend to include questions based on relevant selected principles from the [European Statistics Code of Practice](https://ec.europa.eu/eurostat/web/quality/european-quality-standards/european-statistics-code-of-practice). The principles covered will be:
+
++ relevance: coverage, content, purpose and collection
+
++ accessibility and clarity: accessing the data, data format, availability of supporting information, quality and sufficiency of metadata, illustrations and accompanying advice
+
+Before going into the questions for each data quality dimension, we have provided some general questions for you, which can be used to ensure that you have a fundamental understanding of the data and its’ quality.
+
+## Questions to ask to gain insights into the data's quality in general
+
+
+
+
+ |
+ |
+
+
+
+
+ Q1 |
+ How are the data from this dataset collected? For example, through public contact with services over the phone, registration forms, etc. |
+
+
+
+
+
+ Q2 |
+ What organisation(s) collects the data? |
+
+
+
+ Q3 |
+ Are there different organisations which collect different data or variables in the dataset? For example, where one organisation is responsible for collecting income-related data, and another organisation is responsible for collecting demographic data, and this data is combined to create one composite dataset |
+
+
+
+ Q4 |
+ If so, which data is collected by which organisation? |
+
+
+
+ Q5 |
+ Is the data collected differently? |
+
+
+
+ Q6 |
+ Does this have an impact on the quality? |
+
+
+
+
+ Q7 |
+ How do suppliers quality assure the data? |
+
+
+
+
+ Q8 |
+ Are there any known quality issues? |
+
+
+
+
+ Q9 |
+ What thresholds have the suppliers put in place regarding the data's quality? For example, an acceptable number of duplicate records, or an acceptable amount of missing data. |
+
+
+
+
+ Q10 |
+ How is the quality for this dataset documented? |
+
+
+
+
+ Q11 |
+ Are there any supplementary documents related to the dataset that can be shared? For example, a data dictionary, a metadata list. |
+
+
+
+
+ Q12 |
+ Are there training manuals related to the work that can be shared? For example, for coding, updating or maintaining the dataset. |
+
+
+
+
+
+
+
+ |
+ |
+
+
+
+
+ Q13 |
+ How accurate are the supplied data? |
+
+
+
+ Q14 |
+ How well do the data meet the statistical use? |
+
+
+
+ Q15 |
+ How accurate are the items, or variables in the supplied data? |
+
+
+
+ Q16 |
+ How accurate are the units, or records in the supplied data? |
+
+
+
+ Q17 |
+ What are the accuracy issues in the supplied data? |
+
+
+
+ Q18 |
+ If there are accuracy issues, how are they identified? For example, through a formal auditing process, or an automatic flagging system. |
+
+
+
+ Q19 |
+ What methods are implemented by the suppliers to prevent any accuracy issues? For example, checks built into the data collection instrument. |
+
+
+
+ Q20 |
+ If there are accuracy issues, how are they resolved by the suppliers? And to which variables and types of records? |
+
+
+
+
+ Q21 |
+ What data accuracy issues are not addressed? |
+
+
+
+ Q22 |
+ Why are the issues not addressed? |
+
+
+
+ Q23 |
+ What happens to data accuracy issues that are not addressed? For example, logged or reported to a specific team. |
+
+
+
+ Q24 |
+ How are users of the data informed about these data accuracy issues? |
+
+
+
+
+
+### Invalid entry questions
+
+
+
+ |
+ |
+
+
+
+ Q31 |
+ What kinds of errors, typos or mistakes, are there in the data? |
+
+
+
+
+ Q32 |
+ Which variables have typos, errors or mistakes? |
+
+
+
+
+ Q33 |
+ Which types of records have typos, errors or mistakes? |
+
+
+
+
+ Q34 |
+ What are the causes of these errors, typos or mistakes in the data? |
+
+
+
+
+ Q35 |
+ How are errors, typos or mistakes identified in the data? |
+
+
+
+
+ Q36 |
+ How are errors, typos or mistakes in the data resolved? |
+
+
+
+
+
+
+
+ |
+ |
+
+
+
+ Q37 |
+ How complete or incomplete are the data? |
+
+
+
+
+ Q38 |
+ How many records in the data are considered complete, or to have good coverage? |
+
+
+
+
+
+ Q39 |
+ What types of records need to be in the data to be considered complete? |
+
+
+
+
+ Q40 |
+ What types of records are missing from the data where they should be included? |
+
+
+
+
+ Q41 |
+ Why are they missing? |
+
+
+
+
+
+ Q42 |
+ What types of records are included in the data where they should not be? |
+
+
+
+ Q43 |
+ Why are these included? |
+
+
+
+
+ Q44 |
+ How are records missing from the data identified as missing? |
+
+
+
+ Q45 |
+ How are records missing in the data resolved? |
+
+
+
+
+ Q46 |
+ How are records that are wrongly included in the data, identified? |
+
+
+
+
+ Q47 |
+ How are records, that are wrongly included in the data, resolved? |
+
+
+
+
+ Q48 |
+ Unit imputation is when missing data are replaced with a record or unit. Are any records in the data supplied, imputed records? |
+
+
+
+
+ Q49 |
+ Why are these records imputed? |
+
+
+
+ Q50 |
+ How are they imputed? |
+
+
+
+
+ Q51 |
+ What changes have been made to exclusion and inclusion criteria in the data over time? For example, due to policy changes. |
+
+
+
+
+
+
+### Item completeness questions
+
+
+
+ |
+ |
+
+
+
+ Q52 |
+ Which variables or values have missing data? |
+
+
+
+ Q53 |
+ If there are missing data in variables or values, are there any particular types of records that have data within variables or values missing? |
+
+
+
+
+ Q54 |
+ How are missing data within variables or values identified? |
+
+
+
+
+ Q55 |
+ How are missing data within variables or values resolved? |
+
+
+
+
+ Q56 |
+ How are data, variables or values that are wrongly included in the dataset, identified? |
+
+
+
+
+ Q57 |
+ How are data, variables or values that are wrongly included in the dataset, resolved? |
+
+
+
+
+ Q58 |
+ Item imputation is when missing data are replaced with a value or variable. Which variables, or values in the data are imputed? |
+
+
+
+
+ Q59 |
+ Why are these variables, or values imputed? |
+
+
+
+
+ Q60 |
+ How are they imputed? |
+
+
+
+
+
+
+ |
+ |
+
+
+
+ Q61 |
+ How often, do identical records appear in the data more than once? |
+
+
+
+
+
+
+ Q62 |
+ Should there or shouldn’t there be records appearing more than once in the data? |
+
+
+
+
+
+ Q63 |
+ If records appear more than once, what, is the reason? |
+
+
+
+
+
+ Q64 |
+ How unique are the records in the data? |
+
+
+
+
+
+ Q65 |
+ What type of records appear in the data more than once? |
+
+
+
+
+
+ Q66 |
+ What does each row in the dataset represent? |
+
+
+
+
+
+ Q67 |
+ How is each unique record identified? For example, a record ID number. |
+
+
+
+
+
+ Q68 |
+ What measures are carried out to prevent records appearing more than once in the data during data collection? |
+
+
+
+
+
+ Q69 |
+ What measures are carried out to prevent records appearing more than once in the data during data processing? |
+
+
+
+
+
+ Q70 |
+ How are records that appear more than once in the data identified? |
+
+
+
+
+
+ Q71 |
+ What do duplicate records look like in the data? |
+
+
+
+
+
+ Q72 |
+ How are records that appear more than once in the data resolved? |
+
+
+
+
+
+
+
+ |
+ |
+
+
+
+ Q73 |
+ How consistent, are the data between variables? |
+
+
+
+
+ Q74 |
+ Which variables, have inconsistent information? What is the reason for this? |
+
+
+
+
+ Q75 |
+ Which types of records, if any, have inconsistent information? |
+
+
+
+
+ Q76 |
+ What is the reason for this? |
+
+
+
+
+ Q77 |
+ If you have a composite dataset (dataset compiled from different sources), how consistent, are the data across the different sources? |
+
+
+
+
+ Q78 |
+ How consistent, are the data over time? |
+
+
+
+
+ Q79 |
+ Have there been any changes to the way the data are collected over time? |
+
+
+
+
+ Q80 |
+ What changes have there been to the variables over time? For example, changes to definition. |
+
+
+
+
+ Q81 |
+ Which variables, if any, were changed? |
+
+
+
+
+ Q82 |
+ What is used to measure consistency or identify inconsistencies in the supplied data? |
+
+
+
+
+ Q83 |
+ What aspects of the data are checked for consistency? Such as, all data items, certain variables, certain time points. |
+
+
+
+
+ Q84 |
+ How are inconsistencies in the data resolved? |
+
+
+
+
+## Timeliness definition
+
+Timeliness refers to how well the data reflect the period they are supposed to represent. It also describes how up to date the data are.
+
+
+The attributes represented in some data might stay the same over time – e.g., the day you were born does not change, no matter how much time passes. Other attributes, such as income, may change.
+
+
+Your data are also ‘timely’ if the lag between their collection and their availability for your use is appropriate for your needs. Are the data available when expected and needed? Do they reflect the time they are supposed to?
+
+
+## Questions to ask to gain insights into timeliness of data
+
+
+
+
+ |
+ |
+
+
+
+ Q85 |
+ When are the data collected? For example, constantly or over a certain timeframe? |
+
+
+
+
+ Q86 |
+ Up to date refers to whether the data supplied is the latest version. For example, if there are new data being collected, but is not reflected in the current data, then the data are not up to date. |
+
+
+
+
+ Q87 |
+ How up to date are the data at the point of it being supplied? |
+
+
+
+
+ Q88 |
+ What can impact how up to date the data are? |
+
+
+
+
+ Q89 |
+ Reference dates refer to timestamps which indicate when the data have been changed. Are there any reference dates for each record? |
+
+
+
+
+ Q90 |
+ At what point of the data collection phase are reference dates produced? For example, when the data are collected, or when the data were last updated. |
+
+
+
+
+ Q91 |
+ How up to date, are the variables at the point of it being supplied? |
+
+
+
+
+ Q92 |
+ Which types of records, do not have up to date information in these variables? |
+
+
+
+ Q93 |
+ What methods are used to check that the data are up to date? |
+
+
+
+
+ Q94 |
+ What methods are carried out to resolve data if they are not up to date? |
+
+
+
+
+ Q95 |
+ How often are the data updated? |
+
+
+
+
+ Q96 |
+ What information is updated? |
+
+
+
+
+ Q97 |
+ Are there any time lags between the reference dates in the data and the date in which the data are supplied? |
+
+
+
+
+ Q98 |
+ What are the different processes by which new records are added? |
+
+
+
+
+ Q99 |
+ How often, are existing records within the data updated with new information? |
+
+
+
+
+ Q100 |
+ What are the different processes by which existing records are updated with new information? |
+
+
+
+
+ Q101 |
+ What are the different processes by which variables or values are updated with new information? |
+
+
+
+
+ Q102 |
+ How often are the data updated to remove records from the data? |
+
+
+
+
+ Q103 |
+ Under what circumstances are records removed from the data? |
+
+
+
+
+ Q104 |
+ What are the different processes by which unwanted records are removed? |
+
+
+
+
+ Q105 |
+ When records meet the criteria for removal, how long would it typically take for the record to be deleted from the data supplied? |
+
+
+
+
+ Q106 |
+ How often, are existing records within the data, updated to correct for any errors? |
+
+
+
+
+ Q107 |
+ How often, are variables within records, updated to correct for any errors? |
+
+
+
+
+ Q108 |
+ What are the different processes by which existing records are updated to correct for any errors? |
+
+
+
+
+
+
+
+ |
+ |
+
+
+
+ Q109 |
+ How, if at all, are the data linked? |
+
+
+
+
+
+ Q110 |
+ Why is the data linkage conducted? |
+
+
+
+
+ Q111 |
+ Why are these methods used to link data? |
+
+
+
+
+ Q112 |
+ How often are the data linkage methods changed? |
+
+
+
+
+ Q113 |
+ What changes have been made? |
+
+
+
+
+ Q114 |
+ Why were these changes made? |
+
+
+
+
+ Q115 |
+ What assessments, checks or preparations, are carried out on the different datasets before data linkage occurs? |
+
+
+
+
+ Q116 |
+ What variables are used to link or match the datasets? |
+
+
+
+
+ Q117 |
+ Why are these variables used? |
+
+
+
+
+ Q118 |
+ A match-key is created by putting together pieces of information to create unique keys that are then hashed and used for automated matching. How does your organisation use match keys? |
+
+
+
+
+ Q119 |
+ How is the success of these match keys evaluated? |
+
+
+
+
+ Q120 |
+ How are decisions made about whether records should be declared a match or a non-match? |
+
+
+
+
+ Q121 |
+ How is data linkage quality assessed? |
+
+
+
+
+