Skip to content

PDFUA 2 test corpus

Git User edited this page Sep 1, 2024 · 7 revisions

5 Version identification

5-t01-fail-a.pdf: PDF/UA Identifier is not present.

5-t02-fail-a.pdf: PDF/UA part number is set to 3.

5-t02-pass-a.pdf: PDF/UA-2 Identifier is present.

5-t03-fail-a.pdf: Property 'pdfuaid:part' of the PDF/UA Identification Schema doesn't have namespace prefix 'pdfuaid'.

5-t04-fail-a.pdf: Property 'pdfuaid:rev' of the PDF/UA Identification Schema doesn't have namespace prefix 'pdfuaid'.

5-t05-fail-a.pdf: Property 'pdfuaid:rev' of the PDF/UA Identification Schema does not have the four digit year number.

8.2 Logical structure

8.2.1 General

8.2.1-t01-fail-a.pdf: StructTreeRoot entry of the document catalog dictionary is not present.

8.2.2 Real content

8.2.2-t01-fail-a.pdf: Content is not marked as Artifact and is not tagged as real content.

8.2.2-t01-fail-b.pdf: Content stream doesn't contain an "Artifact" tag.

8.2.2-t01-fail-c.pdf: Image is not marked as Artifact or real content.

8.2.2-t01-pass-a.pdf: Content is marked as Artifact. Real content is tagged.

8.2.2-t01-pass-b.pdf: Content stream explicitly distinguished with the "Artifact" tag.

8.2.4 Structure types

8.2.4-t01-fail-a.pdf: RoleMap entry is present with value : "Standard - > p". p is not standard type; standard type is P(case sensitive).

8.2.4-t01-fail-b.pdf: RoleMap entry is present with next values: • Text body -> p(case sensitive) • Standard -> Text body

8.2.4-t01-fail-c.pdf: Standard structure element is remapped to empty string. • Standard -> ""

8.2.4-t01-pass-a.pdf: RoleMap entry is present with value : "Standard - > P".

8.2.4-t01-pass-b.pdf: RoleMap entry is present with next values: • Text body -> P • Standard -> Text body

8.2.4-t02-fail-a.pdf: Rolemap entry is present. LI rolemapped to LI.

8.2.4-t02-fail-b.pdf: Non-standard structure elements "Text body" and "Standard" are rolemapped to each other.

8.2.4-t02-fail-c.pdf: Structure type http://iso.org/pdf2/ssn:Q is role mapped to itself in the RoleMapNs dictionary.

8.2.4-t02-pass-a.pdf: All structure elements in document have Standard type.

8.2.4-t03-fail-a.pdf: Structure type Q is role mapped to type P within explicitly provided namespace http://www.w3.org/1999/xhtml.

8.2.4-t03-fail-b.pdf: Structure type http://iso.org/pdf2/ssn:Q is role mapped to http://www.w3.org/1999/xhtml:T. Structure type http://www.w3.org/1999/xhtml:T is role mapped to http://iso.org/pdf2/ssn:P.

8.2.4-t03-pass-a.pdf: Structure type Q is role mapped to type P within not explicitly provided namespace http://iso.org/pdf/ssn.

8.2.4-t04-fail-a.pdf: Structure type H1 is role mapped to type Standard structure type.

8.2.4-t04-pass-a.pdf: Standard structure type H1 is role mapped to non-standard structure type Standard.

8.2.5 Additional requirements for specific structure types

8.2.5.2 Document and DocumentFragment

8.2.5.2-t01-fail-a.pdf: The StructureTreeRoot contains no element instead of a Document structure element.

8.2.5.2-t02-fail-a.pdf: The document structure element doesn't belong to the PDF 2.0 namespace.

8.2.5.12 Heading (Hn) and (H)

8.2.5.12-t01-fail-a.pdf: H structure type exists.

8.2.5.12-t01-pass-a.pdf: H1 structure type exists instead of H.

8.2.5.25 List (L, LI, LBody)

8.2.5.25-t01-fail-a.pdf: The ListNumbering attribute has value None on the List structure element.

8.2.5.26 Table (Table, TR, TH, TD, THead, TBody, TFoot)

8.2.5.26-t01-pass-a.pdf: Attribute RowSpan with value 2 is present in empty cell. Attribute ColSpan with value 3 is present in TH1.

8.2.5.26-t01-pass-b.pdf: Reading order of Table is good.

8.2.5.26-t03-fail-a.pdf: Attribute RowSpan with value 2 is present in empty cell. Attribute ColSpan with value 2 is present in TH1.

8.2.5.26-t03-fail-b.pdf: Attribute RowSpan with value 5 is present in Header 1.

8.2.5.26-t04-fail-a.pdf: Attribute RowSpan with value 1 is present in empty cell. Attribute ColSpan with value 3 is present in TH1.

8.2.5.26-t04-fail-b.pdf: THead contains all Table Header Cells. TBody contains all Table Data Cells.

8.2.5.26-t04-fail-c.pdf: All Table Header Cells are present in TR.

8.2.5.26-t05-fail-a.pdf: ID and Headers are not present for cell (row = 2, column = 2). Scope attribute is not defined for the TH cells "15-003" and "Failure Condition".

8.2.5.26-t05-pass-a.pdf: ID and Header are not present in standard structure types of Table. Scope attribute is present for all TH cells.

8.2.5.26-t05-pass-b.pdf: Scope attribute is not present in TH cells. ID and Header are present for all corresponding structure elements of Table.

8.2.5.26-t05-pass-c.pdf: Scope attribute is not present in TH cells. (Index) Table Heading has no associated subcells.

8.2.5.26-t05-pass-d.pdf: Scope attribute is not present in TH cells. ID and Header are present for all corresponding structure elements of Table. Value of ID for Index is empty. Value of Headers for TH (Row) is empty.

8.2.5.26-t05-pass-e.pdf: Scope attribute is not present in TH cells. Table Heading (Row). • Headers attribute is associated with ID that is not present

8.2.5.26-t06-fail-a.pdf: ID is not present for cell (row = 2, column = 2). Scope attribute is not defined for the TH cells "15-003" and "Failure Condition". TD "a cell" references to an undefined Header "12345".

8.2.5.28 Figure

8.2.5.28.2 Figure properties

8.2.5.28.2-t01-fail-a.pdf: ActualText or Alt entries are not present in Figure structure element.

8.2.5.28.2-t01-pass-a.pdf: Alt is present in Figure.

8.2.5.28.2-t01-pass-b.pdf: ActualText is present in Figure.

8.2.5.28.2-t01-pass-c.pdf: ActualText is present in Figure. Value of ActualText is empty.

8.2.5.29 Formula

8.2.5.29-t01-fail-a.pdf: Math structure element is not a child element of Formula structure element.

8.2.5.29-t01-pass-a.pdf: Math structure element is a child element of Formula structure element.

8.4 Text representation for content

8.4.4 Declaring natural language

8.4.4-t02-fail-a.pdf: Lang entry is present in Document Catalog. Primary-subtag of Lang is "portugues".

8.4.4-t02-fail-b.pdf: Lang entry is present in Document Catalog. Primary-subtag of Lang is digit.

8.4.4-t02-fail-c.pdf: Lang entry is present in Document Catalog. Primary-subtag of Lang is empty.

8.4.4-t02-fail-d.pdf: Lang entry is present in Paragraph. Primary-subtag of Lang is "portugues".

8.4.4-t02-fail-e.pdf: Lang entry is present in Paragraph. Primary-subtag of Lang is digit.

8.4.4-t02-fail-f.pdf: Lang entry is present in Paragraph. Primary-subtag of Lang is empty.

8.4.4-t02-fail-g.pdf: Lang entry is present in property list. Primary-subtag of Lang is empty.

8.4.4-t02-fail-h.pdf: Lang entry is present in property list. Primary-subtag of Lang is "portugues".

8.4.4-t02-fail-i.pdf: Lang entry is present in property list. Primary-subtag of Lang is digit.

8.4.4-t02-fail-j.pdf: Lang entry is present in Document Catalog. Subtag of Lang is "1234abcde".

8.4.4-t02-fail-k.pdf: Lang entry is present in Document Catalog. Subtag of Lang is "пт".

8.4.4-t02-fail-l.pdf: Lang entry is present in Paragraph. Subtag of Lang is "1234abcdf".

8.4.4-t02-fail-m.pdf: Lang entry is present in Paragraph. Subtag of Lang is "ПТ".

8.4.4-t02-fail-n.pdf: Lang entry is present in Document Catalog. Value of Lang is empty.

8.4.4-t02-fail-o.pdf: Lang entry is present in Paragraph. Value of lang is empty.

8.4.4-t02-fail-p.pdf: Lang is present in marked content sequence. Value of lang is empty.

8.4.4-t02-pass-a.pdf: Lang entry is present in Document Catalog. Primary-subtag of Lang is "portugue".

8.4.4-t02-pass-b.pdf: Lang entry is present in Document Catalog. Primary-subtag of Lang is "p".

8.4.4-t02-pass-c.pdf: Lang entry is present in Paragraph structure element. Primary-subtag of Lang is "portugue".

8.4.4-t02-pass-d.pdf: Lang entry is present in Paragraph structure element. Primary-subtag of Lang is "p".

8.4.4-t02-pass-e.pdf: Lang entry is present in property list. Primary-subtag of Lang is "portugue".

8.4.4-t02-pass-f.pdf: Lang entry is present in property list. Primary-subtag of Lang is "p".

8.4.4-t02-pass-g.pdf: Lang entry is present in Document Catalog. Subtag of Lang is "PT".

8.4.4-t02-pass-h.pdf: Lang entry is present in Document Catalog. Subtag of Lang is "1234abcd".

8.4.4-t02-pass-i.pdf: Lang entry is present in Paragraph. Subtag of Lang is "1234abcd".

8.4.4-t02-pass-j.pdf: Lang entry is present in Paragraph. Subtag of Lang is "PT".

8.4.5 Fonts

8.4.5.3 Composite fonts

8.4.5.3.1 General

8.4.5.3.1-t01-fail-a.pdf: Registry values in Encoding and DescendantFonts dictionaries are not the same.

8.4.5.3.1-t01-fail-b.pdf: Ordering values in Encoding and DescendantFonts dictionaries are not the same.

8.4.5.3.1-t01-fail-c.pdf: Supplement value in DescendantFonts more than value in Encoding.

8.4.5.3.1-t01-pass-a.pdf: Registry values in Encoding and DescendantFonts dictionaries are the same.

8.4.5.3.1-t01-pass-b.pdf: Ordering values in Encoding and DescendantFonts dictionaries are the same.

8.4.5.3.1-t01-pass-c.pdf: Supplement value in Encoding and DescendantFonts dictionaries are the same.

8.4.5.3.1-t01-pass-d.pdf: Supplement value in DescendantFonts dictionary less than in Encoding dictionary.

8.4.5.3.2 CIDFonts

8.4.5.3.2-t01-fail-a.pdf: CIDToGIDMap is name. Value of CIDToGIDMap is NoIdentity.

8.4.5.3.2-t01-fail-b.pdf: CIDToGIDMap entry is not present.

8.4.5.3.2-t01-fail-c.pdf: CIDToGIDMap entry is present. Value of CIDToGIDMap is empty.

8.4.5.3.2-t01-pass-a.pdf: CIDToGIDMap with value "Identity" is present.

8.4.5.4 CMaps

8.4.5.4-t01-fail-a.pdf: Encoding contains "Adobe-Korea1-2" that is not present in Table 118 of ISO 32000-1.

8.4.5.4-t01-pass-a.pdf: Encoding contains "Identity-H" that is present in Table 118 of ISO 32000-1.

8.4.5.4-t02-fail-a.pdf: WMode in Encoding is 1. WMode in CMap stream is 0.

8.4.5.4-t02-fail-b.pdf: WMode in Encoding is 0. WMode in CMap stream is 1.

8.4.5.4-t02-pass-a.pdf: WMode in Encoding is 0. WMode in CMap stream is 0.

8.4.5.4-t03-fail-a.pdf: UseCMap contains "Adobe-Korea1-2" that is not present in Table 118 of ISO 32000-1.

8.4.5.4-t03-pass-a.pdf: UseCMap contains "H" that is present in Table 118 of ISO 32000-1.

8.4.5.5 Embedding

8.4.5.5.1 General

8.4.5.5.1-t01-fail-a.pdf: Font is not embedded.

8.4.5.5.1-t01-pass-a.pdf: Font is embedded.

8.4.5.6 Font metrics

8.4.5.6-t01-fail-a.pdf: First value in Width array is 192.

8.4.5.6-t01-pass-a.pdf: First value in Width array is 250.

8.4.5.7 Character encodings

8.4.5.7-t02-fail-a.pdf: BaseEncoding with value "Identity" is present in Encoding Dictionary.

8.4.5.7-t02-fail-b.pdf: BaseEncoding entry is not present in Encoding Dictionary.

8.4.5.7-t02-fail-c.pdf: Encoding entry is not present in Font Dictionary.

8.4.5.7-t02-fail-d.pdf: Differences array contains "gravee" glyph that not present in Adobe Glyph List.

8.4.5.7-t02-pass-a.pdf: BaseEncoding with value "WinAnsiEncoding" is present in Encoding Dictionary.

8.4.5.7-t02-pass-b.pdf: BaseEncoding with value WinAnsiEncoding is present in Encoding Dictionary.

8.4.5.7-t02-pass-c.pdf: Encoding with value WinAnsiEncoding is present in Font Dictionary.

8.4.5.7-t02-pass-d.pdf: Differences array contains "grave" glyph that present in Adobe Glyph List.

8.4.5.7-t03-fail-a.pdf: Encoding entry is present in Font Dictionary for Symbolic TrueType font.

8.4.5.7-t03-pass-a.pdf: Encoding entry is not present in Font Dictionary for Symbolic TrueType font.

8.4.5.8 Unicode character maps

8.4.5.8-t01-fail-a.pdf: A font dictionary does not contain the ToUnicode entry and none of the following is true:

8.4.5.8-t01-pass-a.pdf: A Type0 font uses the Adobe-Japan1 character collection and does not contain the ToUnicode entry.

8.4.5.8-t01-pass-b.pdf: A Font uses the predefined encoding MacRomanEncoding and does not contain the ToUnicode entry.

8.4.5.8-t01-pass-c.pdf: non-symbolic TrueType font is present so ToUnicode entry is not required.

8.4.5.8-t02-fail-a.pdf: Unicode value in ToUnicode CMap is <0000>.

8.4.5.8-t02-fail-b.pdf: Unicode value in ToUnicode CMap is <fffe>.

8.4.5.8-t02-fail-c.pdf: Unicode value in ToUnicode CMap is <feff>.

8.4.5.8-t02-pass-a.pdf: Unicode value in ToUnicode CMap is <0075>. Unicode values: <0000>, <fffe>, <feff> are not present.

8.4.5.9 Use of .notdef glyph

8.4.5.9-t01-fail-a.pdf: One or more characters used in text showing operators reference the .notdef glyph.

8.6 Text string objects

8.6-t01-fail-a.pdf: The Lang entry contains the Unicode PUA value.

8.7 Optional content

8.7-t02-fail-a.pdf: The Name is not present in the optional content configuration dictionary.

8.7-t02-pass-a.pdf: The AS key doesn't appear in an Optional Content Configuration Dictionary.

8.8 Intra-document destinations

8.8-t01-fail-a.pdf: Outline item contains not a structure destination entry.

8.8-t02-fail-a.pdf: GoTo action contains not a structure destination entry.

8.9 Annotations

8.9.2 Semantics and content

8.9.2.2 Annotations as artifacts

8.9.2.2-t01-fail-a.pdf: An annotation with invisible flag is included in logical structure and is not marked as Artifact.

8.9.2.4 Annotation types

8.9.2.4.10 File attachment

8.9.2.4.10-t01-fail-a.pdf: The file specification dictionary does not include AFRelationship entry.

8.10 Forms

8.10.1 General

8.10.1-t02-fail-a.pdf: A Form structure element contains more than one widget annotation.

8.10.2 Context

8.10.2.3 Contents entry

8.10.2.3-t01-fail-a.pdf: Contents entry is not provided to supply description and context for the widget.

8.11 Metadata

8.11.1 General

8.11.1-t01-fail-a.pdf: dc:title is not present in metadata stream.

8.11.1-t01-pass-a.pdf: dc:title "title" is present in metadata stream.

8.11.2 Interactive aspects

8.11.2-t01-fail-a.pdf: DisplayDocTitle entry is not present in ViewerPreferences dictionary.

8.11.2-t01-fail-b.pdf: "DisplayDocTitle: False" is present in ViewerPreferences dictionary.

8.11.2-t01-pass-a.pdf: "DisplayDocTitle: True" is present in ViewerPreferences dictionary.

8.14 Use of embedded files

8.14.1 Descriptions for embedded files

8.14.1-t01-fail-a.pdf: The Desc entry is missing for the file specification dictionary in the EmbeddedFiles name tree.