-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #23 from impresso/feature/sc_improve_language_iden…
…tification Feature/sc improve language identification
- Loading branch information
Showing
160 changed files
with
1,092 additions
and
2,009 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled boolean in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/cc | ||
``` | ||
|
||
True if image box coordinates are known to be correct, False otherwise | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## cc Type | ||
|
||
`boolean` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled string in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ft | ||
``` | ||
|
||
the rebuilt fulltext | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## ft Type | ||
|
||
`string` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled string in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/id | ||
``` | ||
|
||
The unique identifier for a content item (CI) | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## id Type | ||
|
||
`string` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/lb/items | ||
``` | ||
|
||
|
||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## items Type | ||
|
||
`number` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled array in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/lb | ||
``` | ||
|
||
text offsets of physical line breaks (relative to 'ft' field) | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## lb Type | ||
|
||
`number[]` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled boolean in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/olr | ||
``` | ||
|
||
True if optical layout recognition was applied to the issue this content item originates from. | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## olr Type | ||
|
||
`boolean` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/pb/items | ||
``` | ||
|
||
|
||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## items Type | ||
|
||
`number` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled array in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/pb | ||
``` | ||
|
||
text offsets of physical paragraph breaks (relative to 'ft' field) | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## pb Type | ||
|
||
`number[]` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/pp/items | ||
``` | ||
|
||
|
||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## items Type | ||
|
||
`number` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled string in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ppreb/items/properties/id | ||
``` | ||
|
||
canonical ID | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## id Type | ||
|
||
`string` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ppreb/items/properties/n | ||
``` | ||
|
||
page number | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## n Type | ||
|
||
`number` |
13 changes: 0 additions & 13 deletions
13
docs/contentitem-properties-ppreb-items-properties-t-items-properties-c.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
13 changes: 0 additions & 13 deletions
13
docs/contentitem-properties-ppreb-items-properties-t-items-properties-l.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ppreb/items/properties/t/items/properties/l | ||
``` | ||
|
||
token length | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## l Type | ||
|
||
`number` |
13 changes: 0 additions & 13 deletions
13
docs/contentitem-properties-ppreb-items-properties-t-items-properties-s.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled number in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ppreb/items/properties/t/items/properties/s | ||
``` | ||
|
||
offset start (relative to ft field) | ||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## s Type | ||
|
||
`number` |
13 changes: 0 additions & 13 deletions
13
docs/contentitem-properties-ppreb-items-properties-t-items-properties.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,3 @@ | ||
# Untitled undefined type in Content Item Schema | ||
|
||
```txt | ||
https://impresso.github.io/impresso-schemas/json/newspaper/contentitem.schema.json#/properties/ppreb/items/properties/t/items/properties | ||
``` | ||
|
||
|
||
|
||
|
||
| Abstract | Extensible | Status | Identifiable | Custom Properties | Additional Properties | Access Restrictions | Defined In | | ||
| :------------------ | ---------- | -------------- | ----------------------- | :---------------- | --------------------- | ------------------- | ---------------------------------------------------------------------------------- | | ||
| Can be instantiated | No | Unknown status | Unknown identifiability | Forbidden | Allowed | none | [contentitem.schema.json\*](../out/contentitem.schema.json "open original schema") | | ||
|
||
## properties Type | ||
|
||
unknown |
13 changes: 0 additions & 13 deletions
13
docs/contentitem-properties-ppreb-items-properties-t-items.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.