Shortening tags is not important. #286

WGroleau · 2023-02-12T21:50:56Z

WGroleau
Feb 12, 2023

Might be worthwhile to unlearn the tendency to shrink the size of tags. There is no sacredness to the number four, and indeed, all along we've had three (SEX) and five (underscore plus four).

Now there is at least one more SCHMA. What is the benefit of removing the 'E'? Humans don't often read GEDCOM directly, but some of us do, and too much push to shorten tags is at best an inconvenience to those who aren't native speakers of English.

I'm not suggesting the other extreme of having really long ones, but four should not be the goal.

Answered by tychonievich

Feb 15, 2023

The steering committee has generally tried to balance the desire to be brief, the desire to match the style of previous versions, and the desire to be expressive when read by English speakers. For example, when introducing PHRASE we spelled it out in full but when introducing SDATE we abbreviated "Sort date". However, more important than any of these has been avoiding ambiguity and name conflict, which brings us to the specific example:

Now there is at least one more SCHMA. What is the benefit of removing the 'E'?

GEDOCM 5.3 (only; not the versions before or after it) had a SCHEMA with different semantics. We picked a different tag to avoid potential name collisions with old files.

View full answer

jl5000 · 2023-02-15T12:53:55Z

jl5000
Feb 15, 2023

There comes a point where tag length 'feels' too long, given what is required for expressiveness and differentiating between other tags. Everyone will have a different view on this. For me, 6 characters does feel longer than necessary. This is reinforced by the fact that I write raw GEDCOM as well as read it. There is also something to be said of having tags that are similar in length so that the components align across lines and the cognitive burden of reading it is minimised.

0 replies

tychonievich · 2023-02-15T23:12:36Z

tychonievich
Feb 15, 2023
Maintainer

The steering committee has generally tried to balance the desire to be brief, the desire to match the style of previous versions, and the desire to be expressive when read by English speakers. For example, when introducing PHRASE we spelled it out in full but when introducing SDATE we abbreviated "Sort date". However, more important than any of these has been avoiding ambiguity and name conflict, which brings us to the specific example:

Now there is at least one more SCHMA. What is the benefit of removing the 'E'?

GEDOCM 5.3 (only; not the versions before or after it) had a SCHEMA with different semantics. We picked a different tag to avoid potential name collisions with old files.

0 replies

WGroleau · 2023-02-24T04:39:38Z

WGroleau
Feb 24, 2023
Author

"reduce cognitive burden of reading"—a more effective way of accomplishing this would be allowing leading white space on lines, and comments on lines that end with "@". Implementations (or human editors) could optionally add them, but they would be ignored on import.

0 @abc@ FAM
  1 HUSB @def@ {John Doe}
  1 WIFE @ghi@ {Jane Roe}
  1 MARR
    2 DATE 5 MAY 1909
0 @ghi@ INDI
  1 NAME Jane /Roe/
  1 FAMS @abc@ {John Doe & Jane Roe}

0 replies

tychonievich · 2023-02-24T18:39:50Z

tychonievich
Feb 24, 2023
Maintainer

a more effective way of accomplishing this would be allowing leading white space on lines

This was allowed in the 5.5.1 spec, but a survey of existing GEDCOM tools during the 7.0.0 drafting stage revealed that many did not support this feature. We removed it from 7.0 because we felt that interoperability of tools was more important than readability of files.

Incidentally, that finding surprised me; stripping leading whitespace seemed trivial to implement. My rationale for the finding (based on thought and not hard evidence) is that a sizable set of tools appear to add features to their parsers on demand when a user complains that a file exported by tool X fails to import correctly, and most software does not add extraneous whitespace so failure to handle it was not reported.

1 reply

dthaler Feb 24, 2023
Maintainer

Adding whitespace to maximal7x.ged in the future would help ensure correct imports, if we want to enable whitespace in a later release.

WGroleau · 2023-02-25T07:39:13Z

WGroleau
Feb 25, 2023
Author

Both adding and trimming the white space is indeed trivial. I once wrote a perl script that split a GEDCOM into a separate file for each level zero record, added before each level number two times the number spaces, and put each record into a Berkeley database keyed by the xref. And another one that put it all back together, removing the spaces. It's also not hard to add the comments in my example, and even easier to remove them.

0 replies

WGroleau · 2023-02-25T07:47:55Z

WGroleau
Feb 25, 2023
Author

Another program (not by me) that added and removed indentation was LifeLines: “The indentation shown in the examples is not part of GEDCOM format. When LifeLines prepares records for you to edit, however, it always indents the records, making them easier to read and understand. You do not need to follow this indentation scheme when you edit the records. Indentation is removed from the data before it is stored in the database.”

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shortening tags is not important. #286

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 1 reply

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Shortening tags is not important. #286

WGroleau Feb 12, 2023

Replies: 6 comments · 1 reply

jl5000 Feb 15, 2023

tychonievich Feb 15, 2023 Maintainer

WGroleau Feb 24, 2023 Author

tychonievich Feb 24, 2023 Maintainer

dthaler Feb 24, 2023 Maintainer

WGroleau Feb 25, 2023 Author

WGroleau Feb 25, 2023 Author

WGroleau
Feb 12, 2023

Replies: 6 comments 1 reply

jl5000
Feb 15, 2023

tychonievich
Feb 15, 2023
Maintainer

WGroleau
Feb 24, 2023
Author

tychonievich
Feb 24, 2023
Maintainer

dthaler Feb 24, 2023
Maintainer

WGroleau
Feb 25, 2023
Author

WGroleau
Feb 25, 2023
Author