Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make naming rules more strict #651

Closed
wants to merge 1 commit into from

Conversation

erikbosch
Copy link
Collaborator

This is a proposal for discussion.

Background

As of today we more or less say that anything that is valid Yaml is valid VSS, which easily gives problem in downstream tools trying to generate code from VSS. An example exists in COVESA/vss-tools#284, and there have been discussions in the past like in #280.

Idea

Already today we have some recommendations written in documentation, and vss-tools at least partially validate them and give a warning, and if you request "strict" checking you might get an error. But should we possibly be more strict and change many of them to rules. That would make life easier for vss-tools, but possibly also for APIs and ontology representation of VSS. Then the "core vss-tool functionality" can do the check, so that exporters does not need to be able to handle VSS-style specifications that does not follow the rules.

Then we have two alternatives, either free-text rules like here, or "inheriting" rules from some other definition (XML? RDF?). Or describing them in some other format. Suggestions are welcome. Most likely we cannot enforce rules that consider keywords in "all" potential output formats, but perhaps we can consider the most popular output formats.

Way forward

First agree on if we want to add additional rules. Then if so, define them and implement them in vss-tools. Preferably, any changed rule should not affect signals currently being part of standard catalog.

Feedback is welcome.

@erikbosch
Copy link
Collaborator Author

meeting notes:

  • Sebastian: Text in PR not totally consistent, if it is a rule it shall give an error, not optional to give an error. Some items in rules should better be guidelines
  • Erik: But if we for instance allow all unicode characters, what do we then expect from tools; should tools then handle them in a consistent way? Or ok to give error?

* The allowed character set for node names (branches and signals) is `A-Z`, `a-z` and `0-9`
* Node names must not start with a digit
* Names are case sensitive, but it is not allowed to have two nodes that only differ by case,
i.e. it is not allowed to have both `Vehicle.Aaa` and `Vehicle.AaA`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. -> i.e.,

Different languages and environments have different rules for identifiers.
To simplify usage of VSS in as many languages and environments as possible the following rules apply.

* The allowed character set for node names (branches and signals) is `A-Z`, `a-z` and `0-9`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

branches and signals -> branch and data nodes

Perhaps? To align with terms used elsewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change it. We are a bit inconsistent in documentation, like in https://covesa.github.io/vehicle_signal_specification/rule_set/data_entry/. We sometimes talks about signals, sometimes about data entries.

To simplify usage of VSS in as many languages and environments as possible the following rules apply.

* The allowed character set for node names (branches and signals) is `A-Z`, `a-z` and `0-9`
* Node names must not start with a digit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period at end of sentence?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a period to all bullet points

* Node names must not start with a digit
* Names are case sensitive, but it is not allowed to have two nodes that only differ by case,
i.e. it is not allowed to have both `Vehicle.Aaa` and `Vehicle.AaA`
* It is not allowed to reuse the same node name as a parent node in a child node,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like if you mean the immediate parent, as opposed to all parents.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try to rephrase it


### Naming Conventions

The recommended naming convention for node elements is to use camel case notation starting with a capital letter. It is recommended to use only
`A-Z`, `a-z` and `0-9` in node names. For boolean signals it is recommended to start the name with `Is`.
The recommended naming convention for node elements is to use camel case notation starting with a capital letter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well, it seems to be common practice that acronyms are in uppercase in node names (which, by the way, makes it a bit awkward to read when the acronym is in the middle of a node name).

Given that acronyms are indeed "words", my personal preference is for acronyms in code, etc. to follow the same case use as codes. For example, Vehicle.Adas.Abs.IsEnabled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding that to the PR to get comments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, in my eyes, reading "Adas" or "Abs" looks weird as opposed to "ADAS" or "ABS". In case it turns out I am in the minority here, I can live with it :)

@erikbosch erikbosch added Status:Review Please review/discuss contents Status:DoNotMerge PR shall not be merged now, maybe later, maybe after rework labels Sep 20, 2023
This is a proposal for discussion

Signed-off-by: Erik Jaegervall <[email protected]>
@SebastianSchildt
Copy link
Collaborator

I personally think in 2023 it is a bit to restrictive to say

The allowed character set for node names (branch and data nodes) is A-Z, a-z and 0-9.

Many (most?) modern languages do allow some form of unicode in identifiers, e.g. https://doc.rust-lang.org/reference/identifiers.html , so while for greatest compatibility with all outputs it might be desirable in standard catalogue, I would not limit us to "7 bit ASCII" here.

I would leave this as a recommendation, and only require no whitespaces.

@argerus
Copy link

argerus commented Sep 27, 2023

I personally think in 2023 it is a bit to restrictive to say

The allowed character set for node names (branch and data nodes) is A-Z, a-z and 0-9.

Many (most?) modern languages do allow some form of unicode in identifiers, e.g. https://doc.rust-lang.org/reference/identifiers.html
, so while for greatest compatibility with all outputs it might be desirable in standard catalogue, I would not limit us to "7 bit ASCII" here.

Even in a modern language such as Rust, you are only allowed to use the alpha-numeric characters of unicode in the identifiers (+ underscore).

I suppose that C++23 restricts them to alpha-numeric characters of ASCII + underscore.

Now, it's perfectly possible to use paths with arbitrary characters in both of these languages, just not as identifiers. But when such paths are used in other contexts, i.e. as identifiers in protocols, access tokens, config files etc, allowing arbitrary special characters (or even alpha-numeric unicode) will most likely lead to issues.

I would leave this as a recommendation, and only require no whitespaces.

Wouldn't it make more sense to require alpha-numeric characters in that case (unicode or otherwise)?

But honestly, if the point is to promote interoperability, aiming for a common denominator by restricting what characters are allowed in an identifier is probably not such a bad idea.

Allowing the content addressed by these identifiers to contain other things than 7bit ascii, now that's something that should be expected in 2023.

@ppb2020
Copy link
Collaborator

ppb2020 commented Sep 27, 2023

Perhaps one thing to think of: restrict the set of characters in the COVESA catalog to what works in all formats produced by the tooling, but allow other characters such that users of overlays that do not require this level of compatibility are not hampered by the limitation. Say, add a --strict option that, when used, will check and guarantee that the produced artifacts will be able to generate the catalog in any format (IDL, JSON, etc.). We validate any submission into the COVESA catalof using --strict, but anyone defining their own signals can opt not to use --strict and use whatever characters they want, as long as they can generate the catalog in whatever set of formats they use.

@erikbosch
Copy link
Collaborator Author

Perhaps one thing to think of: restrict the set of characters in the COVESA catalog to what works in all formats produced by the tooling, but allow other characters such that users of overlays that do not require this level of compatibility are not hampered by the limitation. Say, add a --strict option that, when used, will check and guarantee that the produced artifacts will be able to generate the catalog in any format (IDL, JSON, etc.). We validate any submission into the COVESA catalof using --strict, but anyone defining their own signals can opt not to use --strict and use whatever characters they want, as long as they can generate the catalog in whatever set of formats they use.

We already have a --strict option that for instance gives an error if your signal does not start with a capital letter. We could extend usage of it, or alternatively use some other mechanism like a config file to specify limitations more in detail, like should a signal called "Über" give an error, a warning or be silently accepted?

Today we do not use --strict in the Makefile used by continuous integration, but we could start using that and also require that all "official" tools in vss-tools must at least support a strict VSS model.

@erikbosch erikbosch closed this Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status:DoNotMerge PR shall not be merged now, maybe later, maybe after rework Status:Review Please review/discuss contents
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants