telefonicaid · fgalan · Nov 29, 2016 · Nov 29, 2016 · Nov 30, 2016 · fgalan
diff --git a/docs/topics/modelling_considerations.md b/docs/topics/modelling_considerations.md
@@ -3,6 +3,112 @@
 There are some considerations to make when designing how to model a project in the platform. This document will give you
 some hints to avoid most common mistakes (use them as hints to guide your modelling, not as strict rules).
 
+## Modeling your IDs in the right way
+
+Entity ids and attribute names should be like *real* IDs. In other words, using whitespaces, accents or
+any other funny weird character in ID strings is a really bad idea. In fact, although that is allowed in the NGSIv1
+API (due to legacy reasons), it is forbidden in the NGSIv2 version of the API (check the "Field syntax restrictions"
+section in the [NGSIv2 specification document](http://telefonicaid.github.io/fiware-orion/api/v2/stable) for details).
+
+Why this is a bad idea? There are several reasons:
+
+* Take into account that the IoT Platform would use that identifiers (or strings derived from that identifiers) in
+  places where such characters are not allowed. For example, some persistence backends are based in databases which 
+  doesn't accept whitespaces or non-ASCII characters in table databases.
+* IDs may appear as part of URLs (e.g. the URL identifying an entity at [Context Broker](../context_broker.md) and
+  using non-ASCII characters in that places makes these URL more complex.
+
+Sometimes you may think you need to use ids with whitespaces and non-ASCII characters to render that information
+correctly, e.g. in a graphic-user interface. For instance, you have an entity that you want to show as
+"Row 12/Seat B" with an attribute "Occupation status" in your application and you may think that 
+modeling in the following way is a good idea:
+
+
+      {
+         "type": "Seat",
+         "isPattern": "false",
+         "id": "Row 12/Seat B",
+         "attributes": [
+           {
+             "name": "Occupation status",
+             "type": "String",
+             "value": "occupied"
+           },
+           ...
+           }
+         ]
+      }
+
+However, it is not a good idea. If you need descriptive texts for entities or attributes, then use specific 
+attributes and metadata for them respectively, which values are not ids and doesn't have any of the problems 
+described above. Taking that into account, you could model in the following for the example above:
+
+      {
+         "type": "Seat",
+         "isPattern": "false",
+         "id": "Row12SeatB",
+         "attributes": [
+           {
+             "name": "description",
+             "type": "String",
+             "value": "Row 12/Seat B"
+           },
+           {
+             "name": "status",
+             "type": "String",
+             "value": "occupied",
+             "metadata": [
+               {
+                 "name": "description",
+                 "type": "String",
+                 "value": "Occupation status"
+               }
+             ]
+           },
+           ...
+           }
+         ]
+      }
+
+As a general guideline, you should use identifiers with the following properties:
+
+* They **must** be unique: It's better to have globally unique IDs if that's possible, but, for the cases 
+  where they aren't, they should be at least unique at the service level. It's also important to design 
+  the process of ID assignment so that the probability of generating an ID collision is as lower as 
+  possible (i.e.: it's better to have a 16 bytes hexadecimal UUID than an 8bit integer).
+
+* They **should** never change (or do it under extraordinary circumstances): the ID uniquely identifies 
+  your entities, and not only the Context Broker, but potentially multiple other systems may use it 
+  to identify objects associated to it (e.g. this specially affects the persistence backends). That
+  turns any change in the ID into a potential migration of data in multiple systems, with it associated
+  (usually very large) costs.
+
+* They **should not** be tied to the data: as that bound would make it easier to brake any of the
+  two first rules. Even if you are completely sure that identifying your users with their Driver Licenses 
+  is unique and immutable, chances are that the Government choose to change it; use a UUID instead. 
+  That will ensure uniqueness and, since the UUID only belongs to the system, you will be the one 
+  who decides when and how it may change (if it is allowed to do it at all). However, note that we 
+  are not using UUID in this documentation for didactic reasons but in real usage use case 
+  you should consider this recommendation.
+
+* (*) They **should not** use the underscore (`_`) character: although accepted by context broker, it is a
+  bad idea using it as part of your IDs since the persistence backends use the underscore too for special
+  purposes. On the one hand, it is used as concatenator character. On the other hand, it is used as
+  replacement character when a character within the ID is not accepted by the persistence backends.
+
+* (*) They **should** avoid using uppercase letters when using PostgreSQL-based persistence backends, e.g.
+  CKAN (or Carto in the future): they usually convert uppercase letters into lowercase. This means IDs
+  such as `Car` and `car` will be different at context broker level, but the same at persistence backend
+  level. Of course, if you are not considering using PostgreSQL-based persistence backends, ignore this
+  advice.
+
+The above consideration applies to entity ids and attribute names but also to other pieces of context 
+information which take the role of an ID, in particular to entity types, attribute types, metadata names and 
+metadata types.
+
+(*) This guideline won't make sense once the new encoding is enabled in the IoT Platform. Such a new
+encoding uses a concatenator different than underscore and not accepted characters (including uppercase letters) are encoded following Unicode format.
+
 ## The IoT Platform is centered in context
 
 The central piece of the IoT Platform is the Context Broker: a component that lets you store and query information