-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: document tagging architecture decisions (#55)
- Loading branch information
1 parent
025c9c4
commit f1b8579
Showing
22 changed files
with
399 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,6 +64,7 @@ requirements/private.txt | |
dev.db* | ||
|
||
.vscode | ||
venv/ | ||
|
||
# Media files (for uploads) | ||
media/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
7. Tagging App structure | ||
======================== | ||
|
||
Context | ||
------- | ||
|
||
We want the openedx_tagging app to be useful in different Django projects outside of just openedx-learning and edx-platform. | ||
|
||
|
||
Decisions | ||
--------- | ||
|
||
The openedx_tagging data structures and code will stand alone with no dependencies on other Open edX projects. | ||
|
||
Classes which require dependencies on other Open edX projects should be defined within a ``tagging`` module inside those projects. | ||
|
||
Taxonomy | ||
~~~~~~~~ | ||
|
||
The ``openedx_tagging`` module defines ``openedx_tagging.core.models.Taxonomy``, whose data and functionality are self-contained to the ``openedx_tagging`` app. However in Studio, we need to be able to limit access to some Taxonomy by organization, using the same "course creator" access which limits course creation for an organization to a defined set of users. | ||
|
||
So in edx-platform, we will create the ``openedx.features.tagging`` app, to contain ``models.OrgTaxonomy``. OrgTaxonomy subclasses ``openedx_tagging.core.models.Taxonomy``, employing Django's `multi-table inheritance`_ feature, which allows the base Tag class to keep foreign keys to the Taxonomy, while allowing OrgTaxonomy to store foreign keys into Studio's Organization table. | ||
|
||
ObjectTag | ||
~~~~~~~~~ | ||
|
||
Similarly, the ``openedx_tagging`` module defined ``openedx_tagging.core.models.ObjectTag``, also self-contained to the | ||
``openedx_tagging`` app. | ||
|
||
But to tag content in the LMS/Studio, we create ``openedx.features.tagging.models.ContentTag``, which subclasses ``ObjectTag``, and can then reference functionality available in the platform code. | ||
|
||
Rejected Alternatives | ||
--------------------- | ||
|
||
Embed in edx-platform | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Embedding the logic in edx-platform would provide the content tagging logic specifically required for the MVP. | ||
|
||
However, we plan to extend tagging to other object types (e.g. People) and contexts (e.g. Marketing), and so a generic, standalone library is preferable in the log run. | ||
|
||
|
||
.. _multi-table inheritance: https://docs.djangoproject.com/en/3.2/topics/db/models/#multi-table-inheritance |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
8. Tag tree data structure | ||
========================== | ||
|
||
Context | ||
------- | ||
|
||
A taxonomy groups a set of related tags under a namespace and a set of usage rules. Some taxonomies require tags to be selected from a list or nested tree of valid options. | ||
|
||
Do we need a formal tree data structure to represent hierarchical tags? | ||
|
||
Decision | ||
-------- | ||
|
||
No, a simplistic tree structure is sufficient for the MVP and forseeable feature requests. | ||
|
||
Existing tree data structures are designed to support very dynamic and deeply nested trees (e.g. forum threads) which are traversed frequently, and this feature set is overkill for taxonomy trees. | ||
|
||
Taxonomy trees have a maximum depth of 3 levels, which limits the depth of node traversal, and simplifies the UI/UX required to tag or search filter content with nested tags. | ||
|
||
Taxonomy trees only require simple operations, and infrequent traversals. Frequent operations (like viewing content tags) will only display the leaf tag value, not its full lineage, to minimize tree traversal. Full trees can be fetched quickly enough during content tag editing. Taxonomy tree changes themselves will also be infrequent. | ||
|
||
Rejected Alternatives | ||
--------------------- | ||
|
||
All taxonomies are trees | ||
~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
We could use a tree structure for all taxonomies: flat taxonomies would have only 1 level of tags under the root, while nested taxonomies can be deeper. | ||
|
||
To implement this, we'd link each taxonomy to a root tag, with the user-visible tags underneath. | ||
|
||
It was simpler instead to link the tag to the taxonomy, which removes the need for the unseen root tag. | ||
|
||
Closure tables | ||
~~~~~~~~~~~~~~ | ||
|
||
https://coderwall.com/p/lixing/closure-tables-for-browsing-trees-in-sql | ||
|
||
Implementing the taxonomy tree using closure tables allows for tree traversals in O(N) time or less, where N is the total number of tags in the taxonomy. So the tree depth isn't as much of a performance concern as the total number of tags. | ||
|
||
Options include: | ||
|
||
* `django-tree-queries <https://github.com/matthiask/django-tree-queries>`_ | ||
|
||
Simple, performant, and well-maintained tree data structure. However it uses RECURSIVE CTE queries, which aren't supported until MySQL 8.0. | ||
|
||
* `django-mptt <https://github.com/django-mptt/django-mptt>`_ | ||
|
||
Already an edx-platform dependency, but no longer maintained. It can be added retroactively to an existing tree-like model. | ||
|
||
* `django-closuretree <https://github.com/ocadotechnology/django-closuretree>`_ | ||
|
||
Another a good reference implementation for closure tables which can be added retroactively to an existing tree-like model. It is not actively maintained. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
9. Taxonomy administrators | ||
========================== | ||
|
||
Context | ||
------- | ||
|
||
Taxonomy Administrators have the right to create, edit, populate, and delete taxonomies available globably for a given instance, or for a specific organization. | ||
|
||
How should these users be identified and their access granted? | ||
|
||
Decision | ||
-------- | ||
|
||
In the Studio context, a modified version of "course creator" access will be used to identify Taxonomy Administrators (ref `get_organizations`_): | ||
|
||
#. Global staff and superusers can create/edit/populate/delete Taxonomies for the instance or for any org key. | ||
|
||
#. Users who can create courses for "any organization" access can create/edit/populate/delete Taxonomies for the instance or for any org key. | ||
|
||
#. Users who can create courses only for specific organizations can create/edit/populate/delete Taxonomies with only these org keys. | ||
|
||
|
||
Permission #1 requires no external access, so can be enforced by the ``openedx_tagging`` app. | ||
|
||
But because permissions #2 + #3 require access to the edx-platform CMS model `CourseCreator`_, this access can only be enforced in Studio, and so will live under `cms.djangoapps.tagging` along with the ``ContentTag`` class. Tagging MVP must work for libraries v1, v2 and courses created in Studio, and so tying these permissions to Studio is reasonable for the MVP. | ||
|
||
Per `OEP-9`_, ``openedx_tagging`` will allow applications to use the standard Django API to query permissions, for example: ``user.has_perm('openedx_tagging.edit_taxonomy', taxonomy)``, and the appropriate permissions will be applied in that application's context. | ||
|
||
Rejected Alternatives | ||
--------------------- | ||
|
||
Django users & groups | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
This is a standard way to grant access in Django apps, but it is not used in Open edX. | ||
|
||
.. _get_organizations: https://github.com/openedx/edx-platform/blob/4dc35c73ffa6d6a1dcb6e9ea1baa5bed40721125/cms/djangoapps/contentstore/views/course.py#L1958 | ||
.. _CourseCreator: https://github.com/openedx/edx-platform/blob/4dc35c73ffa6d6a1dcb6e9ea1baa5bed40721125/cms/djangoapps/course_creators/models.py#L27 | ||
.. _OEP-9: https://open-edx-proposals.readthedocs.io/en/latest/best-practices/oep-0009-bp-permissions.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
10. Taxonomy enabled for context | ||
================================ | ||
|
||
Context | ||
------- | ||
|
||
The MVP specification says that taxonomies need to be able to be enabled/disabled for the following contexts: instance, organization, and course. | ||
|
||
Taxonomy Administrators must be able to enable a taxonomy globally for all organizations in an instance, or to set a list of organizations who can use the taxonomy. | ||
|
||
Content Authors must be able to turn taxonomies (instance and org-levels) on/off at the course level. | ||
|
||
Decision | ||
-------- | ||
|
||
When is a taxonomy field shown to course authors in a given course? | ||
|
||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| tax.enabled | tax.enabled_for(course.org) | course enables all tax's | Is taxonomy shown for course? | | ||
+=============+=============================+==========================+===============================+ | ||
| True | True | True | True | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| False | True | True | False | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| True | False | True | False | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| True | True | False | False | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| False | True | False | False | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
| False | False | True | False | | ||
+-------------+-----------------------------+--------------------------+-------------------------------+ | ||
|
||
.. _Course | ||
Course | ||
~~~~~~ | ||
|
||
We will add a Course `Advanced Settings`_ that allows course authors to enable/disable *all available taxonomies* for a given course. | ||
|
||
In order for a given taxonomy to be "available to a course", it must be enabled in the :ref:`Instance` context and the course's :ref:`Organization` context. | ||
|
||
Disabling taxonomies for a course will remove/hide the taxonomy fields from the course edit page and unit edit page(s), and tags will not be shown in Studio for that course. LMS use of tags is outside of this MVP. | ||
|
||
Future versions may add more granularity to these settings, to be determined by user needs. | ||
|
||
.. _Instance | ||
Instance | ||
~~~~~~~~ | ||
|
||
Taxonomy contains a boolean ``enabled`` field. | ||
|
||
A Taxonomy can be disabled for all contexts by setting ``enabled = False``. | ||
If ``enabled = True``, then the :ref:`Organization` and :ref:`Course` contexts determine whether a taxonomy will be shown to course authors. | ||
|
||
.. _Organization | ||
Organization | ||
~~~~~~~~~~~~ | ||
|
||
OrgTaxonomy has a many-to-many relationship with the Organization model, accessed via the ``org_owners`` field. OrgTaxonomy lives under `cms.djangoapps.tagging` and so has access to the Organization model and logic in Studio. | ||
|
||
An OrgTaxonomy is enabled for all organizations if ``org_owners == []``. | ||
If there are any ``org_owners`` set, then the OrgTaxonomy is only enabled for those orgas, i.e. only courses in these orgs will see the taxonomy field in Studio. | ||
|
||
Allowing multiple orgs to access a taxonomy reduces redundancy in data and maintenance. | ||
|
||
Rejected Alternatives | ||
--------------------- | ||
|
||
Single org per taxonomy | ||
~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Having a single org on a taxonomy is simpler from an implementation perspective, but the UI/UX frames demonstrated that it is simpler for the user to maintain a single taxonomy for multiple orgs. | ||
|
||
Course Waffle Flags | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
Use `Course Waffle Flags`_ to enable/disable all taxonomies for a given course. | ||
|
||
Waffle flags can only be changed by instance superusers, but the MVP specifically requires that content authors have control over this switch. | ||
|
||
|
||
Link courses to taxonomies | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Link individual courses as enabled/disabled to specific taxonomies. | ||
This was deemed too granular for the MVP, and the data structures and UI can be simplified by using a broader on/off flag. | ||
|
||
|
||
.. _Advanced Settings: https://github.com/openedx/edx-platform/blob/4dc35c73ffa6d6a1dcb6e9ea1baa5bed40721125/cms/djangoapps/models/settings/course_metadata.py#L28 | ||
.. _Course Waffle Flags: https://github.com/openedx/edx-platform/blob/4dc35c73ffa6d6a1dcb6e9ea1baa5bed40721125/openedx/core/djangoapps/waffle_utils/models.py#L14 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
11. Taxonomy and tag changes | ||
============================ | ||
|
||
Context | ||
------- | ||
|
||
Tagging content may be a labor-intensive, and the data produced is precious, both for human and automated users. Content tags should be structured and namespaced according to the needs of the instance's taxonomy administrators. But taxonomies themselves need to allow for changes: their tags can be overridden with a single import, they can be deleted, reworked, and their rules changed on the fly. | ||
|
||
What happens to the existing content tags if a Taxonomy or Tag is renamed, moved, or deleted? | ||
|
||
Decision | ||
-------- | ||
|
||
Preserve content tag name:value pairs even if the associated taxonomy or tag is removed. | ||
Reflect name:value changes from the linked taxonomy:tag immediately to the user. | ||
|
||
Content tags (through their base ObjectTag class) store a foreign key to their Taxonomy and Tag (if relevant), but they also store a copy of the Taxonomy.name and Tag.value, which can be used if there is no Taxonomy or Tag available. | ||
|
||
We consult the authoritative Taxonomy.name and Tag.value whenever displaying a content tag, so any changes are immediately reflected to the user. | ||
|
||
If a Taxonomy or Tag is deleted, the linked content tags will remain, and cached copies of the name:value pair will be displayed. | ||
|
||
This cached "value" field enables content tags (through their base ObjectTag class) to store free-text tag values, so that the free-text Taxonomy itself need not be modified when new free-text tags are added. | ||
|
||
This extra storage also allows tagged content to be imported independently of a taxonomy. The taxonomy (and appropriate tags) can be constructed later, and content tags validated and re-synced by editing the content tag or by running a maintenance command. | ||
|
||
Rejected Alternatives | ||
--------------------- | ||
|
||
Require foreign keys | ||
~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Require a foreign key from a content tag to Taxonomy for the name, and Tag for the value. | ||
|
||
Only using foreign keys puts the labor-intensive content tag data at risk during taxonomy changes, and requires free-text tags to be made part of a taxonomy. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
"""Open edX Tagging app.""" |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
""" Tagging app admin """ | ||
from django.contrib import admin | ||
|
||
from .models import TagContent | ||
|
||
admin.site.register(TagContent) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
""" | ||
tagging Django application initialization. | ||
""" | ||
|
||
from django.apps import AppConfig | ||
|
||
|
||
class TaggingConfig(AppConfig): | ||
""" | ||
Configuration for the tagging Django application. | ||
""" | ||
|
||
name = "openedx_tagging.core.tagging" | ||
verbose_name = "Tagging" | ||
default_auto_field = 'django.db.models.BigAutoField' | ||
label = "oel_tagging" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Generated by Django 3.2.19 on 2023-05-25 21:08 | ||
|
||
from django.db import migrations, models | ||
|
||
|
||
class Migration(migrations.Migration): | ||
|
||
initial = True | ||
|
||
dependencies = [ | ||
] | ||
|
||
operations = [ | ||
migrations.CreateModel( | ||
name='TagContent', | ||
fields=[ | ||
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), | ||
('content_id', models.CharField(max_length=255)), | ||
('name', models.CharField(max_length=255)), | ||
('value', models.CharField(max_length=765)), | ||
], | ||
), | ||
] |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
""" Tagging app data models """ | ||
from django.db import models | ||
|
||
|
||
class TagContent(models.Model): | ||
""" | ||
TODO: This is a basic representation of TagContent, it may change | ||
depending on the discovery in: | ||
https://docs.google.com/document/d/13zfsGDfomSTCp_G-_ncevQHAb4Y4UW0d_6N8R2PdHlA/edit#heading=h.o6fm1hktwp7b | ||
This is the representation of a tag that is linked to a content | ||
Name-Value pair | ||
----------------- | ||
We use a Field-Value Pair representation, where `name` functions | ||
as the constant that defines the field data set, and `value` functions | ||
as the variable tag lineage that belong to the set. | ||
Value lineage | ||
--------------- | ||
`value` are stored as null character-delimited strings to preserve the tag's lineage. | ||
We use the null character to avoid choosing a character that may exist in a tag's name. | ||
We don't use the Taxonomy's user-facing delimiter so that this delimiter can be changed | ||
without disrupting stored tags. | ||
""" | ||
|
||
# Content id to which this tag is associated | ||
content_id = models.CharField(max_length=255) | ||
|
||
# It's not usually large | ||
name = models.CharField(max_length=255) | ||
|
||
# Tag lineage value. | ||
# | ||
# The lineage can be large. | ||
# TODO: The length is under discussion | ||
value = models.CharField(max_length=765) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Tagging App | ||
============== | ||
|
||
The ``tagging`` app will enable content authors to tag pieces of content and quickly | ||
filter for ease of re-use. | ||
|
||
Motivation | ||
---------- | ||
|
||
Tagging content is central to enable content re-use, facilitate the implementation | ||
of flexible content structures different from the current implementation and | ||
allow adaptive learning in the Open edX platform. | ||
|
||
This service has been implemented as pluggable django app. Since it is necessary for | ||
it to work independently of the content to which it links to. | ||
|
||
Setup | ||
--------- | ||
|
||
**TODO:** We need to wait the discussion of the `Taxonomy discovery <https://docs.google.com/document/d/13zfsGDfomSTCp_G-_ncevQHAb4Y4UW0d_6N8R2PdHlA/edit#heading=h.o6fm1hktwp7b>`_. | ||
to build a proper setup for linking different pieces of content. | ||
|
||
The current approach is to save the id of the content through a generic string, | ||
so that the tag can be linked to any type of content, no matter what type of ID have. | ||
For example, with this approach we can link standalone blocks and library blocks, | ||
both on Denver (v3) and Blockstore Content Libraries (v2). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.