-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
78f4926
commit d214f8f
Showing
3 changed files
with
375 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,373 @@ | ||
4. Upstream and downstream content | ||
################################## | ||
|
||
Status | ||
****** | ||
|
||
Accepted. | ||
|
||
Implementation in progress as of 2024-09-03. | ||
|
||
Context | ||
******* | ||
|
||
We are replacing the existing Legacy ("V1") Content Libraries system, based on | ||
ModuleStore, with a Relaunched ("V2") Content Libraries system, based on | ||
Learning Core. V1 and V2 libraries will coexist for at least one release to | ||
allow for migration; eventually, V1 libraries will be removed entirely. | ||
|
||
Content from V1 libraries can only be included into courses using the | ||
LibraryContentBlock (called "Randomized Content Module" in Studio), which works | ||
like this: | ||
|
||
* Course authors add a LibraryContentBlock to a Unit and configure it with a | ||
library key and a count of N library blocks to select (or `-1` for "all | ||
blocks"). | ||
|
||
* For each block in the chosen library, its *content definition* is copied into | ||
the course as a child of the LibraryContentBlock, whereas its *settings* are | ||
copied into a special "default" settings dictionary in the course's structure | ||
document--this distinction will matter later. The usage key of each copied | ||
block is derived from a hash of the original library block's usage key plus | ||
the LibraryContentBlock's own usage key--this will also matter | ||
later. | ||
|
||
* The course author is free to override the content and settings of the | ||
course-local copies of each library block. | ||
|
||
* When any update is made to the library, the course author is prompted to | ||
update the LibraryContentBlock. This involves re-copying the library blocks' | ||
content definitions and default settings, which clobbers any overrides they | ||
have made to content, but preserves any overrides they have made to settings. | ||
Furthermore, any blocks that were added to the library are newly copied into | ||
the course, and any blocks that were removed from the library are deleted | ||
from the course. For all blocks, usage keys are recalculated using the same | ||
hash derivation described above; for existing blocks, it is important that | ||
this recalculation yields the same usage key so that student state is not | ||
lost. | ||
|
||
* Over in the LMS, when a learner loads LibraryContentBlock, they are shown a | ||
list of N randomly-picked blocks from the library. Subsequent visits show | ||
them the same list, *unless* children were added, children were removed, or N | ||
changed. In those cases, the LibraryContentBlock tries to make the smallest | ||
possible adjustment to their personal list of blocks while respecting N and | ||
the updated list of children. | ||
|
||
This system has several issues: | ||
|
||
#. **Missing defaults after import:** When a course with a LibraryContentBlock | ||
is imported into an Open edX instance *without* the referenced library, the | ||
blocks' *content* will remain intact as will course-local *settings | ||
overrides*. However, any *default settings* defined in the library will be | ||
missing. This can result in content that is completely broken, especially | ||
since critical fields like video URLs and LTI URLs are considered | ||
"settings". For a detailed scenario, see `LibraryContentBlock Curveball 1`_. | ||
|
||
#. **Strange behavior when duplicating content:** Typically, when a | ||
block is duplicated or copy-pasted, the new block's usage key and its | ||
children's usage keys are randomly generated. However, recall that when a | ||
LibraryContentBlock is updated, its children's usage keys are rederived | ||
using a hash function. That would cause the children's usage keys to change, | ||
thus destroying any student state. So, we must work around this with a hack: | ||
upon duplicating or pasting a LibraryContentBlock, we immediately update the | ||
LibraryContentBlock, thus discarding the problematic randomly-generated keys | ||
in favor of hash-derived keys. This works, but: | ||
|
||
* it involves weird code hacks, | ||
* it unexpectedly discards any content overrides the course author made to | ||
the copied LibraryContentBlock's children, | ||
* it unexpectedly uses the latest version of library content, regardless of | ||
which version the copied LibraryContentBlock was using, and | ||
* it fails if the library does not exist on the Open edX instance, which | ||
can happen if the course was imported from another instance. | ||
|
||
#. **Conflation of reference and randomization:** The LibraryContentBlock does | ||
two things: it connects courses to library content, and it shows users a | ||
random subset of content. There is no reason that those two features need to | ||
be coupled together. A course author may want to randomize course-defined | ||
content, or they may want to randomize content from multiple different | ||
libraries. Or, they may want to use content from libraries without | ||
randomizing it at all. While it is feasible to support all these things in a | ||
single XBlock, trying to do so led to a `very complicated XBlock concept`_ | ||
which difficult to explain to product managers and other engineers. | ||
|
||
#. **Unpredictable preservation of overrides:** Recall that *content | ||
definitions* and *settings* are handled differently. This distinction is | ||
defined in the code: every authorable XBlock field is either defined with | ||
`Scope.content` or `Scope.settings`. In theory, XBlock developers would use | ||
the content scope for fields that are core to the meaning of piece of | ||
content, and they would only use the settings scope for fields that would be | ||
reasonable to configure in a local copy of the piece of content. In | ||
practice, though, XBlock developers almost always use `Scope.settings`. The | ||
result of this is that customizations to blocks *almost always* survive | ||
through library updates, except when they don't. Course authors have no way | ||
to know (or even guess) when their customizations they will and won't | ||
survive updates. | ||
|
||
#. **General pain and suffering:** The relationship between courses and V1 | ||
libraries is confusing to content authors, site admins, and developers | ||
alike. The behaviors above toe the line between "quirks" and "known bugs", | ||
and they are not all documented. Past attempts to improve the system have | ||
`triggered series of bugs`_, some of which led to permanent loss of learner | ||
state. In other cases, past Content Libraries improvement efforts have | ||
slowed or completely stalled out in code review due to the overwhelming | ||
amount of context and edge cases that must be understood to safely make any | ||
changes. | ||
|
||
.. _LibraryContentBlock Curveball 1: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3966795804/Fun+with+LibraryContentBlock+export+import+and+duplication#Curveball-1%3A-Import%2FExport | ||
.. _LibraryContentBlock Curveball 2: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3966795804/Fun+with+LibraryContentBlock+export+import+and+duplication#Curveball-2:-Duplication | ||
.. _very complicated XBlock concept: https://github.com/openedx/edx-platform/blob/master/xmodule/docs/decisions/0003-library-content-block-schema.rst | ||
.. _triggered series of bugs: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3858661405/Bugs+from+Content+Libraries+V1 | ||
|
||
We are keen to use the Library Relaunch project to address all of these | ||
problems. So, V2 libraries will interop with courses using a completely | ||
different data model. | ||
|
||
|
||
Decision | ||
******** | ||
|
||
We will create a framework where a *downstream* piece of content (e.g. a course | ||
block) can be *linked* to an *upstream* piece of content (e.g., a library | ||
block) with the following properties: | ||
|
||
* **Portable:** Links can refer to certain content on the current Open edX | ||
instance, and in the future they may be able to refer to content on other | ||
Open edX instances or sites. Links will never include information that is | ||
internal to a particular Open edX instance, such as foreign keys. | ||
|
||
* **Flat:** The *link* is a not a wrapper (like the LibraryContentBlock), | ||
but simply a piece of metadata directly on the downstream content which | ||
points to the upstream content. We will no longer rely on precarious | ||
hash-derived usage keys to establish connection to upstream blocks; | ||
like any other block, an upstream-link blocks can be granted whatever block | ||
ID that the authoring environment assigns it, whether random or | ||
human-readable. | ||
|
||
* **Forwards-compatible:** If downstream content is created in a course on | ||
an Open edX site that supports upstream and downstreams (e.g., a Teak | ||
instance), and then it is exported and imported into a site that doesn't | ||
(e.g., a Quince instance), the downstream content will simply act like | ||
regular course content. | ||
|
||
* **Independent:** Upstream content and downstream content exist separately | ||
from one another: | ||
|
||
* Modifying upstream content does not affect any downstream content (unless a | ||
sync happens, more on that later). | ||
* Deleting upstream content does not impact its downstream content. By | ||
corollary, pieces of downstream content can completely and correctly render | ||
on Open edX instances that are missing their linked upstream content. | ||
* (Preserving a positive feature of the V1 LibraryContentBlock) The link | ||
persists through export-import and copy-paste, regardless of whether the | ||
upstream content actually exists. A "broken" link to upstream content is | ||
seamlessly "repaired" if the upstream content becomes available again. | ||
|
||
* **Customizable:** On an OLX level, authors can still override the value | ||
of any field for a piece of downstream content. However, we will empower | ||
Studio to be more prescriptive about what authors *can* override versus what | ||
they *should* override: | ||
|
||
* We define a set of *customizable* fields, with platform-level defaults | ||
like display_name and a max_attempts, plus the ability for external | ||
XBlocks to opt their own fields into customizability. | ||
* Studio may use this list to provide an interface for customizing | ||
downstream blocks, separate from the usual "Edit" interface that would | ||
permit them to make unsafe overrides. | ||
* Furthermore, downstream content will record which fields the user has | ||
customized... | ||
|
||
* even if the customization is to simply clear the value of the fields... | ||
* and even if the customization is made redundant in a future version of | ||
the upstream content. For example, if max_attempts is customized from 3 | ||
to 5 in the downstream content, but the next version of the upstream | ||
content also changes max_attempts to 5, the downstream would still | ||
consider max_attempts to be customized. If the following version of the | ||
upstream content again changed max_attempts to 6, the downstream would | ||
retain max_attempts to be 5. | ||
|
||
* Finally, the downstream content will locally save the upstream value of | ||
customizable fields, allowing the author to *revert* back to them | ||
regardless of whether the upstream content is actually available. | ||
|
||
* **Synchronizable, without surprises:** Downstream content can be *synced* | ||
with updates that have been made to its linked upstream. This means that the | ||
latest available upstream content field values will entirely replace all of | ||
the downstream field values, *except* those which were customized, as | ||
described in the previous item. | ||
|
||
* **Concrete, but flexible:** The internal implementation of upstream-downstream | ||
syncing will assume that: | ||
|
||
* upstream content belongs to a V2 content library, | ||
* downstream content belongs to a course on the same instance, and | ||
* the link is the stringified usage key of the upstream library content. | ||
|
||
This will allow us to keep the implementation straightforward. However, we | ||
will *not* expose these assumptions in the Python APIs, the HTTP APIs, or in | ||
the persisted fields, allowing us in the future to generalize to other | ||
upstreams (such as externally-hosted libraries) and other downstreams (such | ||
as a standalone enrollable sequence without a course). | ||
|
||
* **Decoupled:** Upstream-downstream linking is not tied up with any other | ||
courseware feature; in particular, it is unrelated to content randomization. | ||
Randomized library content will be supported, but it will be a *synthesis* of | ||
two features: (1) a RandomizationBlock that randomly selects a subset of its | ||
children, where (2) some or all of those children are linked to upstream | ||
blocks. | ||
|
||
Consequences | ||
************ | ||
|
||
We will implement the core logic in a new module | ||
``cms/lib/xblock/upstream_sync.py``. Here is a truncated excerpt from that new | ||
module, focusing on the new HTTP and Python APIs which will be added: | ||
|
||
.. code-block:: python | ||
########################################################################### | ||
# cms/lib/xblock/upstream_sync.py | ||
########################################################################### | ||
class UpstreamSyncMixin(XBlockMixin): | ||
""" | ||
Allows an XBlock in the CMS to be associated & synced with an upstream. | ||
Mixed into CMS's XBLOCK_MIXINS, but not LMS's. | ||
""" | ||
# Metadata related to upstream synchronization | ||
upstream = String( | ||
help=(""" | ||
The usage key of a block (generally within a content library) | ||
which serves as a source of upstream updates for this block, | ||
or None if there is no such upstream. Please note: It is valid | ||
for this field to hold a usage key for an upstream block | ||
that does not exist (or does not *yet* exist) on this instance, | ||
particularly if this downstream block was imported from a | ||
different instance. | ||
"""), | ||
default=None, scope=Scope.settings, hidden=True, enforce_type=True | ||
) | ||
upstream_version = Integer( | ||
help=(""" | ||
Record of the upstream block's version number at the time this | ||
block was created from it. If upstream_version is smaller | ||
than the upstream block's latest version, then the user will be | ||
able to sync updates into this downstream block. | ||
"""), | ||
default=None, scope=Scope.settings, hidden=True, enforce_type=True, | ||
) | ||
downstream_customized = List( | ||
help=(""" | ||
Names of the fields which have values set on the upstream | ||
block yet have been explicitly overridden on this downstream | ||
block. Unless explicitly cleared by the user, these | ||
customizations will persist even when updates are synced from | ||
the upstream. | ||
"""), | ||
default=[], scope=Scope.settings, hidden=True, enforce_type=True, | ||
) | ||
# Store upstream defaults for customizable fields. | ||
upstream_display_name = String(...) | ||
upstream_max_attempts = List(...) | ||
... # We will probably want to pre-define several more of these. | ||
def get_upstream_field_names(cls) -> dict[str, str]: | ||
""" | ||
Mapping from each customizable field to field which stores its upstream default. | ||
XBlocks outside of edx-platform can override this in order to set | ||
up their own customizable fields. | ||
""" | ||
return { | ||
"display_name": "upstream_display_name", | ||
"max_attempts": "upstream_max_attempts", | ||
} | ||
def save(self, *args, **kwargs): | ||
""" | ||
Update `downstream_customized` when a customizable field is modified. | ||
Uses `get_upstream_field_names` keys as the list of fields that are | ||
customizable. | ||
""" | ||
... | ||
@XBlock.handler | ||
def sync_updates(self, request: Request, suffix=None) -> Response: | ||
""" | ||
POST handler which applies updates from the upstream block. | ||
No request/response data. | ||
Responses: | ||
200: Success | ||
400: Upstream is missing or invalid | ||
405: Method other than POST | ||
""" | ||
... | ||
def sync_from_upstream(self, *, user: User, apply_updates: bool) -> None: | ||
""" | ||
Python API for loading updates from upstream block. | ||
Can choose whether or not to actually apply those updates... | ||
apply_updates=False: Think "get fetch". | ||
Use case: course import. | ||
apply_updates=True: Think "git pull". | ||
Use case: sync_updates handler. | ||
Raises: InvalidKeyError, UnsupportedUpstreamKeyType, XBlockNotFoundError | ||
""" | ||
... | ||
def get_upstream_info(self) -> UpstreamInfo | None: | ||
""" | ||
Python API for upstream metadata, or None. | ||
Raises: InvalidKeyError, XBlockNotFoundError | ||
""" | ||
... | ||
@dataclass(frozen=True) | ||
class UpstreamInfo: | ||
""" | ||
Metadata about a block's relationship with an upstream. | ||
""" | ||
usage_key: UsageKey | ||
current_version: int | ||
latest_version: int | None | ||
sync_url: str | ||
error: str | None | ||
@property | ||
def sync_available(self) -> bool: | ||
""" | ||
Should the user be prompted to sync this block with upstream? | ||
""" | ||
return ( | ||
self.latest_version | ||
and self.current_version < self.latest_version | ||
and not self.error | ||
) | ||
def validate_upstream_key(usage_key: UsageKey | str) -> UsageKey: | ||
""" | ||
Raise an error if the provided key is not a valid upstream reference. | ||
Instead of explicitly checking whether a key is a LibraryLocatorV2, | ||
callers should validate using this function, and use an `except` clause | ||
to handle the case where the key is not a valid upstream. | ||
Raises: InvalidKeyError, UnsupportedUpstreamKeyType | ||
""" | ||
... | ||
Here is what the OLX for a library-sourced Problem XBlock in a course might | ||
look like: | ||
|
||
.. code-block:: xml | ||
<problem | ||
display_name="A title that has been customized in the course" | ||
max_attempts="2" | ||
upstream="lb:myorg:mylib:problem:p1" | ||
upstream_version="12" | ||
downstream_customized="["display_name","max_attempts"]" | ||
upstream_display_name="The title that was defined in the library block" | ||
upstream_max_attempts="3" | ||
> | ||
<!-- problem content would go here --> | ||
</problem> |
Oops, something went wrong.