Improve TLM documentation (#216)

Co-authored-by: Jonas Mueller <[email protected]> Co-authored-by: Ulyana <[email protected]>
cleanlab · Apr 16, 2024 · 0a42e31 · 0a42e31
1 parent 0d1876c
commit 0a42e31
Show file tree

Hide file tree

Showing 2 changed files with 134 additions and 85 deletions.
diff --git a/cleanlab_studio/studio/studio.py b/cleanlab_studio/studio/studio.py
@@ -391,28 +391,39 @@ def TLM(
         timeout: Optional[float] = None,
         verbose: Optional[bool] = None,
     ) -> trustworthy_language_model.TLM:
-        """Gets a configured instance of Trustworthy Language Model (TLM).
+        """Instantiates a configured Trustworthy Language Model (TLM) instance.
 
-        The returned TLM object can then be used as a drop-in replacement for an LLM, for estimating trustworthiness scores for LLM prompt/response pairs, and more. See the documentation for the [TLM](../trustworthy_language_model#class-TLM) class for more on what you can do with TLM.
+        The TLM object can be used as a drop-in replacement for an LLM, or, for estimating trustworthiness scores for arbitrary text prompt/response pairs, and more (see the [TLM documentation](../trustworthy_language_model#class-TLM)).
 
-        For advanced use cases, TLM supports a number of configuration options. The documentation below summarizes the options, and the [TLM tutorial](/tutorials/tlm) explains the tradeoffs in more detail.
+        For advanced use, TLM offers configuration options. The documentation below summarizes these options, and more details are explained in the [TLM tutorial](/tutorials/tlm).
 
         Args:
-            quality_preset (TLMQualityPreset): quality preset to use for TLM queries, which will determine the quality of the output responses and trustworthiness scores.
-            Supported presets include "best", "high", "medium", "low", "base".
-            The "best" and "high" presets will improve the LLM responses themselves, with "best" also returning the most reliable trustworthiness scores.
-            The "medium" and "low" presets will return standard LLM responses along with associated confidence scores,
-            with "medium" producing more reliable trustworthiness scores than low.
-            The "base" preset will not return any confidence score, just a standard LLM output response, this option is similar to using your favorite LLM API.
-            Higher presets have increased runtime and cost.
+            quality_preset (TLMQualityPreset): An optional preset to control the quality of TLM responses and trustworthiness scores vs. runtimes/costs.
+                TLMQualityPreset is a string specifying one of the supported presets, including "best", "high", "medium", "low", "base".
+
+                The "best" and "high" presets return improved LLM responses,
+                with "best" also returning more reliable trustworthiness scores than "high".
+                The "medium" and "low" presets return standard LLM responses along with associated trustworthiness scores,
+                with "medium" producing more reliable trustworthiness scores than low.
+                The "base" preset will not return any trustworthiness score, just a standard LLM response, and is similar to directly using your favorite LLM API.
+
+                Higher presets have increased runtime and cost (and may internally consume more tokens).
+                Reduce your preset if you see token-limit errors.
+                Details about each present are in the documentation for [TLMOptions](../trustworthy_language_model#class-tlmoptions).
+                Avoid using "best" or "high" presets if you primarily want to get trustworthiness scores, and are less concerned with improving LLM responses.
+                These presets have higher runtime/cost and are optimized to return more accurate LLM outputs, but not necessarily more reliable trustworthiness scores.
 
             options (TLMOptions, optional): a typed dict of advanced configuration options.
-            Options that can be passed in include "model", "max_tokens", "num_candidate_responses", "num_consistency_samples", "use_self_reflection".
+            Avaialable options (keys in this dict) include "model", "max_tokens", "num_candidate_responses", "num_consistency_samples", "use_self_reflection".
             For more details about the options, see the documentation for [TLMOptions](../trustworthy_language_model#class-tlmoptions).
+            If specified, these override any settings from the choice of `quality_preset`.
 
-            timeout (float, optional): timeout (in seconds) to apply to each method call. If a result is not produced within the timeout, a TimeoutError will be raised. Defaults to None, which does not apply a timeout.
+            timeout (float, optional): timeout (in seconds) to apply to each TLM prompt.
+            If a batch of data is passed in, the timeout will be applied to each individual item in the batch.
+            If a result is not produced within the timeout, a TimeoutError will be raised. Defaults to None, which does not apply a timeout.
 
-            verbose (bool, optional): whether to run in verbose mode, i.e., whether to show a tqdm progress bar when TLM is prompted with batches of data. If None, this will be determined automatically based on whether the code is running in an interactive environment such as a notebook.
+            verbose (bool, optional): whether to print outputs during execution, i.e., whether to show a progress bar when TLM is prompted with batches of data.
+            If None, this will be determined automatically based on whether the code is running in an interactive environment such as a Jupyter notebook.
 
         Returns:
             TLM: the [Trustworthy Language Model](../trustworthy_language_model#class-tlm) object