Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax errors in README examples #219

Closed
arnavgarg1 opened this issue Aug 14, 2023 · 5 comments · Fixed by #240
Closed

Syntax errors in README examples #219

arnavgarg1 opened this issue Aug 14, 2023 · 5 comments · Fixed by #240
Assignees

Comments

@arnavgarg1
Copy link

I think the README might be outdated with the new 0.0.8 release

@arnavgarg1
Copy link
Author

>>> import outlines.text.generate as generate
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/lib/python3.8/site-packages/outlines/__init__.py", line 4, in <module>
    from outlines.text import prompt
  File "/opt/homebrew/lib/python3.8/site-packages/outlines/text/__init__.py", line 2, in <module>
    from .generate import continuation
  File "/opt/homebrew/lib/python3.8/site-packages/outlines/text/generate/__init__.py", line 2, in <module>
    from .regex import choice, float, integer, json, regex
  File "/opt/homebrew/lib/python3.8/site-packages/outlines/text/generate/regex.py", line 11, in <module>
    from outlines.text.json_schema import build_regex_from_schema
  File "/opt/homebrew/lib/python3.8/site-packages/outlines/text/json_schema.py", line 204
    match step:
          ^
SyntaxError: invalid syntax

Not sure what's going on here, maybe its a jsonschema package version mismatch? I'm on 4.6.2

@rlouf
Copy link
Member

rlouf commented Aug 15, 2023

Thank you for opening an issue! It looks like a Python version issue. Make sure to install Python >= 3.10. Let me know if examples still fail.

@rlouf rlouf self-assigned this Aug 15, 2023
@davidberenstein1957
Copy link

davidberenstein1957 commented Aug 16, 2023

Hi, I noticed this too, but I would say rewriting the code might be better than forcing people to install 3.10. But I don't know how much work this would be. #234

outlines/text/json_schema.py

def match_step_to_regex(step):
    """Translate an element of a JSON schema to a regex that defines its content.

    Parameters
    ----------
    step:
        A string that represents the schema's structure, or a dictionnary
        that represents a field in the schema.

    Returns
    -------
    A string that represents a regular expression that defines the value of the
    schedule's step.

    """
    if isinstance(step, str):
        return step

    if isinstance(step, dict) and "enum" in step:
        choices = step["enum"]
        if step.get("type") == "string":
            choices = [f'"{choice}"' for choice in choices]
        else:
            choices = [str(choice) for choice in choices]
        return f"({'|'.join(choices)})"

    if isinstance(step, dict) and "type" in step and step["type"] == "array" and "items" in step:
        item_regexes = [match_step_to_regex(item) for item in step["items"]]
        return rf"\[({','.join(item_regexes)})(,({','.join(item_regexes)}))*\]"

    if isinstance(step, dict) and "type" in step and step["type"] == "object":
        object_regexes = [match_step_to_regex(value) for value in step.values()]
        return ''.join(object_regexes)

    if isinstance(step, dict) and "type" in step and step["type"] == "string":
        if "maxLength" in step:
            return f'".{{,{step["maxLength"]}}}"'
        elif "minLength" in step:
            return f'".{{{step["minLength"]},}}"'

    if isinstance(step, dict) and "type" in step:
        field_type = step["type"]
        return type_to_regex[field_type]

    if isinstance(step, dict) and "anyOf" in step:
        choices = step["anyOf"]
        regexes = [match_step_to_regex(choice) for choice in choices]
        return rf"({'|'.join(regexes)})"

    raise NotImplementedError(f"Unsupported step: {step}")

and a type hint outlines/text/prompts.py

def get_schema_pydantic(model: "type[BaseModel]"):

@davidberenstein1957
Copy link

I can create a PR for this if you allow me to work on it.

@rlouf
Copy link
Member

rlouf commented Aug 16, 2023

Thank you for offering to help! Python 3.10 introduced structural pattern matching, which helps keep parts of the codebase sane, specifically JSON as you pointed out. This only supports a subset of the JSON schema, and as we add field constraints (#215) it will quickly become unmanageable and hard to understand. I would prefer not to change this requirement at this time. Nevertheless I'm open to arguments.

@brandonwillard brandonwillard changed the title None of the README examples work Syntax errors in README examples Aug 16, 2023
@rlouf rlouf linked a pull request Aug 16, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants