Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cfg beta #85

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Cfg beta #85

wants to merge 3 commits into from

Conversation

lapp0
Copy link
Owner

@lapp0 lapp0 commented Jul 24, 2024

Fixes:

Rendered Docs: https://github.com/lapp0/outlines/blob/cfg-beta/docs/reference/creating_grammars.md

Changes

CFGGuide

  • Created a stateless CFGGuide based on Brandon Willard's implementation in examples/parsing.py
  • Update outlines.fsm.parsing to handle some edge cases
    • Mark EOS valid if $END is a legal next terminal
    • Bug fix: before fix, tokens which exceeded the bounds of the terminal, but had no matching subsequent terminal candidate were still marked as valid.
  • Delete CFGFSM

Grammars

  • Fix ESCAPED_STRING in json.lark and common.lark

Integrations

  • Implement outlines.generate.cfg(...) via SequenceGeneratorAdapter
  • Implement outlines.processors.CFGLogitsProcessor

Testing

tests/fsm/test_cfg_guide.py

  • test_cfg_next_token: assert that given a sequence of prior tokens generated, the expected next tokens in a vocabulary are allowed.
  • test_cfg_grammar_sample: Resurrected tests from an old PR which encode a sample which is valid with the grammar, and assert that the sequence of encoded tokens can be produced by CFGGuide. Allows for a new test to be created by simply adding an example to tests/cfg_samples/

test outlines.generate.cfg via tests/generate/test_generate.py

Benchmarks

benchmarks/bench_cfg_guide.py: measure CFGGuide construction time, token run time, and token run peak-memory

Analysis

Regardless of length, 10 tokens, 40 tokens, or 100 tokens, it takes ~1.2 seconds to generate a token.

Unsurprisingly get_next_instruction takes most of the time, totaling over 99.99% of the runtime. It's intuitive considering the same operation is applied for get_next_state, but for a single token instead of once for each of gpt2's 50,257 tokens.

cProfile:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  140.176  140.176 {built-in method builtins.exec}
        1    0.000    0.000  140.176  140.176 <string>:1(<module>)
        1    0.003    0.003  140.176  140.176 /home/andrew/p/outlines/profile_cfg.py:15(profile_guide_run)
       40    3.736    0.093  140.159    3.504 /home/andrew/p/outlines/outlines/fsm/guide.py:324(get_next_instruction)
  1758994    0.785    0.000   92.318    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:140(parse_from_state)
  1758994    2.055    0.000   91.533    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:482(parse_from_state)
  2917115    2.304    0.000   81.840    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:630(lex)
  2916020    8.630    0.000   79.111    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:696(next_token)
11708177/2913032   11.354    0.000   39.208    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/copy.py:66(copy)
  1759029    2.429    0.000   36.344    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:453(__copy__)
  2329598    1.455    0.000   33.410    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:693(match)
  2329598    8.935    0.000   31.644    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:562(match)
  1759029    1.509    0.000   24.687    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:145(__copy__)
  1555742    2.393    0.000   15.974    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:545(get_terminals_info)
  2312124    1.152    0.000   14.318    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:202(__new__)
  3111484    7.513    0.000   13.219    0.000 /home/andrew/p/outlines/outlines/fsm/regex.py:619(get_sub_fsms_from_seq)
  2312124    1.477    0.000   13.166    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:213(_future_new)
  8160509    1.637    0.000   12.493    0.000 {built-in method __new__ of type object at 0x7fee64db5340}
  1759029    1.347    0.000   12.287    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:427(__copy__)
2309943/1154085    2.419    0.000   10.857    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/dataclasses.py:233(wrapper)
  3518058    5.254    0.000    8.725    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/copy.py:259(_reconstruct)
  2329614    1.714    0.000    8.284    0.000 /nix/store/mrp9s742bpjwv7lb3rv3ikv8qx72nj0d-python3.11-numba-0.59.1/lib/python3.11/site-packages/numba/core/dispatcher.py:724(typeof_pyval)
  1158121    1.975    0.000    7.087    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:362(feed_token)
  2329598    5.019    0.000    6.705    0.000 /home/andrew/p/outlines/outlines/fsm/regex.py:465(walk_fsm)
  2329882    1.531    0.000    6.388    0.000 /nix/store/mrp9s742bpjwv7lb3rv3ikv8qx72nj0d-python3.11-numba-0.59.1/lib/python3.11/site-packages/numba/core/typing/typeof.py:27(typeof)
  2329598    5.240    0.000    6.380    0.000 /home/andrew/p/outlines/outlines/fsm/regex.py:694(get_token_transition_keys)
  3111484    3.793    0.000    5.577    0.000 /home/andrew/p/outlines/outlines/fsm/regex.py:646(<genexpr>)
  1759029    2.521    0.000    5.514    0.000 /home/andrew/p/outlines/outlines/models/transformers.py:96(convert_token_to_string)
  1759029    2.159    0.000    4.904    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/copy.py:128(deepcopy)
14580583/11126799    1.989    0.000    4.371    0.000 {built-in method builtins.isinstance}
2330433/2329882    1.406    0.000    3.828    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/functools.py:904(wrapper)
  1759029    1.323    0.000    2.993    0.000 /nix/store/m7bq08w4hkvyby4s2w04pv6jjh4jk13l-python3.11-transformers-4.41.0/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py:618(convert_tokens_to_string)
   602706    1.057    0.000    2.707    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/exceptions.py:179(__init__)
 27668151    2.677    0.000    2.678    0.000 {method 'get' of 'dict' objects}
 12318040    2.217    0.000    2.217    0.000 {built-in method builtins.getattr}
  1726892    0.754    0.000    2.142    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/typing.py:1327(__instancecheck__)
  3518058    2.115    0.000    2.115    0.000 {method '__reduce_ex__' of 'object' objects}
  2330433    1.183    0.000    2.066    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/functools.py:818(dispatch)
  1154003    0.682    0.000    2.058    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:252(new_borrow_pos)
  3518058    1.236    0.000    1.626    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/copyreg.py:104(__newobj__)
   602706    1.090    0.000    1.584    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/exceptions.py:55(get_context)
  1759029    0.974    0.000    1.471    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:349(__init__)
  1759029    1.451    0.000    1.451    0.000 {method 'decode' of 'tokenizers.decoders.Decoder' objects}
  1759029    1.195    0.000    1.435    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/copy.py:243(_keep_alive)
  1726892    1.068    0.000    1.395    0.000 /home/andrew/p/outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:292(feed)
  1726892    0.946    0.000    1.388    0.000 /nix/store/4rf5qybw37b4lh1g0xczlv14sqdbmnpm-python3-3.11.9/lib/python3.11/typing.py:1602(__subclasscheck__)
  1154085    0.353    0.000    1.280    0.000 /home/andrew/p/outlines/outlines/fsm/parsing.py:867(get_contextual_lexer)
17163298/17163296    1.272    0.000    1.272    0.000 {built-in method builtins.len}
  4659212    1.141    0.000    1.141    0.000 /nix/store/mrp9s742bpjwv7lb3rv3ikv8qx72nj0d-python3.11-numba-0.59.1/lib/python3.11/site-packages/numba/core/serialize.py:30(_numba_unpickle)

Future Work

(TODO: Move these to issues)

Improvements

Context-sensitive features such as pythons tree parser

Currently tree parser isn't supported dottxt-ai#592

Allow CFG in outlines.serve

Remove Guide.is_final_state

is_final_state is ambiguous (dottxt-ai#885), in a separate PR we should remove is_final_state

Clean Up Dead Code

Remove

  • StopAtEosFSM
  • RegexFSM
  • Consider whether StopAtEOSGuide is useful anywhere

Bug Fixes

Ensure parser allows ambiguous terminals

(e.g. ?start: /ab*/ /bc?/)

Improve performance

  • Add benchmarks for the first 10 tokens generated, and for the last 10 of 100
  • improve performance generally (see profile in "Benchmarks" section)

Incorrectly Over-Constrained

TODO

  • fix failing tests
  • Remove sql tests
  • outlines.generate.cfg
  • CFGLogitsProcessor
  • test_generate.py
  • CFGGuide doc-string

Notify these threads:

Separate PR

  • Introduce outlines.grammars.sql_select and tests
  • The following tested samples are disabled because they were impossible to generate despite the fact that they are valid generations under the lark grammar:
  • sql_select_select_minimal_lalr1.sql.test

dottxt-ai#636

dottxt-ai#633

outlines/fsm/guide.py Show resolved Hide resolved
outlines/fsm/guide.py Outdated Show resolved Hide resolved
outlines/fsm/guide.py Outdated Show resolved Hide resolved
outlines/fsm/parsing.py Outdated Show resolved Hide resolved
@@ -614,6 +617,8 @@ def __init__(self, conf: "LexerConf", states, always_accept=()):
lexer_conf.terminals = [
terminals_by_name[n] for n in accepts if n in terminals_by_name
]
if not lexer_conf.terminals:
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Enables returning EOS

token_history=lexer_state.last_token and [lexer_state.last_token],
state=parser_state,
terminals_by_name=self.root_lexer.terminals,
)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Fixes the following tests

  • Multiple Valid Continuations
  • Token is Substring of Another Token
  • Recursive Patterns

outlines/grammars.py Outdated Show resolved Hide resolved
tests/cfg_samples/arithmetic/lots_of_ops.arithmetic.test Outdated Show resolved Hide resolved
@w013nad
Copy link

w013nad commented Aug 2, 2024

Any updates on when this will be merged? Grammars via vLLM are completely broken atm.

@lapp0
Copy link
Owner Author

lapp0 commented Aug 12, 2024

@w013nad Please track dottxt-ai#1067

I'll follow up this week and see if we can get this merged.

@tens444
Copy link

tens444 commented Aug 29, 2024

Has this pr been available? I'd like to see the behavior of CFG via PartialLark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants