-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonl broken, will only read as json #17
Comments
something that can take normal jsonl like gpt would be great, where I can essentially transcribe a show and have the ai take on the personality of a character but have full context of an episode. such as {"messages": [{"role": "user", "content": "text text text"}, {"role": "assistant", "content": "text text"}, {"role": "user", "content": "text text"}, |
Jsonl 'works', but the extension needs it to be formatted incorrectly. Wrap the whole thing like an array (e.g.[]) and add commas at the end of all but the last line and it'll work. To clarify, the correct format for jsonl looks like this:
Whereas right now Training_PRO expects:
|
ah gotcha. though i do notice it will now give the error meaning it can't take like a script and format it, can we just modify this template? Edit: |
any time I try to use the JSONL I get this error
03:29:19-716909 INFO Loading JSONL datasets...
Traceback (most recent call last):
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/queueing.py", line 407, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/route_utils.py", line 226, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1550, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/blocks.py", line 1199, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 519, in async_iteration
return await iterator.anext()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 512, in anext
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/site-packages/gradio/utils.py", line 649, in gen_wrapper
yield from f(*args, **kwargs)
File "/media/cher/brains/text-generation-webui/extensions/Training_PRO_wip/script.py", line 466, in check_dataset
loaded_JSONLdata = json.load(dataFile)
^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/json/init.py", line 293, in load
return loads(fp.read(),
^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/cher/brains/text-generation-webui/installer_files/env/lib/python3.11/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 4268)
it's loading all jsonl as json? so the next lines will always cause an error. this seems to be with every model. I've tried so far
GPT2, mistral and lmsys_vicuna
The text was updated successfully, but these errors were encountered: