Skip to content

Issues: IBM/data-prep-kit

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Simplification of how users interact with the rep_removal transform enhancement New feature or request
#1007 opened Jan 31, 2025 by shahrokhDaijavad
1 of 2 tasks
[Bug] Update lang_id readme file with list of languages that it supports bug Something isn't working
#1005 opened Jan 31, 2025 by touma-I
1 of 2 tasks
[Feature] Update pdf-processing example enhancement New feature or request
#997 opened Jan 29, 2025 by sujee
2 tasks done
[Feature] how to find which DPK 'modules' are installed enhancement New feature or request
#996 opened Jan 29, 2025 by sujee
1 of 2 tasks
[Bug] Unable to access quay.io/dataprep1/data-prep-kit/doc_chunk-ray:latest bug Something isn't working
#995 opened Jan 29, 2025 by touma-I
2 tasks done
[Bug] Web2parquet fails on Windows bug Something isn't working
#990 opened Jan 28, 2025 by touma-I
1 of 2 tasks
[Bug] FDedup Fails on Windows bug Something isn't working
#989 opened Jan 28, 2025 by touma-I
1 of 2 tasks
[Bug] Wrong Ray cluster name bug Something isn't working
#988 opened Jan 28, 2025 by roytman
1 of 2 tasks
[Bug] The S3 secret name is hardcoded in the KFP library bug Something isn't working
#985 opened Jan 28, 2025 by roytman
2 tasks done
[Bug] FDedup failing with latest release mmh3==5.1.0 bug Something isn't working
#982 opened Jan 27, 2025 by touma-I
1 of 2 tasks
Bloom annotator implementation for GneissWeb data enhancement New feature or request sprint-feb-7
#981 opened Jan 27, 2025 by shahrokhDaijavad
2 tasks done
[KFP v2] Create ray cluster run id
#977 opened Jan 27, 2025 by revit13
[Bug] Error running Run_your_first_transform_colab.ipynb in colab. bug Something isn't working
#975 opened Jan 27, 2025 by echinmay
1 of 2 tasks
[Feature] data preprocessing code for finetuning enhancement New feature or request sprint-Jan31
#972 opened Jan 26, 2025 by PoojaHolkar
2 tasks done
[Bug] pdf2paruet fails on windows due to fcntl bug Something isn't working
#969 opened Jan 24, 2025 by touma-I
1 of 2 tasks
Supporting data access to hugging face data sets enhancement New feature or request
#964 opened Jan 23, 2025 by blublinsky
2 tasks done
[Bug] Fdedup (simpler API) transform does not return a success/error code bug Something isn't working
#957 opened Jan 21, 2025 by sujee
1 of 2 tasks
[Feature] update RAG-PDF example to use newer API enhancement New feature or request sprint-Jan31
#954 opened Jan 20, 2025 by sujee
2 tasks done
[Bug] spark images on mac m1 failing to start bug Something isn't working
#952 opened Jan 17, 2025 by daw3rd
1 of 2 tasks
[Bug] html2parquet/README.md link to sample notebook broken bug Something isn't working
#947 opened Jan 16, 2025 by sujee
2 tasks done
Error in running Ray version of pdf2parquet on Google Colab bug Something isn't working
#940 opened Jan 14, 2025 by shahrokhDaijavad
1 of 2 tasks
[Bug] Publishing KFP docker image fails bug Something isn't working
#936 opened Jan 14, 2025 by revit13
1 of 2 tasks
ProTip! Follow long discussions with comments:>50.