Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFound #49

Open
axaygaid opened this issue Jul 14, 2020 · 15 comments
Open

FileNotFound #49

axaygaid opened this issue Jul 14, 2020 · 15 comments

Comments

@axaygaid
Copy link

Hello guys, i have i think a simple problem : when i launch test_iam_dataset i have this error :

FileNotFoundError: [Errno 2] File /home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt does not exist: '/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/../../dataset/iamdataset/subject/trainset.txt'

I don't know what kind of file is it

If someone has an idea, thank's a lot !

@jonomon
Copy link
Contributor

jonomon commented Jul 14, 2020

Hi @axaygaid
Thank you for bringing this to my attention.

Could you try to replace to following cell:

ds = IAMDataset("word", output_data="text")
plot_image_with_text(ds)

to

ds = IAMDataset("word", output_data="text", root="../../dataset/iamdataset")
plot_image_with_text(ds)

and see if it solves your issue?

@axaygaid
Copy link
Author

Hi jonomon;

thank's for the answer, it didn't work, same issue ..
I think the trainset.txt file can't be downloaded, don't know why... i download the different library manually (security problem)
then i put it in the dataset/iamdataset folder so i have all the iamdataset but not the other file as trainset i think ?

@jonomon
Copy link
Contributor

jonomon commented Jul 14, 2020

You should place trainset.txt (as well as testset.txt etc.) in data/iamdataset/subject.

Please let me know if this works.

@axaygaid
Copy link
Author

but what kind of data i have to put in the two txt file ? because i tried it before and the error was :

EmptyDataError: No columns to parse from file

So i have to put some data on it

thank's !

@jonomon
Copy link
Contributor

jonomon commented Jul 14, 2020

You should download the files here http://www.fki.inf.unibe.ch/DBs/iamDB/tasks/largeWriterIndependentTextLineRecognitionTask.zip

@axaygaid
Copy link
Author

hey jonomon

thank's for the help, it was helpful... now running : test_ds = IAMDataset("form_original", train=False) "works" but when i try to plot an image i have nothing, like nothing is read ? and when i try i simple : len(ds) to check if there is something in it and it returns just 0... i'm checkin on the source if something is missing in my setting but if someone has any idea...

thank's a lot!

@jonomon
Copy link
Contributor

jonomon commented Jul 15, 2020

Hi @axaygaid,

It is hard for me to debug the issue without any information.
What is the contents of data/iamdataset?

@axaygaid
Copy link
Author

Hi @jonomon

A simple example is that when i run :
ds = IAMDataset("word", output_data="text") *
that give this : <_io.TextIOWrapper name='/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/ocr/utils/credentials.json' mode='r' encoding='UTF-8'>
so it's the good path
and
len(ds)
the output is 0
and in iamdataset folder : `os.listdir("/home/roo/sf_workspace/Image Médecine douce/handwritting model/handwritting notebook/dataset/IAMDataset/") :

['.ipynb_checkpoints',
'ascii.gz',
'forms.txt',
'formsA-D.tgz',
'formsE-H.tgz',
'formsI-Z.tgz',
'image_data-form_original-text0.plk',
'image_data-form_original-text1.plk',
'image_data-form_original-text2.plk',
'image_data-form_original-text3.plk',
'image_data-word-text0.plk',
'image_data-word-text1.plk',
'image_data-word-text2.plk',
'image_data-word-text3.plk',
'largeWriterIndependentTextLineRecognitionTask.zip',
'lines.tgz',
'lines.txt',
'sentences.tgz',
'sentences.txt',
'subject',
'untitled.txt',
'words.tgz',
'words.txt',
'xml',
'xml.tgz']

i don't if it's clear now ? ><

@jonomon
Copy link
Contributor

jonomon commented Jul 16, 2020

It seems like the contents is missing a bunch of folders. See the example here.
image

The IAMDataset class should automatically download the IAM dataset and process the files. Was there something wrong with that step?

@axaygaid
Copy link
Author

i download the different dataset (word, forms etc..) manually because i have a protection, i can't download directly big file such as IAMDataset, that's why the processing is not made i think but after the download, i extract every file and put in folder (all the .png of the form in form...) i thounght it could be enough

image

that's the iamdataset folder ... maybe i have to preprocess by myself if it doesn't work ... i have the same problem, i mean that the pipeline doesn't recognize the different picture :/

Thank's if you have an idea,.. :)

@jonomon
Copy link
Contributor

jonomon commented Jul 16, 2020

So you are not using the IAMDataset?
If that's the case, you would have to customise the Gluon Dataset to your dataset.

This documentation provides information for it https://mxnet.apache.org/api/python/docs/tutorials/packages/gluon/data/datasets.html

@mahin003
Copy link

If anybody executed it on Google colab ,please sharethe edited iam_dataset.py it with me , [email protected]

@JPremnath06
Copy link

If anybody executed it on Google colab ,please sharethe edited iam_dataset.py it with me , [email protected]

Please share the iam_dataset.py file with me. (to use in colab). [email protected]

@sambbhavgarg
Copy link

sambbhavgarg commented Feb 2, 2021

Hey @jonomon, first off, thanks a lot for this repo.

There seems to be an issue in accessing the largeWriterIndependentTextLineRecognitionTask.zip file at http://www.fki.inf.unibe.ch/DBs/iamDB/tasks/largeWriterIndependentTextLineRecognitionTask.zip (E404)

Could you update the latest link in the code/point us to the file so it can be downloaded manually?

Thanks,
Sambbhav

@jonomon
Copy link
Contributor

jonomon commented Feb 2, 2021

Hi @sambbhavgarg
You can download it here https://fki.tic.heia-fr.ch/static/zip/largeWriterIndependentTextLineRecognitionTask.zip.

Regards,
Jonathan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants