You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello @lucaskjaero,
I have a project similar to yours where I've implemented some Chinese character recognition models using the CASIA data sets. For my project, I've similarly used the CASIA competition GNT files, but I believe it should be easier to build performant models on the HWDB1.X and OLHWDB1.X data sets because they are five times larger. Unfortunately, those data sets use a different file format MPF. Do you have any idea how to process these files using Python?
Hi @brucegarro,
I see there's a file specification here. You can read these files in python as strings of binary format using the struct library. In this project, I do this here, which hopefully is a decent example.
Let me know if that helps -- I can see about implementing it here if it doesn't.
Best,
Lucas
Hello @lucaskjaero,
I have a project similar to yours where I've implemented some Chinese character recognition models using the CASIA data sets. For my project, I've similarly used the CASIA competition GNT files, but I believe it should be easier to build performant models on the HWDB1.X and OLHWDB1.X data sets because they are five times larger. Unfortunately, those data sets use a different file format MPF. Do you have any idea how to process these files using Python?
Datasets:
http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
My Project:
https://github.com/brucegarro/chinese-character-recognition
The text was updated successfully, but these errors were encountered: