Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exception when extracting text from pdf-file #123

Open
FroggieFrog opened this issue Jun 13, 2018 · 1 comment
Open

exception when extracting text from pdf-file #123

FroggieFrog opened this issue Jun 13, 2018 · 1 comment

Comments

@FroggieFrog
Copy link

I tried to use TikaOnDotNet, but it already fails in a very simple test-project (see attachment).
Is there anything I can do to make it work?

Some infos:
Os: Win 10 1803
Locale: de-de

The error message:

TikaOnDotNet.TextExtraction.TextExtractionException: Extraction of text from the file '...tikadotnet.pdf' failed. ---> TikaOnDotNet.TextExtraction.TextExtractionException: Extraction failed. ---> System.TypeInitializationException: Der Typeninitialisierer für "org.apache.tika.metadata.Metadata" hat eine Ausnahme verursacht. ---> System.InvalidCastException: Das Objekt des Typs "java.util.PropertyResourceBundle" kann nicht in Typ "sun.util.resources.OpenListResourceBundle" umgewandelt werden.

bei sun.util.resources.LocaleData.getCurrencyNames(Locale locale)

bei sun.util.locale.provider.LocaleResources.getCurrencyName(String key)

bei sun.util.locale.provider.CurrencyNameProviderImpl.getString(String , Locale )

bei sun.util.locale.provider.CurrencyNameProviderImpl.getSymbol(String currencyCode, Locale locale)

bei java.util.Currency.CurrencyNameGetter.getObject(CurrencyNameProvider , Locale , String , Object[] )

bei java.util.Currency.CurrencyNameGetter.getObject(LocaleServiceProvider , Locale , String , Object[] )

bei sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObjectImpl(LocalizedObjectGetter , Locale , Boolean , String , Object[] )

bei sun.util.locale.provider.LocaleServiceProviderPool.getLocalizedObject(LocalizedObjectGetter getter, Locale locale, String key, Object[] params)

bei java.util.Currency.getSymbol(Locale locale)

bei java.text.DecimalFormatSymbols.initialize(Locale )

bei java.text.DecimalFormatSymbols..ctor(Locale locale)

bei sun.util.locale.provider.DecimalFormatSymbolsProviderImpl.getInstance(Locale locale)

bei java.text.DecimalFormatSymbols.getInstance(Locale locale)

bei sun.util.locale.provider.NumberFormatProviderImpl.getInstance(Locale , Int32 )

bei sun.util.locale.provider.NumberFormatProviderImpl.getIntegerInstance(Locale locale)

bei java.text.NumberFormat.getInstance(LocaleProviderAdapter , Locale , Int32 )

bei java.text.NumberFormat.getInstance(Locale , Int32 )

bei java.text.NumberFormat.getIntegerInstance(Locale inLocale)

bei java.text.SimpleDateFormat.initialize(Locale )

bei java.text.SimpleDateFormat..ctor(String pattern, DateFormatSymbols formatSymbols)

bei org.apache.tika.utils.DateUtils.createDateFormat(String , TimeZone )

bei org.apache.tika.utils.DateUtils.loadDateFormats()

bei org.apache.tika.utils.DateUtils..ctor()

bei org.apache.tika.metadata.Metadata..cctor()

--- Ende der internen Ausnahmestapelüberwachung ---

bei org.apache.tika.metadata.Metadata..ctor()

bei TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\Stream\StreamTextExtractor.cs:Zeile 19.

--- Ende der internen Ausnahmestapelüberwachung ---

bei TikaOnDotNet.TextExtraction.Stream.StreamTextExtractor.Extract(Func`2 streamFactory, Stream outputStream) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\Stream\StreamTextExtractor.cs:Zeile 42.

bei TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](Func2 streamFactory, Func3 extractionResultAssembler) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:Zeile 85.

bei TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](String filePath, Func`3 extractionResultAssembler) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:Zeile 27.

--- Ende der internen Ausnahmestapelüberwachung ---

bei TikaOnDotNet.TextExtraction.TextExtractor.Extract[TExtractionResult](String filePath, Func`3 extractionResultAssembler) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:Zeile 31.

bei TikaOnDotNet.TextExtraction.TextExtractor.Extract(String filePath) in C:\projects\tikaondotnet\src\TikaOnDotnet.TextExtractor\TextExtractor.cs:Zeile 17.

bei Test.Tika.Class1.Extract(String filePath) in Test.Tika\Test.Tika\Class1.cs:Zeile 16.

bei WindowsFormsApp1.Form1.button1_Click(Object sender, EventArgs e) in Test.Tika\WindowsFormsApp1\Form1.cs:Zeile 32.

Test.Tika.zip

@KevM
Copy link
Owner

KevM commented Jun 13, 2018

I think this a duplicate of #118. @chrisoverton91 Did you find a fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants