-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into fix-issue-11189-part-00-refactor-citation-re…
…lation-tab-logic
- Loading branch information
Showing
60 changed files
with
743 additions
and
604 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Journal Abbreviations | ||
|
||
## Preparation | ||
|
||
- Ensure that `buildres/abbrv.jabref.org` contains data. Otherwise, the submodules were not configured correctly. | ||
- Ensure that `build/resources/main/journals/journal-list.mv` exists. It is generated by gradle task `generateJournalListMV`, which uses `org.jabref.cli.JournalListMvGenerator`. | ||
|
||
## Where does the data come from? | ||
|
||
The generator uses all `.csv` files from <https://github.com/JabRef/abbrv.jabref.org/tree/main/journals>, but ignores following ones: | ||
|
||
```java | ||
Set<String> ignoredNames = Set.of( | ||
// remove all lists without dot in them: | ||
// we use abbreviation lists containing dots in them only (to be consistent) | ||
"journal_abbreviations_entrez.csv", | ||
"journal_abbreviations_medicus.csv", | ||
"journal_abbreviations_webofscience-dotless.csv", | ||
|
||
// we currently do not have good support for BibTeX strings | ||
"journal_abbreviations_ieee_strings.csv" | ||
); | ||
``` | ||
|
||
## Future work | ||
|
||
See <https://github.com/JabRef/jabref-issue-melting-pot/issues/41> |
64 changes: 64 additions & 0 deletions
64
docs/decisions/0043-show-merge-dialog-when-importing-a-single-pdf.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
--- | ||
nav_order: 43 | ||
parent: Decision Records | ||
--- | ||
# Show merge dialog when importing a single PDF | ||
|
||
## Context and Problem Statement | ||
|
||
PDF files are one of the main format for transferring various documents, especially scientific papers. However, by itself, | ||
PDF is like a picture, it contains commands solely for displaying the human-readable text, but it might not contain | ||
computer-readable metadata. | ||
|
||
To overcome these problems various heuristics and AI models are used to "convert" a PDF into a BibTeX entry. However, it | ||
also introduces a level of problems, as heuristics are not ideal: sometimes it works perfectly, but on others it generates | ||
random output. | ||
|
||
PDF importing in JabRef is done via `PdfImporter` abstract class and its descendants, and via `PdfMergeMetadataImporter`. | ||
`PdfImporter` is typically a single heuristics or method of extracting a `BibEntry` from PDF. `PdfMergeMetadataImporter` | ||
collects `BibEntry` candidates from all `PdfImporter`s and merges them automatically into a single `BibEntry`. | ||
|
||
The specific problem JabRef has: should JabRef automate all heuristics (automatically merge all `BibEntry`ies from | ||
several `PdfImporter`s) when importing PDF files or should every file be analysed thoroughly by users? | ||
|
||
## Decision Drivers | ||
|
||
* Option should provide a good-enough quality. | ||
* It is desired to have a fine-grained controls of PDF importing for power-users. | ||
|
||
## Considered Options | ||
|
||
* Automatically merge all `BibEntry` candidates from `PdfImporters`. | ||
* Open a merge dialog with all candidates. | ||
* Open a merge dialog with all candidates if a single PDF is imported. | ||
|
||
## Decision Outcome | ||
|
||
Chosen option: "Open a merge dialog with all candidates if a single PDF is imported", because comes out best (see below). | ||
|
||
## Pros and Cons of the Options | ||
|
||
### Automatically merge all `BibEntry` candidates from `PdfImporters` | ||
|
||
* Good, because minimal user interaction and disruption of flow. It also allows batch-processing. | ||
* Bad, because heuristics are not ideal, and it is even harder to develop a "smarter" merging algorithm. | ||
|
||
### Open a merge dialog with all candidates | ||
|
||
* Good, because allows for fine-grained import. Some correct field may be overridden by a wrong field from other importer, | ||
which is undesirable for power-users. | ||
* Bad, because it is a dialog. If lots of PDFs are imported, then there will be lots of dialogs, which might be | ||
too daunting to process manually. | ||
|
||
### Open a merge dialog with all candidates if a single PDF is imported | ||
|
||
Explanation: | ||
|
||
- If a single PDF is imported, then open a merge dialog. | ||
- If several PDFs are imported, merge candidates for each PDF automatically. | ||
|
||
Outcomes: | ||
|
||
* Good, because it combines the best of the other two options: Allow both for PDF batch-processing and for fine-grained control. | ||
|
||
<!-- markdownlint-disable-file MD004 --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
68 changes: 68 additions & 0 deletions
68
src/main/java/org/jabref/gui/externalfiles/PdfMergeDialog.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
package org.jabref.gui.externalfiles; | ||
|
||
import java.io.IOException; | ||
import java.nio.file.Path; | ||
import java.util.function.Supplier; | ||
|
||
import org.jabref.gui.mergeentries.MultiMergeEntriesView; | ||
import org.jabref.gui.preferences.GuiPreferences; | ||
import org.jabref.logic.importer.Importer; | ||
import org.jabref.logic.importer.ParserResult; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfContentImporter; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfEmbeddedBibFileImporter; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfGrobidImporter; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfImporter; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfVerbatimBibtexImporter; | ||
import org.jabref.logic.importer.fileformat.pdf.PdfXmpImporter; | ||
import org.jabref.logic.l10n.Localization; | ||
import org.jabref.logic.util.TaskExecutor; | ||
import org.jabref.model.entry.BibEntry; | ||
|
||
public class PdfMergeDialog { | ||
|
||
/** | ||
* Constructs a merge dialog for a PDF file. This dialog calls various {@link PdfImporter}s, collects the results, and lets the user choose between them. | ||
* <p> | ||
* {@link PdfImporter}s try to extract a {@link BibEntry} out of a PDF file, | ||
* but it does not perform this 100% perfectly, it is only a set of heuristics that in some cases might work, in others not. | ||
* Thus, JabRef provides this merge dialog that collects the results of all {@link PdfImporter}s | ||
* and gives user a choice between field values. | ||
* | ||
* @param entry the entry to merge with | ||
* @param filePath the path to the PDF file. This PDF is used as the source for the {@link PdfImporter}s. | ||
* @param preferences the preferences to use. Full preference object is required, because of current implementation of {@link MultiMergeEntriesView}. | ||
* @param taskExecutor the task executor to use when the multi merge dialog executes the importers. | ||
*/ | ||
public static MultiMergeEntriesView createMergeDialog(BibEntry entry, Path filePath, GuiPreferences preferences, TaskExecutor taskExecutor) { | ||
MultiMergeEntriesView dialog = new MultiMergeEntriesView(preferences, taskExecutor); | ||
|
||
dialog.setTitle(Localization.lang("Merge PDF metadata")); | ||
|
||
dialog.addSource(Localization.lang("Entry"), entry); | ||
dialog.addSource(Localization.lang("Verbatim"), wrapImporterToSupplier(new PdfVerbatimBibtexImporter(preferences.getImportFormatPreferences()), filePath)); | ||
dialog.addSource(Localization.lang("Embedded"), wrapImporterToSupplier(new PdfEmbeddedBibFileImporter(preferences.getImportFormatPreferences()), filePath)); | ||
|
||
if (preferences.getGrobidPreferences().isGrobidEnabled()) { | ||
dialog.addSource("Grobid", wrapImporterToSupplier(new PdfGrobidImporter(preferences.getImportFormatPreferences()), filePath)); | ||
} | ||
|
||
dialog.addSource(Localization.lang("XMP metadata"), wrapImporterToSupplier(new PdfXmpImporter(preferences.getXmpPreferences()), filePath)); | ||
dialog.addSource(Localization.lang("Content"), wrapImporterToSupplier(new PdfContentImporter(), filePath)); | ||
|
||
return dialog; | ||
} | ||
|
||
private static Supplier<BibEntry> wrapImporterToSupplier(Importer importer, Path filePath) { | ||
return () -> { | ||
try { | ||
ParserResult parserResult = importer.importDatabase(filePath); | ||
if (parserResult.isInvalid() || parserResult.isEmpty() || !parserResult.getDatabase().hasEntries()) { | ||
return null; | ||
} | ||
return parserResult.getDatabase().getEntries().getFirst(); | ||
} catch (IOException e) { | ||
return null; | ||
} | ||
}; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.