-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Update README for concept-linking Change structure of requirements for entire project * Fully integrated prompt-engineering solution in concept linking to new shared repo. * Integrated UntrainedSpacy solution Changed gitignore Refactored some duplicate code for untrainedspacy and promptEng sol. * Integrated StringComparison solution * Update README for concept-linking Change structure of requirements for entire project * Fully integrated prompt-engineering solution in concept linking to new shared repo. * Integrated UntrainedSpacy solution Changed gitignore Refactored some duplicate code for untrainedspacy and promptEng sol. * Integrated StringComparison solution * Minor bugfixing * Fixed missing output.json * Test requirements fix * Test requirements fix 2 * Fix empty folders not being committed * Added support for outputting sentence when doing test_run * ML Solution refactor * ML Solution refactor * Deleted obsolete test files * Evaluation files * Fixed small label error * Fixed ont mapping mistakes * Evaluation Script * Generated output for Promptbased * Generated output for Promptbased * ML UnitTest * ML UnitTest * Minor fixes * String comparison test (#27) * String comparison test * Fixed requirements * Update test-and-build.yml * Added error handling for no triples generated. * Changed python 3.12 to 3.11 * Fixed test_server.py for concept_linking. --------- Co-authored-by: denBruneBarone <[email protected]> Co-authored-by: Vi Thien Le <[email protected]> Co-authored-by: Gamma <[email protected]> Co-authored-by: Mikkel Wissing <[email protected]>
- Loading branch information
1 parent
716049f
commit 140cff6
Showing
56 changed files
with
24,737 additions
and
23,113 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,61 +1,126 @@ | ||
# D: Preproscessing Layer - Concept Linking | ||
# Concept Linking | ||
|
||
Repository of group D in KNOX pipeline. | ||
--- | ||
## Background | ||
In group D, four different solutions have been implemented. These will be mentioned further in a later section. | ||
At default the solution that will be running is PromptEngineering. | ||
|
||
## Description | ||
To change which solution to run in the pipeline perform the following changes... | ||
|
||
This repository is for type classification of already provided sententes with given entity mentions. Several different solutions were created, in order to find the best one. | ||
First change directory to the 'server' folder in the root directory. | ||
|
||
### Dependencies | ||
Next, open the server.py file. | ||
On line 24, under the text "#Begin ConceptLinking" | ||
Change the code, to run the desired solution. | ||
|
||
- Python | ||
- PIP | ||
- Git | ||
|
||
### Installing | ||
## Requirements | ||
For any of the four solutions, it is necessary to install the requirements found | ||
in the requirements.txt file inside /{Solution_Name}/requirements.txt | ||
|
||
However since this is a joint codebase for both relation-extraction and concept-linking, | ||
there is a global requirements.txt file in the root directory. | ||
It follows a nested structure, meaning that installing only to one if the root folder, | ||
will install all the rest. | ||
|
||
It will install both the necessary requirements for both groups' solutions. | ||
However since it is possible to change which of the four concept-linking solutions to run, | ||
it is also necessary to the requirements to be installed accordingly. | ||
This is done by navigation to | ||
|
||
``` | ||
git clone https://github.com/Knox-AAU/PreprocessingLayer_Concept-Linking | ||
./concept_linking/requirements.txt | ||
``` | ||
In this file, there is listed a reference to the four different requirements.txt files. | ||
Remove the #(comment) from the one referencing the solution you want to run. | ||
|
||
### Example | ||
Install the requirements for the PromptEngineering solution | ||
|
||
### Initial Setup | ||
Navigate to the following directory | ||
|
||
- Navigate to root folder | ||
- Run the following command for installing all requirements: | ||
``` | ||
../concept_linking/solutions/PromptEngineering/ | ||
``` | ||
|
||
And run the following command | ||
``` | ||
pip install -r requirements.txt | ||
``` | ||
|
||
### Adding modules | ||
## Solutions | ||
|
||
- Navigate to root folder. | ||
- Run the following command to add all installed modules: | ||
Following below is brief description of each of the four solutions and how to get started. | ||
|
||
--- | ||
|
||
``` | ||
pip freeze > requirements.txt | ||
``` | ||
|
||
### Executing program | ||
### Machine Learning | ||
description WIP | ||
|
||
- Navigate to main.py in Program directory | ||
- Run main.py with Python | ||
### Prompt Engineering | ||
Uses the LLM Llama2. A prompt is given to the model. | ||
|
||
``` | ||
python .\program\main.py | ||
prompt_template = { | ||
"system_message": ("The input sentence is all your knowledge. \n" | ||
"Do not answer if it can't be found in the sentence. \n" | ||
"Do not use bullet points. \n" | ||
"Do not identify entity mentions yourself, use the provided ones \n" | ||
"Given the input in the form of the content from a file: \n" | ||
"[Sentence]: {content_sentence} \n" | ||
"[EntityMention]: {content_entity} \n"), | ||
"user_message": ("Classify the [EntityMention] in regards to ontology classes: {ontology_classes} \n" | ||
"The output answer must be in JSON in the following format: \n" | ||
"{{ \n" | ||
"'Entity': 'Eiffel Tower', \n" | ||
"'Class': 'ArchitecturalStructure' \n" | ||
"}} \n"), | ||
"max_tokens": 4092 | ||
} | ||
``` | ||
|
||
or: | ||
The variables {content_sentence} and {content_entity} is found in a previous part of the KNOX pipeline. | ||
The variable {ontology_classes} fetched by the Ontology endpoint provided by group E(Database Layer) | ||
|
||
|
||
#### Using LocalLlama API server | ||
It is possible to use a local LlamaServer. It can be found in ../concept_linking/tools/LlamaServer. | ||
A README for setting up an instance of this server can be found in the directory given above. | ||
|
||
#### Using the Llama API server hosted in the KNOX pipeline | ||
WIP | ||
Go to the directory /concept_linking/PromptEngineering/main | ||
set the api_url accordingly | ||
``` | ||
api_url={domain or ip+port of llama server hosted in the knox pipeline} | ||
``` | ||
cd .\program\ | ||
python .\main.py | ||
``` | ||
Refer to the <a href="https://docs.google.com/spreadsheets/d/1dvVQSEvw15ulNER8qvl1P8Ufq-p3vLU0PswUeahhThg/edit#gid=0" target="_blank">Server Distribution document</a> | ||
for specific dns and ip+port information. | ||
|
||
## Report | ||
### String Comparison | ||
description WIP | ||
|
||
Description of the project can be found in the report on [Overleaf](https://www.overleaf.com/project/65000513b10b4521e8907099) (requires permission) | ||
|
||
## Authors | ||
### Untrained Spacy | ||
description WIP | ||
|
||
|
||
|
||
--- | ||
|
||
## Tools | ||
|
||
### LlamaServer | ||
|
||
Lucas, Gamma, Vi, Mikkel, Caspar & Rune | ||
### OntologyGraphBuilder | ||
|
||
--- | ||
|
||
## Report | ||
Description of the project can be found in the report on Overleaf (requires permission) | ||
|
||
## Authors | ||
Lucas, Gamma, Vi, Mikkel, Caspar & Rune |
Oops, something went wrong.