Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto tagger #191

Merged
merged 12 commits into from
Feb 8, 2024
Merged

Auto tagger #191

merged 12 commits into from
Feb 8, 2024

Conversation

cpefatimaabdillahi
Copy link
Collaborator

Hi,

  • Please add ninjs_formatter_2 to vocabularies in superdesk-core/apps/prepopulate/data_init/vocabularies.json
    Look for NINJSv3
    { "_id": "subscriber_types", "display_name": "Subscriber Types", "type": "unmanageable", "selection_type": "do not show", "items": [ {"is_active": true, "name": "All", "qcode": "all", "formats": [ {"name": "NINJS", "qcode": "ninjs"}, {"name": "NINJSv2", "qcode": "ninjs2"}, {"name": "NINJSv3", "qcode": "ninjs3"}, {"name": "NINJS FTP", "qcode": "ftp ninjs"}, {"name": "NewsML G2", "qcode": "newsmlg2"}, {"name": "Email", "qcode": "email"}, {"name": "Newsroom NINJS", "qcode": "newsroom ninjs"}, {"name": "Adobe InDesign (IDML)", "qcode": "idml"}, {"name": "iMatrics", "qcode": "imatrics"} ]}, {"is_active": true, "name": "Digital/Internet", "qcode": "digital", "formats": [ {"name": "NINJS", "qcode": "ninjs"}, {"name": "NINJSv2", "qcode": "ninjs2"}, {"name": "NINJSv3", "qcode": "ninjs3"}, {"name": "NINJS FTP", "qcode": "ftp ninjs"}, {"name": "NewsML G2", "qcode": "newsmlg2"}, {"name": "Email", "qcode": "email"}, {"name": "Newsroom NINJS", "qcode": "newsroom ninjs"}, {"name": "Adobe InDesign (IDML)", "qcode": "idml"}, {"name": "iMatrics", "qcode": "imatrics"} ]}, {"is_active": true, "name": "Wire/Paper", "qcode": "wire", "formats": [ {"name": "NITF", "qcode": "nitf"}, {"name": "Email", "qcode": "email"} ]} ], "init_version": 3 }

  • Added jimi_2 file please use this version

  • In cp_ninjs_formatter, it should use ninjs_formatter_2 (added it to the same directory) in the class not the current version from superdesk-core as we made changes to the main formatter

tcp-bhargav and others added 6 commits January 25, 2024 12:35
Added init function.
Updated type and name to JIMI XML 2
Added Code that writes back to KMM.
- Add subject tag group
- Filter by subject scheme
- Add error log and error display
- Fix service name in sendFeedback
Added NinjsFormatter2 Which we want to use.
Added New Jimi Formatter
server/cp/output/formatter/jimi.py Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this file needed in the repo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is needed to handle the different read/write functions of the Semaphore.py file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to only override the methods you need to change to avoid duplication

server/cp/output/formatter/ninjs_formatter_2.py Outdated Show resolved Hide resolved
added code for updated Associations and the suggested name reference.
Reverted Name back to JIMI
Added Code to write IPTC qcodes in Tagger output.
Added Exception Handling and Bug Fixes
@cpefatimaabdillahi
Copy link
Collaborator Author

cpefatimaabdillahi commented Jan 31, 2024

Hello @petrjasek , please see 6 new commits from Jan 30 and Jan 31.

We have added the whole ninjs_formatter_2 and if you could help us override the methods it would be great.

Note For Semaphore.py:
Semaphore.py has a new variable called index_file_path which refers to the path of a JSON file (Index.json) generated when we create an Item in Metadata Management in SuperDesk. The file should be located where Semaphore is running i.e in /superdesk/server or /superdesk/server/data.

Please determine the path where this metadata file will be and set it in the environment variable index_file_path in the env file. I will share both files with Gideon. Thanks.

@petrjasek
Copy link
Member

hi @cpefatimaabdillahi what's the structure of that file? would it make sense to store it as a vocabulary in db?

@petrjasek
Copy link
Member

and which of all those PRs have latest code @cpefatimaabdillahi ? we should probably close the other 2

@cpefatimaabdillahi
Copy link
Collaborator Author

@petrjasek 189 and 190 can be closed.

@cpefatimaabdillahi
Copy link
Collaborator Author

hi @cpefatimaabdillahi what's the structure of that file? would it make sense to store it as a vocabulary in db?

Hi @petrjasek,

Your suggestion makes sense. The JSON file we're discussing is nearly identical to the current vocabularies database, with the addition of semaphore codes necessary for mapping to IPTC codes. Would it be possible for you to adjust the script to read from the vocabularies database instead of a local file, which we've been using for testing? If so, could you please provide an estimate of the hours you anticipate this modification would take?

Thank you!

@petrjasek
Copy link
Member

hi, I think you can just add those files here to begin with @cpefatimaabdillahi , in the data folder, and I can try to make a vocabulary from it

@pauljkelly5
Copy link

pauljkelly5 commented Feb 7, 2024

FWIW, I uploaded the vocab file manually to UAT. It's an update to the Index CV

I added a new property ("semaphore_id") to the schema which is present in all the items. It functions as a mapping to another CV in the manner of "ap_subject"

@petrjasek petrjasek merged commit 51d18a1 into superdesk:auto-tagger Feb 8, 2024
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants