Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP Auto tagger #2503

Closed
wants to merge 77 commits into from
Closed

Conversation

cpefatimaabdillahi
Copy link

This pull request involves several changes across eight files:

  1. superdesk/publish/formatters/imatrics.py
  2. superdesk/publish/formatters/ninjs_formatter.py: Enhanced formatting of controlled vocabulary items and merged various entities into subjects for a NINJS response.
  3. superdesk/publish/formatters/semaphore.py: Created a new formatter for Semaphore integration.
  4. superdesk/publish/transmitters/imatrics.py
  5. superdesk/publish/transmitters/semaphore.py: Introduced a new transmitter for Semaphore integration.
  6. superdesk/text_checkers/ai/init.py: Revised article validation and processing for AI services.
  7. superdesk/text_checkers/ai/imatrics.py
  8. superdesk/text_checkers/ai/semaphore.py: Expanded image search capabilities and updated concept formatting.

These changes focus on integration enhancements and more refined article transformation logic across different services and formatters in Superdesk.

Variables are stored in our environment file. Please let us know if we should we move these variables to Superdesk's config file.

tcp-bhargav and others added 30 commits August 11, 2023 15:44
V1: Added Code for Imatric to use Semaphore API.
V1: Syntax error addressed.
Added More Logging to TroubleShoot.
V2: Added Logging
V2: Added Logging
inconsistency in the use of tabs and spaces
Added More Logging and updated code to transform input before processing.
Added some prints to check
Added Semaphore.py to use for Semaphore API
Updated environment variables
Returning errors to front end
Created Semaphore.py in transmitter
Created Semaphore.py Formatter file
added env variables to code
Updated api_key
Changed much part of the code.
exception bug fix
Trying to run the build
analyzed_data = service.analyze(item, doc.get("tags"))

removed tags
More Bug Fix
Made some changes for methods of transforming xml and jsons.
bug fixes and more loggings added
Trying to run the working Semaphore Code.
Updated the semaphore api key
Added Headline Tag generation to Semaphore.py
Added Slugline Functionality
Added Code for Slugline and Showing tags with ID coming from Semaphore.
Added code to show Heirarchy in Tags and added backend code for Search Tag Responses.
Updated init to take in searchString.
Added Code to get entities in Ninjs Response.
Added Code to fetch Parent Tags.
Copy link
Member

@petrjasek petrjasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pls reset the changes done to imatrics?

logger.error(data)


self.output = self.analyze(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this should be called in init, seems like it will be called with the application object and not an article anyhow




self.session = requests.Session()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this is not used later, so you can avoid that or use it later instead of the session defined outside of the class

query = qcode
parent_url = self.get_parent_url+query+frank

response = requests.get(parent_url, headers=headers)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to use session everywhere


# Embed the 'body_html' into the XML template
xml_output = xml_template.format(headline,headline_extended,body_html,slugline)
xml_output = clean_html_content(xml_output)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use get_text helper to convert html to text

Only change not done is using get_helper to convert from html to text as it did not work with the API request.
Copy link
Member

@petrjasek petrjasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general we plan to move the code to superdesk-cp repo, but we still need to get rid of those changes in the non semaphore files I commented on, some things like changing the text_checkers schema we can keep

@@ -88,7 +90,8 @@ def create(self, docs, **kwargs):
except KeyError:
raise SuperdeskApiError.notFoundError("{service} service can't be found".format(service=service))

analyzed_data = service.analyze(item, doc.get("tags"))
# analyzed_data = service.analyze(item, doc.get("tags"))
analyzed_data = service.analyze(item)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would need the previous version here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to do those changes in a custom formatter, either that Semaphore one or some CP specific NINJS.
I think those changes would break some existing integrations like with newshub

@petrjasek
Copy link
Member

hi @cpefatimaabdillahi I've pushed the changes from the PR to superdesk/superdesk-cp#189 , can you pls continue there? btw was trying to run the code locally but was getting some 500 error from semaphore api

Added Changed Name and Type to the Formatter
@petrjasek petrjasek closed this Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants