Skip to content

Code highlights

Juan Gabriel Griggio edited this page Nov 20, 2024 · 1 revision

This project's code has some parts that can be changed to modify specific functionality.

Gemini / LLM configuration

The file /utils/gemini_helper.py contains all the code referring to the usage of Gemini and the access to the LLM (Large Language Model) API. It contains functions like generate_dict, which returns the LLM's response parsed as a dictionary, generate_text_list, which returns the LLM's response parsed as a list of strings, and so on.

The Gemini model used can be changed in the following line:

self.model = genai.GenerativeModel('gemini-1.5-flash')

Parameters send to the Gemini API regarding generation can be changed by changing the following variables: RETRIES, GENERATION_CONFIG, SAFETY_SETTINGS, TIME_INTERVAL_BETWEEN_REQUESTS, TIME_INTERVAL_IF_QUOTA_ERROR, TIME_INTERVAL_IF_GEMINI_ERROR.

Be mindful that if the Gemini model used is changed, usage quota may differ (you might be able to perform more or less requests per minute) and therefore might need to add a pause between requests (changing the variable TIME_INTERVAL_BETWEEN_REQUESTS).

Prompts configuration

Prompts can be adjusted depending on your particular needs. For example, you can give Gemini instructions to generate content for product names, for flight destinations, university degrees or whatever you might want, and adjusting the prompts accordingly will yield better results.

Inside the folder /prompts, there are 3 files: prompts_en.py, prompts_es.py and prompts_pt.py. These files are the ones that need to be modified to change the prompts (it is only needed to change the one in the desired language).

To modify the prompts, make sure that they comply with the following structure. Refer to one of the prompts scripts to see how it is implemented.

  • prompts_XX: REPLACE XX with 'en', 'es' or 'pt' accordingly
    • ASSOCIATION: Prompts that will be used to find relationships between the term and the trend.
      • WITHOUT_DESCRIPTIONS: If neither the term and the trend have descriptions. String parameters required: term, associative_term
      • WITH_BOTH_DESCRIPTIONS: If both the term and the trend have descriptions. String parameters required: term, term_description, associative_term, associative_term_description
      • WITH_TERM_DESCRIPTION: If only the term has description. String parameters required: term, term_description, associative_term
      • WITH_ASSOCIATIVE_TERM_DESCRIPTION: If only the trend has description. String parameters required: term, associative_term, associative_term_description
      • COMMON_PART: Final part of the prompt that is used to find association, will be shared among the previous 4 cases and must make gramatical sense. String parameters required: term, associative_term
    • GENERATION: Prompts that will be used to generate content.
      • WITH_ASSOCIATIVE_TERM: If two terms (for example, product and trend) are used
        • WITHOUT_RELATIONSHIP_AND_DESCRIPTIONS: If neither have descriptions and they don't have a relationship (if forced to generate content in the HTTP request). String parameters required: n, length, term, associative_term
        • WITHOUT_DESCRIPTIONS': If neither have descriptions. String parameters required: n, length, term, associative_term, association_reason, company
        • WITH_TERM_DESCRIPTION: If the main term has description. String parameters required: n, length, term, term_description, associative_term, association_reason, company
        • WITH_ASSOCIATIVE_TERM_DESCRIPTION: If the associative term (trend) has description. String parameters required: n, length, term, associative_term, associative_term_description, association_reason, company
        • WITH_BOTH_DESCRIPTIONS: If both terms have descriptions. String parameters required: n, length, term, term_description, associative_term, associative_term_description, association_reason, company
      • WITHOUT_ASSOCIATIVE_TERM: If only one main term is used (there is no second term to associate with, meaning not trying to find relationships with trends)
        • WITH_DESCRIPTION: If the term has description. String parameters required: n, length, term, term_description, company
        • WITHOUT_DESCRIPTION: If the term does not have description. String parameters required: n, length, term, company
      • PATHS_WITHOUT_TERM_DESCRIPTION: Prompt to generate paths if term has no description. String parameters required: n, term
      • PATHS_WITH_TERM_DESCRIPTION: Prompt to generate paths if term has description. String parameters required: n, term, term_description
    • SIZE_ENFORCEMENT: Prompt to shorten generated text if longer than the allowed limit. String parameters required: max_length, copy

Output format configuration

Even after telling Topic Mine in the request what output format is wanted (Google Ads, SA360 or DV360), it might still be needed to tweak it a bit. For that, there are a few scripts that can be changed.

SA360

The file /output_writers/sa360_feed_destination.py contains the script that formats the output.

The variable FEED_HEADERS contains the headers that will be present in the output, and the function _append_columns_to_feed_row generates a list of rows with the data. Changing this will change how the data is written in the output.

DV360

The file /output_writers/dv360_feed_destination.py contains the script that formats the output.

The function __generate_feed contains the code that generates both the headers and the output. Changing this will change how the data is written in the output.