Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Performance and Accuracy Testing Script for FertiScan Pipeline #32

Merged
merged 20 commits into from
Oct 16, 2024

Conversation

Endlessflow
Copy link
Contributor

@Endlessflow Endlessflow commented Sep 12, 2024

This PR introduces a basic performance and accuracy testing framework for the FertiScan pipeline as outlined in Issue #18. The key features include the ability to measure the end-to-end execution time of the pipeline, evaluate accuracy using Levenshtein similarity, and generate structured reports in CSV format.

My apologies @k-allagbe for the long PR. I will try to summarize the most important information bellow.

Key Changes:

  1. TestCase Class:

    • A class responsible for individual performance and accuracy tests.
    • Measures the time taken by the pipeline and compares the actual output against expected output using Levenshtein similarity.
    • Saves actual JSON output for debugging or future comparison. (I actually am hesitant on if this is really useful - looking for feedback)
  2. TestRunner Class:

    • A class responsible to executes a suite of test cases in a single run.
    • Generates a structured CSV report after running the test suite with the following fields: Test Case, Field Name, Accuracy Score, Expected Value, Actual Value, Pass/Fail, and Pipeline Speed (seconds).
  3. Accuracy Calculation:

    • Implemented basic accuracy assessment using Levenshtein similarity between the expected and actual output fields.
    • Configurable global accuracy threshold (set to 80%).
  4. Performance Reporting:

    • Measures the total execution time for the pipeline from end-to-end.
  5. Output Handling: ( I used it for debugging mostly idk if it's pertinent to keep - looking for feedback)

    • Actual output is saved in test_outputs, allowing for a side-by-side comparison with expected output JSON.

How to Test:

  1. Prepare test data under the test_data/labels folder with images and expected_output.json files. (follow structure in the README found in the test_data/labels folder).
  2. Run the script using python performance_test.
  3. After execution, check the reports folder for a CSV report that details test case performance and accuracy.

Example CSV Report:

Test Case ,Field Name                      ,Accuracy Score ,Expected Value                                        ,Actual Value                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ,Pass/Fail ,Pipeline Speed (seconds)
        1 ,company_name                    ,         12.50 ,Stoller Enterprises Inc.                              ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Fail      ,                  7.1083
        1 ,company_address                 ,          3.92 ,"9090 Katy Freeway, suite 400, Houston, TX 77024 USA" ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Fail      ,                  7.1083
        1 ,company_website                 ,        100.00 ,                                                      ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Pass      ,                  7.1083
        1 ,company_phone_number            ,          0.00 ,1-800-539-5283 ou 713-461-1493                        ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Fail      ,                  7.1083
        1 ,manufacturer_name               ,         96.00 ,Stoller Enterprises Inc.                              ,"Stoller Enterprises, Inc."                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ,Pass      ,                  7.1083
        1 ,manufacturer_address            ,         96.08 ,"9090 Katy Freeway, suite 400, Houston, TX 77024 USA" ,"9090 Katy Freeway, Suite 400 Houston, TX 77024 USA"                                                                                                                                                                                                                                                                                                                                                                                                                                           ,Pass      ,                  7.1083
        1 ,manufacturer_website            ,        100.00 ,                                                      ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Pass      ,                  7.1083
        1 ,manufacturer_phone_number       ,         96.67 ,1-800-539-5283 ou 713-461-1493                        ,1-800-539-5283 or 713-461-1493                                                                                                                                                                                                                                                                                                                                                                                                                                                                 ,Pass      ,                  7.1083
        1 ,fertiliser_name                 ,        100.00 ,Balancer                                              ,Balancer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ,Pass      ,                  7.1083
        1 ,registration_number             ,        100.00 ,2012063B                                              ,2012063B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       ,Pass      ,                  7.1083
        1 ,lot_number                      ,        100.00 ,                                                      ,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               ,Pass      ,                  7.1083
...        

Suggested Next Steps

  • Automated CI Integration: The framework can be integrated into a CI/CD pipeline to run tests automatically after new updates.
  • Semantic Similarity: The framework should use a test to check the semantic similarity of fields.
  • Data aggregation and visualisation: The framework should eventually be expended to include a way to aggregate data of the results across dozen of test cases and visualise the results.

closes #18

Endlessflow and others added 6 commits August 27, 2024 04:18
This commit adds a basic devcontainer configuration and implements a simple naive framework for performance testing.
This commit adds a basic devcontainer configuration and implements a simple naive framework for performance testing.
@Endlessflow Endlessflow linked an issue Sep 12, 2024 that may be closed by this pull request
6 tasks
Endlessflow and others added 5 commits September 12, 2024 03:14
* feat: add a new input field for the form signature for the json_schema

* feat: Update max_token value for gpt-4o model

The `max_token` value for the `gpt-4o` model in the `gpt.py` file was changed to `None`. This change allows for unlimited token length when making API calls with the `gpt-4o` model.

* feat: Update company information in expected.json

The company information in the `expected.json` file was updated to reflect the new details of GreenGrow Inc. This change includes the company name, address, website, and phone number.

* fix: remove newline at end of file in test_inspection.py

* chore: Update test_gpt.py with translated warranty information and nutrient values

* feat: Add translated nutrient values for ingredients in expected.json

The code changes include adding nutrient values for ingredients in the expected.json file. This enhancement improves the accuracy and completeness of the data. The commit message follows the established convention of using a "feat" prefix to indicate a new feature or enhancement.

* feat: Update nutrient values in expected.json

The code changes involve updating the nutrient values in the expected.json file. This improves the accuracy and completeness of the data. The commit message follows the established convention of using a "feat" prefix for new features or enhancements.

* feat: Update nutrient values in expected.json

* refactor: Refactor field validation in inspection.py

Refactor the field validation in the `GuaranteedAnalysis` and `FertilizerInspection` classes in `inspection.py`. The `replace_none_with_empty_list` methods have been updated to use the `field_validator` decorator instead of the `model_validator` decorator. This change improves the readability and maintainability of the code.
This commit adds a basic devcontainer configuration and implements a simple naive framework for performance testing.
performance_assessment.py Outdated Show resolved Hide resolved
pipeline/inspection.py Outdated Show resolved Hide resolved
…classes

Simplified the script by replacing classes with functions to reduce complexity and improve readability. The script now:

1. Loads environment variables.
2. Loads test cases (images and expected outputs) from the `test_data` folder.
3. Iterates through the test cases to run the pipeline and assess performance.
4. Compiles the results into a CSV file.

Consolidated trivial functions into larger ones with single responsibilities to make the code more maintainable. Updated type hints to use the latest built-in types.
… case handling for missing fields in `calculate_accuracy()`
performance_assessment.py Outdated Show resolved Hide resolved
performance_assessment.py Show resolved Hide resolved
@Endlessflow Endlessflow merged commit f4fd942 into main Oct 16, 2024
3 checks passed
@Endlessflow Endlessflow deleted the 18-implement-basic-performance-testing-framework branch October 16, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Basic Performance Testing Framework
3 participants