-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to use this framework for identifying image manipulation; without the treatment and control setting? #2
Comments
Yes - set the database, run the perturbation generator and then replace the generated folder with images from your manipulated dataset. At this point you can follow the steps outlined in the README to get the statistics and the figures. Hope that helps! |
Sure, in your case you would first need to make sure that your pool of images that are manipulated versions of the source images are named with the following convention: Then you would do: import numpy as np
import hashing
path_database = 'path/to/source/images/'
path_dataset = 'path/to/pool/of/manipulated/images/'
dataset = hashing.create_dataset(path_dataset, existing_attacks=True)
algos = [
# declare all algos you want
hashing.ClassicalAlgorithm('Phash', hash_size=8),
hashing.FeatureAlgorithm('ORB', n_features=30),
hashing.NeuralAlgorithm('SimCLR v1 ResNet50 2x', device='cuda', distance='Jensen-Shannon')
]
thresholds = [
# declare all thresholds you want
np.linspace(0, 0.4, 20),
np.linspace(0, 0.3, 20),
np.linspace(0.3, 0.8, 20),
]
# Creates the databases for each algos and record the time it took
databases, time_database = hashing.create_databases(algos, path_database)
general_output, image_wise_output, running_time = hashing.hashing(algos, thresholds, databases,
dataset, artificial_attacks=False) The However, this version will only provide you with true positives/false negatives (as I think this is what you are interested in). If you are also interested in true negatives/false positives, you will still need to split your images in experimental and control groups, as we did in our benchmarks. |
Thank you for the quick response @Cyrilvallez . The setup works in the case of above snippet. I am interested to identify true negatives, and false positives as well. In that case, I am still finding it difficult to understand the split of experimental / control groups ( I even read the paper) used in the benchmark. |
Well then the experimental group would be the manipulations of the source images (images that are supposed to be detected), and the control would be the noisy images (images not supposed to be detected). However, if both groups do not contain the exact same number of images, be aware that the statistics computed (accuracy, precision, etc...) are NOT normalized against the number of images in each group (as we always used the same number of images in each group). Thus it can be misleading in the case of (large) imbalance between both groups. Using this setup, you can simply follow all steps of the README using the experimental and control groups defined above as positive_dataset = hashing.create_dataset('path/to/experimental', existing_attacks=True)
negative_dataset = hashing.create_dataset('path/to/control', existing_attacks=True) |
Great work with the overall framework and setting up the benchmark.
The control and treatment setting confused me a lot.
Let's say I have a bunch of source images, and a pool of images that are manipulated versions of the images, with the corresponding ground truth for each source image.
For every source image, I want to benchmark the different algorithms on identifying the correct set of manipulated images. Is there a proper way to go about this?
Thanks
The text was updated successfully, but these errors were encountered: