Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Twitter demo #34

Open
tokee opened this issue Feb 10, 2017 · 1 comment
Open

Create Twitter demo #34

tokee opened this issue Feb 10, 2017 · 1 comment
Assignees
Labels

Comments

@tokee
Copy link
Owner

tokee commented Feb 10, 2017

Given a list of tweet-IDs from Twitter, tools such as https://github.com/docnow/twarc makes it easy to extract the tweets. Using this to extract image URL's seems straight forward, so it should be possible to automate the generation of an image collage given just the list of tweet-IDs. By keeping the map of IDs<->images, a link back to the originating tweet on Twitter can be used for metadata.

Things to consider:

  1. If multiple tweets points to the same image, should the image be shown once for each tweet or just once in total? The former can lead to hundreds of thousands of repetitions of the same image (see https://medium.com/on-archivy/exploring-womensmarch-dcc30221101c), while the latter "hides" the image among lesser-shared images and makes it problematic to provide links back to the originating tweets.
  2. Should the images be downloaded before collage creation or fetched on the fly by juxta? This is tied to Spaces in filenames does not work #1 as repeat images would be fetched over the net once for each repetition. Also, the current version of juxta is not geared towards fetching from the web and will be effectively blocked by an adversary image server that trickle-serves images one byte at a time.

Given this, the best solution seems to #1 repeat images in the collage and #2 download them before generating the collage.

@tokee tokee self-assigned this Feb 10, 2017
@tokee
Copy link
Owner Author

tokee commented Feb 15, 2017

This has been implemented in demo_twitter.sh, with the trade-offs as described above. I am not really sure that the best option is to repeat the duplicate images though. Maybe some sort of visual prioritization mechanism instead? Sorting by popularity? Changing border color? Permanent boxes around certain images?

Related to this, the de-duplication of images would only catch those with the same URL. This could be improved with checksums, but with re-sizing and re-compression of images on social media, even that does not guarantee that duplicates will be eliminated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant