Skip to content

Latest commit

 

History

History
220 lines (170 loc) · 7.8 KB

README.md

File metadata and controls

220 lines (170 loc) · 7.8 KB

Coce

A cover image URLs cache exposing its content as a REST web service.

MIT License

In various softwares (ILS for example), Book (or other kind of resources) cover image is displayed in front of the resource. Those images are fetched automatically from providers, such as Google or Amazon. Providers propose web services to retrieve information on Books from an ID (ISBN for example).

With Coce, the cover images URL from various providers are cached in a Redis server. Client send REST request to Coce which reply with cached URL, or if not available in its cache retrieving them from providers. In its request, the client specify a providers order (for example aws,gb,ol for AWS, Google, and then Open Library): Coce send the first available URL.

Installation

  • Install and start a Redis server

  • Install node.js

  • Install node.js libraries. In Coce home directory, enter:

      npm install
    
  • Configure Coce operation by editing config.json. Start with provided config.json.sample file.

    • port - port on which the server respond
    • providers - array of available providers: gb,aws,ol
    • timeout - timeout in miliseconds for the service. Above this value, Coce stops waiting response from providers
    • redis - Redis server parameters:
      • host
      • port
      • timeout
    • gb - Google Books parameters:
      • timeout - timeout of the cached URL from Google Books
    • ol - Open Library parameters:
      • timeout - timeout of the cached URL from Open Library. After this delay, an URL is automatically removed from the cache, and so has to be re-fetched again if requested
      • imageSize - size of images: small, medium, large
    • aws - Amazon
      • imageSize - size of images: SmallImage, MediumImage, LargeImage
      • timeout - timeout when probing images url via direct http requests

Start

cd _Coce HOME_
node app.js

Deployment on a production server

By default, running Coce directly, there isn't any supervision mechanism, and Coce run as a multi-threaded single process (as any Node.js application). In production, it is necessary to transform Coce into a Linux service, with automatic start/stop, and supervision. Traditional Unix process supervision architecture could be used: Unix System V Init, runit, or [daemon](http://man7.org/linux/man- pages/man3/daemon.3.html).

A more Node.js approach is to utilise pm2 daemon process manager.

pm2 global installation (Debian/Ubuntu): sudo npm i -g pm2

You ask pm2 to use all available core for coce: pm2 start app.js --name coce -i max.

Monitoring of you daemons: pm2 monit

Program auto-startup:

cd _COCE_HOME_
pm2 start app.js --name coce -i max
pm2 save
pm2 startup
sudo env PATH=$PATH:/usr/bin /usr/local/lib/node_modules/pm2/bin/pm2 startup systemd -u your_user_name --hp _HOME_

Redis persistence

Coce book cover images are stored in Redis. By default, there is no data persistence. It means that if you restart your server, all the urls patienly collected will be lost.

By default, on a Debian/Ubuntu box, Redis saves its state in a file: /var/lib/redis/dump.rdb. You can backup this file, automatically with a cron job. To restore a backup, you just have to stop the Redis server, copy the file, and restart Redis. Something like that:

systemctl stop redis-server
cp path_name_to_backup/dump.rdb /var/lib/redis
systemctl start redis-server

Service usage

To get all cover images from Open Library (ol), Google Books (gb), and Amazon (aws) for several ISBN:

http://coce.server/cover?id=9780415480635,9780821417492,2847342257,9780563533191&provider=ol,gb,aws&all

This request returns:

{
  "2847342257": {
    "aws": "https://images-na.ssl-images-amazon.com/images/I/51LYLJRtthL._SL160_.jpg"
  },
  "9780563533191": {
    "ol": "https://covers.openlibrary.org/b/id/2520432-M.jpg",
    "gb": "https://books.google.com/books/content?id=OphMAAAACAAJ&printsec=frontcover&img=1&zoom=1",
    "aws": "https://images-na.ssl-images-amazon.com/images/I/412CFNG0QEL._SL160_.jpg"
  },
  "9780415480635": {
    "gb": "https://books.google.com/books/content?id=Yc30cofv4_MC&printsec=frontcover&img=1&zoom=1",
    "aws": "https://images-na.ssl-images-amazon.com/images/I/41HOtyaxTlL._SL160_.jpg"
  },
  "9780821417492": {
    "gb": "https://books.google.com/books/content?id=D5yimAEACAAJ&printsec=frontcover&img=1&zoom=1",
    "aws": "https://images-na.ssl-images-amazon.com/images/I/417jg7TjvYL._SL160_.jpg"
  }
}

Without the &all parameter, the same request returns first URL per ISBN, by provider order:

http://coce.server/cover?id=9780415480635,9780821417492,2847342257,9780563533191&provider=ol,gb,aws

returns:

{
  "2847342257": "https://images-na.ssl-images-amazon.com/images/I/51LYLJRtthL._SL160_.jpg",
  "9780563533191": "https://covers.openlibrary.org/b/id/2520432-M.jpg",
  "9780415480635": "https://books.google.com/books/content?id=Yc30cofv4_MC&printsec=frontcover&img=1&zoom=1",
  "9780821417492": "https://books.google.com/books/content?id=D5yimAEACAAJ&printsec=frontcover&img=1&zoom=1"
}

By adding a callback JavaScript function to the request, Coce returns its result as JSONP:

http://coce.server/cover?id=9780415480635,9780821417492,2847342257,9780563533191&provider=ol,gb,aws&callback=populateImg

return:

populateImg({"2847342257":"https://images-na.ssl-images-amazon.com/images/I/51LYLJRtthL._SL160_.jpg","9780563533191":"https://covers.openlibrary.org/b/id/2520432-M.jpg","9780415480635":"https://books.google.com/books/content?id=Yc30cofv4_MC&printsec=frontcover&img=1&zoom=1","9780821417492":"https://books.google.com/books/content?id=D5yimAEACAAJ&printsec=frontcover&img=1&zoom=1"})

Client-side usage

See sample-client.html for a Coce sample usage from JavaScript. It uses coceclient.js module, which is use like this:

// isbns is an array of ISBNs
var coceClient = new CoceClient('http://coceserver.com:8080', 'ol,aws,gb');
coceClient.fetch(isbns, function(isbn, url) {
  $('#isbn_'+isbn).html('<img src="+url)+'"");
});

Performance

coce is highly scalable. With all requested URLs in cache, ab test, 10000 requests, with 50 concurrent requests:

ab -n 10000 -c 50 http://localhost:8080/cover?id=9780415480635,97808?1417492,2847342257,9780563533191&provider=gb,aws

gives this result:

Document Path:          /cover?id=9780415480635,97808?1417492,2847342257,9780563533191
Document Length:        431 bytes

Concurrency Level:      50
Time taken for tests:   5.333 seconds
Complete requests:      10000
Failed requests:        0
Total transferred:      6350000 bytes
HTML transferred:       4310000 bytes
Requests per second:    1874.97 [#/sec] (mean)
Time per request:       26.667 [ms] (mean)
Time per request:       0.533 [ms] (mean, across all concurrent requests)
Transfer rate:          1162.70 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.5      0      10
Processing:     6   25   5.8     24     348
Waiting:        5   24   5.6     23     338
Total:          6   25   5.8     24     348

Percentage of the requests served within a certain time (ms)
  50%     24
  66%     25
  75%     27
  80%     28
  90%     31
  95%     34
  98%     37
  99%     40
 100%    348 (longest request)