Skip to content

Ch 5. Document Operations

Madhusudhan Konda edited this page Jul 26, 2021 · 7 revisions

The movies dataset is available in the datasets folder under code. You can find it here

Indexing Documents

Indexing a movie document with an identifier:

//Request
PUT movies/_doc/1
{
  "title":"The Godfather",
  "synopsis":"The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son"
}

// Response:
{
  "_index" : "movies",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "title" : "The Godfather",
    "synopsis" : "The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son"
  }
}

Indexing documents without an ID

Indexing a movie review document - the document isn't expected to have an ID provided by the user (note the POST method invocation):

// Request
POST movies_reviews/_doc
{
  "movie":"The Godfather",
  "user":"Peter Piper",
  "rating":4.5,
  "remarks":"The movie started with a .."
}

// Response:
{
  "_index" : "movies_reviews",
  "_type" : "_doc",
  "_id" : "6HPnfXoBW8A1B2am0B5U",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 4,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

The ID was not provided by the user. System generated one (6HPnfXoBW8A1B2am0B5U) and assigned to the document

Source Includes

Let's add another document to the movies index:

PUT movies/_doc/3
{
  "title":"The Shawshank Redemption",
  "synopsis":"Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency",
  "rating":"9.3",
  "certificate":"15",
  "genre":"drama",
  "actors":["Morgan Freeman","Tim Robbins"]
}

Include fields

To include title, rating,genre, run the following query: GET movies/_source/3?_source_includes=title,rating,genre

The output would be:

{
  "rating" : "9.3",
  "genre" : "drama",
  "title" : "The Shawshank Redemption"
}

Exclude fields

To exclude fields, run the following query:

GET movies/_source/3?_source_excludes=actors,certificate This will return:

{
  "rating" : "9.3",
  "certificate" : "15",
  "genre" : "drama",
  "title" : "The Shawshank Redemption"
}

Include and exclude fields

Let's first update our movie document:

PUT movies/_doc/3
{
  "title":"The Shawshank Redemption",
  "synopsis":"Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency",
  "rating":"9.3",
  "certificate":"15",
  "genre":"drama",
  "actors":["Morgan Freeman","Tim Robbins"],
  "rating_amazon":4.5,
  "rating_rotten_tomatoes":80,
  "rating_metacritic":90
}

Deleting Documents

Deleting a document by its ID

DELETE movies/_doc/1

This will result in:

{
  "_index" : "movies",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 5,
  "result" : "deleted",
  "_shards" : {
    "total" : 4,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 6,
  "_primary_term" : 1
}

Updating Documents

Let's update a couple of fields to our existing movie document:

PUT movies/_doc/1
{
  "title":"The Godfather",
  "synopsis":"The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son"
}

Add additional fields

Update the document with additional fields actors and director:

POST movies/_update/1
{
  "doc": {
    "actors":["Marldon Brando","Al Pacino","James Cann"],
    "director":"Frances Ford Coppola"  
  }
}

If the operation is successful, the response would be:

{
  "_index" : "movies",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 4,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

Update by query

Updating actor Al Pacino to 'Oscar Winner Al Pacino' for the movie 'The Godfather':

POST movies/_update_by_query
{
  "query": {
    "match": {
      "actors": "Al Pacino"
    }
  },
  
  "script": {
    "source": """
    ctx._source.actors.add('Oscar Winner Al Pacino');
    ctx._source.actors.remove(ctx._source.actors.indexOf('Al Pacino'))
    
    """,
    "lang": "painless"
  }
}

The response would be:

{
  "took" : 43,
  "timed_out" : false,
  "total" : 1,
  "updated" : 1,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

Upsert

POST movies/_update/5
{
  "script": {
    "source": "ctx._source.gross_earnings = '357.1m'"
  },
  "upsert": {
    "title":"Top Gun",
    "gross_earnings":"357.1m"
  }
}