Geospatial Query is a feature in MongoDB allowing users to query geospatial data.
Geospatial data can be stored in GeoJSON
or coordinate pairs
format.
GeoJSON is used to calculate geometry on an Earth-like sphere, while coordinate pairs is used to calculate distance between 2 points on a Euclidean plane.
### B. How to do Geospatial Query in MongoDB?
First we need some data to work with, I'm using School List GeoJSON from Calderdale Council. I inserted some of the data to MongoDB Atlas to try some queries on them.
Here is an example of the school data
{
"type": "Feature",
"properties": {
"DfE number": "2005",
"Establishment": "Abbey Park Academy",
"Phase": "Primary",
...
},
"geometry": {
"type": "Point",
"coordinates": [
-1.89601,
53.75541
]
}
}
Some geospatial operations requires a geospatial index. There are 2 types of geospatial index:
- 2dsphere
2dsphere
index is used if we are working with Earth-like geometry - 2d
2d
index is used for working on geometry on 2D plane.
Here I will be using 2dsphere
index on geometry
field:
There are 4 operators on geospatial query:
-
$geoIntersects
Selects geometries that intersect with a GeoJSON geometry
-
$geoWithin
Selects geometries within a bounding GeoJSON geometry
-
$near
Returns geospatial objects in proximity to a point. Requires a geospatial index.
-
$nearSphere
Returns geospatial objects in proximity to a point on a sphere. Requires a geospatial index.
{
geometry: {
$near: {
$geometry: {
type: "Point",
coordinates: [-1.89601, 53.75541]
},
$maxDistance: 4000
}
}
}
result:
This query result in 4 records being retrieved, now I'll try to increase $maxDistance
to get more data.
{
geometry: {
$near: {
$geometry: {
type: "Point",
coordinates: [-1.89601, 53.75541]
},
$maxDistance: 8000
}
}
}
result:
Now we get 7 records, more than before as expected.
{
geometry: {
$geoIntersects: {
$geometry: {
type: "MultiPoint",
coordinates: [
[-1.89601, 53.75541],
[-1.85983, 53.7007],
[-1.90734, 53.7493],
[-1.77728, 53.7241]
]
}
}
}
}
result:
Here are the steps I took to deploy MongoDB and ElasticSearch on Kubernetes using Kind:
I am using Ubuntu 20.04 on WSL2.
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.17.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind
I also installed kubectl
to be able to interact with kubernetes clusters from command line.
kind create cluster --config kubernetes_files/kindconfig.yaml
This creates a cluster with pre-built kindest-node image.
I'm using the following config when creating the cluster.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
The idea is to have 1 control-plane node and 1 worker node, the worker node is going to host a pod which MongoDB and ElasticSearch will be deployed on. I plan to deploy both of them on the same pod so they have shared storage and networking.
Here is the list of currently available nodes.
I use the following YAML file to create a pod named db-pod
and deploy MongoDB and ElasticSearch on it.
apiVersion: v1
kind: Pod
metadata:
name: db-pod
spec:
containers:
- name: mongo
image: mongo:latest
ports:
- containerPort: 27017
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.6.2
ports:
- containerPort: 9200
- containerPort: 9300
Then I ran
kubectl apply -f kubernetes_files/deploy.yaml
Here is explanation about the file:
a. apiVersion
shows the version of Kubernetes APi used.
b. kind
shows the type of Kubernetes Object that this file is working on
c. metadata.name
shows the name of the pod
d. spec
describes the desired state of the pod
e. containers
lists the containers to be deployed on the pod.
f. The first container is MongoDB container. For the image I use mongo:latest. I expose port 27017
, the default port for MongoDB
g. The second container is ElasticSearch container. I exposed ports 9200
as the REST API port and 9300
as transport port.
It seems that for elasticsearch image, the tag must be specified and we cannot use latest as the tag. So I tried to set the image to elasticsearch:8.6.2
, but the pod creation failed. After exporting Kind
log I saw that I got the following error:
Mar 19 07:44:59 kind-worker containerd[236]: time="2023-03-19T07:44:59.054587600Z" level=error msg="PullImage \"elasticsearch:8.2.6\" failed" error="rpc error: code = NotFound desc = failed to pull and unpack image \"docker.io/library/elasticsearch:8.2.6\": failed to resolve reference \"docker.io/library/elasticsearch:8.2.6\": docker.io/library/elasticsearch:8.2.6: not found"
so I used docker.elastic.co/elasticsearch/elasticsearch:8.6.2
instead as showed in here.
Now I got another error: the elasticsearch container failed to start because of CrashLoopBackOff
error.
After researching about it I found that the cause could be an outdated docker version. The docker I had was Docker version 20.10.21, build 20.10.21-0ubuntu1~20.04.1
.
So I ran
sudo apt-get upgrade docker
kubectl delete pod db-pod
kubectl apply -f kubernetes_files/deploy.yaml
And now all containers in the pod are ready, but not for long. The ElasticSearch container keeps crashing. So I tried to run the container using Docker without Kubernetes. The container crashed too when being run using Docker, and in the logs I found that the problem was that vm.max_map_count
was too low.
So I updated /etc/sysctl.conf
and set vm.max_map_count=262144
as recommended in the logs. Now when I run the container using Docker and everything went well.
I redeployed the pod after updating vm.max_map_count
and now both containers are running.
The pod have been running for 14 minutes and have no problem.
To check the readiness of the containers, first I forward the ports on my local machine to the ports of the containers.
For MongoDB:
kubectl port-forward db-pod 27018:27017
For ElasticSearch:
kubectl port-forward db-pod 9201:9200
Here are the check results