- pymongo
dbc.transform2(read_database_name=None, read_collection_name=None,
write_database_name="transformed_beta", write_collection_name=None)
Default transforms the current collection to time-indexed collection of the same name in "transformed_beta" database. The timestamp-indexed documents have the schema:
{
{
"_id":
"timetamp": t1,
"eb": {
"traj1_id": [centerx, centery, l, w, dir, v],
"traj2_id": [...],
...
},
"wb": {
"traj3_id": [centerx, centery, l, w, dir, v],
"traj4_id": [...],
...
},
},
{
"_id":
"timetamp": t2,
...,
...
},
}
With the desired python venv / conda env activated, use the following command in shell:
pip install git+https://github.com/yanb514/i24_database_api@<tag>
where <tag>
is either a branch name (e.g. main
), a tag name (e.g. v0.3
), or the latest version (latest
)
default_param
a dictionary read from a config file (template see test_param_template.config).
default_param = {
"host": "<mongodb-host>",
"port": 27017,
"username": "<mongodb-username>",
"password": "<mongodb-password>"
}
dbc = DBClient(**default_param)
Pass optional database_name and collection_name to connect to a specific database and/or collection:
dbc = DBClient(**default_param, database_name = <database_name>, collection_name = <collection_name>)
Either ways dbc.client
is essentially a wrapper of pymongo.MongoClient
object, and inherits all properties and functions of it.
dbc.list_collection_names(), or equivalently
dbc.db.list_collection_names()
newdb = dbc.client[<new_database_name>]
newdb.list_collection_names()
dbc = DBClient(**default_param, database_name = <database_name>, latest_collection=True) # dbc.collection is now the latest collection
print(dbc.collection_name)
dbc.collection.drop(), or
dbc.db[<some_collection_name>].drop(), or access another db
dbc.client[<some_database>][<some_collection_name>].drop()
dbc.reset_collection()
Reset would empty the currect collection but still keep the reference dbc.collection to that emptied collection.
dbc.delete_collection([list_of_cols_to_be_deleted])
dbc.mark_safe([safe_collection_list])
Authors: Zi Nean Teoh and Lisa Liu. Details see https://github.com/yanb514/i24_database_api/blob/main/src/i24_database_api/README.md
dbc.transform(read_database_name=None, read_collection_name=None,
write_database_name="transformed", write_collection_name=None)
Default transforms the current collection to time-indexed collection of the same name in "transformed" database.
- continuous range query
- async insert
- schema enforcement (pass schema rule as .json file)
dbc.find_one(index_name, value)
This API follows pymongo implementation, a more abstracted version of pymongo's collection.find()
query_filter = {"_id": {"$in": fragment_ids}}
query_sort = [("last_timestamp", "ASC")])
dbc.read_query(query_filter, query_sort)
The following code demonstrates the use of the iterative query based on a query parameter.
rri = dbc.read_query_range(range_parameter='last_timestamp', range_greater_equal=300, range_less_than=330, range_increment=None)
while True:
try:
print(next(rri)["ID"]) # access documents in rri one by one
except StopIteration:
print("END OF ITERATION")
break
print("Using for-loop to read range")
for result in dbc.read_query_range(range_parameter='last_timestamp', range_greater_equal=300, range_less_than=330, range_increment=None):
print(result["ID"])
print("END OF ITERATION")
produces
last timestamp: 304.17, starting_x: 32806.20, ID: 3600083.0
last timestamp: 306.00, starting_x: 32771.59, ID: 3600084.0
last timestamp: 310.90, starting_x: 32533.66, ID: 3600086.0
last timestamp: 312.73, starting_x: 32805.35, ID: 400088.0
last timestamp: 313.23, starting_x: 31897.72, ID: 3600087.0
last timestamp: 316.53, starting_x: 31594.89, ID: 3600088.0
last timestamp: 324.50, starting_x: 31166.60, ID: 3600089.0
last timestamp: 325.07, starting_x: 32076.31, ID: 400089.0
last timestamp: 328.93, starting_x: 30132.66, ID: 3600090.0
A collection with specified collection_name
is automatically created upon instantiating the DBWriter object. If a schema file (in json) is given, the writer object adds validation rule to the collection based on the json file.
Otherwise, it gives a warning "no schema provided", and proceeds without validation rule.
A collection can also be created after the DBClient object is instantiated, simply call
dbc.db.create_collection(collection_name = collection_name, schema = schema_file) # schema is optional
When bulk write to database, this package offers the choice to do non-blocking (concurrent) insert:
col = dbc.collection
# insert a document of python dictionary format -> pass it as kwargs
doc1 = {
"timestamp": [1.1,2.0,3.0],
"first_timestamp": 1.0,
"last_timestamp": 3.0,
"x_position": [1.2]}
dbc.write_one_trajectory(**doc1)
# insert a document using keyword args directly (if collection_name is None, use the current collection dbc.collection)
dbc.write_one_trajectory(collection_name = "test_collection" , timestamp = [1.1,2.0,3.0],
first_timestamp = 1.0,
last_timestamp = 3.0,
x_position = [1.2])
As of v0.2, if a document violates the schema, it bypasses the validation check and throws a warning in the console.
"Reconciled trajectories" collection
{
"$jsonSchema": {
"bsonType": "object",
"required": ["timestamp", "last_timestamp", "x_position"],
"properties": {
"configuration_id": {
"bsonType": "int",
"description": "A unique ID that identifies what configuration was run. It links to a metadata document that defines all the settings that were used system-wide to generate this trajectory fragment"
},
"coarse_vehicle_class": {
"bsonType": "int",
"description": "Vehicle class number"
},
"timestamp": {
"bsonType": "array",
"items": {
"bsonType": "double"
},
"description": "Corrected timestamp. This timestamp may be corrected to reduce timestamp errors."
},
"road_segment_ids": {
"bsonType": "array",
"items": {
"bsonType": "int"
},
"description": "Unique road segment ID. This differentiates the mainline from entrance ramps and exit ramps, which get distinct road segment IDs."
},
"x_position": {
"bsonType": "array",
"items": {
"bsonType": "double"
},
"description": "Array of back-center x position along the road segment in feet. The position x=0 occurs at the start of the road segment."
},
"y_position": {
"bsonType": "array",
"items": {
"bsonType": "double"
},
"description": "array of back-center y position across the road segment in feet. y=0 is located at the left yellow line, i.e., the left-most edge of the left-most lane of travel in each direction."
},
"length": {
"bsonType": "double",
"description": "vehicle length in feet."
},
"width": {
"bsonType": "array",
"items": {
"bsonType": "double"
},
"description": "vehicle width in feet"
},
"height": {
"bsonType": "array",
"items": {
"bsonType": "double"
},
"description": "vehicle height in feet"
},
"direction": {
"bsonType": "int",
"description": "-1 if westbound, 1 if eastbound"
}
}
}
}
https://github.com/yanb514/i24_database_api/blob/main/test/config/reconciled_schema.json
Additional future enhancements include:
- Use logger in db_writer
- Allow more customization in DBWriter, such as max time out etc.
- Add built-in user privilege checking (but this step requires authentication). After temporary disable authentication in mongod.conf, one can do
dbr.client.admin.command({"usersInfo": "readonly" })['users'][0]['roles']
to get all the user info. Check the specified user has only "read only" privilege or not. Similar for DBWriter.
User roles: More details: https://stackoverflow.com/questions/23943651/mongodb-admin-user-not-authorized https://www.codexpedia.com/devops/mongodb-authentication-setting/ https://www.mongodb.com/docs/manual/tutorial/manage-users-and-roles/