-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database Triggers #1082
Comments
CREATE TRIGGER { NAME } [ON { RESOURCE }] [WITH { OPTION }]
FOR | AFTER | INSTEAD OF
INSERT | UPDATE | DELETE Where RESOURCE would be a single doc or if omitted for every doc (maybe support glob syntax) Example:
|
👍 for database triggers Regarding the trigger code I would be interested how you planned to store the XQuery trigger functions, e.g. in Also, I would prefer to use function items instead of single xq-files. This would allow - at least in my opinion - more flexibility, shorten the code and with function items being first-level citizens it feels like a natural choice to me. So the example would look like
Using database commands we could allow XQuery expressions, e.g. both would be valid:
This would allow to store all or some trigger functions in one module. Having to write 100 xq files for 100 triggers seems cumbersome to me. It would entail that we also have to create a name, quite likely the structure as in SQL would be sufficient. As we would not implement INSTEAD OF or FOR (at least in the beginning and AFTER could simply be the default) I think the following database command structure should work:
Regarding errors in the trigger I would opt for simply logging them. Everything else seems quite complicated to me and I don't think trigger functions should be written in a way that they can fail (instead, error handling should be done within the trigger functions). Regarding the current event architecture I am all for removing this feature. Having a trigger should enable one to write a connection to some message broker like ZeroMQ, which could notify a client application. |
Thanks for the inspiring feedback… I'm trying to summarize our first ideas: Smallest Entities We'll probably settle down on having database resources (XML, binary) as smallest observable entity. Once we choose this path, this may also have effects on other features like backups, replication, distribution, etc. On the other hand, we'll need to gain more experience with large collections and get to know new bottlenecks. SQL Syntax The alignment with existing SQL statement makes sense to me. Fortunately, there are many similarities with our first ideas. Storing Triggers I like the idea of simply storing trigger query files in the database (i.e., as binary resource):
Trigger Expressions Using function items for triggers was my first idea as well, but one problem (among others) is that function items can reference other expressions from the original query. See the following example: declare variable $global := db:open('haha')
declare function local:evil() {
"I'm belonging to the original query"
};
let $code := function($db, $resources, $operation) {
local:evil(), $global
}
return trigger:install('db', $code, 'delete') This would mean that the original query cannot be finalized as long as there still may be another function to be executed. – We have similar problems with But I agree it's ugly to work with things like external variable declarations. A way out could be to represent operations via function names or annotations: (: 1st approach :)
declare function trigger:delete($db, $res) { ... };
(: 2nd approach :)
declare %trigger:delete function f($db, $res) { ... };
Error Handling Good point again: As we are using XQuery, it's indeed easy to use try/catch. If things still fail, we can restrict ourselves to logging them. Trigger Execution Do you have some preference towards synchronous or asynchronous execution?
|
I see the problem with using function items now, thanks for the explanation. I prefer the second approach using annotations, simply because I think it is more flexible and also a nicer syntax. If I understood you correctly, you want to move all functionality into the functions itself, so that they are self-contained. So wouldn't it make sense to move all functionality there? So e.g. we could have our file declare
%trigger:delete
%trigger:on("resource.xml")
function f($db, $res) { ... }; This way, we could have a simple Regarding the trigger execution I would definitely go for synchronous execution. Three reasons:
|
Right, that's what I would propose. Your example makes sense.
Exactly. It would primarily be used to validate the trigger queries. Apart from that, it would do nothing else than copying the files to the database directory.
In that case, it could make sense not to attach trigger queries to specific databases, but instead provide a single, global
|
+1 for the current outcome of the discussion. While re-reading some SQL trigger spec, I think we need to clarify what the "ON " part really means. In SQL terms it reacts on row level (insert, delete) operations. While there are other kinds of triggers for table level (drop, alter) operations. So "ON " would currently mean: fire the trigger if something happens (e.g. insert nodes) with these resources Question: Would the trigger "ON foo*.xml" also fire if a document named foo123.xml was added to database? This could be solved by just adding another kind: ADD (resource) : [ADD | INSERT | UPDATE | DELETE] Still, DELETE remains ambiguous (delete some node or delete resource?). |
First of all, if we decide not to restrict triggers to databases, we could add two database-specific operations:
Next, I propose that all update operations should be summarized on resource level:
Otherwise, multiple updates on a documents would usually lead to numerous trigger calls: db:open('db', 'doc.xml') update (
delete node .//text(), (: → one DELETE call per text node? :)
//a ! (replace value of node . with 'new text') (: → one UPDATE call per element? :)
)
(: → I think a single UPDATE call should suffice :) Here are some examples how the resulting annotations could look like (I decided to merge the discussed annotations into one): (: Called when any new database is created :)
declare %trigger:create function create-any($db) { ... };
(: Called when a db is dropped whose name starts with "db" :)
declare %trigger:drop("db*") function create($db) { ... };
(: Is called when an XML resource is updated in the database "books" :)
declare %trigger:update("books", "*.xml") function update($db, $res) { ... };
(: Is called when an image is deleted from any database :)
declare %trigger:delete("*", "*.jpg,*.gif,*.bmp") function delete($db, $res) { ... }; Those would be the remaining trigger commands and functions: Commands:
Functions: trigger:install('/local/path/to/trigger.xq'),
trigger:delete('trigger.xq')
trigger:list()
trigger:list-details() |
One more thought: We may need to sort out if the fired events relate to the original query or to the optimized update operations. Some examples:
(: 1 :) (db:delete('db', 'doc.xml'), db:add('db', 'doc.xml')
(: 2 :) db:replace('db', 'doc.xml', <new-doc/>)
This leads to questions like: Do we want to fire events if the database contents have not really changed? |
So this is triggerXQ; annotated based, but unlike RESTXQ and like the REPO, On 20 February 2015 at 21:32, Christian Gruen [email protected]
|
Nice name ;) I think we can automatically scan the
Good question. I just looked up the PostgreSQL documentation. It states that "SQL specifies that multiple triggers should be fired in time-of-creation order. PostgreSQL uses name order, which was judged to be more convenient.". We could order all triggers by:
|
I would advise against using the line position and instead use the function name. I would find it highly surprising if the order of functions in a file (or line spaces or indentations) in XQuery should be relevant, i.e. reordering my code would change my program. Using the function names as a default order is less surprising to me. Additionally, one could think of adding an optional |
Nice to hear about triggers. trigger:event($name as xs:string, $data as item()*) declare |
One of the reasons why I don't particularly liked the old event architecture is that such a loosely coupled integration is in my opinion not a job for a database system or query processor. What you can already do is to use some 3rd party tool like a message queue (ZeroMQ, ...), which could trigger such events and could get notified. |
Ideally, clients should be informed on changes. Personally, I kind of like the web sockets approach, where the client would register to a BaseX web socket server and listen for the database events. But anything that is handy and easy to setup on the end is welcome. |
I think it's fair to say that a database such as BaseX is both a database engine but also a platform for programming general applications (due to the nature of XQuery and web facilities such as RESTXQ). DBA's probably tend to the database engine side and would find most value in the lower-level database triggers as tools to strictly guard the data. Web application developers would probably tend to see most value in a more loosely coupled, flexible, event system. The question in my mind is if both concerns should be reconciled into one set of eventing/trigger facilities or if they are so different that they require different facilities. When it comes to database triggers (the limited scope) I would want
Most of these points were already raised above but there were also points made in the direction of a more general event system. I also see these triggers, apart from the similarity with SQL triggers, similar to Git hooks. General advice is to keep them very short and simple (because they kind of run synchronously and inside a transaction). In an XML database context I could imagine they can be used to handle some XML validation. Maybe, maybe there should also be a way of notifying others of these events (but only after the operation and asynchronously). But I think this would already lead to the second and wider scope of a general event system which has the added complication of running code asynchronously. |
Ah, one thing about "smallest entities". I believe the current consensus is on not addressing individual nodes within a database resource and that's probably sufficient. However there's one level besides database resources and individual XML nodes that hasn't been addressed I think. Suppose I have order XML files stored in a database with a path "acme/orders/". Although the path is "just" a string it does express virtual collections. With db:open or fn:collection() I can get all order document nodes by specifying the path to it's directory or virtual collection. Would I be able to put a database trigger only on such a directory so that each new order or update on an existing order causes an event? I ask this because currently it seems that the path is just a string and the fact that db:open returns all document nodes under a specific path is just a simple convenience but otherwise has no explicit representation in the database. |
Yes, in the case of web applications, there needs ways not only to trigger events but also to react to database events on the client-side. That's why I think a web socket implementation for triggers inside BaseX would be a nice solution.
To my understanding, according to the draft implementation, you could filter the events through a module and therefore trigger actions on specific documents updates, like on a specific path or directory. It would need to support the glob syntax I guess. |
Discarded (discussion has stopped long time ago). |
Allow definition of database triggers to react on changes in a database (creations, updates, deletions).
Open questions:
Database Commands:
XQuery:
Trigger Query (trigger.xq):
Feedback is welcome!
The text was updated successfully, but these errors were encountered: