Replies: 2 comments
-
Discussion UpdateThis update is based on the discussion Vassil and I had a couple of days ago. Attribute approach to specifying pushforward/pullback functions have various drawbacks that cannot be resolved trivially. The drawbacks are as follows:
In comparison, the custom derivatives approach that we currently employ does not suffer from most of these drawbacks. Thus, for now, we have decided to provisionally go with the custom derivatives approach. An example describing custom derivatives approach in action: template <typename T>
T cube(T x) {
return x*x*x;
}
namespace clad {
namespace custom_derivatives {
template<typename T>
T cube_pushforward(T x, T d_x) {
return 3*x*x*d_x;
}
template<typename T>
T cube_pullback(T x, clad::array_ref<T> d_y, clad::array_ref<T> d_x) {
T t0 = *d_y;
*d_x += *d_y*3*x*x
*d_y -= t0;
}
}
} One place where we cannot directly use the custom derivative approach is specifying the custom derivative of a member function. There are several strategies that we can employ to solve this problem such as using a fixed naming convention for pushforward/pullback functions (without the enclosing custom_derivative namespace) and using inheritance for extending the original types with pushforward/pullbacks. We need to decide what is the best way forward for this. |
Beta Was this translation helpful? Give feedback.
-
Cool, I think pushforward/pullback is the way to go in general (independently on how custom derivatives are implemented) |
Beta Was this translation helpful? Give feedback.
-
This discussion introduces a rule-based design to differentiate a function using Automatic Differentiation (AD) in clad. It will supersede the current custom derivatives approach. This write-up aims to introduce the concept used and the associated API design, and also provide a common place for discussions and gathering of ideas to further improve this design. This rule-based design is inspired by ChainRules.jl library and Swift language.
The rule-based design will allow users to specify custom differentiation rules for any function and overloaded operator. The rule-based approach will have at least the following benefits:
It will facilitate differentiation of function whose definition is not available.
It will allow specifying a more numerically stable/efficient derivative of a function to be used for differentiation that otherwise would not be known using ordinary automatic differentiation transformation.
ChainRules.jl documentation does a very good job at explaining pushforward and pullback concepts. Please refer ChainRules.js documentation here and here for an introduction to pushforward/pullback concepts.
Proposed Solution
To add support for allowing user-defined differentiation rules, we introduce 2 new C++ attributes that will allow registering a function as pushforward or pullback of some other function.
clad::pushforwardOf(FnName)
attributeclad::pushforwardOf(FnName)
attribute registers a function as the pushforward of the function specified byFnName
.For example:
Pushforward functions will be utilised by Clad wherever there is a need to obtain a derivative of the corresponding function.
A concrete example of pushforward function usage by Clad:
Clad will utilise pushforward function in the synthesized forward mode derived function of
fn
as follows:clad::pullbackOf(FnName)
attributeclad::pullbackOf(FnName)
attribute registers a function as the pullback of the function specified byFnName
.For example:
Pullback functions can be designed in 2 distinct ways that have slightly different behaviour. We need to decide which design we should proceed with.
In the first way, the pullback function will provide pullback values, they will not modify the actual derived variables involved.
For example, consider this code snippet:
Using this implementation of pullback function, it will be used as described below:
This statement will be transformed as follows:
In the second way, the pullback function will update the actual derived variables involved instead of just providing the pullback values.
For example, consider this code snippet:
Using this implementation of pullback, it will be used as described below:
This statement will get transformed as follows:
This design is computationally less expensive since fewer additional variables are involved.
pushforward and pullback functions defined by clad
Clad will internally automatically define pushforward and pullback to obtain derivatives of the functions if the user-defined pushforward/pullback function is not available.
The above discussed
cube_pushforward
andcube_pullback
functions will be synthesised automatically by Clad if they are required and the user-defined rule is not available.pushforward and pullback as the basic differentiation building blocks
We can go one step further and develop pushforward and pullback functions as the basic differentiation building blocks. One direct consequence of this will be that the forward and reverse mode derived functions will be defined directly using the corresponding pushforward and pullback functions.
For example,
Few major advantages of defining derived functions by directly forwarding differentiation to pushforward and pullback functions are as follows:
fn
is used inside some other functionfnB
, then clad will need to generate its pushforward function, but iffn
is also directly differentiated (using auto d_fn = clad::differentiate(fn, "i");
for example), then currently the same functionfn
will be derived 2 times, one for pushforward and once forfn_darg0
, this can be avoided iffn_darg0
is defined using the pushforward only.pushforward/pullback design in regards with differentiating with respect to aggregate types
pushforward/pullback design goes very well with differentiating scalar types with respect to aggregate types. However, it will be challenging to use this design with differentiating aggregating types with respect to aggregate types, and thus it does not go well with differentiating aggregate types with respect to aggregate types.
Any suggestions or comments regarding this discussion are welcome. Please feel free to ask any questions.
Beta Was this translation helpful? Give feedback.
All reactions