Change to CUDA 11 toolchain
I am trying to switch to the regular nvcc toolchain for elsa, however running into multiple issues with metaprogramming as nvcc does not fully support it. Specifically I am right now looking into the function
/// overloaded addition operator returns the corresponding expression
template <typename LHS, typename RHS, typename = std::enable_if_t<isBinaryOpOk<LHS, RHS>>>
auto operator+(LHS const& lhs, RHS const& rhs)
{
auto add = [] __device__(auto const& l, auto const& r) { return l + r; };
auto expr = Expression<decltype(add), LHS, RHS>(add, lhs, rhs);
return expr;
}
which creates an Expression
object templated on the input types. I get the compiler error
The enclosing parent function ("operator+") for an extended __device__ lambda must not have deduced return type
detected during instantiation of "quickvec::Vector<data_t> &quickvec::Vector<data_t>::operator+=(const quickvec::Vector<data_t> &) [with data_t=quickvec::index_t]"
I was trying to use trailing return type deduction, but the problem is the type of the lambda add
which I did not manage to deduce. Anyone has any ideas? Pinging @lasser, @david.frank . Maybe you also have hints to anyone else who is very familiar in templates and could take a look.