We were selected again by the NumFOCUS SDG grant🎉! With the new selection rules for the Small Development Grant program by NumFOCUS, the projects are now selected using a random script, it was a quite competitive round, but we are so lucky to be funded by NumFOCUS organization again for the project GPU-accelerated BVP Solvers. Firstly I should thank the NumFOCUS community for their generous support for the development of softwares for scientific computing. And I want to thank the PI Chris and the SciML community, for the cutting-edge scientific computing software development and warm support. The project will last 8 months, the main focus will be developing heterogenous and vendor-agnostic GPU accelerated boundary value problem solvers, for both collocation and shooting methods. The project will focus on using KernelAbstraction.jl to parallelize the collocation routines where the current implementation is a big for loop, so the generalization would make the computation much more faster with GPU parallelism. Actually the basic idea would come from the paper: Automated translation and accelerated solving of differential equations on multiple GPU platforms, which generalize the ODE solvers to hetergeneous GPU backend such as CUDA, Metal, OpenAPI and AMDGPU etc. Finally, I am really honored to get this funding to help me continue the BVP works. To make sure the project going on the right track with appropriate speed, I will record biweekly progress in this blog during the whole SDG project period.

Review of the past two weeks

In the first two weeks, I continued the previous work on extension to dynamic optimization of BoundaryValueDiffEq.jl, including support inequality constraints and further application in optimal control problems, but the main issue remains is that the inequality constraints inn boundary value problems for example:

tspan = (0.0, 1.0)
function lotka_volterra!(du, u, p, t)
    du[1] = 1.5 * u[1] - 1.0 * u[1] * u[2]
    du[2] = -3.0 * u[2] + 1.0 * u[1] * u[2]
end
function bc!(residual, u, p, t)
    residual[1] = u[1] - 1.0
    residual[2] = u[2] - 1.0
end
prob = BVProblem(
    lotka_volterra!, bc!, [4.0, 2.0], tspan, lcons = [0.0, 0.0], ucons = [Inf, Inf])
sol = solve(prob, MIRK4(; optimizer = Ipopt.Optimizer()), dt = 0.05)

the inequality constraints over the whole time interval specifications would involve make every discrete elements on the whole time points to satisfy the constraints.

TODOs in next two weeks

In the following two weeks, there are a few TODOs:

Continue the privious PR and get that PR, Optimization based solvers merged ASAP.
The generalization to heterogeneous GPU backends for Shooting methods is relatively easy, the remain issue is the evaluation of boundary conditions routine, for example, if we have a boundary condition of:

function bc!(residual, u, p, t)
    residual[1] = u(pi / 4)[1] - 1.0
    residual[2] = u(pi / 2)[1] - 1.0
end

Which would always involves scalar indexing which is always disallowed by default. We need to figure out a better way to handling this.