Ever since the control bug bit me in my EE undergrad years I am happy to see how useful the knowledge remains. Of course the underlying math of optimization remains general but the direct applications of control theory made it much more appetizing for me to struggle through.
I find myself completely outclassed by mathematicians in my own field.
I tried to learn a little math on the side after my regular software engineer gig but I'm completely outclassed by phd's.
I am unsure of the next course of action or if software will survive another 5 years and how my career will look like in the future. Seems like I am engaged in the ice trade and they are about to invent the refrigerator.
Don't despair. The key to becoming proficient in advanced subjects like this one is to first try to understand the fundamentals in plain language and pictures in your mind. Ignore the equations. Ask AI to explain the topic at hand at the most fundamental level.
Once the fundamental concepts are understood, what problem is being solved and where the key difficulties are, only then the equations will start to make sense. If you start out with the math, you're making your life unnecessarily hard.
Also, not universally true but directionally true as a rule of thumb, the more equations a text contains the less likely it is that the author itself has truly grasped the subject. People who really grasp a subject can usually explain it well in plain language.
I guess I have the opposite experience. I have a post-graduate level of mathematical education and I am dismayed at how little there is to be gained from it, when it comes to AI/ML. Diffusion Models and Geometric Deep Learning are the only two fields where there's any math at all. Many math grads are struggling to find a job at all. They aren't outclassing programmers with their leet math skillz.
It's not clear or obvious why continuous semantics should be applicable on a digital computer. This might seem like nitpicking but it's not, there is a fundamental issue that is always swept under the rug in these kinds of analysis which is about reconciling finitary arithmetic over bit strings & the analytical equations which only work w/ infinite precision over the real or complex numbers as they are usually defined (equivalence classes of cauchy sequences or dedekind cuts).
There are no dedekind cuts or cauchy sequences on digital computers so the fact that the analytical equations map to algorithms at all is very non-obvious.
It is definitely not obvious, but I wouldn't say it is completely unclear.
For instance we know that algorithms like the leapfrog integrator not only approximate a physical system quite well but even conserve the energy, or rather a quantity that approximates the true energy.
There are plenty of theorems about the accuracy and other properties of numerical algorithms.
This is what the field of numerical analysis exists for. These details definitely have been treated, but this was done mainly early in the field's history; for example, by people like Wilkinson and Kahan...
I just took some basic numerical courses at uni, but every time we discretized a problem with the aim to implement it on a computer, we had to show what the discretization error would lead to, eg numerical dispersion[1] etc, and do stability analysis and such, eg ensure CFL[2] condition held.
So I guess one might want to do a similar exercise to deriving numerical dispersion for example in order to see just how discretizing the diffusion process affects it and the relation to optimal control theory.
Continuous formulations are used with digital computers all the time. Limited precision of floats sometimes causes numerical instability for some algorithms, but usually these are fixable with different (sometimes less efficient) implementations.
Discretizing e.g. time or space is perhaps a bigger issue, but the issues are usually well understood and mitigated by e.g. advanced numerical integration schemes, discrete-continuous formulations or just cranking up the discretization resolution.
Analytical tools for discrete formulations are usually a lot less developed and don't as easily admit closed-form solutions.
Doesn't continuous time basically mean "this is what we expect for sufficiently small time steps"? Very similar to how one would for example take the first order Taylor dynamics and use them for "sufficiently small" perturbations from equilibrium. Is there any other magic to continuous time systems that one would not expect to be solved by sufficiently small time steps?
You should look into condition numbers & how that applies to numerical stability of discretized optimization. If you take a continuous formulation & naively discretize you might get lucky & get a convergent & stable implementation but more often than not you will end up w/ subtle bugs & instabilities for ill-conditioned initial conditions.
I understand that much, but it seems like "your naive timestep may need to be smaller than you think or you need to do some extra work" rather than the more fundamental objection from OP?
The translation from continuous to discrete is not automatic. There is a missing verification in the linked analysis. The mapping must be verified for stability for the proper class of initial/boundary conditions. Increasing the resolution from 64 bit floats to 128 bit floats doesn't automatically give you a stable discretized optimizer from a continuous formulation.
Real numbers mostly appear in calculus (e.g. the chain rule in gradient descent/backpropagation), but "discrete calculus" is then used as an approximation of infinitesimal calculus. It uses "finite differences" rather than derivatives, which doesn't require real numbers:
If your definition of "algorithm" is "list of instructions", then there is nothing surprising. It's very obvious. The "algorithm" isn't perfect, but a mapping with an error exists.
If your definition of "algorithm" is "error free equivalent of the equations", then the analytical equations do not map to "algorithms". "Algorithms" do not exist.
I mean, your objection is kind of like questioning how a construction material could hold up a building when it is inevitably bound to decay and therefore result in structural collapse. Is it actually holding the entire time or is it slowly collapsing the entire time?
I am unsure of the next course of action or if software will survive another 5 years and how my career will look like in the future. Seems like I am engaged in the ice trade and they are about to invent the refrigerator.
Once the fundamental concepts are understood, what problem is being solved and where the key difficulties are, only then the equations will start to make sense. If you start out with the math, you're making your life unnecessarily hard.
Also, not universally true but directionally true as a rule of thumb, the more equations a text contains the less likely it is that the author itself has truly grasped the subject. People who really grasp a subject can usually explain it well in plain language.
There are no dedekind cuts or cauchy sequences on digital computers so the fact that the analytical equations map to algorithms at all is very non-obvious.
For instance we know that algorithms like the leapfrog integrator not only approximate a physical system quite well but even conserve the energy, or rather a quantity that approximates the true energy.
There are plenty of theorems about the accuracy and other properties of numerical algorithms.
So I guess one might want to do a similar exercise to deriving numerical dispersion for example in order to see just how discretizing the diffusion process affects it and the relation to optimal control theory.
[1]: https://en.wikipedia.org/wiki/Numerical_dispersion
[2]: https://en.wikipedia.org/wiki/Courant%E2%80%93Friedrichs%E2%...
Discretizing e.g. time or space is perhaps a bigger issue, but the issues are usually well understood and mitigated by e.g. advanced numerical integration schemes, discrete-continuous formulations or just cranking up the discretization resolution.
Analytical tools for discrete formulations are usually a lot less developed and don't as easily admit closed-form solutions.
https://en.wikipedia.org/wiki/Finite_difference
I'm not sure about applications of real numbers outside of calculus, and how to replace them there.
If your definition of "algorithm" is "list of instructions", then there is nothing surprising. It's very obvious. The "algorithm" isn't perfect, but a mapping with an error exists.
If your definition of "algorithm" is "error free equivalent of the equations", then the analytical equations do not map to "algorithms". "Algorithms" do not exist.
I mean, your objection is kind of like questioning how a construction material could hold up a building when it is inevitably bound to decay and therefore result in structural collapse. Is it actually holding the entire time or is it slowly collapsing the entire time?