Data Driven vs. Physics Aware Modeling
There are two kinds of modeling. The first kind is “data driven” modeling. In the most basic form, this means performing a lot of experiments, and finding a mathematical rule that best explains all the data that comes out of them. This is a sure fire way, an approach that “just works”.
On the other hand, this data driven approach has many, and, some would say, grave limitations:
- The model is only good for the extents of the data collected. It cannot be extrapolated to other situations.
- A huge number of experiments are needed to build an accurate model. For multi-parameter models, a problem popularly called the “curse of dimensionality” befalls — the amount of data needed for accurate characterization grows exponentially with the number of parameters to be co-characterized.
- The data driven model has no explanatory power. It does not elucidate any underlying physics.
- A data driven model may be less accurate, since it will be more dependent on inaccuracies in the underlying data.
- A data driven model may be hard to compute, since it may depend upon looking up “correlations” from large lookup tables.
The more physics aware we can make our model, the more it can overcome the above limitations. A physics aware model is more likely to apply to situations that it has never been tested in, since the fact that it has been tested at all says something more than the extents and situations tested — it says that the underlying physics was tested. A fewer experiments will build more accurate models, since in physics aware models, only a few coefficients usually need to be measured rather than large diverse correlation tables. The physics aware model could be easier to compute, since it depends more on equations and less on data.
Lastly, and very importantly, a physics aware model elucidates the “inner working” (noumenon!!!) of the phenomenon in more detail than a data driven model. This is important, because insight into the phenomenon can lead to better technological and engineering solutions.
In practice, the data driven and physics aware models are not a dichotomy of classification, but an axis of gradation. The more physics aware we make our models, the more the benefits that will accrue to us in the long run!