The shape of things to come
How to pick a PhD project? That was the question.
As I was thinking this through, my undergraduate mentor gave me the following advice, I don’t remember it exactly (it was a long time ago), but the gist was this: “Do something hard, and even if you don’t completely solve the issue, you’ll learn a lot along the way.”
I ended up doing this, for the most part. I wrote code, did some math, and ended up implementing common approaches in the field of electronic structure theory. I got my hands dirty, and got to try and think deeply about important problems in theoretical chemistry.
I’ve been thinking about my experience as, on Friday, we saw the publication in Science of some recent work by DeepMind. As you may know, DeepMind (now a part of Alphabet), are pioneers in the application of machine learning strategies to a wide variety of different fields (including but not limited to: Go, retro video games, modern video games, and protein structure prediction).
This most recent work from DeepMind is in the field of Density Functional Theory (DFT). Very coarsely, DFT provides a low-cost computational approach to determine ground state properties of physical systems. It’s estimated that roughly 30% of all super computer use is devoted to solving the requisite equations necessary for DFT calculations. In short, it’s a practical approach to explore systems using quantum mechanics. For more detail, you can take a look here.
A critical element of DFT is the Exchange-Correlation (XC) functional (a functional is a function-of-a-function) and, while we know about some asymptotic properties, and have some insight for model systems, the exact form of the XC functional is unknown. Whole careers have been built on trying to capture a functional formalism that accounts for the required physics of electronic Exchange-Correlation, and there are hundreds of functionals available (there’s even an annual poll wherein folks can vote on their favourites).
Through their recent publication, DeepMind introduced DM21, their own functional. It’s a functional that is built through the application of their machine learning strategies, and it’s reportedly quite good.
The publication appeared as I was getting ready to head out to a conference on Antibody modeling. One of the first talks I heard at that meeting was from an academic, who described some of his group’s recent work on protein structure prediction (a field that has been shaken by DeepMind’s AlphaFold and, more recently, AlphaFold2 approaches). I don’t know this person at all, but his remarks made me think he’d built a career on thinking about the physics of protein folding: the derivation and use of tractable mathematical expressions that capture what we believe is happening within these systems. And, what was he talking about? … Machine learning. I got the distinct impression, subsequently confirmed when I met him in person, that he was wrestling with this wholesale change, enforced on his research direction by the seemingly unstoppable success of machine learning approaches.
Which brings me back to the question I started with. What’s a student in the sciences to do now? Imagine, a budding theoretician, interested in understanding nature fundamentally, through physics, and chemistry, should she plan on learning those fundamentals, the models that have come before, and then working to improve them by better descriptions of the underlying problem. Or … should she instead focus her efforts on learning the tools of machine learning? Should she do both? There’s no easy answer to this, and maybe there doesn’t need to be. I just wondered about what this means for the choices junior scientists might have to make, the resultant trajectories their careers might then take, and what we might not be learning, by focusing on having the machines learn for us?