Last month, Google announced that its DeepMind AI program AlphaFold can ‘predict’ all the 200 million protein structures from 1 million species. This is, indeed, an impressive achievement. In a sense, I consider it an example of what AI really could be useful for: Not so much to create humanoids or sci-fi conscious computers, but rather to fathom Nature’s complexity that the human mind can’t. The announcement made headlines everywhere and provided for much excitement. But, in most cases, the headlines didn’t tell the whole story.
Because ‘predicting’ means what it means: We have predictions, not real protein structures. In the context of IT, ‘predicting’ means that a computer program ‘forecasts’, and ‘estimates’ by making gazillions of (more or less approximate) calculations and simulations of something on the base of what we know. Sort of a weather forecast that might be pretty accurate on short-time periods, but we always must keep in mind that there could be considerable deviations between calculations and reality. It is about speculative calculations based on the information that has been fed into a machine. In fact, almost as a footnote we get to know, that only about 35% of AlphaFold’s predictions are deemed to be highly accurate, while 45% accurate enough for applications, while 20% might turn out to be wrong. That is, it is a tool that gets it wrong 1 over 5 times. This can, nevertheless, be a quite useful (re-)search engine, giving indications, and suggesting directions of research, but it can’t replace the good old empirical approach: Every single protein structure will have to be assessed via sophisticated X-Ray or electron-microscopy observations. Moreover, AI does not tell us how proteins fold. We are far from understanding why proteins fold like they do. Because we have no idea how the folding mechanism works and why a specific architecture leads to specific functionality. In fact, even once we will know all the protein structures, that will not automatically tell us how to design from it new drugs. Protein structure by itself will not be more informative to design new drugs as the mapping of the genome was for designing drugs against genetic diseases. As usual, again and again, it turns out that the map is not the territory.