By now, we’ve heard plenty about the “reproducibility crisis” in science. A lot of the stories we hear are about unscrupulous or desperate scientists who publish data that they have, to put it one way, tweaked. But despite the challenges inherent to manual lab work, the vast majority of scientists are in it for the right reasons. Therefore, the reproducibility crisis is not an issue of ethics. It is about something far more systemic in the way information is recorded and shared. In short, it is about the lack of reproducibility that no one knows is there, not even the authors of the papers where the offending data is published. It’s not about fraudulent or manipulated data, but the nature of data production which makes it unreproducible all the same.
Honest, yet unreproducible data is a major problem for science. Research shows that as many as 68% of studies are not reproducible in an industrial setting.1 The time and money that companies spend discovering that published work is not reproducible means that fewer treatments make it to market, and the ones that do are more expensive. The same way a ripple permeates outward in water, unreproducible data spreads far beyond its source, affecting every scientist who reads it, paper that cites it, and project that uses it as basis for further experimentation.
Many people have proposed myriad of potential solutions. From changing the “publish or perish” incentives in academia to crowdsourced troubleshooting, none of these solutions address a major part of the problem: methods sections. Currently, methods sections are full of vague and incomplete information, and scientists are left to make assumptions when attempting to reproduce them. Most of today’s methods sections are little better than grandma’s recipes, riddled with dozens of assumptions made by both the author and the reader, from what temperature is considered to be “room temperature,” to whether the oven is accurately calibrated to what exactly a “medium” potato means. No wonder grandma’s cooking always comes out better–she knows what she meant by all of these parameters, but we don’t.
The same thing happens in a paper’s methods section. We are all using different makes and models of equipment, our labs have different humidity and temperature levels, and we don’t usually know what kind of plastic our consumables are made of, to list just a few differences that inevitably exist between physical spaces.

How do we fix this?
In a traditional lab it’s almost impossible to rectify this due to misaligned incentives. A journal is not going to be more likely to accept your paper because your methods section is three pages long. But in a cloud lab, it’s a different story. In the process of setting up a method in a cloud lab, your protocol becomes a snippet of code that contains all of the parameters that define your experiment. In ECL’s case, Command Center already knows every setting on every instrument in our lab, from flow rates to temperature traces to materials, and everything you could ever want to know about a protocol is readily available. When you go to build your experiment, you select the options you need and when you want to run the exact same protocol again you just take the snippet of code and click “go.” And if someone else wants to run it, all they need is the same code snippet and they can click “go.” This is true reproducibility.
In the future, there won’t be any more need to email the author of a paper because you can’t get their method to work; you’ll just run the code that they published. Graduate students won’t spend hours going through the former student’s notes; they’ll find the code they need in the lab’s cloud account. And companies won’t waste time rebuilding the know-how that their former employee took with them when they left; they’ll simply run the code from their private database. The result of all of this reproducibility will be efficient science leading to more discoveries and more cures in less time.
- Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets?. Nat Rev Drug Discov 10, 712 (2011). https://doi.org/10.1038/nrd3439-c1 ↩︎