Cleaning up the mess in AI innovation

A lot has already been said about the gigantic potential of AI: an explosion in algorithmic capabilities, computing capacity, and data is enabling beyond-human machine competencies. A blue ocean of innovation is expected to enable new use cases and drive massive productivity gains across virtually all industries. Impact on the global economy will certainly count in trillions dollars.

The mess underneath the surface of AI innovation is less known.

A single experiment to test a novel computer vision algorithm involves a long list of tedious tasks. Less than 10% of AI research papers share their source code and only 30% share the dataset used to produce claimed results in terms of accuracy, latency or energy consumption. If you are lucky enough to collect both, you still need to install a specific compiler, appropriate drivers and all other software packages with their own dependencies. The whole setup process can take up to a month before you can finally measure the performances of the tested solution. If any of these steps is not meticulously recorded, your teammates will have hard times to reproduce your results as each of them may impact performances.

This considerably slows down innovation: how can AI teams make progress if they cannot even reproduce earlier successes?

Still you are here only testing a single solution i.e a given algorithm deployed on a given hardware platform used to run a given dataset. The problem you face become intractable with the exponential growth in the number of potential combinations. 30,000 research papers on AI are released each year. Software packages are constantly updated, dragging you into a dependency hell. AI is deployed on a range of various platforms from TPUs in the cloud to legacy mainframe computers, mobile phones and IoT devices, with new models released every days. Finally, public datasets used in research may significantly differ from proprietary data jealously guarded by corporates (and sometimes other institutions) as they are the fuel of AI. Performances may differ consequently.

AI players have to make their technological choices out of millions of potential solutions. Up to now they are not in a position to do so rigorously. This leads to delays in AI adoption, guesswork, suboptimal mimetism or costly mischoices.

Some may say that virtual machines (e.g Dockers) already solve this. Our users know that those snapshots just hide the mess as they are not portable and quickly outdated. To accelerate and democratize the diffusion of AI innovation, the world needs a common playground to build, benchmark and share functional AI solutions, publicly or internally depending on who you are, but in any case, 100x faster than what is possible today.

That's why we are building CodeReef.