[Nov 2022] This has been such an excellent year for software system design in ML.
So, I compiled a list of some of my favorite papers 📜in MLOps.
Not too long, I ran a poll on Twitter to see why people were interested in MLOps.
I 💯agree with what Adam said here. I will be doing a detailed post on what is MLOps, its history/background, open problems, and challenges in the field, startups working to solve them, and overall putting things into perspective for everyone but I first wanted to start out with a post that would help you get inspired.
There are two ways of exploring MLOps. First, the industry way is by getting our hands dirty building something and using the tools, libraries, and frameworks available. This is a bad way of starting out. Reason being it’s so easy to get overwhelmed and lose perspective when inundated with so many choices - each time-consuming, with a steep learning curve. So, here’s a second way - the academic way. Let’s get a bird’s eye view of what’s going on in the field. MLOps as an academic research field is still new with only a handful of directly-relevant papers.
So, I compiled a list of some of my favorite papers 📜in MLOps.
The top one would be Machine Learning: The High-Interest Credit Card of Technical Debt by D. Sculley et al
We invited him as a guest on our MLOps Community podcast (Spotify/iTunes) Episode #32 - def worth listening to! research.google/pubs/pub43146/
If there's one I would def read it would be Machine Learning Operations (MLOps): Overview, Definition, and Architecture by Dominik, Niklas and Sebastian. It highlights necessary principles, components, and associated architecture and workflows in MLOps arxiv.org/abs/2205.02302
A recent one is Operationalizing Machine Learning: An Interview Study by Shreya et al which interviews 18 MLOps practitioners and discusses common practices across different stages of an ML project from experimentation ->deployment-> monitoring. arxiv.org/abs/2209.09125
While a guide for academia, but generally applicable best practices for all Data Scientists and ML Engineers is How to avoid machine learning pitfalls: a guide for academic researchers by Michael A. Lones arxiv.org/abs/2108.02497
While this one by Cote et al is the description of researchers' approach to designing a study that would hopefully guide how to build quality assurance tools in ML Software Systems (the study is yet to be out) but it does bring attention to an open challenge. arxiv.org/abs/2208.08982
Next is a paper that talks about how to address the eng challenges associated with distributed training if u don't have the necessary infrastructure to match the big corps with infinite compute and a million hyperparameters. Training Transformers Together arxiv.org/abs/2207.03481
Of course, the list won't be complete without a discussion of Jupyter NBs. But what would be the performance difference b/w notebooks vs scripts and the pros and cons of each? 📜A Large-Scale Comparison of Python Code in Jupyter Notebooks and Scripts arxiv.org/abs/2203.16718
But what about Production Infrastructure? How to cater to data stalls in the pre-processing pipeline? Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training arxiv.org/abs/2108.09373
Last, how can all the progress in machine learning guide the future of chip design? The paper by Jeff Dean provides an interesting outlook on hardware for software folks. The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design arxiv.org/abs/1911.05289
Found it useful? Others might too. This post is public so feel free to share it.
If not already subscribed? Subscribe for free to receive new posts and support my work