Lessons from using SigOpt to weigh tradeoffs for BERT size and accuracy

When: Wednesday, August 19 at 10am PT / 1pm ET

BERT is a strong and generalizable architecture that can be transferred for a variety of NLP tasks. But, it is very, very large, which can make it very, very slow. In a recent analysis, SigOpt Machine Learning Engineer Meghana Ravikumar explored this tradeoff between size and performance for BERT on Squad 2.0. 

In her first talk, Meghana explained how she set up a Multimetric Bayesian Optimization experiment to explore this tradeoff. In this talk, she builds on this discussion by explaining how she used insights from training runs and automated hyperparameter tuning to explore this tradeoff in greater depth, and draw specific conclusions related to the impact of reducing size on actual model performance on this particular task. 

More specifically, Meghana will explain how SigOpt easily integrates into and helps organize her modeling process. Specifically she’ll walk through critical points of her modeling workflow and describe how she leveraged SigOpt to make informed decisions. Specifically she’ll explain how to:

  • Track and Organize Modeling Attributes: Track and organize your training and tuning cycles, including: metrics, parameters, architectures, training or tuning runs and more.
  • Visualize and Compare Runs: Gain intuition on your models that auto populates their performance on your dashboard with customizable visualizations as you train and tune
  • Seamless Training and Tuning: Transition between training and tuning with our fully integrated automated hyperparameter tuning with training run tracking to make this process easily accessible

Join us for a live demo as Meghana walks through the functionality of Experiment Management. If you’re interested in learning more, join our private beta to get free access to the product.

Meghana Ravikumar

Meghana is a Machine Learning Engineer at SigOpt with a particular focus on novel applications of deep learning across academia and industry. In particular, Meghana explores the impact of hyperparameter optimization and other techniques on model performance and evangelizes these practical lessons for the broader machine learning community. Prior to SigOpt, she worked in biotech, employing natural language processing to mine and classify biomedical literature. She holds a BS degree in Bioengineering from UC Berkeley. When she’s not reading papers, developing models/tools, or trying to explain complicated topics, she enjoys doing yoga, traveling, and hunting for the perfect chai latte.