Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks
Citation: Cong Yan, Yeye He (2020) Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks.
Internet Archive Scholar (search for fulltext): Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks
Wikidata (metadata): Q95693645
Download: https://homes.cs.washington.edu/~congy/JupyterNotebooks.pdf
Tagged:
Summary
Show promise of learning from existing notebooks to predict and recommend data preparation steps. Harvested public notebooks from GitHub, replayed those using Pandas API, instrumented to build data-flow graphs with fine-grained information of input/output tables and the operations taken at each step. Using this data, build machine-leaning-based and optimization based models to predict parameterization of Pivot and Join operations, and develop a deep-learning architecture to predict the next operation.