Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks

From AcaWiki
Jump to: navigation, search


Citation: Cong Yan, Yeye He (2020) Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks.


Wikidata: Q95693645

Download: https://homes.cs.washington.edu/~congy/JupyterNotebooks.pdf

Tagged:


Summary:

Show promise of learning from existing notebooks to predict and recommend data preparation steps. Harvested public notebooks from GitHub, replayed those using Pandas API, instrumented to build data-flow graphs with fine-grained information of input/output tables and the operations taken at each step. Using this data, build machine-leaning-based and optimization based models to predict parameterization of Pivot and Join operations, and develop a deep-learning architecture to predict the next operation.