DataHub: Collaborative Data Science & Dataset Version Management at Scale