A friend of mine shared an abstract with me from an upcoming talk by Computer Science professor: Data analysis is an emerging research topic that focuses on understanding patterns of data to discover knowledge. For understanding the data, various machine learning (ML) techniques are commonly utilized to build learning models. For maintaining high performance of the models, it is important to extract good features and utilize them to build a reliable learning model.
Disclaimer: This post is at least tongue-half-way-in-cheek. I acutally like the article I’m lampooning. A recent publication by academics and AI researchers titled “Data Sheets for Datasets” calls for the Machine Learning community to ensure that all of their datasets are accompanied by a “datasheet.” These datasheets would contain information the dataset’s “motivation, composition, collection process, recommended uses, and so on.” The authors, Gebru, et al., would you like to include more data about your dataset.