Test/Train Split Importance in Speech Datasets