End-to-End Speech Recognition Explained