That's a tough question to answer since audio is so multidisciplinary involving physics (acoustics), electronics (analog and digital), psychoacoustics, perception, music, etc.
I taught a course on Critical Listening: Perception and the Audio Environment for first year college students interested in working in the recording-post-production industry and I tried to focus on variables within the audio recording/playback chain that generally affect the sound quality -- I used Alton Everest's Master Handbook on Acoustics with a lot of supplemental reading to cover microphones, digital audio, loudspeakers, room acoustics, etc.
In my opinion, the biggest variables in the entire recording-playback chain are the loudspeakers and their acoustical interaction with the listening rooms. Why? Because everything within the recording chain is manipulated based on listening through loudspeakers in rooms - yet these are the weakest and least understood component in the entire recording and playback chain. And there no meaningful standards to define their performance --
There are not many good books on that subject until Floyd Toole wrote one this year called
Sound Reproduction: The Acoustics and Psychoacoustics of Loudspeakers and Rooms. It's not too technical and only has graphs and text (no math).
Good luck !
Cheers