The current major Linux sound servers are Enlightened Sound Daemon (ESD or EsounD) for Gnome, analog Real time synthesizer (aRts) for KDE2/3, and Advanced Linux Sound Architecture (ALSA), which works everywhere. Network Audio System (NAS) is a client/server networked sound system for thin clients. For applications that require OSS, ALSA, ESD, and aRts all include an OSS emulator. JACK is a popular professional-level low-latency audio server. One thing these all have in common is they require ALSA to provide the audio hardware drivers.
ESD and aRts both support networked sound, and manage sound streams from multiple sources. ESD is hard-coded into Gnome, but thanks to the PulseAudio developers it should soon be divorced from Gnome, as a proper modular Linux application should be. aRts was designed from the beginning as an independent, portable audio framework. ALSA provides device drivers, multi-device management, basic mixing and recording, and works in any Linux environment, including the console.
ESD and aRts both perform both low-level and higher-level functions. Both interface between sound hardware and applications, and also encode and decode your various file and streaming audio formats. aRts does everything; ESD handles sound server duties, and GStreamer handles the encoding and decoding. Both eventually pass everything down the pipeline to ALSA.
aRts has been officially deprecated by the KDE team for KDE4, and will be replaced by Phonon. Phonon promises a simpler API (application programming interface) by functioning more as a universal interface between existing audio engines such as ALSA, Xine, MPLayer, and VLC. The Phonon developers also have the worthy goal of designing a friendlier mixer interface that doesn't require knowledge of sound engineering terminology, but uses sensible labels like Notifications, Music, and Communications.