VAD (Voice Activity Detection) is used to detect human speech segments in audio streams, reducing ASR computation and improving real-time performance.