"Audio Signal Quality Enhancement Based On Quantitative Snr Analysis And Adaptive Wiener Filtering" in Patent Application Approval Process (USPTO 20180277135)

Electronics Newsweekly |

By a News Reporter-Staff News Editor at Electronics Newsweekly -- A patent application by the inventors Ali, Mahdi (Detroit, MI); Zhou, Xuan (Shanghai, CN); Chabaan, Rakan (Farmington Hills, MI); Jaber, Nabih (Southfield, MI); Lu, Meihui (Shanghai, CN); Hua, Kun (Southfield, MI), filed on , was made available online on , according to news reporting originating from Washington, D.C., by VerticalNews correspondents.

This patent application is assigned to Kia Motors Corporation (Seoul, South Korea).

The following quote was obtained by the news editors from the background information supplied by the inventors: "Voice recognition-enabled applications have become increasingly common in modern vehicles. Such technology allows for the driver of a vehicle to perform in-vehicle functions typically requiring the use of hands, such as making a telephone call or selecting music to play, by simply uttering a series of voice commands. This way, the driver's hands can remain on the steering wheel and the driver's gaze can remain directed on the road ahead, thereby reducing the risk of accidents.

"'Hands-free' communication in vehicles is commonly implemented using Bluetooth, which is a short range wireless communication that operates in the Industrial Scientific and Medical (ISM) band at 2.4 to 2.485 GHz. Bluetooth is designed for low-power consumption and replaces standard wire-based communications using low-cost transceiver microchips equipped in each compatible device. Among other things, Bluetooth allows drivers to pair their mobile phones with the vehicles' audio system and establish hands-free calls utilizing the vehicles' audio system.

"Voice recognition, or speech recognition, applications can utilize Bluetooth to acquire a speech signal, recognize spoken language within the signal, and translate the spoken language into text or some other form which allows a computer to act on recognized commands. Various models and techniques for performing voice recognition exist, such as the Autoregressive (AR) model, hidden Markov models, dynamic time warping, and neural networks, among others. There are various advantages to each voice recognition model, including greater computational efficiency, increased accuracy, improved speed, and so forth.

"Common to all voice recognition approaches is the process of acquiring speech signals from a user. When voice recognition is attempted in a noisy environment, however, performance often suffers due to environmental noises muddying the speech signals from the user. Such problems arise when performing voice recognition in a vehicle, as several sources of noise exist due to vehicular dynamics inside of the vehicle (e.g., the engine, radio, turn signal indicator, window/sunroof adjustments, heating, ventilation, and air conditioning (HVAC) fan, etc.) as well as outside of the vehicle (e.g., wind, rain, passing vehicles, road features such as pot holes, speed bumps, etc.). As a result, the cabin of the vehicle often contains a mixture of different noises, each with different characteristics (e.g., position, direction, pitch, volume, duration, etc.). The result is degraded audio quality in hands-free Bluetooth-based conversations and poor voice recognition accuracy."

In addition to the background information obtained for this patent application, VerticalNews journalists also obtained the inventors' summary information for this patent application: "The present disclosure provides techniques for enhancing audio signal quality and, more particularly, noise reduction for voice communication over Bluetooth. Two different noise estimation techniques, log-energy voice activity detection (VAD) and first-in, first-out (FIFO), are employed in conjunction with Wiener filtering. Both noise estimation techniques have advantages under different noisy conditions. Particularly, it has been observed, based on the performance of these techniques, that log-energy VAD is more effective at higher signal-to-noise ratios (SNR) than FIFO, whereas FIFO is more effective at lower SNR than log-energy VAD. Thus, the present disclosure describes an optimized, adaptive approach to noise reduction that combines log-energy VAD and FIFO techniques with Wiener filtering. The result is a new signal filtering algorithm which improves upon conventional Wiener filtering.

"According to embodiments of the present disclosure, an audio signal enhancement method includes: acquiring an audio signal; estimating a signal-to-noise ratio (SNR) of an audio frame of the audio signal; determining a SNR threshold for the audio frame; selecting an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame; filtering the audio frame using a Wiener filter applying the selected signal processing technique; and outputting the audio frame filtered using the Wiener filter applying the selected signal processing technique. A first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold.

"Correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique can measure a correlation between a clean signal and respective output signals of the FIFO signal processing technique and the log-energy VAD signal processing technique. In this regard, the audio signal enhancement method may further include calculating the correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique, respectively. Also, the SNR threshold is a SNR value at which the correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique, respectively, are the same.

"The determining of the SNR threshold may include: estimating a noise level in an environment in which the audio signal was acquired; and determining the SNR threshold based on the estimated noise level. The estimating of the noise level may include estimating the noise level using the FIFO signal processing technique. The estimating of the noise level may also include: determining one or more environmental conditions present when the audio signal is acquired; and estimating the noise level based on the one or more environmental conditions. The one or more environmental conditions may include one or more of a vehicle speed, a fan speed, a weather condition, whether a vehicle window is open, revolutions per minute (RPM) of an engine, and a volume of media being played.

"The audio signal enhancement method may further include referencing a look-up table to determine the SNR threshold based on the estimated noise level. The audio signal enhancement method may even further include: measuring SNR thresholds across a plurality of noise levels; and generating the look-up table using the measured SNR thresholds across the plurality of noise levels.

"The SNR threshold may vary according to a noise level in an environment in which the audio signal was acquired.

"The estimating of the SNR may include estimating the SNR of the audio frame using the FIFO signal processing technique.

"Additionally, the audio signal enhancement method may further include dividing the acquired audio signal into a plurality of audio frames, where the audio frame is one of the plurality of audio frames. In this regard, the steps of estimating the SNR, determining the SNR threshold, selecting the audio signal processing technique, filtering the audio frame using the Wiener filter applying the selected signal processing technique, and outputting the audio frame using the Wiener filter applying the selected signal processing technique can be performed for each of the plurality of audio frames.

"Also, the acquired audio signal may include a combination of noise and speech. The outputted filtered audio frame may include speech from which noise present in the audio frame is removed.

"The audio signal enhancement method may further include converting the audio frame into a frequency domain using a fast Fourier transform (FFT) before the filtering of the audio frame.

"In addition, the audio signal may be acquired via Bluetooth, and the audio signal may be acquired in a vehicle.

"Furthermore, according to embodiments of the present disclosure, an audio signal enhancement apparatus includes: an audio acquisition unit acquiring an audio signal in a vehicle; and a control unit equipped in the vehicle configured to: estimate a signal-to-noise ratio (SNR) of an audio frame of the audio signal; determine a SNR threshold for the audio frame; select an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame; filter the audio frame using a Wiener filter applying the selected signal processing technique; and output the audio frame filtered using the Wiener filter applying the selected signal processing technique. A first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold.

"Furthermore, according to embodiments of the present disclosure, a non-transitory computer readable medium containing program instructions for performing an audio signal enhancement method includes: program instructions that estimate a signal-to-noise ratio (SNR) of an audio frame of an acquired audio signal; program instructions that determine a SNR threshold for the audio frame; program instructions that select an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame; program instructions that filter the audio frame using a Wiener filter applying the selected signal processing technique; and program instructions that output the audio frame filtered using the Wiener filter applying the selected signal processing technique. A first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold."

The claims supplied by the inventors are:

"1. An audio signal enhancement method comprising: acquiring an audio signal; estimating a signal-to-noise ratio (SNR) of an audio frame of the audio signal; determining a SNR threshold for the audio frame; selecting an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame, wherein a first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold; filtering the audio frame using a Wiener filter applying the selected signal processing technique; and outputting the audio frame filtered using the Wiener filter applying the selected signal processing technique.

"2. The audio signal enhancement method of claim 1, wherein correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique measure a correlation between a clean signal and respective output signals of the FIFO signal processing technique and the log-energy VAD signal processing technique.

"3. The audio signal enhancement method of claim 2, further comprising: calculating the correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique, respectively.

"4. The audio signal enhancement method of claim 2, wherein the SNR threshold is a SNR value at which the correlation coefficients of the FIFO signal processing technique and the log-energy VAD signal processing technique, respectively, are the same.

"5. The audio signal enhancement method of claim 1, wherein the determining of the SNR threshold comprises: estimating a noise level in an environment in which the audio signal was acquired; and determining the SNR threshold based on the estimated noise level.

"6. The audio signal enhancement method of claim 5, wherein estimating of the noise level comprises: determining one or more environmental conditions present when the audio signal is acquired; and estimating the noise level based on the one or more environmental conditions.

"7. The audio signal enhancement method of claim 6, wherein the one or more environmental conditions include one or more of a vehicle speed, a fan speed, a weather condition, whether a vehicle window is open, revolutions per minute (RPM) of an engine, and a volume of media being played.

"8. The audio signal enhancement method of claim 5, wherein estimating of the noise level comprises: estimating the noise level using the FIFO signal processing technique.

"9. The audio signal enhancement method of claim 5, further comprising: referencing a look-up table to determine the SNR threshold based on the estimated noise level.

"10. The audio signal enhancement method of claim 9, further comprising: measuring SNR thresholds across a plurality of noise conditions; and generating the look-up table using the measured SNR thresholds across the plurality of noise levels.

"11. The audio signal enhancement method of claim 1, wherein the SNR threshold varies according to noise conditions in an environment in which the audio signal was acquired.

"12. The audio signal enhancement method of claim 1, wherein the estimating of the SNR comprises: estimating the SNR of the audio frame using the FIFO signal processing technique.

"13. The audio signal enhancement method of claim 1, further comprising: dividing the acquired audio signal into a plurality of audio frames, wherein the audio frame is one of the plurality of audio frames.

"14. The audio signal enhancement method of claim 13, wherein the steps of estimating the SNR, determining the SNR threshold, selecting the audio signal processing technique, filtering the audio frame using the Wiener filter applying the selected signal processing technique, and outputting the audio frame using the Wiener filter applying the selected signal processing technique are performed for each of the plurality of audio frames.

"15. The audio signal enhancement method of claim 1, wherein the acquired audio signal includes a combination of noise and speech.

"16. The audio signal enhancement method of claim 1, wherein the outputted filtered audio frame includes speech from which noise present in the audio frame is removed.

"17. The audio signal enhancement method of claim 1, further comprising: converting the audio frame into a frequency domain using a fast Fourier transform (FFT) before the filtering of the audio frame.

"18. The audio signal enhancement method of claim 1, wherein the audio signal is acquired via Bluetooth.

"19. The audio signal enhancement method of claim 1, wherein the audio signal is acquired in a vehicle.

"20. An audio signal enhancement apparatus comprising: an audio acquisition device acquiring an audio signal in a vehicle; and a control unit equipped in the vehicle configured to: estimate a signal-to-noise ratio (SNR) of an audio frame of the audio signal; determine a SNR threshold for the audio frame; select an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame, wherein a first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold; filter the audio frame using a Wiener filter applying the selected signal processing technique; and output the audio frame filtered using the Wiener filter applying the selected signal processing technique.

"21. A non-transitory computer readable medium containing program instructions for performing an audio signal enhancement method, the computer readable medium comprising: program instructions that estimate a signal-to-noise ratio (SNR) of an audio frame of an acquired audio signal; program instructions that determine a SNR threshold for the audio frame; program instructions that select an audio signal processing technique according to a comparison of the SNR threshold to the estimated SNR of the audio frame, wherein a first-in, first-out (FIFO) signal processing technique is selected when the estimated SNR of the audio frame is less than the SNR threshold, and a log-energy voice activity detection (VAD) signal processing technique is selected when the estimated SNR of the audio frame is greater than the SNR threshold; program instructions that filter the audio frame using a Wiener filter applying the selected signal processing technique; and program instructions that output the audio frame filtered using the Wiener filter applying the selected signal processing technique."

URL and more information on this patent application, see: Ali, Mahdi; Zhou, Xuan; Chabaan, Rakan; Jaber, Nabih; Lu, Meihui; Hua, Kun. Audio Signal Quality Enhancement Based On Quantitative Snr Analysis And Adaptive Wiener Filtering. Filed and posted . Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220180277135%22.PGNR.&OS=DN/20180277135&RS=DN/20180277135

(Our reports deliver fact-based news of research and discoveries from around the world.)

DISCLOSURE: The views and opinions expressed in this article are those of the authors, and do not represent the views of equities.com. Readers should not consider statements made by the author as formal recommendations and should consult their financial advisor before making any investment decisions. To read our full disclosure, please go to: http://www.equities.com/disclaimer

Comments

Watchlist

Symbol Last Price Change % Change
AAPL

     
AMZN

     
HD

     
JPM

     
IBM

     
BA

     
WMT

     
DIS

     
XOM

     

What Is Petrolithium?

MGX Minerals explains the advantages of petrolithium and how they are helping to solve future problems today.

Emerging Growth

CMX Gold & Silver Corp.

CMX Gold & Silver Corp is an exploration stage company. The Company is engaged in the acquisition, exploration and development of silver and copper/gold properties in the USA.