Microsoft speech recognition achieves the lowest-ever error rate in recent study

3 min. read

Published on September 13, 2016

published on September 13, 2016

Readers help support Windows Report. We may get a commission if you buy through our links.

According to the keynotes delivered during several developer conferences over the past year, three key areas companies are looking to lead the technology industry in the future include a focus on machine learning, artificial intelligence and speech recognition.

Ideally, the avenues of machine learning, speech recognition, and artificial intelligence will intersect and create a seamless experience for users who opt to communicate through burgeoning digital assistants or applications that rely heavily on cloud-connected data.

Fortunately for Microsoft, its stake in a digital assistant that uses speech recognition is starting to pay off for the company as its achieves a new milestone in human and machine interactions.

According to a recent benchmark evaluation reported by Microsoft’s chief speech scientist Xuedong Huang, the company managed to mark its lowest word error rate (WER) to date. When compared to the industry standard Switchboard speech recognition task, Microsoft researchers managed to jot down a WER of 6.3 percent. Microsoft’s new 6.3 WER stands currently as the sector’s lowest markings to date.

The Microsoft researchers behind the new speech recognition feat attribute their success to foundations developed with Neural networks. Earlier this year, Microsoft researchers also won the Image computer vision challenge that utilized its work in neural networking. By using Microsoft’s cross-layer network connections, researchers we able to use each layer to optimize recognition and association of speech patterns, definitions, etc.

Another key scientific element contributing to the new low measure was Microsoft’s other successful jaunt with its Computational Network Toolkit. Once again, CNTK allowed researchers to make use of sophisticated optimizations by way of learning algorithms that helped users and computers tap into quickened learning algorithms.

Huang adds that the speech recognition milestone is a significant marker on Microsoft’s journey to deliver the best AI solutions for its customers. One component of that AI strategy is conversation as a platform (CaaP); Microsoft outlined its CaaP strategy at the company’s annual developer conference earlier this year.”

Although it may been said seven months ago and subsequently forgotten by most, Microsoft is betting on a future where voice will become the new ‘swipe’ and user interactions should be as seamless through vocal input as it is when using touchscreens and apps.

Microsoft researchers seem well on their way to making a movie such as Her, a reality rather than a bullet point on a PowerPoint presentation during a developer conference keynote.

To read more about the men and women helping to bring this project to light or to find out more about the low ranking was achieved, visit Microsoft’s Official Microsoft Blog for details.

Kareem Anderson

Networking & Security Specialist

Kareem is a journalist from the bay area, now living in Florida. His passion for technology and content creation drives are unmatched, driving him to create well-researched articles and incredible YouTube videos. He is always on the lookout for everything new about Microsoft, focusing on making easy-to-understand content and breaking down complex topics related to networking, Azure, cloud computing, and security.

User forum

0 messages

Sort by:

Leave a Reply Cancel reply