Powerful AI models are healthy for voice tech

Microsoft’s acquisition of Nuance for almost $20 billion signposted that speech-based services have strong prospects. Healthcare is a good indicator of what innovations voice tech will deliver next.
7 September 2022

Noteworthy alternative: adding information to medical records is one of many areas where speedy speech-powered services offer advantages over point and click keyboard entry. Secure smartphone apps add to the convenience for users. Image credit: Shutterstock.

One of the biggest developments in speech recognition and voice-activated technology has been the massive improvements made in language models. The ability to feed algorithms with huge amounts of training data, and apply techniques such as deep learning, has given developers access to a much richer set of statistical models. And because these models do a much better job of capturing the complexity of language, the resulting applications that are built on top of them are night and day compared with voice software that you may have used in the past.

“It’s made a fundamental difference to the user experience,” Simon Wallace, Chief Clinical Information Officer, at Nuance told TechHQ. “Speech has become a massive springboard to take technology into a new era.” In healthcare, trends include exploring where advanced tools such as conversational artificial intelligence (AI) can play a bigger role in supporting clinicians. And this vision of the future could be closer than we imagine.

Building on their successes in the medical sector, engineers at Nuance are looking at how technology – deployed with security features such as voice biometrics – can give even more time back to healthcare workers. Solutions being evaluated include hardware and software for diarizing health conversations between doctors and patients, which can quickly and accurately turn a variety of information into structured medical notes. If a patient points to their knee, for example, and mentions that the area is sore – the sensor array capturing the conversation can interpret whether it is the left or right knee, resulting in a more precise description about the location of the pain when this is documented in the digitally digested records.

Voice tech today

Over the past decade, speech recognition and voice-activated technology has proven to be a significant time-saver for clinicians. Features such as being able to dictate at the point of cursor represent the first stage of the transition. “We typically speak three times faster than we type,” said Wallace. “Systems also have templates for larger blocks of text, which further speeds up the process.” Nuance has been busy training its language models with a large body of scientific information, including medical dictionaries, so that products are well-equipped to cope with the complex terminology used in the medical profession.

Data packs – English has four versions, one each for the US, UK, Australia and Canada – manage localization scenarios such differences in the pronunciation of a pharmaceutical drug across countries that share the same language. Thanks to the use of much more detailed base models – which have been created to serve a number of languages, not just English – clinicians can use Naunce’s system straight out of the box. And, as users have their own profiles, the language learning routines can continue to feedback and further tailor their operation. Today, voice characteristics (Wallace makes clear that the system doesn’t save any conversations directly) learned from more than half a million clinical users mean that AI tools are extremely well placed to serve the healthcare sector.

Besides dictation, voice tools can be programmed to navigate the various clicks and tabs on hospital systems to order a hip x-ray, request a renal ultrasound, or schedule a postoperative care appointment – to give a few of examples of common admin tasks that accompany surgical procedures. Site visits are good opportunities to identify where software is able to make the most impact. “We’ll sit down with healthcare professionals and look at their workflows to find the pain points,” said Wallace. He enjoys seeing customers smile when they realize that voice-activated step-by-step commands can replace minutes of clicking buttons and navigating tabs, and easily save users big chunks of time over their shifts. Practically, those savings can allow more patient contact time as well as giving clinicians some opportunity to rest and take a breath – important for a sector that has high rates of staff burnout.

Cloud certified

There are other useful features too, such as cloud deployment, which means that authorized users can access voice technology from anywhere. During the pandemic, when some healthcare workers had to work from home, it meant that records could be still be updated and managed using voice commands. Healthcare has proven to be a rich training ground for AI-powered speech recognition and voice-activated technology. Being able to get to grips with complex, highly-technical language is definitely a win for applications providers. As is building systems that meet tough security criteria, which is necessary to safeguard patient data. Wallace, who has worked as a GP and hospital doctor himself, notes the various standards that voice tech providers must comply with. The list includes DCB0129 – prepared by the UK’s NHS Digital Clinical Safety team – which is an information standard designed to help manufacturers of health IT software evidence the clinical safety of their products.

Unlocking the productivity gains of moving from keyboard to voice – which (to recap) includes features such as dictation at the point of cursor, templates for inserting frequently used text, and step-by-step commands for combining mouse clicks and tabbed navigation – benefits a wide range of roles in healthcare. And that’s not forgetting emerging upgrades powered by developments in conversational AI supported by smart sensors. It’s a compelling case study, and the medical sector is by no means the only area where speech recognition and voice-activated technology is delivering operational gains.

Big players have noted the role that voice tech can play in accelerating digital transformation. Today, providers such as Nuance, which is now part of Microsoft, serve clients operating in telecoms, financial services, retail, and government – to touch on just a few of the industries where speech-enabled services are adding value.