Microsoft’s vision for Cortana continues to evolve, and according to a new patent, the next step for the digital assistant could include new capabilities to better read and summarize emails, text messages and other types of communications for users on the go.
Drawings for the patent, first spotted by Windows United, show a jogger getting a rundown of important messages from Cortana through headphones — presumably the Surface Headphones or the upcoming Surface Earbuds — including a message from the boss, a reminder to check into a flight and a text message asking for a resume.
REVIEW: Microsoft Cortana reading email aloud on an iPhone is surprisingly useful, to a point
Microsoft has already made progress in Cortana’s ability to read and relay messages with its new Play My Emails feature for Outlook. However, the information described in the patent takes that ability to the next level by giving Cortana new capabilities to pull important points out of long messages and summarize them.
This is necessary, Microsoft says in the patent, because the human brain struggles to digest long, complex messages read aloud. Hence the need for short summaries.
Today, Play My Emails is only available on iOS, though Microsoft said at the time it would soon come to Android as well. It’s only available on Outlook for now, but the patent shows that Microsoft eventually wants its digital assistant to be able to read and summarize content from other sources.
According to the patent, Cortana would use machine learning and artificial intelligence to analyze message data from different sources — Microsoft Teams, Skype, WhatsApp, Twitter, emails, phone calls and text messages are all illustrated in the drawings — to learn the meaning of each message. Cortana would score the messages and generate a text summary that would be turned into speech and sent to a listening device. That could be a phone, car, headphones or smart speaker.
We’ve reached out to Microsoft for comment and will update this post if we hear back.
GeekWire Editor Todd Bishop tried out the Play My Emails feature last month and called it a “surprisingly useful and well-executed tool for keeping on top of your email.” It does have some glaring limitations such as the inability to forward or label/categorize messages using voice commands. Here’s the basics of how it works:
When reading messages, Cortana follows this general pattern, “[In this time frame], [name of person] [sent/replied/forwarded] an email about [subject] to you and [number or name of other recipients.]”
Cortana then provides a sense for how long the message is before reading it. “It’s a long one,” is the phrase used before reading a message will take more than 30 seconds. When it will take more than a minute, she warns, “It’s a really long one.” (Messages from my colleague and GeekWire co-founder John Cook tend to come with this warning.)
When Cortana is reading the text of the email, a simplified overlay on the screen resembles a music app, with play/pause, archive and flag buttons underneath the subject line and a photo or initials of the sender, providing an alternative to voice commands. You can swipe back and forth on the screen to navigate between messages if you’re holding your phone, or if not, you can give voice commands to go back or forward in the queue of messages.
Microsoft says “Play My Emails” is the first Cortana feature to be released as “an integral part of the Office 365 core experience.” It is part of a broader effort by Microsoft to focus Cortana on productivity across a variety of devices, after giving up on the notion of competing head-to-head with the likes of Siri and Alexa as an all-purpose voice assistant.
Under CEO Satya Nadella, Microsoft hasn’t been afraid to de-emphasize areas that aren’t paying off, or make a major pivot to re-invigorate a stagnant product, and the company is in the midst of that with Cortana. In a Windows update earlier this year, Microsoft separated Cortana from search and add the option to mute it during setup.
The new features shown in the patent fit with a vision for Cortana the company first teased earlier this year at its Build developer conference. Microsoft displayed a more free-flowing conversation between a human and virtual assistant that went beyond the typical single command and response and included shuffling around a schedule and creating new meetings on the go.
Microsoft’s acquisition of “conversational AI” startup Semantic Machines last year is playing a significant role in the evolution of Cortana. Semantic Machines aims to advance the state of voice-based AI from understanding and responding to singular commands commands to having complete conversations. The company was built by accomplished startup entrepreneurs, a former chief speech scientist for Apple’s Siri and leading AI researchers and professors from Stanford and University of California at Berkeley.
Today’s digital assistants are limited in their capabilities and “aren’t focused on learning how to do new things, or mixing and matching the things they already know in order to support new contexts,” Semantic Machines Co-founder and Microsoft Technical Fellow Dan Klein wrote in a blog post in May. He added that Semantic Machines teaches virtual assistants about context so they can grow beyond simple questions and answers.