Friday, May 18, 2018

Answering the Phone, with AI



I have often said that one of my goals in court automation was to be able to enable courts to “answer the phone”.  Many years ago, I was called by an elected clerk of court from a large metropolitan county.  Unfortunately, they did not leave a direct telephone number but rather, their general office number.  I tried to return the call and their automated answering system put me on hold.  I thought it was an important call to try to return and waited for over a half-hour.  Their system then disconnected me.  Frustration for me but it would have been worse to have business with this office.  So, when I saw the Google Duplex AI System demo, I became very interested.




---

At the recent Google I/O Conference for developers, one of the biggest news items was the demonstration of their (still in development) AI based Duplex voice technology.  This four-minute YouTube video shows the demonstration

In short, the demonstration presented an automated assistant making a call to a hair salon and interacting with the human scheduler in a completely natural way to make an appointment.  For me this event was similar to the first time I saw a spreadsheet (’79), first time I saw a graphical user interface (’82), first time I saw an Internet browser (’83) kind of moment.  Significant.

The Google AI blog provided additional explanation of how they have researched and developed both natural language understanding but also, how they created natural sounding voices.

The blog post explains:
“Google Duplex’s conversations sound natural thanks to advances in understanding, interacting, timing, and speaking. 
At the core of Duplex is a recurrent neural network (RNN) designed to cope with these challenges, built using TensorFlow Extended (TFX). To obtain its high precision, we trained Duplex’s RNN on a corpus of anonymized phone conversation data. The network uses the output of Google’s automatic speech recognition (ASR) technology, as well as features from the audio, the history of the conversation, the parameters of the conversation (e.g. the desired service for an appointment, or the current time of day) and more. We trained our understanding model separately for each task, but leveraged the shared corpus across tasks. Finally, we used hyperparameter optimization from TFX to further improve the model."
There are also pictures in the blog post to help to explain this (as I am not sure what all this means either).

Now for courts.  There were immediate commentary and objection to the idea that one is talking with a computer without knowing that fact.  OK, I suppose that having a computer call you and sound like an intelligent person could be concerning?  It doesn’t bother me, but I can sympathize with that viewpoint.

But, on the other hand, I just want the system to answer my clerk’s office phone and help me to find information and schedule my court time.  Calling Google Duplex would be a joy compared to sitting on the phone for a half hour waiting for someone to respond.

In my opinion, this cannot come soon enough.

1 comment:

  1. James great post! I agree with you whole heartedly, this can't come soon enough. I also think this is part of an omni-channel approach to interacting with the courts. This might include a text, IVR, email, mail, and phone - it also could be carried out by a AI, a bot, or a human. Not every question should be handled by a "computer" but it could triage, direct, and handle incoming requests to a certain level. Machines or computers could even operate after court hours in certain scenarios. This is similar to several contact center models in the commercial sector. In fact, I am sure I was getting an answer from a bot via chat about a PC issue yesterday, and it was helpful and easy.

    Obviously processes and standards need to be put in place to ensure the technology does no harm. However, these type of tools are in use to day and Google Duplex holds a lot of promise.

    ReplyDelete