Connect with us

AI

Chatbots & Cloud CRM Are Replacing the Call Center Using AI Algorithms

Published

on

Source

Telemarketing, telecommunications and service call centers traditionally relied on human service operators interacting with customers over the phone. This was for many years just done over the phone until the Web and things such as e-mail took off. Over time, more and more service centers started to outsource employees and customer service to third world countries where cheap labor could be found (there are also companies specialising in such services).

For many years customer service relied on customers calling up a service center and asking for help in fixing their system or an error with the software they are using. A call center representative would try to help the customer over the phone from a single location and if that did not work they would route them to a technician or specialist that could potentially access their system remotely. Customers would spend hours over the phone with the potential of a connectivity loss. Cell phone usage becoming widespread alleviated this issue somewhat.

However, synchronous chats and online communications really took customer service to a higher and more individualised level. Nowadays the practice is expanding into artificial intelligence or machine learning along with the cloud and other technologies replacing many employees or humans sitting at these call centers.

We have these programs being deployed across websites and messaging services called chatbots that even Facebook uses in its messenger to help customers instead of relying on human operators waiting to take a call at all hours of the day and night: often not a feasible task.

A chatbot is available on demand and can serve as the first line of help for any customer that may be accessing the system from a different geographical one where the company is based out of and in the middle of the night in the company’s location. It can also aid customers with both text and voice as well as through visual cues.

Chatbots are becoming more robust and effective due to them implementing AI or machine learning techniques in helping solve customers’ needs and understanding human language better or the questions people input on their keyboards. Some of you may recall a time when you entered a question or even a search query on Google and got a completely different answer than what you intended to ask about. Chatbots are increasing efficiency in the ability to weed through questions asked in a variety of ways by the end user.

Besides chatbots, there is also contact center software available that allows companies to manage all their communications from a single source (or cloud-based CRM) and respond effectively to customers on demand. An example of this is Vocalcom. Communication is becoming much more digital and Vocalcom also has a enterprise contact center service that is integrated with SalesForce.

1. Conversational AI: Code/No Code

2. Chatbots 2.0: Simplifying Customer Service with RPA and AI

3. Question Answering on Medical conversation

4. Automating WhatsApp with NLP: Complete guide

Service center solutions such as these and CRM systems like SalesForce’s allow companies to stay in touch with customers through smart business intelligence and analytics to stay ahead of their customer’s needs. Another good example is that of Ujet. Both of these service center solutions rely on cloud integrations.

Machine learning and chatbots are really driving much of the global communications today amongst help desks or C2B assistance. It isn’t easy to sometimes hire interpreters or communication experts in a wide range of languages if the company is expanding to other geographical regions. A chatbot can do this on demand. It will also start to understand and pick up customer habits in terms of what they are asking for and how to help them effectively as the languages and audiences change.

A smart solution for business today is to have integrated systems in place that include both chatbots and cloud service centers on demand. Ujet has an interesting solution in place that allows for real-time monitoring of agent activity, intelligent call routine logic and the ability for supervisors to problem solve before issues arise, which is smart business intelligence. Personalization is also key as companies are becoming more tightly focused and specialized on particular market segments and customer needs.

“Customers can contact your call center, schedule an agent callback or launch a chat session directly from your website or mobile app,” according to Ujet’s website. “They can use their smartphones to share visuals that help agents better understand their issues. And service is speedier because agents already have key details about them, and their potential needs, right at the start of a call or chat.”

Ujet is referring to traditional chat communications between clients and companies, however a chatbot can also be integrated into such solution for times when company reps are not at work and as a first-step communication mechanism for companies experiencing heavy traffic load from customers.

Companies can implement multiple solutions today with the aid of SaaS and leverage various tools available from the cloud or CRM systems. They can then start integrating machine learning, such as chatbots, and have various ways to service clients through effective communication. Ujet has more information about the potential of such integrations in customer service on their company’s about us page. It summarizes this form of customer support well:

Customers can easily reach contact centers by phone, web and mobile app and use the power of their smartphones to help explain their problems. Agents have the contextual information they need to assist customers quickly, as well as more time to focus on resolving their issues. And supervisors have intuitive tools and real-time intelligence to get the best performance from their teams, increase customer satisfaction scores and help grow the businesses they serve.

In terms of the first layer of communication, I argue this is where the chatbot should come in —- at least for international companies with customers worldwide and those experiencing heavy traffic spikes (where call centers may be slow to respond to clients and even cloud contact centers can experience issues effectively addressing customer complaints on an individual level. Examples of this would be not enough employees for example and too many emails and chat clients.

SalesForce has an interesting video embedded on its site about chatbots and their use of machine learning advancements titled Chatbots: Using AI to Deliver the Future of Customer Service. You can watch it here and learn how chatbots are hot to implement right now and will only improve in the future as machine learning improves and the longer chatbots are active the more effective they become as tools for customer communications and services.

Companies can even utilize their own ideas for chatbots and deploy a chatbot they customized for their needs from scratch on botpress. This is an “on-prem, open-source boy building platform for businesses,” according to its website.

Some of its key highlights include NLU: natural language processing (NLP); analytics; flow editing: the ability to edit conversational flow; multi-channel across various messaging channels; authoring UI: ships with a GUI of its own for non-developers and SDK along with various APIs that are extensible and customizable.

This is just one example of many solutions out there for creating a comprehensive chatbot today with regard to natural language processing capabilities. Oracle has one as well that relies on JavaScript and where the chatbot can be deployed as localhost or off Oracle’s cloud servers. It is called Oracle Intelligent Bots.

I have experience creating chatbot variants from three different providers that I covered in depth for Pluralsight as technical guides. These three chatbot platforms I used to create them include Microsoft Azure, MotionAI (now Hubspot) and IBM Watson.

Natural language is not east to interpret or understand as a question can sometimes be asked a thousand times. It is in this way that machine learning is strongly being pushed. Google search, for instance, relies on machine learning to interpret search queries as do chatbots. Combined with effective CRM tools that leverage the cloud and cloud customer communication solutions is a great way for a company to service customers on a personalized level and identify problems before they arise on a massive scale. Customer service is not just becoming more personalized, but data driven and intelligent.

Source: https://chatbotslife.com/chatbots-cloud-crm-are-replacing-the-call-center-37ee2f10555c?source=rss—-a49517e4c30b—4

AI

Trump’s tweets blur the boundary between reality and fiction.

Published

on

Photo by visuals on Unsplash
Guido Meijer

We live in an age of misinformation in which it is becoming increasingly difficult to discern truth from falsehood. Part of the blame rests with technological advances, conceived in the advent of internet bubbles; neural networks which can fabricate deep-fake videos and bots which flood the internet with fake news. In other cases the blame lies with individual people who intentionally spread misleading or outright deceitful messages, because they personally benefit from confusion about what is real and what is not. One of these people is the current President of the United States. Mr. Trump routinely discredits news sources which are widely regarded as trustworthy. Some examples of news outlets he labelled as “fake news media” are The New York Times, CNN, and The Washington Post. It’s hardly a coincidence that there is a large overlap between sources which are deemed fake and those which disagree with Mr. Trump’s political views.

To amplify his -often misleading- statements, Mr. Trump regularly takes to Twitter as a platform. For example, he has falsely claimed that mail-in ballots lead to a fraudulent election (they do not), and that he fired Marine Corps general James Mattis (he resigned). In Twitter’s defense, the company has labelled several tweets of Mr. Trump as containing false information. Besides the accuracy of Mr. Trump’s tweets, the style of writing is often exceptionally un-presidential with myriad exclamation marks and random capitalization. Misleading content written in an outlandish style has become his distinctive signature, and has incentivized many to start parody Twitter accounts which jokingly mimic Mr. Trump. This has created yet another blurred boundary between truth and fiction on top of an already opaque situation. At times, I have been hard pressed to determine whether a tweet was real or a parody. That’s when I decided to see if a machine learning algorithm would be able to make this distinction for me.

I scraped 1000 tweets from Mr. Trump himself and 500 tweets from two parody accounts (@realDonaldTrFan and @RealDonalDrumpf) to train a machine learning algorithm (see the Github repository for details) to tell the difference between the real and the parody tweets. The trained classifier performed remarkably well and was able to correctly classify 88% of new tweets as either real or fake. Unfortunately it doesn’t quite reach human-level performance yet. My girlfriend was kind and bored enough to sit behind a laptop and classify 100 tweets for me, and had 96% correct. Only twice did I hear “He actually **expletives** said that?!”.

I took the trained classifier and put it into a Twitter bot; @real_fakeTrump was born. The bot shows you a tweet from either Mr. Trump himself or one of the parody accounts, you then have to guess if it’s real or fake. In the thread the bot then tells you its own prediction, the correct answer, and a link to the original tweet. Let’s dive in and look at a couple of examples, first a tweet which the classifier correctly identified then two where it was mistaken. The latter case is the most interesting because they are either parody tweets that could be real or tweets from Mr. Trump that are so absurd they cannot be distinguished from parody.

Let’s give it a go shall we? What do you think of this one:

1. 8 Proven Ways to Use Chatbots for Marketing (with Real Examples)

2. How to Use Texthero to Prepare a Text-based Dataset for Your NLP Project

3. 5 Top Tips For Human-Centred Chatbot Design

4. Chatbot Conference Online

If you thought this tweet was fake, you were sadly mistaken. The classifier correctly identified this tweet as real (but only with 55% certainty), it is indeed from Mr. Trump himself. Let’s try another one:

This is an interesting example of where the classifier made a mistake; it identified this tweet as fake but it was actually from Mr. Trump. This means that this tweet is very close to parody regarding choice of words and sentence construction. However, because the algorithm builds a high-dimensional model to make its predictions it is unfortunately not possible to pin down exactly how it reached its conclusion. OK, last one:

You might have guessed that this one is fake, it was indeed from the parody account @RealDonalDrumpf. The classifier, however, thought this one was so close to Mr. Trump’s style of tweeting that it predicted it was real.

At the moment, deep-fake videos and fake news articles written by bots are relatively easy to spot, although in some cases it is already impossible. Take for example the college student who created a productivity blog with content that was generated by the language model GPT-3. Nobody realized the blogs weren’t written by a human and it reached the top of Hacker News. As technology advances, it will become increasingly difficult to discern truth from fiction. Ironically, we might need to depend on AI to make this distinction for us.

Source: https://chatbotslife.com/trumps-tweets-blur-the-boundary-between-reality-and-fiction-6dab612570b9?source=rss—-a49517e4c30b—4

Continue Reading

AI

Active learning workflow for Amazon Comprehend custom classification models – Part 1.

Amazon Comprehend  Custom Classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. For example, your customer support organization can use Custom Classification to automatically categorize inbound requests by problem type based on how the customer has described the issue.  You can use custom classifiers to automatically label […]

Published

on

Amazon Comprehend  Custom Classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. For example, your customer support organization can use Custom Classification to automatically categorize inbound requests by problem type based on how the customer has described the issue.  You can use custom classifiers to automatically label support emails with appropriate issue types, routing customer phone calls to the right agents, and categorizing social media posts into user segments.

For custom classification, you start by creating a training job with a ground truth dataset comprising a collection of text and corresponding category labels. Upon completing the job, you have a classifier that can classify any new text into one or more named categories. When the custom classification model classifies a new unlabeled text document, it predicts what it has learned from the training data. Sometimes you may not have a training dataset with various language patterns, or once you deploy the model, you start seeing completely new data patterns. In these cases, the model may not be able to classify these new data patterns accurately. How can we ensure continuous model training to keep it up to date with new data and patterns?

In this two part blog series, we discuss an architecture pattern that allows you to build an active learning workflow for Amazon Comprehend custom classification models. The first post will describe a workflow comprising real-time classification, feedback pipelines and human review workflows using Amazon Augmented AI (Amazon A2I). The second post will cover the automated model building using the human reviewed data, selecting the best model, and automated deployment of an endpoint of the chosen model.

Feedback loops play a pivotal role in keeping the models up to date. This feedback helps the models learn about their misclassifications and learn the right ones. This process of teaching the models continuously through feedback and deploying them is called active learning.

For every prediction Amazon Comprehend Custom Classification makes, it also gives a confidence score associated with its prediction. This architecture proposes that you set an acceptable threshold and only accept the predictions with a confidence score that exceeds the threshold. All the predictions that have a confidence score less than the desired threshold are flagged for human review. The human decides whether to accept the model’s prediction or correct it.

In some instances, the model may be confident about its predictions, but the classification might be wrong. In these scenarios, the end-user applications that receive the model predictions can request explicit feedback from its users on the prediction quality. A human moderator reviews this explicit feedback and reclassifies instances where the feedback was negative. This process of generating human-verified data and using it for model retraining helps keep the models up to date, reduce data drift, and achieve higher model accuracy.

Feedback Workflow Architecture.

In this section, we discuss an architectural pattern for implementing an end-to-end active learning workflow for custom classification models in Amazon Comprehend using Amazon A2I. The active learning workflow comprises the following components:

  1. Real-time classification
  2. Feedback loops
  3. Human classification
  4. Model building
  5. Model selection
  6. Model deployment

The following diagram illustrates this architecture covering the first three components. In the following sections, we walk you through each step in the workflow.

Architecture Diagram for Feedback Loops

Real-time classification

To use custom classification in Amazon Comprehend, you need to create a custom classification job that reads a ground truth dataset from an Amazon Simple Storage Service (Amazon S3) bucket and builds a classification model. After the model builds successfully, you can create an endpoint that allows you to make real-time classifications of unlabeled text. This stage is represented by steps 1–3 in the preceding architecture:

  1. The end-user application calls an API Gateway endpoint with a text that needs to be classified.
  2. The API Gateway endpoint then calls an AWS Lambda function configured to call an Amazon Comprehend endpoint.
  3. The Lambda function calls the Amazon Comprehend endpoint, which returns the unlabeled text classification and a confidence score.

Feedback collection

When the endpoint returns the classification and the confidence score during the real-time classification, you can send instances with low-confidence scores to human review. This type of feedback is called implicit feedback.

  1. The Lambda function sends the implicit feedback to an Amazon Kinesis Data Firehose.

The other type of feedback is called explicit feedback and comes from the application’s end-users that use the custom classification feature. This type of feedback comprises the instances of text where the user wasn’t happy with the prediction. Explicit feedback can be sent either in real-time through an API or a batch process.

  1. End-users of the application submit explicit real-time feedback through an API Gateway endpoint.
  2. The Lambda function backing the API endpoint transforms the data into a standard feedback format and writes it to the Kinesis Data Firehose delivery stream.
  3. End-users of the application can also submit explicit feedback as a batch file by uploading it to an S3 bucket.
  4. A trigger configured on the S3 bucket triggers a Lambda function.
  5. The Lambda function transforms the data into a standard feedback format and writes it to the delivery stream.
  6. Both the implicit and explicit feedback data gets sent to a delivery stream in a standard format. All this data is buffered and written to an S3 bucket.

Human classification

The human classification stage includes the following steps:

  1. A trigger configured on the feedback bucket in Step 10 invokes a Lambda function.
  2. The Lambda function creates Amazon A2I human review tasks for all the feedback data received.
  3. Workers assigned to the classification jobs log in to the human review portal and either approve the classification by the model or classify the text with the right labels.
  4. After the human review, all these instances are stored in an S3 bucket and used for retraining the models. Part 2 of this series covers the retraining workflow.

Solution overview

The next few sections of the post go over how to set up this architecture in your AWS account. We classify news into four categories: World, Sports, Business, and Sci/Tech, using the AG News dataset for custom classification, and set up the implicit and explicit feedback loop. You need to complete two manual steps:

  1. Create an Amazon Comprehend custom classifier and an endpoint.
  2. Create an Amazon SageMaker private workforce, worker task template, and human review workflow.

After this, you run the provided AWS CloudFormation template to set up the rest of the architecture.

Prerequisites

Before you get started, download the dataset and upload it to Amazon S3. This dataset comprises a collection of news articles and their corresponding category labels. We have created a training dataset called train.csv from the original dataset and made it available for download.

The following screenshot shows a sample of the train.csv file.

CSV file representing the Training data set

After you download the train.csv file, upload it to an S3 bucket in your account for reference during training. For more information about uploading files, see How do I upload files and folders to an S3 bucket?

Creating a custom classifier and an endpoint

To create your classifier for classifying news, complete the following steps:

  1. On the Amazon Comprehend console, choose Custom Classification.
  2. Choose Train classifier.
  3. For Name, enter news-classifier-demo.
  4. Select Using Multi-class mode.
  5. For Training data S3 location, enter the path for train.csv in your S3 bucket, for example, s3://<your-bucketname>/train.csv.
  6. For Output data S3 location, enter the S3 bucket path where you want the output, such as s3://<your-bucketname>/.
  7. For IAM role, select Create an IAM role.
  8. For Permissions to access, choose Input and output (if specified) S3 bucket.
  9. For Name suffix, enter ComprehendCustom.

Comprehend Custom Classification Model Creation

  1. Scroll down and choose Train Classifier to start the training process.

The training takes some time to complete. You can either wait to create an endpoint or come back to this step later after finishing the steps in the section Creating a private workforce, worker task template, and human review workflow.

Creating a custom classifier real-time endpoint

To create your endpoint, complete the following steps:

  1. On the Amazon Comprehend console, choose Custom Classification.
  2. From the Classifiers list, choose the name of the custom model for which you want to create the endpoint and select your model news-classifier-demo.
  3. From the Actions drop-down menu, choose Create endpoint.
  4. For Endpoint name, enter classify-news-endpoint and give it one inference unit.
  5. Choose Create endpoint
  6. Copy the endpoint ARN as shown in the following screenshot. You use it when running the CloudFormation template in a future step.

Custom Classification Model Endpoint Page

Creating a private workforce, worker task template, and human review workflow.

This section walks you through creating a private workforce in Amazon SageMaker, a worker task template, and your human review workflow.

Creating a labeling workforce

  1. For this post, you will create a private work team and add only one user (you) to it. For instructions, see Create a Private Workforce (Amazon SageMaker Console).
  2. Once the user accepts the invitation, you will have to add him to the workforce. For instructions, see the Add a Worker to a Work Team section the Manage a Workforce (Amazon SageMaker Console)

Creating a worker task template

To create a worker task template, complete the following steps:

  1. On the Amazon A2I console, choose Worker task templates.
  2. Choose to Create a template.
  3. For Template name, enter custom-classification-template.
  4. For Template type, choose Custom,
  5. In the Template editor, enter the following GitHub UI template code.
  6. Choose Create.

Worker Task Template

Creating a human review workflow

To create your human review workflow, complete the following steps:

  1. On the Amazon A2I console, choose Human review workflows.
  2. Choose Create human review workflow.
  3. For Name, enter classify-workflow.
  4. Specify an S3 bucket to store output: s3://<your bucketname>/.

Use the same bucket where you downloaded your train.csv in the prerequisite step.

  1. For IAM role, select Create a new role.
  2. For Task type, choose Custom.
  3. Under Worker task template creation, select the custom classification template you created.
  4. For Task description, enter Read the instructions and review the document.
  5. Under Workers, select Private.
  6. Use the drop-down list to choose the private team that you created.
  7. Choose Create.
  8. Copy the workflow ARN (see the following screenshot) to use when initializing the CloudFormation parameters.

Human Review Workflow Page

Deploying the CloudFormation template to set up active learning feedback

Now that you have completed the manual steps, you can run the CloudFormation template to set up this architecture’s building blocks, including the real-time classification, feedback collection, and the human classification.

Before deploying the CloudFormation template, make sure you have the following to pass as parameters:

  • Custom classifier endpoint ARN
  • Amazon A2I workflow ARN
  1. Choose Launch Stack:

  1. Enter the following parameters:
    1. ComprehendEndpointARN – The endpoint ARN you copied.
    2. HumanReviewWorkflowARN – The workflow ARN you copied.
    3. ComrehendClassificationScoreThreshold – Enter 0.5, which means a 50% threshold for low confidence score.

CloudFormation Required Parameters

  1. Choose Next until the Capabilities
  2. Select the check-box to provide acknowledgment to AWS CloudFormation to create AWS Identity and Access Management (IAM) resources and expand the template.

For more information about these resources, see AWS IAM resources.

  1. Choose Create stack.

Acknowledgement section of the CloudFormation Page

Wait until the status of the stack changes from CREATE_IN_PROGRESS to CREATE_COMPLETE.

CloudFormation Outputs

  1. On the Outputs tab of the stack (see the following screenshot), copy the value for  BatchUploadS3Bucket, FeedbackAPIGatewayID, and TextClassificationAPIGatewayID to interact with the feedback loop.
  2. Both the TextClassificationAPI and FeedbackAPI will require and API key to interact with them. The Cloudformtion output ApiGWKey refers to the name of the API key. Currently this API key is associated with a usage plan that allows 2000 requests per month.
  3. On the API Gateway console, choose either the TextClassification API or the the FeedbackAPI. Choose API Keys from the left navigation. Choose your API key from step 7. Expand the API key section in the right pane and copy the value.

API Key page

  1. You can manage the usage plan by following the instructions on, Create, configure, and test usage plans with the API Gateway console.
  2. You can also add fine grained authentication and authorization to your APIs. For more information on securing your APIs, you can follow instructions on Controlling and managing access to a REST API in API Gateway.

Testing the feedback loop

In this section, we walk you through testing your feedback loop, including real-time classification, implicit and explicit feedback, and human review tasks.

Real-time classification

To interact and test these APIs, you need to download Postman.

The API Gateway endpoint receives an unlabeled text document from a client application and internally calls the custom classification endpoint, which returns the predicted label and a confidence score.

  1. Open Postman and enter the TextClassificationAPIGateway URL in POST method.
  2. In the Headers section, configure the API key.  x-api-key :  << Your API key >>
  3. In the text field, enter the following JSON code (make sure you have JSON selected and enable raw):
{"classifier":"<your custom classifier name>", "sentence":"MS Dhoni retires and a billion people had mixed feelings."}

  1. Choose Send.

You get a response back with a confidence score and class, as seen in the following screenshot.

Sample JSON request to the Classify Text API endpoint.

Implicit feedback

When the endpoint returns the classification and the confidence score during the real-time classification, you can route all the instances where the confidence score doesn’t meet the threshold to human review. This type of feedback is called implicit feedback. For this post, we set the threshold as 0.5 as an input to the CloudFormation stack parameter.

You can change this threshold when deploying the CloudFormation template based on your needs.

Explicit feedback

The explicit feedback comes from the end-users of the application that uses the custom classification feature. This type of feedback comprises the instances of text where the user wasn’t happy with the prediction. You can send the predicted label by the model’s explicit feedback through the following methods:

  • Real time through an API, which is usually triggered through a like/dislike button on a UI.
  • Batch process, where a file with a collection of misclassified utterances is put together based on a user survey conducted by the customer outreach team.

Invoking the explicit real-time feedback loop

To test the Feedback API, complete the following steps:

  1. Open Postman and enter the FeedbackAPIGatewayID value from your CloudFormation stack output in POST method.
  2. In the Headers section, configure the API key.  x-api-key :  << Your API key >>
  3. In the text field, enter the following JSON code (for classifier, enter the classifier you created, such as news-classifier-demo, and make sure you have JSON selected and enable raw):
{"classifier":"<your custom classifier name>","sentence":"Sachin is Indian Cricketer."}

  1. Choose Send.

Sample JSON request to the Feedback API endpoint.

Submitting explicit feedback as a batch file

Download the following test feedback JSON file, populate it with your data, and upload it into the BatchUploadS3Bucket created when you deployed your CloudFormation template. The following code shows some sample data in the file:

{ "classifier":"news-classifier-demo", "sentences":[ "US music firms take legal action against 754 computer users alleged to illegally swap music online.", "A gamer spends $26,500 on a virtual island that exists only in a PC role-playing game." ]
}

Uploading the file triggers the Lambda function that starts your human review loop.

Human review tasks

All the feedback collected through the implicit and explicit methods is sent for human classification. The labeling workforce can include Amazon Mechanical Turk, private teams, or AWS Marketplace vendors. For this post, we create a private workforce. The URL to the labeling portal is located on the Amazon SageMaker console, on the Labeling workforces page, on the Private tab.

Private Workforce section of the SageMaker console.

After you log in, you can see the human review tasks assigned to you. Select the task to complete and choose Start working.

Human Review Task Page

You see the tasks displayed based on the worker template used when creating the human workflow.

Human Review Task

After you complete the human classification and submit the tasks, the human-reviewed data is stored in the S3 bucket you configured when creating the human review workflow. Go to Amazon Sagemaker-> Human review workflows->output location:

Human Review Task Output Location

This human-reviewed data is used to retrain the custom classification model to learn newer patterns and improve its overall accuracy. Below is screenshot of the human annotated output file output.json in S3 bucket:

Human Review Task Output payload

The process of retraining the models with human-reviewed data, selecting the best model, and automatically deploying the new endpoints completes the active learning workflow. We cover these remaining steps in Part 2 of this series.

Cleaning up

To remove all resources created throughout this process and prevent additional costs, complete the following steps:

  1. On the Amazon S3 console, delete the S3 bucket that contains the training dataset.
  2. On the Amazon Comprehend console, delete the endpoint and the classifier.
  3. On the Amazon A2I console, delete the human review workflow, worker template, and the private workforce.
  4. On the AWS CloudFormation console, delete the stack you created. (This removes the resources the CloudFormation template created.)

Conclusion

Amazon Comprehend helps you build scalable and accurate natural language processing capabilities without any machine learning experience. This post provides a reusable pattern and infrastructure for active learning workflows for custom classification models. The feedback pipelines and human review workflow help the custom classifier learn new data patterns continuously. The second part of this series covers the automatic model building, selection, and deployment of custom classification models.

For more information, see Custom Classification. You can discover other Amazon Comprehend features and get inspiration from other AWS blog posts about how to use Amazon Comprehend beyond classification.


About the Authors

 Shanthan Kesharaju is a Senior Architect in the AWS ProServe team. He helps our customers with AI/ML strategy, architecture, and develop products with a purpose. Shanthan has an MBA in Marketing from Duke University and an MS in Management Information Systems from Oklahoma State University.

Mona Mona is an AI/ML Specialist Solutions Architect based out of Arlington, VA. She works with World Wide Public Sector team and helps customers adopt machine learning on a large scale. She is passionate about NLP and ML Explainability areas in AI/ML.

Joyson Neville Lewis obtained his master’s in Information Technology from Rutgers University in 2018. He has worked as a Software/Data engineer before diving into the Conversational AI domain in 2019, where he works with companies to connect the dots between business and AI using voice and chatbot solutions. Joyson joined Amazon Web Services in February of 2018 as a Big Data Consultant for AWS Professional Services team in NYC.

Source: https://aws.amazon.com/blogs/machine-learning/active-learning-workflow-for-amazon-comprehend-custom-classification-models-part-1/

Continue Reading

AI

Active learning workflow for Amazon Comprehend custom classification models – Part 1

Amazon Comprehend  Custom Classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. For example, your customer support organization can use Custom Classification to automatically categorize inbound requests by problem type based on how the customer has described the issue.  You can use custom classifiers to automatically label […]

Published

on

Amazon Comprehend  Custom Classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. For example, your customer support organization can use Custom Classification to automatically categorize inbound requests by problem type based on how the customer has described the issue.  You can use custom classifiers to automatically label support emails with appropriate issue types, routing customer phone calls to the right agents, and categorizing social media posts into user segments.

For custom classification, you start by creating a training job with a ground truth dataset comprising a collection of text and corresponding category labels. Upon completing the job, you have a classifier that can classify any new text into one or more named categories. When the custom classification model classifies a new unlabeled text document, it predicts what it has learned from the training data. Sometimes you may not have a training dataset with various language patterns, or once you deploy the model, you start seeing completely new data patterns. In these cases, the model may not be able to classify these new data patterns accurately. How can we ensure continuous model training to keep it up to date with new data and patterns?

In this two part blog series, we discuss an architecture pattern that allows you to build an active learning workflow for Amazon Comprehend custom classification models. The first post will describe a workflow comprising real-time classification, feedback pipelines and human review workflows using Amazon Augmented AI (Amazon A2I). The second post will cover the automated model building using the human reviewed data, selecting the best model, and automated deployment of an endpoint of the chosen model.

Feedback loops play a pivotal role in keeping the models up to date. This feedback helps the models learn about their misclassifications and learn the right ones. This process of teaching the models continuously through feedback and deploying them is called active learning.

For every prediction Amazon Comprehend Custom Classification makes, it also gives a confidence score associated with its prediction. This architecture proposes that you set an acceptable threshold and only accept the predictions with a confidence score that exceeds the threshold. All the predictions that have a confidence score less than the desired threshold are flagged for human review. The human decides whether to accept the model’s prediction or correct it.

In some instances, the model may be confident about its predictions, but the classification might be wrong. In these scenarios, the end-user applications that receive the model predictions can request explicit feedback from its users on the prediction quality. A human moderator reviews this explicit feedback and reclassifies instances where the feedback was negative. This process of generating human-verified data and using it for model retraining helps keep the models up to date, reduce data drift, and achieve higher model accuracy.

Feedback Workflow Architecture.

In this section, we discuss an architectural pattern for implementing an end-to-end active learning workflow for custom classification models in Amazon Comprehend using Amazon A2I. The active learning workflow comprises the following components:

  1. Real-time classification
  2. Feedback loops
  3. Human classification
  4. Model building
  5. Model selection
  6. Model deployment

The following diagram illustrates this architecture covering the first three components. In the following sections, we walk you through each step in the workflow.

Architecture Diagram for Feedback Loops

Real-time classification

To use custom classification in Amazon Comprehend, you need to create a custom classification job that reads a ground truth dataset from an Amazon Simple Storage Service (Amazon S3) bucket and builds a classification model. After the model builds successfully, you can create an endpoint that allows you to make real-time classifications of unlabeled text. This stage is represented by steps 1–3 in the preceding architecture:

  1. The end-user application calls an API Gateway endpoint with a text that needs to be classified.
  2. The API Gateway endpoint then calls an AWS Lambda function configured to call an Amazon Comprehend endpoint.
  3. The Lambda function calls the Amazon Comprehend endpoint, which returns the unlabeled text classification and a confidence score.

Feedback collection

When the endpoint returns the classification and the confidence score during the real-time classification, you can send instances with low-confidence scores to human review. This type of feedback is called implicit feedback.

  1. The Lambda function sends the implicit feedback to an Amazon Kinesis Data Firehose.

The other type of feedback is called explicit feedback and comes from the application’s end-users that use the custom classification feature. This type of feedback comprises the instances of text where the user wasn’t happy with the prediction. Explicit feedback can be sent either in real-time through an API or a batch process.

  1. End-users of the application submit explicit real-time feedback through an API Gateway endpoint.
  2. The Lambda function backing the API endpoint transforms the data into a standard feedback format and writes it to the Kinesis Data Firehose delivery stream.
  3. End-users of the application can also submit explicit feedback as a batch file by uploading it to an S3 bucket.
  4. A trigger configured on the S3 bucket triggers a Lambda function.
  5. The Lambda function transforms the data into a standard feedback format and writes it to the delivery stream.
  6. Both the implicit and explicit feedback data gets sent to a delivery stream in a standard format. All this data is buffered and written to an S3 bucket.

Human classification

The human classification stage includes the following steps:

  1. A trigger configured on the feedback bucket in Step 10 invokes a Lambda function.
  2. The Lambda function creates Amazon A2I human review tasks for all the feedback data received.
  3. Workers assigned to the classification jobs log in to the human review portal and either approve the classification by the model or classify the text with the right labels.
  4. After the human review, all these instances are stored in an S3 bucket and used for retraining the models. Part 2 of this series covers the retraining workflow.

Solution overview

The next few sections of the post go over how to set up this architecture in your AWS account. We classify news into four categories: World, Sports, Business, and Sci/Tech, using the AG News dataset for custom classification, and set up the implicit and explicit feedback loop. You need to complete two manual steps:

  1. Create an Amazon Comprehend custom classifier and an endpoint.
  2. Create an Amazon SageMaker private workforce, worker task template, and human review workflow.

After this, you run the provided AWS CloudFormation template to set up the rest of the architecture.

Prerequisites

Before you get started, download the dataset and upload it to Amazon S3. This dataset comprises a collection of news articles and their corresponding category labels. We have created a training dataset called train.csv from the original dataset and made it available for download.

The following screenshot shows a sample of the train.csv file.

CSV file representing the Training data set

After you download the train.csv file, upload it to an S3 bucket in your account for reference during training. For more information about uploading files, see How do I upload files and folders to an S3 bucket?

Creating a custom classifier and an endpoint

To create your classifier for classifying news, complete the following steps:

  1. On the Amazon Comprehend console, choose Custom Classification.
  2. Choose Train classifier.
  3. For Name, enter news-classifier-demo.
  4. Select Using Multi-class mode.
  5. For Training data S3 location, enter the path for train.csv in your S3 bucket, for example, s3://<your-bucketname>/train.csv.
  6. For Output data S3 location, enter the S3 bucket path where you want the output, such as s3://<your-bucketname>/.
  7. For IAM role, select Create an IAM role.
  8. For Permissions to access, choose Input and output (if specified) S3 bucket.
  9. For Name suffix, enter ComprehendCustom.

Comprehend Custom Classification Model Creation

  1. Scroll down and choose Train Classifier to start the training process.

The training takes some time to complete. You can either wait to create an endpoint or come back to this step later after finishing the steps in the section Creating a private workforce, worker task template, and human review workflow.

Creating a custom classifier real-time endpoint

To create your endpoint, complete the following steps:

  1. On the Amazon Comprehend console, choose Custom Classification.
  2. From the Classifiers list, choose the name of the custom model for which you want to create the endpoint and select your model news-classifier-demo.
  3. From the Actions drop-down menu, choose Create endpoint.
  4. For Endpoint name, enter classify-news-endpoint and give it one inference unit.
  5. Choose Create endpoint
  6. Copy the endpoint ARN as shown in the following screenshot. You use it when running the CloudFormation template in a future step.

Custom Classification Model Endpoint Page

Creating a private workforce, worker task template, and human review workflow.

This section walks you through creating a private workforce in Amazon SageMaker, a worker task template, and your human review workflow.

Creating a labeling workforce

  1. For this post, you will create a private work team and add only one user (you) to it. For instructions, see Create a Private Workforce (Amazon SageMaker Console).
  2. Once the user accepts the invitation, you will have to add him to the workforce. For instructions, see the Add a Worker to a Work Team section the Manage a Workforce (Amazon SageMaker Console)

Creating a worker task template

To create a worker task template, complete the following steps:

  1. On the Amazon A2I console, choose Worker task templates.
  2. Choose to Create a template.
  3. For Template name, enter custom-classification-template.
  4. For Template type, choose Custom,
  5. In the Template editor, enter the following GitHub UI template code.
  6. Choose Create.

Worker Task Template

Creating a human review workflow

To create your human review workflow, complete the following steps:

  1. On the Amazon A2I console, choose Human review workflows.
  2. Choose Create human review workflow.
  3. For Name, enter classify-workflow.
  4. Specify an S3 bucket to store output: s3://<your bucketname>/.

Use the same bucket where you downloaded your train.csv in the prerequisite step.

  1. For IAM role, select Create a new role.
  2. For Task type, choose Custom.
  3. Under Worker task template creation, select the custom classification template you created.
  4. For Task description, enter Read the instructions and review the document.
  5. Under Workers, select Private.
  6. Use the drop-down list to choose the private team that you created.
  7. Choose Create.
  8. Copy the workflow ARN (see the following screenshot) to use when initializing the CloudFormation parameters.

Human Review Workflow Page

Deploying the CloudFormation template to set up active learning feedback

Now that you have completed the manual steps, you can run the CloudFormation template to set up this architecture’s building blocks, including the real-time classification, feedback collection, and the human classification.

Before deploying the CloudFormation template, make sure you have the following to pass as parameters:

  • Custom classifier endpoint ARN
  • Amazon A2I workflow ARN
  1. Choose Launch Stack:

  1. Enter the following parameters:
    1. ComprehendEndpointARN – The endpoint ARN you copied.
    2. HumanReviewWorkflowARN – The workflow ARN you copied.
    3. ComrehendClassificationScoreThreshold – Enter 0.5, which means a 50% threshold for low confidence score.

CloudFormation Required Parameters

  1. Choose Next until the Capabilities
  2. Select the check-box to provide acknowledgment to AWS CloudFormation to create AWS Identity and Access Management (IAM) resources and expand the template.

For more information about these resources, see AWS IAM resources.

  1. Choose Create stack.

Acknowledgement section of the CloudFormation Page

Wait until the status of the stack changes from CREATE_IN_PROGRESS to CREATE_COMPLETE.

CloudFormation Outputs

  1. On the Outputs tab of the stack (see the following screenshot), copy the value for  BatchUploadS3Bucket, FeedbackAPIGatewayID, and TextClassificationAPIGatewayID to interact with the feedback loop.
  2. Both the TextClassificationAPI and FeedbackAPI will require and API key to interact with them. The Cloudformtion output ApiGWKey refers to the name of the API key. Currently this API key is associated with a usage plan that allows 2000 requests per month.
  3. On the API Gateway console, choose either the TextClassification API or the the FeedbackAPI. Choose API Keys from the left navigation. Choose your API key from step 7. Expand the API key section in the right pane and copy the value.

API Key page

  1. You can manage the usage plan by following the instructions on, Create, configure, and test usage plans with the API Gateway console.
  2. You can also add fine grained authentication and authorization to your APIs. For more information on securing your APIs, you can follow instructions on Controlling and managing access to a REST API in API Gateway.

Testing the feedback loop

In this section, we walk you through testing your feedback loop, including real-time classification, implicit and explicit feedback, and human review tasks.

Real-time classification

To interact and test these APIs, you need to download Postman.

The API Gateway endpoint receives an unlabeled text document from a client application and internally calls the custom classification endpoint, which returns the predicted label and a confidence score.

  1. Open Postman and enter the TextClassificationAPIGateway URL in POST method.
  2. In the Headers section, configure the API key.  x-api-key :  << Your API key >>
  3. In the text field, enter the following JSON code (make sure you have JSON selected and enable raw):
{"classifier":"<your custom classifier name>", "sentence":"MS Dhoni retires and a billion people had mixed feelings."}

  1. Choose Send.

You get a response back with a confidence score and class, as seen in the following screenshot.

Sample JSON request to the Classify Text API endpoint.

Implicit feedback

When the endpoint returns the classification and the confidence score during the real-time classification, you can route all the instances where the confidence score doesn’t meet the threshold to human review. This type of feedback is called implicit feedback. For this post, we set the threshold as 0.5 as an input to the CloudFormation stack parameter.

You can change this threshold when deploying the CloudFormation template based on your needs.

Explicit feedback

The explicit feedback comes from the end-users of the application that uses the custom classification feature. This type of feedback comprises the instances of text where the user wasn’t happy with the prediction. You can send the predicted label by the model’s explicit feedback through the following methods:

  • Real time through an API, which is usually triggered through a like/dislike button on a UI.
  • Batch process, where a file with a collection of misclassified utterances is put together based on a user survey conducted by the customer outreach team.

Invoking the explicit real-time feedback loop

To test the Feedback API, complete the following steps:

  1. Open Postman and enter the FeedbackAPIGatewayID value from your CloudFormation stack output in POST method.
  2. In the Headers section, configure the API key.  x-api-key :  << Your API key >>
  3. In the text field, enter the following JSON code (for classifier, enter the classifier you created, such as news-classifier-demo, and make sure you have JSON selected and enable raw):
{"classifier":"<your custom classifier name>","sentence":"Sachin is Indian Cricketer."}

  1. Choose Send.

Sample JSON request to the Feedback API endpoint.

Submitting explicit feedback as a batch file

Download the following test feedback JSON file, populate it with your data, and upload it into the BatchUploadS3Bucket created when you deployed your CloudFormation template. The following code shows some sample data in the file:

{ "classifier":"news-classifier-demo", "sentences":[ "US music firms take legal action against 754 computer users alleged to illegally swap music online.", "A gamer spends $26,500 on a virtual island that exists only in a PC role-playing game." ]
}

Uploading the file triggers the Lambda function that starts your human review loop.

Human review tasks

All the feedback collected through the implicit and explicit methods is sent for human classification. The labeling workforce can include Amazon Mechanical Turk, private teams, or AWS Marketplace vendors. For this post, we create a private workforce. The URL to the labeling portal is located on the Amazon SageMaker console, on the Labeling workforces page, on the Private tab.

Private Workforce section of the SageMaker console.

After you log in, you can see the human review tasks assigned to you. Select the task to complete and choose Start working.

Human Review Task Page

You see the tasks displayed based on the worker template used when creating the human workflow.

Human Review Task

After you complete the human classification and submit the tasks, the human-reviewed data is stored in the S3 bucket you configured when creating the human review workflow. Go to Amazon Sagemaker-> Human review workflows->output location:

Human Review Task Output Location

This human-reviewed data is used to retrain the custom classification model to learn newer patterns and improve its overall accuracy. Below is screenshot of the human annotated output file output.json in S3 bucket:

Human Review Task Output payload

The process of retraining the models with human-reviewed data, selecting the best model, and automatically deploying the new endpoints completes the active learning workflow. We cover these remaining steps in Part 2 of this series.

Cleaning up

To remove all resources created throughout this process and prevent additional costs, complete the following steps:

  1. On the Amazon S3 console, delete the S3 bucket that contains the training dataset.
  2. On the Amazon Comprehend console, delete the endpoint and the classifier.
  3. On the Amazon A2I console, delete the human review workflow, worker template, and the private workforce.
  4. On the AWS CloudFormation console, delete the stack you created. (This removes the resources the CloudFormation template created.)

Conclusion

Amazon Comprehend helps you build scalable and accurate natural language processing capabilities without any machine learning experience. This post provides a reusable pattern and infrastructure for active learning workflows for custom classification models. The feedback pipelines and human review workflow help the custom classifier learn new data patterns continuously. The second part of this series covers the automatic model building, selection, and deployment of custom classification models.

For more information, see Custom Classification. You can discover other Amazon Comprehend features and get inspiration from other AWS blog posts about how to use Amazon Comprehend beyond classification.


About the Authors

 Shanthan Kesharaju is a Senior Architect in the AWS ProServe team. He helps our customers with AI/ML strategy, architecture, and develop products with a purpose. Shanthan has an MBA in Marketing from Duke University and an MS in Management Information Systems from Oklahoma State University.

Mona Mona is an AI/ML Specialist Solutions Architect based out of Arlington, VA. She works with World Wide Public Sector team and helps customers adopt machine learning on a large scale. She is passionate about NLP and ML Explainability areas in AI/ML.

Joyson Neville Lewis obtained his master’s in Information Technology from Rutgers University in 2018. He has worked as a Software/Data engineer before diving into the Conversational AI domain in 2019, where he works with companies to connect the dots between business and AI using voice and chatbot solutions. Joyson joined Amazon Web Services in February of 2018 as a Big Data Consultant for AWS Professional Services team in NYC.

Source: https://aws.amazon.com/blogs/machine-learning/active-learning-workflow-for-amazon-comprehend-custom-classification-models-part-1/

Continue Reading
AI19 hours ago

Trump’s tweets blur the boundary between reality and fiction.

AI1 day ago

Active learning workflow for Amazon Comprehend custom classification models – Part 1

AI1 day ago

Active learning workflow for Amazon Comprehend custom classification models – Part 1.

AI2 days ago

Data visualization and anomaly detection using Amazon Athena and Pandas from Amazon SageMaker

AI2 days ago

Football tracking in the NFL with Amazon SageMaker

AI2 days ago

Edtech Companies Rush to Take Advantage of the Pandemic Market

AI2 days ago

Jina.ai with Rasa

AI2 days ago

What is the Cost for Custom iPhone App Development?

AI2 days ago

How E-learning Changing Study Pattern of the Current Generation?

AI2 days ago

Key Takeaways from the Webinar – Digital Health Beyond COVID-19: Bringing the Hospital to the Customer

AI2 days ago

Improved OCR and structured data extraction with Amazon Textract

AI2 days ago

Preventing customer churn by optimizing incentive programs using stochastic programming

AI2 days ago

Underwater Autonomous Vehicles Helping Navy Get More for the Money 

AI2 days ago

Researcher Interview: Ziv Epstein, Research Associate, MIT Media Lab  

AI2 days ago

As Cloud Computing Grows Rapidly, Companies Look to Manage Costs 

AI3 days ago

The Trolley Problem Undeniably Applies to AI Autonomous Cars 

AI3 days ago

AI and IoT Applied to Supply Chains Are Driving Digital Twins 

AI3 days ago

Selecting the right metadata to build high-performing recommendation models with Amazon Personalize

AI3 days ago

Streamline modeling with Amazon SageMaker Studio and the Amazon Experiments SDK

AI3 days ago

Expanding Amazon Lex conversational experiences with US Spanish and British English

Trending