Tag incoming emails with AI Builder

Paul Nogues, , dilluns, 20 de desembre de 2021

AI Builder is a Power Automate capability that provides AI models that are designed to optimize your business processes by gleaning insights from your data. There are many different ways that you can use AI Builder – this is the part one of a 3-part weekly series on some common ways to get started.

Background and challenge

Companies receive emails containing support requests, product quotes, and other feedback that can be used for upselling.

They would like to be able to categorize email topics in order to properly triage them (for example, identify which team should process them), or to perform statistical analysis.

Sample: Support request automation

In some companies this is done manually which is tedious, error prone and not scalable.

What AI Builder can do

This kind of scenario can typically be automated by leveraging Microsoft Power Automate for the process orchestration and AI Builder for the email analysis.

Power Automate will listen to new emails, perform conversion of the body to text, send to AI Builder for processing, and will finally save the identified categories in the target system.

We recommend using a custom AI Builder category classification model which will learn from past data that have been manually categorized.

Implement the solution

Training the model

Define categories

This work is usually performed by the business team based on the requirements of project.

Build the training dataset

This can be achieved by browsing emails from a past period and manually tagging each with one or many of the categories previously defined.

You can start with building an instant Power Automate cloud flow that will get all emails from the service mailbox based on criteria:

You’ll perform an HTML to text conversion to get the raw text:

You can store the result to a Microsoft Dataverse table:

Now you need to tag this data with appropriate categories.

You can achieve this by either creating a Microsoft Power App with data entry form on top of the training data table or use the built-in Edit data in Excel feature:

You can use commas, semicolons, or tabs in order to split when several categories match with a text.

Important considerations about the training set

The process to source the training data should be the exact same as the one sourcing future data to be predicted. For instance, avoid manually cleansing or converting the training data.
Have a balanced number of samples for each category. For instance, avoid having 100 samples for a category but only 10 for another one.
Provide at least 20 samples for each category.
Avoid small text (< 100 characters). On the other side, if you’re dealing with large amounts of text (threads for instance) consider truncating at a fixed number of characters like 1,000.

Training the model

Finally, you can create and train the category classification model using the above training table as input, the “Email body” column for the texts, and “Categories” for the tags. (See here for more details.)

Data model

As presented above, you’ll have a table holding the training data. You can add a column with a date or version number to keep track on when this data was added.

You will also likely have a categories reference table that will be used in the target application.

Eventually, you have the table that contains new data to be predicted. This table will contain a column “Predicted” with a reference to the categories reference table based on what the model has predicted. You’ll also likely have a column for manually edited tags. (See “Feedback loop” section below.)

You end up with the following data model:

Before moving to production, you can always set up the flow as follows:

The promotion of models from the Development environment to Production is performed using the Power Platform application lifecycle capabilities.

Feedback loop

Organizations may want to include user feedback on predicted values in their lifecycle. When business users are accessing the emails, they can decide to update the categories in case it has been incorrectly predicted.

A Power Automate flow will check on a regular basis (periodicity to be defined based on volume of new emails / user edits) the emails where the categories have been updated. These records will be sent back to the development environment and re-training will be triggered.

After testing, the new version of the model can be published and imported in Production:

Expected ROI

Fully automated triaging of emails can vastly reduce “human routing” errors and has the benefit of working 24/7.
It’s easy to build KPIs on most popular categories or tracking trending categories.
Retraining makes it easy to manage new categories, deprecate old ones, and maintain high quality tagging.

Want to know more?

AI Builder public documentation is the place to start to learn more about Text Classification in AI Builder.

Details on how to use text classification in Power Automate can be found here.