Using AI effectively for automating helpdesk ticket dispatch

Atri Mandal
7 min readSep 26, 2021

The landscape of modern IT service delivery is changing with increased focus on automation and optimization. Most IT vendors today have service platforms aimed towards end-to-end automation for carrying out mundane, repetitive labor-intensive tasks and even for tasks requiring human cognizance. One such task is ticket assignment/dispatch where the service requests submitted by the end-users to the vendor in the form of tickets are reviewed by a centralized dispatch team and assigned to the appropriate service team. Given the fact that inefficiencies in dispatch have serious business consequences, there has been a lot of interest in automating the dispatch process. Also, for companies having a large customer base and several business functions it will be impossible to scale the ticket dispatch process without some amount of automation.

In order to automate the ticket dispatch effectively, there needs to be an end-to-end automation from the time the ticket enters the system to the time it is appropriately dispatched requiring almost no manual intervention. This involves prediction of multiple fields e.g. incident-type, severity etc. but prediction of resolver group is one of the most challenging tasks.

Challenges: There are primarily 2 big challenges in prediction of resolver groups. Firstly, large companies have hundreds of resolver groups dealing with overlapping problems. The subtle differences in the problems (e.g. emails related to sales and those related to warehouse) can be understood well by experienced technicians but are often difficult and confusing for learning algorithms. Secondly, resolver groups themselves continually change (renamed, split or merged) because of business decisions making prediction difficult. Simple use of traditional machine learning and deep learning is not enough to achieve human level accuracy and reliability, which is mandatory for industry deployment.

This blog describes an effective way to automate ticket assignment and dispatch by removing these challenges using AI. The system described in this blog (with few business specific modifications) has already been adopted by some major helpdesk service providers in India and is currently used to dispatch more than 100,000 emails per month.

Although we have specifically dealt with email tickets (tickets which come in the form of user emails) the same methods can be applied generically to all classes of helpdesk tickets.

The combination of techniques employed in the automated email ticket dispatch system is shown in Figure 1 below:

Fig 1. Help desk Ticket Automation System Overview

1. Pre-processing training data:

Each training sample consists of email data (X) and the resolver group (Y) of the resolved ticket. Historical email data is pulled from the ticketing system (ServiceNow/Remedy etc.) using scripts that use either REST APIs (if supported by the ticketing system) or JDBC connectors. The email subject and body are concatenated (with a space in between) to create the training sample.
We apply the following pre-processing techniques before training:

Text preprocessing: From practical experience we figured out that too much text processing does not significantly impact the accuracy of prediction. In fact, with most text processing exercises we ended up reducing the resolver group classification accuracy. So we limited ourselves to basic text processing — converting text to lowercase and removing HTML tags and special characters. We did however had to do some extra preprocessing for some clients (like masking sensitive data) — but this was done solely for specific business concerns and not for better classification.
Also we do not apply any word embedding layer on top of the ticket text. Instead we use a simple BoW model with tf-idf weights. We did experiment with some word embedding techniques (like word2vec and Glove) but there was no appreciable difference in accuracy.

Resolver group level processing: One of the key things we noted is that for most organizations the set of resolver groups follow the 20–90 rule; that is, 20% of the classes/groups (the short head) account for more than 90% of the email data. The remaining classes (the long tail) and have very few samples. As such we focus our attention mostly on the short head and design ML models for predicting them. The sparse classes are predicted/classified using a rule engine.

Fig 2. Long tail classification of tickets

To divide the resolver groups into short head and long tail we use a pareto chart and a cumulative frequency cutoff as shown in Figure 2. This simple strategy helps in reducing the number of training classes by almost 80%, which results in decreased model size and faster training and classification times. This also reduces the problem of class imbalance and the resulting confusion in training data by a significant margin.

Data Enrichment: To further enrich training data, we merge samples from some resolver groups. Specifically, resolver groups belonging to different escalation levels (e.g. Tier-1, Tier-2 etc.) are merged to increase the number of samples. It’s often enough to be able to assign to the lowest tier level — the emails are escalated by the human agents based on specific requirements of the group. We also merge the region-specific resolver groups (if any) as they have deal with the same set of issues and use the same resolution strategy — but handled by the local team with different names.

2. Classification models

An ensemble classifier consisting of linear SVM (one-vs-rest scheme) and MLP (feed forward neural net) is used for classification. The experiment results showed this combination to be the most effective. The more complicated DL based models (e.g. CNN/LSTM) perform poorly compared to SVM with most datasets (with less than 1 million tickets). The result is both surprising and significant as this means that we can achieve maximum accuracy with minimum computational resources.

However, for very large datasets (1 million training samples and more) LSTM with pre-trained Glove embedding gives the best accuracy. This indicates that if hardware constraints (GPU/memory etc.) are not an issue LSTM is probably the best choice. However, keeping in mind practical considerations for deployment, a combination of SVM and MLP was chosen in ensemble model. Using an ensemble with contrasting algorithms helps us increasing the ticket coverage and also slightly increases the accuracy.

The final ensemble works at more than 90% accuracy (above observed human accuracy) along with more than 90% coverage for all our deployed datasets.

3. Retraining

Model retraining is a key aspect in the design of the classification system which ensures that the classification accuracy does not deteriorate over time. We need to keep in mind 2 important things here viz. (1) Recent data must be accounted for. The email utterances slowly change over time (very gradual) and if this is not taken into account, we risk reducing the accuracy. (2) Also, it must be noted that some of the historical emails are very informative we should not lose them during retraining even while taking recent data into account. To take care of (1) we used a sliding window based retraining strategy. To take care of (2) we use margin sampling to retain the most informative samples from the time intervals which we slide over.

4. Handling business specific cases and confusion/long tail classes

Although our classification models are good enough for most purposes, there are some specific business needs for which we need to look beyond machine learning. Firstly, most service providers will have a very high accuracy requirement (more than 90%) to justify replacement of manual helpdesk with AI. It is often not possible to meet this accuracy with only machine learning (although it comes very close). The reasons for this are as follows:

1. Firstly, resolver groups themselves undergo changes like rename/split/merge for business reasons. ML cannot keep pace with these abrupt changes as it requires lot of training data to achieve high accuracy.

2. Secondly resolver groups often have overlapping problems leading to confusing email utterances. These confusion classes often cannot be classified accurately using ML and need business specific rules.

3. Finally the long tail classes will never be predicted by our ML models because of our design. We need to design rules for these classes.

To handle the above cases and to go that extra mile in accuracy and business continuity/availability, we use a rule engine with manually configured rules. However we design the rule engine differently from traditional rule based systems; we take into account the output of the classification model as well. We first apply the classifier to reduce the number of potential candidates by identifying the confused set; after which rules are applied to decide the final class or resolver group among the reduced set. Thus the output of the ensemble classifier is an input to our rule engine.

Also It is very important to use precedence in the application of rules — the more restrictive rules (using output from ensemble classifier) are given higher precedence in the rule engine. If the rule engine is not designed in this way, the results may not be accurate and can even lead to decrease in classification accuracy.

This work was a joint collaboration between IBM research and IBM Global Technology Services (GTS). The ticket dispatch system described here has been deployed to several major helpdesk service providers in India. Currently the assignment engine serves about 100,000 user emails every month across the deployed accounts. The total number of tickets served till date has crossed the 3 million mark. The system has consistently surpassed human level accuracy in all accounts. The estimated net savings from the use of our assignment engine is more than 50,000 man-hours per annum.

For more details on our work please refer to our IAAI-19 and AI Magazine papers.

Originally published at https://www.linkedin.com.

--

--

Atri Mandal

Director, Machine Learning at HEAL (AIOps startup); ML Researcher; IEEE Senior Member; ex-IBM Research, ex-Yahoo, ex-Amazon