Connect with us

AI

Deploying TensorFlow OpenPose on AWS Inferentia-based Inf1 instances for significant price performance improvements

In this post you will compile an open-source TensorFlow version of OpenPose using AWS Neuron and fine tune its inference performance for AWS Inferentia based instances. You will set up a benchmarking environment, measure the image processing pipeline throughput, and quantify the price-performance improvements as compared to a GPU based instance. About OpenPose Human pose […]

Published

on

In this post you will compile an open-source TensorFlow version of OpenPose using AWS Neuron and fine tune its inference performance for AWS Inferentia based instances. You will set up a benchmarking environment, measure the image processing pipeline throughput, and quantify the price-performance improvements as compared to a GPU based instance.

About OpenPose

Human pose estimation is a machine learning (ML) and computer vision (CV) technology supporting many applications, from pedestrian intent estimation to motion tracking for AR and gaming. At its core, pose estimation identifies coordinates on an image (joints and keypoints), that, when connected, form a representation of an individual skeleton. The representation of body orientation enables tasks such as teaching a robot to interact with humans or quantifying how good yoga asanas really are.

Amongst the many methods that can be used for human pose estimation, the deep learning (DL) bottoms-up approach taken by OpenPose—released by the Perceptual Computing Lab of Carnegie Mellon University in 2018—has gained a lot of users. OpenPose is a multi-person 2D pose estimation model that employs a technique called Part Affinity Fields (PAF) to associate body parts and form multiple individual skeletons on the image. In the bottoms-up approach, the model identifies the key points and pieces together the skeleton.

To achieve that, OpenPose uses a two-step process. First, it extracts image features using a VGG-19 model and passes those features through a pair of convolutional neural networks (CNN) running in parallel.

One of the CNNs in the pair computes confidence maps to detect body parts. The other computes the PAF and combines the parts to form the individual’s skeleton. You can repeat these parallel branches many times to refine the predictions of the confidence maps and PAF.

The following diagram shows features F from a VGG feeding the PAF and confidence map branches of the OpenPose model. (Source: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields)

The original OpenPose code relies on a Caffe model and pre-compiled C++ libraries. For ease of use and portability of our walkthrough, we work with a reimplementation of the neural networks of OpenPose using TensorFlow 1.15 from the tf-pose-estimation GitHub repo. This repo also provides ML pipeline scripts to pre- and post-process images and videos using OpenPose.

Prerequisites

For this walkthrough, you need an AWS account with access to the AWS Management Console and the ability to create Amazon Elastic Compute Cloud (Amazon EC2) instances with public-facing IP and Amazon Simple Storage Service (Amazon S3) buckets.

Working knowledge of AWS Deep Learning AMIs and Jupyter notebooks with Conda environments is beneficial, but not required.

About AWS Inferentia and Neuron SDK

AWS Inferentia chips are custom built by AWS to provide high-performance inference, with the lowest cost of inference in the cloud, and make it easy for you to integrate ML as part of your standard application features and capabilities.

AWS Neuron is a software development kit (SDK) consisting of a compiler, runtime, and profiling tools that optimize the ML inference performance for the Inferentia chips. Neuron is integrated with popular ML frameworks such as TensorFlow, PyTorch, and MXNet and comes pre-installed in AWS Deep Learning AMIs. Deploying deep learning models on AWS Inferentia is done in the same familiar environment used in other platforms, and you can enjoy the boost in performance and lowest cost.

The latest Neuron release, available on the AWS Neuron GitHub, adds support for more models like OpenPose, which we focus on in this post. It also upgrades Neuron PyTorch to the latest stable version (1.5.1), which allows for a wider range of models to compile and run on AWS Inferentia.

Compiling a TensorFlow OpenPose model with the Neuron SDK

You can start the compilation process by setting up an EC2 instance in AWS for compiling the model. We recommend a z1d.xlarge, due to its good single-core performance and memory size. Use the AWS Deep Learning AMI (Ubuntu 18.04) Version 29.0—ami-043f9aeaf108ebc37—in the US East (N. Virginia) Region. This AMI comes pre-packaged with the Neuron SDK and the required Neuron runtime for AWS Inferentia.

For more information about running AWS Deep Learning AMIs on EC2 instances, see Launching and Configuring a DLAMI.

When you can connect to the instance through SSH, you activate the aws_neuron_tensorflow_p36 Conda environment and update the Neuron Compiler to the latest release. The compilation script depends on requirements listed in the file requirements-compile.txt. For compilation scripts and requirements files, see the GitHub repo. Download and install them in the environment with the following code:

source activate aws_neuron_tensorflow_p36
pip install neuron-cc --upgrade --extra-index-url=https://pip.repos.neuron.amazonaws.com
git clone https://github.com/aws/aws-neuron-sdk.git /tmp/aws-neuron-sdk && cp /tmp/aws-neuron-sdk/src/examples/tensorflow/<name_of_the_new_folder>/* . && rm -rf /tmp/aws-neuron-sdk/
pip install -r requirements-compile.txt

You can then start working on the compilation process. You compile the tf-pose-estimation network frozen graph, available on the GitHub repo. You can adapt the original download script to a single-line wget command:

wget -c --tries=2 $( wget -q -O - http://www.mediafire.com/file/qlzzr20mpocnpa3/graph_opt.pb | grep -o 'http*://download[^"]*' | tail -n 1 ) -O graph_opt.pb

When the download is complete, run the convert_graph_opt.py script to compile it for the AWS Inferentia chip. Because Neuron is an ahead-of-time (AOT) compiler, you need to define a specific image size prior to compilation. You can adjust the network input image resolution with the argument --net_resolution (for example, net_resolution=656x368).

The compiled model can accept arbitrary batch size inputs at inference runtime. This property enables benchmarking large-scale deployments of the model; however, the pipeline available for image and video process in the tf-pose-estimation repo utilizes batch size 1.

To start the compilation process, enter the following code:

python convert_graph_opt.py graph_opt.pb graph_opt_neuron_656x368.pb

The compilation process can take up to 20 minutes to complete. During this time, the compiler optimizes the TensorFlow graph operations and provides the AWS Inferentia version of the saved model. During the process you can expect detailed logs such as the following:

2020-07-15 21:44:43.008627: I bazel-out/k8-opt/bin/tensorflow/neuron/convert/segment.cc:460] There are 11 ops of 7 different types in the graph that are not compiled by neuron-cc: Const, NoOp, Placeholder, RealDiv, Sub, Cast, Transpose, (For more information see https://github.com/aws/aws-neuron-sdk/blob/master/release-notes/neuron-cc-ops/neuron-cc-ops-tensorflow.md).
INFO:tensorflow:fusing subgraph neuron_op_ed41d2deb8c54255 with neuron-cc
INFO:tensorflow:Number of operations in TensorFlow session: 474
INFO:tensorflow:Number of operations after tf.neuron optimizations: 474
INFO:tensorflow:Number of operations placed on Neuron runtime: 465

Before you can measure the performance of the compiled model, you need to switch to an EC2 Inf1 instance, powered by the AWS Inferentia chip. To share the compiled model between the two instances, create an S3 bucket with the following code:

aws s3 mb s3://<MY_BUCKET_NAME>
aws s3 cp graph_opt_neuron_656x368.pb s3://<MY_BUCKET_NAME>/graph_model.pb

Benchmarking the inference time with a Jupyter notebook on AWS EC2 Inf1 instances

After you have the compiled graph_model.pb in your S3 bucket, you modify the ML pipeline scripts on the GitHub repo to estimate human poses from images and videos.

To set up the benchmarking Inf1 instance, you can repeat the steps you took to provision the compilation z1d instance. You use the same AMI but change the instance type to inf1.xlarge. A similar setup on a g4dn.xlarge instance might be useful to compare the performance of the base tf-pose-estimation model on GPUs against the compiled model for AWS Inferentia.

Throughout this post, you interact with this instance and the model using a Jupyter Lab server. For more information about provisioning a Jupyter Lab on Amazon EC2, see Set Up a Jupyter Notebook Server.

Setting up the Conda Environment for tf-pose

When you can log in to the Jupyter Lab server, you can clone the GitHub repo containing the TensorFlow version of OpenPose.

On the Jupyter Launcher page, under Other, choose Terminal.

In the terminal, activate the aws_neuron_tensorflow_p36 environment, which contains the Neuron SDK. Activating the environment and cloning are done with the following code:

conda activate aws_neuron_tensorflow_p36
git clone https://github.com/ildoonet/tf-pose-estimation.git
cd tf-pose-estimation

When the cloning is complete, we recommend following the Package Install instructions to install the repo. From the same terminal screen, you customize the environment by installing opencv-python and dependencies listed on the requirements.txt of the GitHub repo.

You run two pip commands: the first takes care of opencv-python and the second completes the installation of the requirements.txt:

pip install opencv-python pip install -r requirements.txt

You’re now ready to build the notebooks.

On the repo’s root directory, create a new Jupyter notebook by choosing Notebook, Environment (conda_aws_neuron_tensorflow_p36). On the first cell of the notebook, import the library as defined in the run.py script, which is the reference pipeline for image processing. In the following cell, create a logger to record the benchmarking. See the following code:

import argparse
import logging
import sys
import time from tf_pose import common
import cv2
import numpy as np
from tf_pose.estimator import TfPoseEstimator
from tf_pose.networks import get_graph_path, model_wh

logger = logging.getLogger('TfPoseEstimatorRun')
logger.handlers.clear()
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
formatter = logging.Formatter('[%(asctime)s] [%(name)s] [%(levelname)s] %(message)s')
ch.setFormatter(formatter)
logger.addHandler(ch)

Define the main inferencing function main() and a helper plotter function plotter(). These functions directly replicate the OpenPose inference pipeline from run.py. One simple modification is the addition of a repeats argument, which allows you to run many inference steps in sequence and improve the measure of the average model throughput (measured in seconds per image):

def main(argString='--image ./images/contortion1.jpg --model cmu', repeats=10): parser = argparse.ArgumentParser(description='tf-pose-estimation run') parser.add_argument('--image', type=str, default='./images/apink2.jpg') parser.add_argument('--model', type=str, default='cmu', help='cmu / mobilenet_thin / mobilenet_v2_large / mobilenet_v2_small') parser.add_argument('--resize', type=str, default='0x0', help='if provided, resize images before they are processed. ' 'default=0x0, Recommends : 432x368 or 656x368 or 1312x736 ') parser.add_argument('--resize-out-ratio', type=float, default=2.0, help='if provided, resize heatmaps before they are post-processed. default=1.0') args = parser.parse_args(argString.split()) w, h = model_wh(args.resize) if w == 0 or h == 0: e = TfPoseEstimator(get_graph_path(args.model), target_size=(432, 368)) else: e = TfPoseEstimator(get_graph_path(args.model), target_size=(w, h)) # estimate human poses from a single image ! image = common.read_imgfile(args.image, None, None) if image is None: logger.error('Image can not be read, path=%s' % args.image) sys.exit(-1) t = time.time() for _ in range(repeats): humans = e.inference(image, resize_to_default=(w > 0 and h > 0), upsample_size=args.resize_out_ratio) elapsed = time.time() - t logger.info('%d times inference on image: %s at %.4f seconds/image.' % (repeats, args.image, elapsed/repeats)) image = TfPoseEstimator.draw_humans(image, humans, imgcopy=False) return image, e

def plotter(image): try: import matplotlib.pyplot as plt fig = plt.figure(figsize=(12,12)) a = fig.add_subplot(1, 1, 1) a.set_title('Result') plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB)) except Exception as e: logger.warning('matplitlib error, %s' % e) cv2.imshow('result', image) cv2.waitKey()

Additionally, you can modify the same code structure for inferencing on videos or batches of images, based on the run_video.py or run_directory.py, if you’re feeling adventurous!

The main() function takes as input the same string of arguments as described in the Test Inference section of the GitHub repo. To test the notebook implementation, you use a reference set of arguments (make sure to download the cmu model using the original download script):

img, e = main('--model cmu --resize 656x368 --image=./images/ski.jpg --resize-out-ratio 2.0')
plotter(img)

The logs show your first multi-person pose analyzed:

‘[TfPoseEstimatorRun] [INFO] 10 times inference on image: ./images/ski.jpg at 1.5624 seconds/image.’

This results in lower than one frame per second (FPS) throughput, which is not a great performance. In this use case, you’re running a TensorFlow graph, --model cmu, without a GPU. The performance of such a model isn’t optimal on CPU. If you repeat the setup and run the environment on a g4dn.xlarge instance, with one NVIDIA T4 GPU, the result is quite different:

‘[TfPoseEstimatorRun] [INFO] 10 times inference on image: ./images/ski.jpg at 0.1708 seconds/image’ 

The result is 5.85 FPS, which is much better.

Using the Neuron compiled CMU model

So far, you’ve used model artifacts that came with the repo. Instead of using the original download script to retrieve the CMU model, copy the Neuron compiled model into ./models/graph/cmu/graph_model.pb and rerun the test:

aws s3 cp s3://<MY_BUCKET_NAME>/graph_opt.pb ./models/graph/cmu/graph_model.pb

Make sure to restart the Python kernel on the notebook if you previously ran a test of the non-Neuron compiled model. Restarting the kernel helps make sure all TensorFlow sessions are closed and get a fresh start for the benchmark. Running the same notebook again results in the following log entry:

‘[TfPoseEstimatorRun] [INFO] 10 times inference on image: ./images/ski.jpg at 0.1709 seconds/image.’

The results show the same frame rate as compared to the g4dn.xlarge instance, in an environment that costs approximately 30% less on demand. Despite the cost benefit from moving the workload to an AWS Inferentia-based instance, this throughput doesn’t convey the observed large performance gains of other reported results. For example, Amazon Alexa text to speech team has cut their inference cost by 50% when migrating to AWS Inferentia.

We decided to profile our version of the compiled graph and look for opportunities to fine-tune the end-to-end inference performance of the OpenPose pipeline. The integration of Neuron with TensorFlow gives access to native profiling libraries. To profile the Neuron compiled graph, we instrumented the TensorFlow session run command on the estimator method using the TensorFlow Python profiler:

from tensorflow.core.protobuf import config_pb2
from tensorflow.python.profiler import model_analyzer, option_builder run_options = config_pb2.RunOptions(trace_level=config_pb2.RunOptions.FULL_TRACE)
run_metadata = config_pb2.RunMetadata() peaks, heatMat_up, pafMat_up = self.persistent_sess.run( [self.tensor_peaks, self.tensor_heatMat_up, self.tensor_pafMat_up], feed_dict={ self.tensor_image: [img], self.upsample_size: upsample_size }, options=run_options, run_metadata=run_metadata
) options = option_builder.ProfileOptionBuilder.time_and_memory()
model_analyzer.profile(self.persistent_sess.graph, run_metadata, op_log=None, cmd='scope', options=options)

The model_analyzer.profile method prints on StdErr the time and memory consumption of each operation on the TensorFlow graph. With the original code, the Neuron operation and a smoothing operation dominated the total graph runtime. The following output from the StdErr log shows that the total graph runtime took 108.02 milliseconds, of which the smoothing operation took 43.07 milliseconds:

node name | requested bytes | total execution time | accelerator execution time | cpu execution time
_TFProfRoot (--/16.86MB, --/108.02ms, --/0us, --/108.02ms)
… TfPoseEstimator/conv5_2_CPM_L1/weights/neuron_op_ed41d2deb8c54255 (430.01KB/430.01KB, 58.42ms/58.42ms, 0us/0us, 58.42ms/58.42ms)
…
smoothing (0B/2.89MB, 0us/43.07ms, 0us/0us, 0us/43.07ms) smoothing/depthwise (2.85MB/2.85MB, 43.05ms/43.05ms, 0us/0us, 43.05ms/43.05ms) smoothing/gauss_weight (47.50KB/47.50KB, 18us/18us, 0us/0us, 18us/18us)
…

The smoothing method provides a gaussian blur of the confidence maps calculated by OpenPose. By optimizing this operation, we can extract even more performance out of our end-to-end pose estimation. We modified the filter argument of the smoother on the estimator.py script from 25 to 5. This new configuration took down the total runtime to 67.44 milliseconds, of which the smoother now only takes 2.37ms—a 37% reduction! On a g4dn, this same optimization had little effect on the runtime. You can also optimize your version of the end-to-end pipeline by changing the same parameters and reinstalling the tf-pose-estimation repo from your local copy.

We ran the same benchmark across seven different instances types and sizes to evaluate the performance and cost of inference of our optimized end-to-end image processing pipeline. For comparison, we also show the On-Demand instance pricing from Amazon EC2 Pricing.

The throughput on the smallest size Inf1 instance—xlarge—is 2 times higher than that of the largest g4dn instance evaluated —8xlarge—at 12 times less the cost per 1000 images. Comparing the two best options, inf1.xlarge and g4dn.xlarge, inf1 has 72% lower cost per 1000 images, or a 3.57 times better price to performance compared to the lowest cost GPU option. The following table summarizes these findings.

inf1.xlarge inf1.2xlarge inf1.6xlarge g4dn.xlarge g4dn.2xlarge g4dn.4xlarge g4dn.8xlarge
Image process time [seconds/image] 0.0703 0.0677 0.0656 0.1708 0.1526 0.1477 0.1427

Throughput

[FPS]

14.22 14.77 15.24 5.85 6.55 6.77 7.01
1000 Images processing time [seconds] 70.3 67.7 65.6 170.8 152.6 147.7 142.7

On demand cost

[$/hr]

$ 0.368 $ 0.584 $ 1.904 $ 0.526 $ 0.752 $ 1.204 $ 2.176

Cost per 1000 images

[$]

$ 0.007 $ 0.011 $ 0.035 $ 0.025 $ 0.032 $ 0.049 $ 0.086

The chart below summarizes the throughput and cost per 1000 images results for the xlarge and 2xlarge instance sizes.

We further reduced the image-processing cost and increased throughput of the tf-pose-estimation on an Inf1 instance by taking a data parallel approach to the end-to-end pipeline. The values shown in the preceding table relate to the use of a single AWS Inferentia processing core—a Neuron core. The benchmarked instance has four, so it’s wasteful to use only one. Our test with embarrassingly parallel implementation of the main()function call using the Python joblib library showed linear scaling up to four threads. This pattern increased the throughput to 56.88 FPS and decreased the cost per 1000 images to below $0.002. This is a good indication that better batching strategy can further improve the price-performance ratio of OpenPose on AWS Inferentia.

The larger CMU model also provides good pose estimation performance. For example, see the following image of the multi-pose detection using the Neuron SDK compiled model, on a scene with subjects at multiple depths.

Safely shutting down and cleaning up

On the Amazon EC2 console, choose the compilation and inference instances, and choose Terminate from the Actions drop-down menu. You persisted the compiled model in your s3://<MY_BUCKET_NAME> so it can be reused later. If you’ve made changes to the code inside the instances, remember to persist those as well. The instance termination discards data stored only in the instance’s home volume.

Conclusion

In this post, you walked through the steps of compiling an open-source OpenPose TensorFlow model, updating a custom end-to-end image processing pipeline, and identifying tools to profile and further optimize your ML inference time on an EC2 Inf1 instance. When tuned, the Neuron compiled TensorFlow model was 72% less expensive than the cheapest GPU instance, with consistently better performance. The steps described in this post also apply to other ML model types and frameworks. For more information, see the AWS Neuron SDK GitHub repo.

Learn more about the AWS Inferentia chip and the Amazon EC2 Inf1 instances to get started with running your own custom ML pipelines on AWS Inferentia using the Neuron SDK.


About the Authors

Fabio Nonato de Paula is a Principal Solutions Architect for Autonomous Computing in AWS. He works with large-scale deployments of machine learning and AI for autonomous and intelligent systems. Fabio is passionate about democratizing access to accelerated computing and distributed ML. Outside of work, you can find Fabio riding his motorcycle on the hills of Livermore valley or reading ComiXology.

Haichen Li is a software development engineer in the AWS Neuron SDK team. He works on integrating machine learning frameworks with the AWS Neuron compiler and runtime systems, as well as developing deep learning models that benefit particularly from the Inferentia hardware.

Source: https://aws.amazon.com/blogs/machine-learning/deploying-tensorflow-openpose-on-aws-inferentia-based-inf1-instances-for-significant-price-performance-improvements/

AI

Executive Interview: Brian Gattoni, CTO, Cybersecurity & Infrastructure Security Agency 

Understanding and Advising on Cyber and Physical Risks to the Nation’s Critical Infrastructure  Brian R. Gattoni is the Chief Technology Officer for the Cybersecurity and Infrastructure Security Agency (CISA) of the Department of Homeland Security. CISA is the nation’s risk advisor, working with partners to defend against today’s threats and collaborating to build a secure and resilient […]

Published

on

As CTO of the Cybersecurity & Infrastructure Security Agency of the DHS, Brian Gattoni is charged with understanding and advising on cyber and physical risks to the nation’s critical infrastructure. 

Understanding and Advising on Cyber and Physical Risks to the Nation’s Critical Infrastructure 

Brian Gattoni, CTO, Cybersecurity & Infrastructure Security Agency

Brian R. Gattoni is the Chief Technology Officer for the Cybersecurity and Infrastructure Security Agency (CISA) of the Department of Homeland Security. CISA is the nation’s risk advisor, working with partners to defend against today’s threats and collaborating to build a secure and resilient infrastructure for the future. Gattoni sets the technical vision and strategic alignment of CISA data and mission services. Previously, he was the Chief of Mission Engineering & Technology, developing analytic techniques and new approaches to increase the value of DHS cyber mission capabilities. Prior to joining DHS in 2010, Gattoni served in various positions at the Defense Information Systems Agency and the United States Army Test & Evaluation Command. He holds a Master of Science Degree in Cyber Systems & Operations from the Naval Postgraduate School in Monterey, California, and is a Certified Information Systems Security Professional (CISSP).  

AI Trends: What is the technical vision for CISA to manage risk to federal networks and critical infrastructure? 

Brian Gattoni: Our technology vision is built in support of our overall strategy. We are the nation’s risk advisor. It’s our job to stay abreast of incoming threats and opportunities for general risk to the nation. Our efforts are to understand and advise on cyber and physical risks to the nation’s critical infrastructure.  

It’s all about bringing in the data, understanding what decisions need to be made and can be made from the data, and what insights are useful to our stakeholders. The potential of AI and machine learning is to expand on operational insights with additional data sets to make better use of the information we have.  

What are the most prominent threats? 

The Cybersecurity and Infrastructure Security Agency (CISA) of the Department of Homeland Security is the Nation’s risk advisor.

The sources of threats we frequently discuss are the adversarial actions of nation-state actors and those aligned with nation-state actors and their interests, in disrupting national critical functions here in the U.S. Just in the past month, we’ve seen increased activity from elements supporting what we refer to in the government as Hidden Cobra [malicious cyber activity by the North Korean government]. We’ve issued joint alerts with our partners overseas and the FBI and the DoD, highlighting activity associated with Chinese actors. On CISA.gov people can find CISA Insights, which are documents that provide background information on particular cyber threats and the vulnerabilities they exploit, as well as a ready-made set of mitigation activities that non-federal partners can implement.   

What role does AI play in the plan? 

Artificial intelligence has a great role to play in the support of the decisions we make as an agency. Fundamentally, AI is going to allow us to apply our decision processes to a scale of data that humans just cannot keep up with. And that’s especially prevalent in the cyber mission. We remain cognizant of how we make decisions in the first place and target artificial intelligence and machine learning algorithms that augment and support that decision-making process. We’ll be able to use AI to provide operational insights at a greater scale or across a greater breadth of our mission space.  

How far along are you in the implementation of AI at the CISA? 

Implementing AI is not as simple as putting in a new business intelligence tool or putting in a new email capability. Really augmenting your current operations with artificial intelligence is a mix of the culture change, for humans to understand how the AI is supposed to augment their operations. It is a technology change, to make sure you have the scalable compute and the right tools in place to do the math you’re talking about implementing. And it’s a process change. We want to deliver artificial intelligence algorithms that augment our operators’ decisions as a support mechanism.  

Where we are in the implementation is closer to understanding those three things. We’re working with partners in federally funded research and development centers, national labs and the departments own Science and Technology Data Analytics Tech Center to develop capability in this area. We’ve developed an analytics meta-process which helps us systemize the way we take in data and puts us in a position to apply artificial intelligence to expand our use of that data.  

Do you have any interesting examples of how AI is being applied in CISA and the federal government today? Or what you are working toward, if that’s more appropriate. 

I have a recent use case. We’ve been working with some partners over the past couple of months to apply AI to a humanitarian assistance and disaster relief type of mission. So, within CISA, we also have responsibilities for critical infrastructure. During hurricane season, we always have a role to play in helping advise what the potential impacts are to critical infrastructure sites in the affected path of a hurricane.  

We prepared to conduct an experiment leveraging AI algorithms and overhead imagery to figure out if we could analyze the data from a National Oceanic and Atmospheric Administration flight over the affected area. We compared that imagery with the base imagery from Google Earth or ArcGIS and used AI to identify any affected critical infrastructure. We could see the extent to which certain assets, such as oil refineries, were physically flooded. We could make an assessment as to whether they hit a threshold of damage that would warrant additional scrutiny, or we didn’t have to apply resources because their resilience was intact, and their functions could continue.   

That is a nice use case, a simple example of letting a computer do the comparisons and make a recommendation to our human operators. We found that it was very good at telling us which critical infrastructure sites did not need any additional intervention. To use a needle in a haystack analogy, one of the useful things AI can help us do is blow hay off the stack in pursuit of the needle. And that’s a win also. The experiment was very promising in that sense.  

How does CISA work with private industry, and do you have any examples of that?  

We have an entire division dedicated to stakeholder engagement. Private industry owns over 80% of the critical infrastructure in the nation. So CISA sits at the intersection of the private sector and the government to share information, to ensure we have resilience in place for both the government entities and the private entities, in the pursuit of resilience for those national critical functions. Over the past year we’ve defined a set of 55 functions that are critical for the nation.  

When we work with private industry in those areas we try to share the best insights and make decisions to ensure those function areas will continue unabated in the face of a physical or cyber threat. 

Cloud computing is growing rapidly. We see different strategies, including using multiple vendors of the public cloud, and a mix of private and public cloud in a hybrid strategy. What do you see is the best approach for the federal government? 

In my experience the best approach is to provide guidance to the CIO’s and CISO’s across the federal government and allow them the flexibility to make risk-based determinations on their own computing infrastructure as opposed to a one-size-fits-all approach.   

We issue a series of use cases that describeat a very high levela reference architecture about a type of cloud implementation and where security controls should be implemented, and where telemetry and instrumentation should be applied. You have departments and agencies that have a very forward-facing public citizen services portfolio, which means access to information, is one of their primary responsibilities. Public clouds and ease of access are most appropriate for those. And then there are agencies with more sensitive missions. Those have critical high value data assets that need to be protected in a specific way. Giving each the guidance they need to handle all of their use cases is what we’re focused on here. 

I wanted to talk a little bit about job roles. How are you defining the job roles around AI in CISA, as in data scientists, data engineers, and other important job titles and new job titles?  

I could spend the remainder of our time on this concept of job roles for artificial intelligence; it’s a favorite topic for me. I am a big proponent of the discipline of data science being a team sport. We currently have our engineers and our analysts and our operators. And the roles and disciplines around data science and data engineers have been morphing out of an additional duty on analysts and engineers into its own sub sector, its own discipline. We’re looking at a cadre of data professionals that serve almost as a logistics function to our operators who are doing the mission-level analysis. If you treat data as an asset that has to be moved and prepared and cleaned and readied, all terms in the data science and data engineering world now, you start to realize that it requires logistics functions similar to any other asset that has to be moved. 

If you get professionals dedicated to that end, you will be able to scale to the data problems you have without overburdening your current engineers who are building the compute platforms, or your current mission analysts who are trying to interpret the data and apply the insights to your stakeholders. You will have more team members moving data to the right places, making data-driven decisions. 

Are you able to hire the help you need to do the job? Are you able to find qualified people? Where are the gaps? 

As the domain continues to mature, as we understand more about the different roles, we begin to see gapseducation programs and training programs that need to be developed. I think maybe three, five years ago, you would see certificates from higher education in data science. Now we’re starting to see full-fledged degrees as concentrations out of computer science or mathematics. Those graduates are the pipeline to help us fill the gaps we currently have. So as far as our current problems, there’s never enough people. It’s always hard to get the good ones and then keep them because the competition is so high. 

Here at CISA, we continue to invest not only in our own folks that are re-training, but in the development of a cyber education and training group, which is looking at the partnerships with academia to help shore up that pipeline. It continually improves. 

Do you have a message for high school or college students interested in pursuing a career in AI, either in the government or in business, as to what they should study? 

Yes and it’s similar to the message I give to the high schoolers that live in my house. That is, don’t give up on math so easily. Math and science, the STEM subjects, have foundational skills that may be applicable to your future career. That is not to discount the diversity and variety of thought processes that come from other disciplines. I tell my kids they need the mathematical foundation to be able to apply the thought processes you learn from studying music or studying art or studying literature. And the different ways that those disciplines help you make connections. But have the mathematical foundation to represent those connections to a computer.   

One of the fallacies around machine learning is that it will just learn [by itself]. That’s not true. You have to be able to teach it, and you can only talk to computers with math, at the base level.  

So if you have the mathematical skills to relay your complicated human thought processes to the computer, and now it can replicate those patterns and identify what you’re asking it to do, you will have success in this field. But if you give up on the math part too earlyit’s a progressive disciplineif you give up on algebra two and then come back years later and jump straight into calculus, success is going to be difficult, but not impossible. 

You sound like a math teacher.  

A simpler way to say it is: if you say no to math now, it’s harder to say yes later. But if you say yes now, you can always say no later, if data science ends up not being your thing.  

Are there any incentives for young people, let’s say a student just out of college, to go to work for the government? Is there any kind of loan forgiveness for instance?  

We have a variety of programs. The one that I really like, that I have had a lot of success with as a hiring manager in the federal government, especially here at DHS over the past 10 years, is a program called Scholarship for Service. It’s a CyberCorps program where interested students, who pass the process to be accepted can get a degree in exchange for some service time. It used to be two years; it might be more now, but they owe some time and service to the federal government after the completion of their degree. 

I have seen many successful candidates come out of that program and go on to fantastic careers, contributing in cyberspace all over. I have interns that I hired nine years ago that are now senior leaders in this organization or have departed for private industry and are making their difference out there. It’s a fantastic program for young folks to know about.  

What advice do you have for other government agencies just getting started in pursuing AI to help them meet their goals? 

My advice for my peers and partners and anybody who’s willing to listen to it is, when you’re pursuing AI, be very specific about what it can do for you.   

I go back to the decisions you make, what people are counting on you to do. You bear some responsibility to know how you make those decisions if you’re really going to leverage AI and machine learning to make decisions faster or better or some other quality of goodnessThe speed at which you make decisions will go both ways. You have to identify your benefit of that decision being made if it’s positive and define your regret if that decision is made and it’s negative. And then do yourself a simple HIGH-LOW matrix; the quadrant of high-benefit, low-regret decisions is the target. Those are ones that I would like to automate as much as possible. And if artificial intelligence and machine learning can help, that would be great. If not, that’s a decision you have to make. 

I have two examples I use in our cyber mission to illustrate the extremes here. One is for incident triage. If a cyber incident is detected, we have a triage process to make sure that it’s real. That presents information to an analyst. If that’s done correctly, it has a high benefit because it can take a lot of work off our analysts. It has lowtomedium regret if it’s done incorrectly, because the decision is to present information to an analyst who can then provide that additional filter. So that’s a high benefit, low regret. That’s a no-brainer for automating as much as possible. 

On the other side of the spectrum is protecting next generation 911 call centers from a potential telephony denial of service attack. One of the potential automated responses could be to cut off the incoming traffic to the 911 call center to stunt the attack. Benefit: you may have prevented the attack. Regret: potentially you’re cutting off legitimate traffic to a 911 call center, and that has life and safety implications. And that is unacceptable. That’s an area where automation is probably not the right approach. Those are two extreme examples, which are easy for people to understand, and it helps illustrate how the benefit regret matrix can work. How you make decisions is really the key to understanding whether to implement AI and machine learning to help automate those decisions using the full breadth of data.  

Learn more about the Cybersecurity & Infrastructure Security Agency.  

Source: https://www.aitrends.com/executive-interview/executive-interview-brian-gattoni-cto-cybersecurity-infrastructure-security-agency/

Continue Reading

AI

Making Use Of AI Ethics Tuning Knobs In AI Autonomous Cars 

By Lance Eliot, the AI Trends Insider   There is increasing awareness about the importance of AI Ethics, consisting of being mindful of the ethical ramifications of AI systems.    AI developers are being asked to carefully design and build their AI mechanizations by ensuring that ethical considerations are at the forefront of the AI systems development […]

Published

on

Ethical tuning knobs would be a handy addition to self-driving car controls, the author suggests, if for example the operator was late for work and needed to exceed the speed limit. (Credit: Getty Images) 

By Lance Eliot, the AI Trends Insider  

There is increasing awareness about the importance of AI Ethics, consisting of being mindful of the ethical ramifications of AI systems.   

AI developers are being asked to carefully design and build their AI mechanizations by ensuring that ethical considerations are at the forefront of the AI systems development process. When fielding AI, those responsible for the operational use of the AI also need to be considering crucial ethical facets of the in-production AI systems. Meanwhile, the public and those using or reliant upon AI systems are starting to clamor for heightened attention to the ethical and unethical practices and capacities of AI.   

Consider a simple example. Suppose an AI application is developed to assess car loan applicants. Using Machine Learning (ML) and Deep Learning (DL), the AI system is trained on a trove of data and arrives at some means of choosing among those that it deems are loan worthy and those that are not. 

The underlying Artificial Neural Network (ANN) is so computationally complex that there are no apparent means to interpret how it arrives at the decisions being rendered. Also, there is no built-in explainability capability and thus the AI is unable to articulate why it is making the choices that it is undertaking (note: there is a movement toward including XAI, explainable AI components to try and overcome this inscrutability hurdle).   

Upon the AI-based loan assessment application being fielded, soon thereafter protests arose by some that assert they were turned down for their car loan due to an improper inclusion of race or gender as a key factor in rendering the negative decision.   

At first, the maker of the AI application insists that they did not utilize such factors and professes complete innocence in the matter. Turns out though that a third-party audit of the AI application reveals that the ML/DL is indeed using race and gender as core characteristics in the car loan assessment process. Deep within the mathematically arcane elements of the neural network, data related to race and gender were intricately woven into the calculations, having been dug out of the initial training dataset provided when the ANN was crafted. 

That is an example of how biases can be hidden within an AI system. And it also showcases that such biases can go otherwise undetected, including that the developers of the AI did not realize that the biases existed and were seemingly confident that they had not done anything to warrant such biases being included. 

People affected by the AI application might not realize they are being subjected to such biases. In this example, those being adversely impacted perchance noticed and voiced their concerns, but we are apt to witness a lot of AI that no one will realize they are being subjugated to biases and therefore not able to ring the bell of dismay.   

Various AI Ethics principles are being proffered by a wide range of groups and associations, hoping that those crafting AI will take seriously the need to consider embracing AI ethical considerations throughout the life cycle of designing, building, testing, and fielding AI.   

AI Ethics typically consists of these key principles: 

1)      Inclusive growth, sustainable development, and well-being 

2)      Human-centered values and fairness 

3)      Transparency and explainability 

4)      Robustness, security, and safety 

5)      Accountability   

We certainly expect humans to exhibit ethical behavior, and thus it seems fitting that we would expect ethical behavior from AI too.   

Since the aspirational goal of AI is to provide machines that are the equivalent of human intelligence, being able to presumably embody the same range of cognitive capabilities that humans do, this perhaps suggests that we will only be able to achieve the vaunted goal of AI by including some form of ethics-related component or capacity. 

What this means is that if humans encapsulate ethics, which they seem to do, and if AI is trying to achieve what humans are and do, the AI ought to have an infused ethics capability else it would be something less than the desired goal of achieving human intelligence.   

You could claim that anyone crafting AI that does not include an ethics facility is undercutting what should be a crucial and integral aspect of any AI system worth its salt. 

Of course, trying to achieve the goals of AI is one matter, meanwhile, since we are going to be mired in a world with AI, for our safety and well-being as humans we would rightfully be arguing that AI had better darned abide by ethical behavior, however that might be so achieved.   

Now that we’ve covered that aspect, let’s take a moment to ponder the nature of ethics and ethical behavior.  

Considering Whether Humans Always Behave Ethically   

Do humans always behave ethically? I think we can all readily agree that humans do not necessarily always behave in a strictly ethical manner.   

Is ethical behavior by humans able to be characterized solely by whether someone is in an ethically binary state of being, namely either purely ethical versus being wholly unethical? I would dare say that we cannot always pin down human behavior into two binary-based and mutually exclusive buckets of being ethical or being unethical. The real-world is often much grayer than that, and we at times are more likely to assess that someone is doing something ethically questionable, but it is not purely unethical, nor fully ethical. 

In a sense, you could assert that human behavior ranges on a spectrum of ethics, at times being fully ethical and ranging toward the bottom of the scale as being wholly and inarguably unethical. In-between there is a lot of room for how someone ethically behaves. 

If you agree that the world is not a binary ethical choice of behaviors that fit only into truly ethical versus solely unethical, you would therefore also presumably be amenable to the notion that there is a potential scale upon which we might be able to rate ethical behavior. 

This scale might be from the scores of 1 to 10, or maybe 1 to 100, or whatever numbering we might wish to try and assign, maybe even including negative numbers too. 

Let’s assume for the moment that we will use the positive numbers of a 1 to 10 scale for increasingly being ethical (the topmost is 10), and the scores of -1 to -10 for being unethical (the -10 is the least ethical or in other words most unethical potential rating), and zero will be the midpoint of the scale. 

Please do not get hung up on the scale numbering, which can be anything else that you might like. We could even use letters of the alphabet or any kind of sliding scale. The point being made is that there is a scale, and we could devise some means to establish a suitable scale for use in these matters.   

The twist is about to come, so hold onto your hat.   

We could observe a human and rate their ethical behavior on particular aspects of what they do. Maybe at work, a person gets an 8 for being ethically observant, while perhaps at home they are a more devious person, and they get a -5 score. 

Okay, so we can rate human behavior. Could we drive or guide human behavior by the use of the scale? 

Suppose we tell someone that at work they are being observed and their target goal is to hit an ethics score of 9 for their first year with the company. Presumably, they will undertake their work activities in such a way that it helps them to achieve that score.   

In that sense, yes, we can potentially guide or prod human behavior by providing targets related to ethical expectations. I told you a twist was going to arise, and now here it is. For AI, we could use an ethical rating or score to try and assess how ethically proficient the AI is.   

In that manner, we might be more comfortable using that particular AI if we knew that it had a reputable ethical score. And we could also presumably seek to guide or drive the AI toward an ethical score too, similar to how this can be done with humans, and perhaps indicate that the AI should be striving towards some upper bound on the ethics scale. 

Some pundits immediately recoil at this notion. They argue that AI should always be a +10 (using the scale that I’ve laid out herein). Anything less than a top ten is an abomination and the AI ought to not exist. Well, this takes us back into the earlier discussion about whether ethical behavior is in a binary state.   

Are we going to hold AI to a “higher bar” than humans by insisting that AI always be “perfectly” ethical and nothing less so?   

This is somewhat of a quandary due to the point that AI overall is presumably aiming to be the equivalent of human intelligence, and yet we do not hold humans to that same standard. 

For some, they fervently believe that AI must be held to a higher standard than humans. We must not accept or allow any AI that cannot do so. 

Others indicate that this seems to fly in the face of what is known about human behavior and begs the question of whether AI can be attained if it must do something that humans cannot attain.   

Furthermore, they might argue that forcing AI to do something that humans do not undertake is now veering away from the assumed goal of arriving at the equivalent of human intelligence, which might bump us away from being able to do so as a result of this insistence about ethics.   

Round and round these debates continue to go. 

Those on the must-be topnotch ethical AI are often quick to point out that by allowing AI to be anything less than a top ten, you are opening Pandora’s box. For example, it could be that AI dips down into the negative numbers and sits at a -4, or worse too it digresses to become miserably and fully unethical at a dismal -10. 

Anyway, this is a debate that is going to continue and not be readily resolved, so let’s move on. 

If you are still of the notion that ethics exists on a scale and that AI might also be measured by such a scale, and if you also are willing to accept that behavior can be driven or guided by offering where to reside on the scale, the time is ripe to bring up tuning knobs. Ethics tuning knobs. 

Here’s how that works. You come in contact with an AI system and are interacting with it. The AI presents you with an ethics tuning knob, showcasing a scale akin to our ethics scale earlier proposed. Suppose the knob is currently at a 6, but you want the AI to be acting more aligned with an 8, so you turn the knob upward to the 8. At that juncture, the AI adjusts its behavior so that ethically it is exhibiting an 8-score level of ethical compliance rather than the earlier setting of a 6. 

What do you think of that? 

Some would bellow out balderdash, hogwash, and just unadulterated nonsense. A preposterous idea or is it genius? You’ll find that there are experts on both sides of that coin. Perhaps it might be helpful to provide the ethics tuning knob within a contextual exemplar to highlight how it might come to play. 

Here’s a handy contextual indication for you: Will AI-based true self-driving cars potentially contain an ethics tuning knob for use by riders or passengers that use self-driving vehicles?   

Let’s unpack the matter and see.   

For my framework about AI autonomous cars, see the link here: https://aitrends.com/ai-insider/framework-ai-self-driving-driverless-cars-big-picture/ 

Why this is a moonshot effort, see my explanation here: https://aitrends.com/ai-insider/self-driving-car-mother-ai-projects-moonshot/ 

For more about the levels as a type of Richter scale, see my discussion here: https://aitrends.com/ai-insider/richter-scale-levels-self-driving-cars/ 

For the argument about bifurcating the levels, see my explanation here: https://aitrends.com/ai-insider/reframing-ai-levels-for-self-driving-cars-bifurcation-of-autonomy/   

Understanding The Levels Of Self-Driving Cars   

As a clarification, true self-driving cars are ones that the AI drives the car entirely on its own and there isn’t any human assistance during the driving task.   

These driverless vehicles are considered a Level 4 and Level 5, while a car that requires a human driver to co-share the driving effort is usually considered at a Level 2 or Level 3. The cars that co-share the driving task are described as being semi-autonomous, and typically contain a variety of automated add-on’s that are referred to as ADAS (Advanced Driver-Assistance Systems).   

There is not yet a true self-driving car at Level 5, which we don’t yet even know if this will be possible to achieve, and nor how long it will take to get there. 

Meanwhile, the Level 4 efforts are gradually trying to get some traction by undergoing very narrow and selective public roadway trials, though there is controversy over whether this testing should be allowed per se (we are all life-or-death guinea pigs in an experiment taking place on our highways and byways, some contend). 

Since semi-autonomous cars require a human driver, the adoption of those types of cars won’t be markedly different than driving conventional vehicles, so there’s not much new per se to cover about them on this topic (though, as you’ll see in a moment, the points next made are generally applicable).   

For semi-autonomous cars, it is important that the public needs to be forewarned about a disturbing aspect that’s been arising lately, namely that despite those human drivers that keep posting videos of themselves falling asleep at the wheel of a Level 2 or Level 3 car, we all need to avoid being misled into believing that the driver can take away their attention from the driving task while driving a semi-autonomous car.   

You are the responsible party for the driving actions of the vehicle, regardless of how much automation might be tossed into a Level 2 or Level 3. 

For why remote piloting or operating of self-driving cars is generally eschewed, see my explanation here: https://aitrends.com/ai-insider/remote-piloting-is-a-self-driving-car-crutch/ 

To be wary of fake news about self-driving cars, see my tips here: https://aitrends.com/ai-insider/ai-fake-news-about-self-driving-cars/ 

The ethical implications of AI driving systems are significant, see my indication here: https://aitrends.com/selfdrivingcars/ethically-ambiguous-self-driving-cars/   

Be aware of the pitfalls of normalization of deviance when it comes to self-driving cars, here’s my call to arms: https://aitrends.com/ai-insider/normalization-of-deviance-endangers-ai-self-driving-cars/   

Self-Driving Cars And Ethics Tuning Knobs 

For Level 4 and Level 5 true self-driving vehicles, there won’t be a human driver involved in the driving task. All occupants will be passengers. The AI is doing the driving.   

This seems rather straightforward. You might be wondering where any semblance of ethics behavior enters the picture. Here’s how. Some believe that a self-driving car should always strictly obey the speed limit. 

Imagine that you have just gotten into a self-driving car in the morning and it turns out that you are possibly going to be late getting to work. Your boss is a stickler and has told you that coming in late is a surefire way to get fired.   

You tell the AI via its Natural Language Processing (NLP) that the destination is your work address. 

And, you ask the AI to hit the gas, push the pedal to the metal, screech those tires, and get you to work on-time.

But it is clear cut that if the AI obeys the speed limit, there is absolutely no chance of arriving at work on-time, and since the AI is only and always going to go at or less than the speed limit, your goose is fried.   

Better luck at your next job.   

Whoa, suppose the AI driving system had an ethics tuning knob. 

Abiding strictly by the speed limit occurs when the knob is cranked up to the top numbers like say 9 and 10. 

You turn the knob down to a 5 and tell the AI that you need to rush to work, even if it means going over the speed limit, which at a score of 5 it means that the AI driving system will mildly exceed the speed limit, though not in places like school zones, and only when the traffic situation seems to allow for safely going faster than the speed limit by a smidgen.   

The AI self-driving car gets you to work on-time!   

Later that night, when heading home, you are not in as much of a rush, so you put the knob back to the 9 or 10 that it earlier was set at. 

Also, you have a child-lock on the knob, such that when your kids use the self-driving car, which they can do on their own since there isn’t a human driver needed, the knob is always set at the topmost of the scale and the children cannot alter it.   

How does that seem to you? 

Some self-driving car pundits find the concept of such a tuning knob to be repugnant. 

They point out that everyone will “cheat” and put the knob on the lower scores that will allow the AI to do the same kind of shoddy and dangerous driving that humans do today. Whatever we might have otherwise gained by having self-driving cars, such as the hoped-for reduction in car crashes, along with the reduction in associated injuries and fatalities, will be lost due to the tuning knob capability.   

Others though point out that it is ridiculous to think that people will put up with self-driving cars that are restricted drivers that never bend or break the law. 

You’ll end-up with people opting to rarely use self-driving cars and will instead drive their human-driven cars. This is because they know that they can drive more fluidly and won’t be stuck inside a self-driving car that drives like some scaredy-cat. 

As you might imagine, the ethical ramifications of an ethics tuning knob are immense. 

In this use case, there is a kind of obviousness about the impacts of what an ethics tuning knob foretells.   

Other kinds of AI systems will have their semblance of what an ethics tuning knob might portend, and though it might not be as readily apparent as the case of self-driving cars, there is potentially as much at stake in some of those other AI systems too (which, like a self-driving car, might entail life-or-death repercussions).   

For why remote piloting or operating of self-driving cars is generally eschewed, see my explanation here: https://aitrends.com/ai-insider/remote-piloting-is-a-self-driving-car-crutch/   

To be wary of fake news about self-driving cars, see my tips here: https://aitrends.com/ai-insider/ai-fake-news-about-self-driving-cars/ 

The ethical implications of AI driving systems are significant, see my indication here: https://aitrends.com/selfdrivingcars/ethically-ambiguous-self-driving-cars/   

Be aware of the pitfalls of normalization of deviance when it comes to self-driving cars, here’s my call to arms: https://aitrends.com/ai-insider/normalization-of-deviance-endangers-ai-self-driving-cars/   

Conclusion   

If you really want to get someone going about the ethics tuning knob topic, bring up the allied matter of the Trolley Problem.   

The Trolley Problem is a famous thought experiment involving having to make choices about saving lives and which path you might choose. This has been repeatedly brought up in the context of self-driving cars and garnered acrimonious attention along with rather diametrically opposing views on whether it is relevant or not. 

In any case, the big overarching questions are will we expect AI to have an ethics tuning knob, and if so, what will it do and how will it be used. 

Those that insist there is no cause to have any such device are apt to equally insist that we must have AI that is only and always practicing the utmost of ethical behavior. 

Is that a Utopian perspective or can it be achieved in the real world as we know it?   

Only my crystal ball can say for sure.  

Copyright 2020 Dr. Lance Eliot  

This content is originally posted on AI Trends.  

[Ed. Note: For reader’s interested in Dr. Eliot’s ongoing business analyses about the advent of self-driving cars, see his online Forbes column: https://forbes.com/sites/lanceeliot/] 

http://ai-selfdriving-cars.libsyn.com/website 

Source: https://www.aitrends.com/ai-insider/making-use-of-ai-ethics-tuning-knobs-in-ai-autonomous-cars/

Continue Reading

AI

Application of AI to IT Service Ops by IBM and ServiceNow Exemplifies a Trend 

By John P. Desmond, AI Trends Editor  The application of AI to IT service operations has the potential to automate many tasks and drive down the cost of operations.  The trend is exemplified by the recent agreement between IBM and ServiceNow to leverage IBM’s AI-powered cloud infrastructure with ServiceNow’s intelligent workflow systems, as reported in Forbes.  […]

Published

on

AI combined with IT service operations is seen as having the potential to automate many tasks while improving response times and decreasing costs (Credit: Getty Images) 

By John P. Desmond, AI Trends Editor 

The application of AI to IT service operations has the potential to automate many tasks and drive down the cost of operations. 

The trend is exemplified by the recent agreement between IBM and ServiceNow to leverage IBM’s AI-powered cloud infrastructure with ServiceNow’s intelligent workflow systems, as reported in Forbes. 

The goal is to reduce resolution times and lower the cost of outages, which according to a recent report from Aberdeen, can cost a company $260,000 per hour.  

David Parsons, Senior Vice President of Global Alliances and Partner Ecosystem at ServiceNow

“Digital transformation is no longer optional for anyone, and AI and digital workflows are the way forward,” stated David Parsons, Senior Vice President of Global Alliances and Partner Ecosystem at ServiceNow. “The four keys to success with AI are the ability 1) to automate IT, 2) gain deeper insights, 3) reduce risks, and 4) lower costs across your business,” Parsons said.   

The two companies plan to combine their tools in customer engagement to address each of these factors. “The first phase will bring together IBM’s AIOps software and professional services with ServiceNow’s intelligent workflow capabilities to help companies meet the digital demands of this moment,” Parsons stated. 

Arvind Krishna, Chief Executive Officer of IBM stated in a press release on the announcement, “AI is one of the biggest forces driving change in the IT industry to the extent that every company is swiftly becoming an AI company.” ServiceNow’s cloud computing platform helps companies manage digital workflows for enterprise IT operations.  

By partnering with ServiceNow and their market leading Now Platform, clients will be able to use AI to quickly mitigate unforeseen IT incident costs. “Watson AIOps with ServiceNow’s Now Platform is a powerful new way for clients to use automation to transform their IT operations and mitigate unforeseen IT incident costs,” Krishna stated. 

The IT service offering squarely positions IBM at aiming for AI in business. “When we talk about AI, we mean AI for business, which is much different than consumer AI,” stated Michael Gilfix of IBM in the Forbes account. He is the Vice President of Cloud Integration and Chief Product Officer of Cloud Paks at IBM. “AI for business is all about enabling organizations to predict outcomes, optimize resources, and automate processes so humans can focus their time on things that really matter,” he stated.   

IBM Watson has handled more than 30,000 client engagements since inception in 2011, the company reports. Among the benefits of this experience is a vast natural language processing vocabulary, which can parse and understand huge amounts of unstructured data. 

Ericsson Scientists Develop AI System to Automatically Resolve Trouble Tickets 

Another experience involving AI in operations comes from two AI scientists with Ericsson, who have developed a machine learning algorithm to help application service providers manage and automatically resolve trouble tickets. 

Wenting Sun, senior data science manager, Ericsson

Wenting Sun, senior data science manager at Ericsson in San Francisco, and Alka Isac, data scientist in Ericsson’s Global AI Accelerator outside Boston, devised the system to help quickly resolve issues with the complex infrastructure of an application service provider, according to an account on the Ericsson BlogThese could be network connection response problems, infrastructure resource limitations, or software malfunctioning issues. 

The two sought to use advanced NLP algorithms to analyze text information, interpret human language and derive predictions. They also took advantage of features/weights discovered from a group of trained models. Their system uses a hybrid of an unsupervised clustering approach and supervised deep learning embedding. “Multiple optimized models are then ensembled to build the recommendation engine,” the authors state.  

The two describe current trouble ticket handling approaches as time-consuming, tedious, labor-intensive, repetitive, slow, and prone to error. Incorrect triaging often results, which can lead to a reopening of a ticket and more time to resolve, making for unhappy customers. When personnel turns over, the human knowledge gained from years of experience can be lost.  

Alka Isac, data scientist in Ericsson’s Global AI Accelerator

We can replace the tedious and time-consuming triaging process with intelligent recommendations and an AI-assisted approach,” the authors stated, with a time to resolution expected to be reduced up to 75% and avoidance of multiple ticket reopenings  

Sun leads a team of data scientists and data engineers to develop AI/ML applications in the telecommunication domain. She holds a bachelor’s degree in electrical and electronics engineering and a PhD degree in intelligent control. She also drives Ericsson’s contributions to the AI open source platform Acumos (under Linux foundation’s Deep Learning Foundation).  

As a Data Scientist in Ericsson’s Global AI Accelerator, Isac is part of a team of Data Scientists focusing on reducing the resolution time of tickets for Ericsson’s Customer Support Team. She holds a master’s degree in Information Systems Management majoring in Data Science. 

Survey Finds AI Is Helpful to IT 

In a survey of 154 IT and business professionals at companies with at least one AI-related project in general production, AI was found to deliver impressive results to IT departments, enhancing the performance of systems and making help desks more helpful, according to a recent account in ZDNet.  

The survey was conducted by ITPro Today working with InformationWeek and Interop. 

Beyond benefits of AI for the overall business, many respondents could foresee the greatest benefits going right to the IT organization itself63% responded that they hope to achieve greater efficiencies within IT operations. Another 45% aimed for improved product support and customer experience, and another 29% sought improved cybersecurity systems.   

The top IT use case was security analytics and predictive intelligence, cited by 71% of AI leaders. Another 56% stated AI is helping with the help desk, while 54% have seen a positive impact on the productivity of their departments. “While critics say that the hype around AI-driven cybersecurity is overblown, clearly, IT departments are desperate to solve their cybersecurity problems, and, judging by this question in our survey, many of them are hoping AI will fill that need,” stated Sue Troy, author of the survey report.   

AI expertise is in short supply. More than two in three successful AI implementers, 67%, report shortages of candidates with needed machine learning and data modeling skills, while 51seek greater data engineering expertise. Another 42% reported compute infrastructure skills to be in short supply.    

Read the source articles and information in Forbes, the IBM press release on the alliance with ServiceNow, on the Ericsson Blog, in ZDNet and from ITPro Today . 

Source: https://www.aitrends.com/aiops/application-of-ai-to-it-service-ops-by-ibm-and-servicenow-exemplifies-a-trend/

Continue Reading
AI6 hours ago

Executive Interview: Brian Gattoni, CTO, Cybersecurity & Infrastructure Security Agency 

AI6 hours ago

Making Use Of AI Ethics Tuning Knobs In AI Autonomous Cars 

AI6 hours ago

Application of AI to IT Service Ops by IBM and ServiceNow Exemplifies a Trend 

AI7 hours ago

Testing Finds Automated Driver Assistance Systems to be Unreliable 

AI7 hours ago

How  Veterans Would Study Machine Learning If He Had to Start Today 

AI7 hours ago

Forecasting for Fall Uncertainties 

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

AI11 hours ago

Securing Amazon SageMaker Studio connectivity using a private VPC

Trending