Connect with us

AI

MantraTalks Podcast with Richard Roy Mendonce: Covid-19 & the Disruption in Healthcare

The outbreak of COVID-19 has put immense pressure on the healthcare sector. The supply chain of medical supplies was hit. The sudden surge of patients made it difficult to manage the hospital operations. Since priority had to be given to COVID patients, regular consults and elective surgeries were delayed.  To go one step further and […]

The post MantraTalks Podcast with Richard Roy Mendonce: Covid-19 & the Disruption in Healthcare appeared first on Mantra Labs.

Published

on

The outbreak of COVID-19 has put immense pressure on the healthcare sector. The supply chain of medical supplies was hit. The sudden surge of patients made it difficult to manage the hospital operations. Since priority had to be given to COVID patients, regular consults and elective surgeries were delayed. 

To go one step further and understand the disruption in healthcare amidst these adverse conditions, we interviewed Mr. Richard Roy Mendonce, Head Digital Strategy at Yashoda Hospitals to shed light on the role of technology in combating the current challenges faced by healthcare and possible mitigation strategies.

Mr. Richard Roy Mendonce has a strong domain expertise within the Healthcare Industry and has successfully infused digital transformations in various organizations like Columbia Asia Group of Hospitals, Sakra World Hospital, and Manipal Hospitals Group that ensured better customer experience and increased business. 

A Digital Strategist, he currently leads the digital efforts at Yashoda Hospitals, which is among the oldest and biggest healthcare groups in the region. He has nearly a decade of experience in digital marketing, digital strategy and digital transformation, with a distinctive ability to develop highly effective and measurable strategies that drive revenue growth, new customers, brand awareness and reputation. 

Constantly inspired & fascinated by the dynamics of the digital landscape, he has developed a skill set built on the art of leveraging digital technologies focused to deliver positive user experiences and achieve business objectives. In 2019, he was awarded as one of the 50 Most Influential Strategy Leaders by COM Global at World Marketing Congress.

Connect with Mr. Richard Roy Mendonce – LinkedIn

Watch the interview: 

The excerpt from the interview:

Covid-19 & the Disruption in Healthcare

Many hospitals are reassessing their digital marketing strategy and budgets in light of the uncertain economic situation. Most healthcare organizations can benefit from taking this time to strategize and plan for the future, rather than putting the brakes on. Please share some key insights into the changing patient behavior and the steps you are taking to reach them? Also, How will the healthcare marketing landscape change Beyond COVID-19?

Mr. Richard: In terms of healthcare, especially telemedicine, COVID-19 has completely cut down the channel of visiting doctors in-person for a consult. Lack of options has increased more acceptance towards Telemedicine. A couple of months back, we compared the benefits and comfort of direct consultation to an online one. We were reluctant to have those experiences but now acceptance has increased. 

Another thing I feel is —  we do not need high-end technology or equipment. When we hear of telemedicine, what comes first to our mind is jazzy computers, high-tech connections, software, etc.; but that is not the case. Even a simple SMS/call/WhatsApp call is enough to connect with a doctor. We don’t really need any high-end equipment to start a telemedicine service. 

Today, most of the spending is being diverted to digital channels rather than traditional offline ones and it will continue to happen. Digital channels are more trackable, more efficient, and more controllable. Even digital connect to engage with offline channels is gaining momentum. Healthcare set-ups will have offline referral networks, business partners. Traditionally, there would be a sales team who go meet and connect with them. Now with the social distancing and lockdowns, even that connection is replaced with digital connections such as webinars, video calls, etc. 

Communication in marketing has also changed. Before COVID-19, the communication was “Don’t ignore your health, come to us”. During the COVID-19 situation, the communication was “Come to us only if it is an emergency, it’s better to stay at home”. Post COVID-19, the communication might be- “Wherever you are, we are accessible, come to us or use our online services.” 

Telemedicine in a Post-Pandemic India

In the short time since the Pandemic began, the impact of social distancing norms has changed our daily lives & routines. Due to which, services like live remote consultations and telemedicine are getting more attention. Telemedicine is likely a permanent beneficiary of the pandemic. Do you think it will remain a key mode of healthcare delivery even after restrictions are lifted? Are there other digitally-enabled services that can potentially find greater adoption in a Post-Pandemic India?

Mr. Richard: Telemedicine will continue to be one of the modes of care delivery but that will not replace the existing care delivery system. Rather, it will be a mix of both. People will opt for telemedicine for the initial consultation (a non-serious one) and post-treatment follow-ups or review visits or to update on reports. People might get accustomed to telemedicine services but I think that will never replace serious conditions or surgical specialty where doctors need to examine personally to deliver proper care. 

In terms of acceptance level of technology, there has been wider acceptance for non-clinical support systems. For example, chatbots in place to address customer service and AI-driven platforms to check symptoms and guide the patient to respective specialists. This is not for prescriptions, but to enable patients to help themselves in availing services. 

Related: Healthcare Chatbots: Innovative, Efficient, and Low-cost Care

Medical supplies: Another area where digital platforms should have a wider scale of adoption is traveling for non-essential medical supplies. Pharma delivery is one sector that can go entirely digital. We can also have a format where physical stores are eliminated. Delivery can be from warehouse to customer. 

Diagnostics: Apart from radiology, diagnostics can go completely digital. Home care such as remote ICUs, remote monitoring could have potentially greater adoption in the current scenario. 

Disruption in healthcare will also include technologies to strengthen medical education and training.

Operational Challenges in Healthcare

From the operations point of view, digital transformation alone cannot help in preparing for an outbreak of this scale. The reality is we also have to be prepared for the possibility of a next Pandemic wave. The pandemic itself is testing the digital readiness and operational resilience of hospitals, in digitizing services and bringing innovation into healthcare. What are the operational challenges, as far as digital capabilities go, that hospitals are facing currently? And, what steps must they take to bridge these gaps?

Mr. Richard: We all know that the entire system was not geared up for a pandemic of this scale. Hospitals are facing both operational and clinical challenges. However, I’ll address this one particular issue from a digital angle. 

The biggest challenge for any hospital is the lack of a digital care platform and is still heavily dependent on paper-based systems. Now we know that anything can be sanitized but how do we sanitize paper documents. Patients have to carry these documents, touch them, and exchange multiple hands which can be potential carriers of the virus. Now it is more important to keep all the medical records digitized. 

Another aspect is the nature of this virus which is highly communicable and is leading to the community spread of this disease. Therefore, hospitals have a responsibility to maintain data at a patient-level so that contact tracing becomes much more easier and automated. So, maybe a symptom can be added as a trigger in the system and automatically do a contact tracing and give a list of people they can reach out to.

Yet another aspect in healthcare which is prone to change is remote working. Most of the industries such as IT have already geared up for remote working but healthcare has not. Many of the processes still need people coming to the office and working on a computer that is in the network. So, the disruption in healthcare relies on digital platforms to ensure that staff is efficiently deployed.

Changes in the Patient Experience

Both outpatient and in-patient treatment for all major non-communicable diseases including emergencies have declined. Going forward, as the country tries to resume life in the New Normal, industries like retail are experimenting with touchless interfaces to boost the customer’s confidence in shopping in-store. What changes, if any, do you foresee to the physical patient experience?

Mr. Richard: Wherever possible, currently hospitals are trying to minimize contact. Like airports, one can print their boarding pass, even hospitals can ask the patients not to wait in a line at the reception but rather book an appointment and make payments online. Once the appointment is booked, patients can just come and wait for the doctor’s call. We have seen multiple robotic-assisted surgeries where contact with the patient is avoided. Similarly, some technologies may come up taking vitals from the patient in a no-contact manner. There are hospitals in the country that have introduced innovative robots who screen patients coming to the hospitals. There are lots of innovations possible in this area. 

Role of AR, VR and AI in Digital Healthcare 

Huge volumes of data are flowing into the cloud, not just from doctors’ offices and imaging centers, but also from remote devices and sensors worn or operated by patients. By harnessing the vast amounts of data and putting it to work in applications, it helps care providers to improve effectiveness and efficiencies. Do you see technologies like AR/VR/AI playing a role in the future of digital healthcare in India? Can you share some examples of areas that Yashoda Hospitals has begun experimentation or implementation with these technologies?

Mr. Richard: Artificial intelligence, Machine Learning, Augmented Reality, Virtual Reality, Cloud systems, etc. are the buzzwords these days. I do believe that these technologies will pick pace in the healthcare industry as well. But I see a challenge there. Though all the data is on the cloud, the data is held by individual stakeholders and corporations. And standardization of data is the biggest challenge right now. 

So, any company which is working towards utilizing these technologies should first look at technologies that can bring data on one platform which is usable, accessible, and standardized without compromising confidential information of the patient. In terms of innovation at Yashoda hospitals, we are working on a couple of them such as AI-based radiology systems, optimizing customer journeys in hospitals, manpower planning, etc. 

Related: Medical Image Management: DICOM Images Sharing Process

Let’s take the patient discharge process for instance. Transitioning a customer from ‘in-patient’ to ‘out-patient’ is a significant challenge for any hospital, since it involves multiple departments. You’ve even stated before that it takes the integrated view of HIS (hospital information systems), EMR (electronic medical records), inventory, billing, and real-time updates of treatment progress to facilitate discharge at the click of a button. What is your experience in the transformation process and the ground realities of addressing this critical pain point? 

Mr. Richard: Theoretically speaking, the discharge process takes a lot of time but the reason it takes so much time is because it involves multiple stakeholders at a time- internal as well as external. It further gets complicated when the insurance is involved. I think all healthcare providers are looking to simplify the discharge process. The only way it is possible is having technology cut across stakeholders and in real-time. So wherever possible, we can avoid these internal communication delays. 

Return to Normal: The way forward

As hospitals plan for the complicated return stage (once restrictions are lifted), the volume of footfalls, testing, etc. will gradually increase. What advice can you share with other healthcare leaders to prepare their organization on the frontline to manage specific risks regarding employee safety, patient outcomes, etc? What investments (in remote patient monitoring, medical equipment, CRM systems, etc.) should healthcare organizations be making to respond to ‘the return to normal’?

Mr. Richard: I think that the precautionary steps taken by most of the healthcare providers are commendable. It is much better than in other countries across the world. We are in touch with a few of the major chains and the precautions that are being taken are phenomenal. Starting from thermal screens and fever clinics at the entrance, social distancing blogs; we have implemented Cluster Systems within our hospitals. It is a system where the employees are clustered in certain areas to minimize cross-contamination between employees. 

In terms of investment in technology, clinical data can be good to start working on. A good EMR system that seamlessly integrates and exchanges data between all relevant information systems is the need of the time. This investment will not just be in terms of technology but also behavioral change. 

So the system has to be friendly to seamlessly capture the data and make it available across systems. Using data efficiently is important to guide clinical decision support, developing user experience protocols and creating empowerment for the patient. 

Summing up

COVID-19 has changed a lot in us. The lockdown has unlocked a lot of things. It is a good time to innovate. Essential services would be a keyword used for a very long time now in every aspect. Be it shopping, be it food, be it health. And social distancing will be a new lifestyle. 

In this session, Mr. Richard shared insights on the disruption in healthcare and the importance of technological innovations in the new normal for hospitals.


AI is going to be essential for Insurers to gain that competitive edge in the post-pandemic world. Check out Hitee — an industry-pecific chatbot for driving customer engagement. For your specific requirements, please feel free to write to us at [email protected]

More insights from the industry stalwarts:

Source: https://www.mantralabsglobal.com/blog/interview-richard-roy-mendonce-disruption-in-healthcare/

AI

Optimizing costs for machine learning with Amazon SageMaker

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss […]

Published

on

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss how to apply such optimization to ML workloads. We consider available options such as elasticity, different pricing models in cloud, automation, advantage of scale, and more.

Developing, training, maintaining, and performance tuning ML models is an iterative process that requires continuous improvement. Determining the optimum state in the model while going through the permutations and combinations of model parameters and data dependencies to adjust is just one leg of the journey. There is more to optimizing the cost of ML than just algorithm performance and model tuning. There is also some effort required to integrate developed models into applications and realize their benefits. Throughout this process, you can keep the cost down in numerous ways. Amazon SageMaker has made most of this journey smooth so developers and data scientists can spend most of their time focusing on what matters the most—delivering business value.

Amazon SageMaker notebook instances

An Amazon SageMaker notebook instance is an ML compute instance running the Jupyter Notebook app. This notebook instance comes with sample notebooks, several optimized algorithms, and complete code walkthroughs. Amazon SageMaker manages the creation of this instance and related resources. Consider using Amazon SageMaker Studio notebooks for collaborative workloads and when you don’t need to set up compute instances and file storage beforehand.

You can follow these best practices to help reduce the cost of notebook instances.

GPU or CPU?

CPUs are best at handling single, more complex calculations sequentially, whereas GPUs are better at handling multiple but simple calculations in parallel. For many use cases, a standard current generation instance type from an instance family such as ml.m* provides enough computing power, memory, and network performance for many Jupyter notebooks to perform well. GPUs provide a great price/performance ratio if you take advantage of them effectively. However, GPUs also cost more, and you should choose GPU-based notebooks only when you really need them.

Ask yourself: Is my neural network relatively small scale? Is my network performing tons of calculations involving hundreds of thousands of parameters? Can my model take advantage of hardware parallelism such as P3 and P3dn instance families?

Depending on the model, the GPU communication overhead might even degrade performance. So, take a step back and start with what you think is the minimum requirement in terms of ml instance specification and work your way up to identifying the best instance type and family for your model.

If you’re using your notebook instance to train multiple jobs, decide when you need a GPU-enabled instance and when you don’t. If you need accelerated computing in your notebook environment, you can stop your m* family notebook instance, switch to a GPU-enabled P* family instance, and start it again. Don’t forget to switch it back when you no longer need that extra boost in your development environment.

If you’re using massive datasets for training and don’t want to wait for days or weeks to finish your training job, you can speed up the process by distributing training on multiple machines or processes in a cluster.

It’s recommended to use a small subset of your data for development in your notebook instance. You can use the full dataset for a training job that is distributed across optimized instances such as P2 or P3 GPU instances or an instance with powerful CPU, such as c5.

Maximize instance utilization

You can optimize your Amazon SageMaker notebook utilization many different ways. One simple way is to stop your notebook instance when you’re not using it and start when you need it. Consider auto-detecting idle notebook instances and managing their lifecycle using a lifecycle configuration script. For detailed implementation, see Right-sizing resources and avoiding unnecessary costs in Amazon SageMaker. Remember that the instance is only useful when you’re using the Jupyter notebook. If you’re not working on a notebook overnight or over the weekend, it’s a good idea to schedule a stop and start. Another way to save instance cost is by scheduling an AWS Lambda function. For example, you can stop all instances at 7:00 PM and start them at 7:00 AM.

You can also use Amazon CloudWatch Events to start and stop the instance based on an event. If you’re feeling geeky, connect it to your Amazon Rekognition based system to start a data scientist’s notebook instance when they step into the office or have Amazon Alexa do it as you grab a coffee.

Training jobs

The following are some best practices for saving costs on training jobs.

Use pre-trained models or even APIs

Pre-trained models eliminate the time spent gathering data and training models with that data. Consider using higher-level APIs such as provided by Amazon Rekognition or Amazon Comprehend to help you avoid spending on tasks that are already done for you. As an example, Amazon Comprehend simplifies topic modeling on a large corpus of documents. You can also use the Neural topic modeling (NTM) algorithm in Amazon SageMaker to get similar results with more effort. Although you have more control over hyperparameters when training your own model, your use case may not need it. A lot of engineering work and experience goes into creating ready-to-consume and highly optimized models, therefore an upfront ROI analysis is highly recommended if you’re embarking on a journey to develop similar models.

Use Pipe mode (where applicable) to reduce training time

Certain algorithms in Amazon SageMaker like Blazing text work on a large corpus of data. When these jobs are launched, significant time goes into downloading the data from Amazon Simple Storage Service (Amazon S3) into the local Amazon Elastic Block Storage (Amazon EBS) store. Your training jobs don’t start until this download finishes. These algorithms can take advantage of Pipe mode, in which training data is streamed from Amazon S3 into Amazon EBS and your training jobs start immediately. For example, training Blazing text on common crawl (3 TB) can take a few days, out of which a significant number of hours are just lost in download. This process can take advantage of Pipe mode to reduce significant training time.

Managed spot training in Amazon SageMaker

Managed spot training can optimize the cost of training models up to 90% over On-Demand Instances. Amazon SageMaker manages the Spot interruptions on your behalf. If your training job can be interrupted, use managed spot training. You can specify which training jobs use Spot Instances and a stopping condition that specifies how long Amazon SageMaker waits for a job to run using EC2 Spot Instances.

You may also consider using EC2 Spot Instances if you’re willing to do some extra work and if your algorithm is resilient enough to interruptions. For more information, see Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs.

Test your code locally

Resolve issues with code and data so you don’t need to pay to run training clusters for failed training jobs. This also saves you time spent initializing the training cluster. Before you submit a training job, try to run the fit function in local mode to fetch some early feedback:

mxnet_estimator = MXNet('train.py', train_instance_type='local', train_instance_count=1)

Monitor the performance of your training jobs to identify waste

Amazon SageMaker is integrated with CloudWatch out of the box and publishes instance metrics of the training cluster in CloudWatch. You can use these metrics to see if you should make adjustments to your cluster, such as CPUs, memory, number of instances, and more. To view the CloudWatch metric for your training jobs, navigate to the Jobs page on the Amazon SageMaker console and choose View Instance metrics in the Monitor section.

Also, use Amazon SageMaker Debugger, which provides full visibility into model training by monitoring, recording, analyzing, and visualizing training process tensors. Debugger can dramatically reduce the time, resources, and cost needed to train models.

Find the right balance: Performance vs. accuracy

Compare the throughput of 16-bit floating point and 32-bit floating point calculations and determine what is right for your model. 32-bit (single precision or FP32) and even 64-bit (double precision or FP64) floating point variables are popular for many applications that require high precision. These are workloads like engineering simulations that simulate real-world behavior and need the mathematical model to be as exact as possible. In many cases, however, reducing memory usage and increasing speed gained by moving to half or mixed precision (16-bit or FP16) is worth the minor tradeoffs in accuracy. For more information, see Accelerating GPU computation through mixed-precision methods.

A similar trade-off also applies when deciding on the number of layers in your neural network for your classification algorithms, such as image classification.

Tuning (hyperparameter optimization) jobs

Use hyperparameter optimization (HPO) when needed and choose the hyperparameters and their ranges to tune on wisely.

Some API calls can result in a bill of hundreds or even thousands of dollars, and tuning jobs are one of those. A good tuning job can save you many working days of expensive data scientists’ time and provide a significant lift in model performance, which is highly beneficial. HPO in Amazon SageMaker finds good hyperparameters quicker if the search space is narrow (for example, a learning rate of 0.01–0.05 rather than 0.001–0.9). If you have some relevant prior knowledge about the hyperparameter range, start with that. For wide hyperparameter ranges, you may want to consider logarithmic transformations.

Amazon SageMaker also reduces the amount of time spent tuning models using built-in HPO. This technology automatically adjusts hundreds of different combinations of parameters to quickly arrive at the best solution for your ML problem. With high-performance algorithms, distributed computing, managed infrastructure, and HPO, Amazon SageMaker drastically decreases the training time and overall cost of building production grade systems. You can see examples of HPO in some of the Amazon SageMaker built-in algorithms.

For longer training jobs and as the training time for each training job gets longer, you may also want to consider early stopping of training jobs.

Hosting endpoints

The following section discusses how to save cost when hosting endpoints using Amazon SageMaker hosting services.

Delete endpoints that aren’t in use

Amazon SageMaker is great for testing new models because you can easily deploy them into an A/B testing environment. When you’re done with your tests and not using the endpoint extensively anymore, you should delete it. You can always recreate it when you need it again because the model is stored in Amazon S3.

Use Automatic Scaling

Auto Scaling your Amazon SageMaker endpoint doesn’t just provide high availability, better throughput, and better performance, it also optimizes the cost of your endpoint. Make sure that you configure Auto Scaling for your endpoint, monitor your model endpoint, and adjust the scaling policy based on the CloudWatch metrics. For more information, see Load test and optimize and Amazon SageMaker endpoint using automatic scaling.

Amazon Elastic Inference for deep learning

Selecting a GPU instance type that is big enough to satisfy the requirements of the most demanding resource for inference may not be a smart move. Even at peak load, a deep learning application may not fully utilize the capacity offered by a GPU. Consider using Amazon Elastic Inference, which allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep learning inference by up to 75%.

Host multiple models with multi-model endpoints

You can create an endpoint that can host multiple models. Multi-model endpoints reduce hosting costs by improving endpoint utilization and provide a scalable and cost-effective solution to deploying a large number of models. Multi-model endpoints enable time-sharing of memory resources across models. It also reduces deployment overhead because Amazon SageMaker manages loading models in memory and scaling them based on traffic patterns to models.

Reducing labeling time with Amazon SageMaker Ground Truth

Data labeling is a key process of identifying raw data (such as images, text files, and videos) and adding one or more meaningful and informative labels to provide context so that an ML model can learn from it. This process is essential because the accuracy of trained model depends on accuracy of properly labeled dataset, or ground truth.

Amazon SageMaker Ground Truth uses combination of ML and a human workforce (vetted by AWS) to label images and text. Many ML projects are delayed because of insufficient labeled data. You can use Ground Truth to accelerate the ML cycle and reduce overall costs.

Tagging your resources

Consider tagging your Amazon SageMaker notebook instances and the hosting endpoints. Tags such as name of the project, business unit, environment (such as development, testing, or production) are useful for cost-optimization and can provide a clear visibility into where the money is spent. Cost allocation tags can help track and categorize your cost of ML. It can answer questions such as “Can I delete this resource to save cost?”

Keeping track of cost

If you need visibility of your ML cost on AWS, use AWS Budgets. This helps you track your Amazon SageMaker cost, including development, training, and hosting. You can also set alerts and get a notification when your cost or usage exceeds (or is forecasted to exceed) your budgeted amount. After you create your budget, you can track the progress on the AWS Budgets console.

Conclusion

In this post, I highlighted a few approaches and techniques to optimize cost without compromising on the implementation flexibility so you can deliver best-in-class ML-based business applications.

For more information about optimizing costs, consider the following:


About the Author

BK Chaurasiya is a Principal Product Manager at Amazon Web Services R&D and Innovation team. He provides technical guidance, design advice, and thought leadership to some of the largest and successful AWS customers and partners. A technologist by heart, BK specializes in driving DevOps, continuous delivery, and large-scale cloud transformation initiatives to success.

Source: https://aws.amazon.com/blogs/machine-learning/optimizing-costs-for-machine-learning-with-amazon-sagemaker/

Continue Reading

AI

Optimizing costs for machine learning with Amazon SageMaker

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss […]

Published

on

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss how to apply such optimization to ML workloads. We consider available options such as elasticity, different pricing models in cloud, automation, advantage of scale, and more.

Developing, training, maintaining, and performance tuning ML models is an iterative process that requires continuous improvement. Determining the optimum state in the model while going through the permutations and combinations of model parameters and data dependencies to adjust is just one leg of the journey. There is more to optimizing the cost of ML than just algorithm performance and model tuning. There is also some effort required to integrate developed models into applications and realize their benefits. Throughout this process, you can keep the cost down in numerous ways. Amazon SageMaker has made most of this journey smooth so developers and data scientists can spend most of their time focusing on what matters the most—delivering business value.

Amazon SageMaker notebook instances

An Amazon SageMaker notebook instance is an ML compute instance running the Jupyter Notebook app. This notebook instance comes with sample notebooks, several optimized algorithms, and complete code walkthroughs. Amazon SageMaker manages the creation of this instance and related resources. Consider using Amazon SageMaker Studio notebooks for collaborative workloads and when you don’t need to set up compute instances and file storage beforehand.

You can follow these best practices to help reduce the cost of notebook instances.

GPU or CPU?

CPUs are best at handling single, more complex calculations sequentially, whereas GPUs are better at handling multiple but simple calculations in parallel. For many use cases, a standard current generation instance type from an instance family such as ml.m* provides enough computing power, memory, and network performance for many Jupyter notebooks to perform well. GPUs provide a great price/performance ratio if you take advantage of them effectively. However, GPUs also cost more, and you should choose GPU-based notebooks only when you really need them.

Ask yourself: Is my neural network relatively small scale? Is my network performing tons of calculations involving hundreds of thousands of parameters? Can my model take advantage of hardware parallelism such as P3 and P3dn instance families?

Depending on the model, the GPU communication overhead might even degrade performance. So, take a step back and start with what you think is the minimum requirement in terms of ml instance specification and work your way up to identifying the best instance type and family for your model.

If you’re using your notebook instance to train multiple jobs, decide when you need a GPU-enabled instance and when you don’t. If you need accelerated computing in your notebook environment, you can stop your m* family notebook instance, switch to a GPU-enabled P* family instance, and start it again. Don’t forget to switch it back when you no longer need that extra boost in your development environment.

If you’re using massive datasets for training and don’t want to wait for days or weeks to finish your training job, you can speed up the process by distributing training on multiple machines or processes in a cluster.

It’s recommended to use a small subset of your data for development in your notebook instance. You can use the full dataset for a training job that is distributed across optimized instances such as P2 or P3 GPU instances or an instance with powerful CPU, such as c5.

Maximize instance utilization

You can optimize your Amazon SageMaker notebook utilization many different ways. One simple way is to stop your notebook instance when you’re not using it and start when you need it. Consider auto-detecting idle notebook instances and managing their lifecycle using a lifecycle configuration script. For detailed implementation, see Right-sizing resources and avoiding unnecessary costs in Amazon SageMaker. Remember that the instance is only useful when you’re using the Jupyter notebook. If you’re not working on a notebook overnight or over the weekend, it’s a good idea to schedule a stop and start. Another way to save instance cost is by scheduling an AWS Lambda function. For example, you can stop all instances at 7:00 PM and start them at 7:00 AM.

You can also use Amazon CloudWatch Events to start and stop the instance based on an event. If you’re feeling geeky, connect it to your Amazon Rekognition based system to start a data scientist’s notebook instance when they step into the office or have Amazon Alexa do it as you grab a coffee.

Training jobs

The following are some best practices for saving costs on training jobs.

Use pre-trained models or even APIs

Pre-trained models eliminate the time spent gathering data and training models with that data. Consider using higher-level APIs such as provided by Amazon Rekognition or Amazon Comprehend to help you avoid spending on tasks that are already done for you. As an example, Amazon Comprehend simplifies topic modeling on a large corpus of documents. You can also use the Neural topic modeling (NTM) algorithm in Amazon SageMaker to get similar results with more effort. Although you have more control over hyperparameters when training your own model, your use case may not need it. A lot of engineering work and experience goes into creating ready-to-consume and highly optimized models, therefore an upfront ROI analysis is highly recommended if you’re embarking on a journey to develop similar models.

Use Pipe mode (where applicable) to reduce training time

Certain algorithms in Amazon SageMaker like Blazing text work on a large corpus of data. When these jobs are launched, significant time goes into downloading the data from Amazon Simple Storage Service (Amazon S3) into the local Amazon Elastic Block Storage (Amazon EBS) store. Your training jobs don’t start until this download finishes. These algorithms can take advantage of Pipe mode, in which training data is streamed from Amazon S3 into Amazon EBS and your training jobs start immediately. For example, training Blazing text on common crawl (3 TB) can take a few days, out of which a significant number of hours are just lost in download. This process can take advantage of Pipe mode to reduce significant training time.

Managed spot training in Amazon SageMaker

Managed spot training can optimize the cost of training models up to 90% over On-Demand Instances. Amazon SageMaker manages the Spot interruptions on your behalf. If your training job can be interrupted, use managed spot training. You can specify which training jobs use Spot Instances and a stopping condition that specifies how long Amazon SageMaker waits for a job to run using EC2 Spot Instances.

You may also consider using EC2 Spot Instances if you’re willing to do some extra work and if your algorithm is resilient enough to interruptions. For more information, see Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs.

Test your code locally

Resolve issues with code and data so you don’t need to pay to run training clusters for failed training jobs. This also saves you time spent initializing the training cluster. Before you submit a training job, try to run the fit function in local mode to fetch some early feedback:

mxnet_estimator = MXNet('train.py', train_instance_type='local', train_instance_count=1)

Monitor the performance of your training jobs to identify waste

Amazon SageMaker is integrated with CloudWatch out of the box and publishes instance metrics of the training cluster in CloudWatch. You can use these metrics to see if you should make adjustments to your cluster, such as CPUs, memory, number of instances, and more. To view the CloudWatch metric for your training jobs, navigate to the Jobs page on the Amazon SageMaker console and choose View Instance metrics in the Monitor section.

Also, use Amazon SageMaker Debugger, which provides full visibility into model training by monitoring, recording, analyzing, and visualizing training process tensors. Debugger can dramatically reduce the time, resources, and cost needed to train models.

Find the right balance: Performance vs. accuracy

Compare the throughput of 16-bit floating point and 32-bit floating point calculations and determine what is right for your model. 32-bit (single precision or FP32) and even 64-bit (double precision or FP64) floating point variables are popular for many applications that require high precision. These are workloads like engineering simulations that simulate real-world behavior and need the mathematical model to be as exact as possible. In many cases, however, reducing memory usage and increasing speed gained by moving to half or mixed precision (16-bit or FP16) is worth the minor tradeoffs in accuracy. For more information, see Accelerating GPU computation through mixed-precision methods.

A similar trade-off also applies when deciding on the number of layers in your neural network for your classification algorithms, such as image classification.

Tuning (hyperparameter optimization) jobs

Use hyperparameter optimization (HPO) when needed and choose the hyperparameters and their ranges to tune on wisely.

Some API calls can result in a bill of hundreds or even thousands of dollars, and tuning jobs are one of those. A good tuning job can save you many working days of expensive data scientists’ time and provide a significant lift in model performance, which is highly beneficial. HPO in Amazon SageMaker finds good hyperparameters quicker if the search space is narrow (for example, a learning rate of 0.01–0.05 rather than 0.001–0.9). If you have some relevant prior knowledge about the hyperparameter range, start with that. For wide hyperparameter ranges, you may want to consider logarithmic transformations.

Amazon SageMaker also reduces the amount of time spent tuning models using built-in HPO. This technology automatically adjusts hundreds of different combinations of parameters to quickly arrive at the best solution for your ML problem. With high-performance algorithms, distributed computing, managed infrastructure, and HPO, Amazon SageMaker drastically decreases the training time and overall cost of building production grade systems. You can see examples of HPO in some of the Amazon SageMaker built-in algorithms.

For longer training jobs and as the training time for each training job gets longer, you may also want to consider early stopping of training jobs.

Hosting endpoints

The following section discusses how to save cost when hosting endpoints using Amazon SageMaker hosting services.

Delete endpoints that aren’t in use

Amazon SageMaker is great for testing new models because you can easily deploy them into an A/B testing environment. When you’re done with your tests and not using the endpoint extensively anymore, you should delete it. You can always recreate it when you need it again because the model is stored in Amazon S3.

Use Automatic Scaling

Auto Scaling your Amazon SageMaker endpoint doesn’t just provide high availability, better throughput, and better performance, it also optimizes the cost of your endpoint. Make sure that you configure Auto Scaling for your endpoint, monitor your model endpoint, and adjust the scaling policy based on the CloudWatch metrics. For more information, see Load test and optimize and Amazon SageMaker endpoint using automatic scaling.

Amazon Elastic Inference for deep learning

Selecting a GPU instance type that is big enough to satisfy the requirements of the most demanding resource for inference may not be a smart move. Even at peak load, a deep learning application may not fully utilize the capacity offered by a GPU. Consider using Amazon Elastic Inference, which allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep learning inference by up to 75%.

Host multiple models with multi-model endpoints

You can create an endpoint that can host multiple models. Multi-model endpoints reduce hosting costs by improving endpoint utilization and provide a scalable and cost-effective solution to deploying a large number of models. Multi-model endpoints enable time-sharing of memory resources across models. It also reduces deployment overhead because Amazon SageMaker manages loading models in memory and scaling them based on traffic patterns to models.

Reducing labeling time with Amazon SageMaker Ground Truth

Data labeling is a key process of identifying raw data (such as images, text files, and videos) and adding one or more meaningful and informative labels to provide context so that an ML model can learn from it. This process is essential because the accuracy of trained model depends on accuracy of properly labeled dataset, or ground truth.

Amazon SageMaker Ground Truth uses combination of ML and a human workforce (vetted by AWS) to label images and text. Many ML projects are delayed because of insufficient labeled data. You can use Ground Truth to accelerate the ML cycle and reduce overall costs.

Tagging your resources

Consider tagging your Amazon SageMaker notebook instances and the hosting endpoints. Tags such as name of the project, business unit, environment (such as development, testing, or production) are useful for cost-optimization and can provide a clear visibility into where the money is spent. Cost allocation tags can help track and categorize your cost of ML. It can answer questions such as “Can I delete this resource to save cost?”

Keeping track of cost

If you need visibility of your ML cost on AWS, use AWS Budgets. This helps you track your Amazon SageMaker cost, including development, training, and hosting. You can also set alerts and get a notification when your cost or usage exceeds (or is forecasted to exceed) your budgeted amount. After you create your budget, you can track the progress on the AWS Budgets console.

Conclusion

In this post, I highlighted a few approaches and techniques to optimize cost without compromising on the implementation flexibility so you can deliver best-in-class ML-based business applications.

For more information about optimizing costs, consider the following:


About the Author

BK Chaurasiya is a Principal Product Manager at Amazon Web Services R&D and Innovation team. He provides technical guidance, design advice, and thought leadership to some of the largest and successful AWS customers and partners. A technologist by heart, BK specializes in driving DevOps, continuous delivery, and large-scale cloud transformation initiatives to success.

Source: https://aws.amazon.com/blogs/machine-learning/optimizing-costs-for-machine-learning-with-amazon-sagemaker/

Continue Reading

AI

Optimizing costs for machine learning with Amazon SageMaker

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss […]

Published

on

Applications based on machine learning (ML) can provide tremendous business value. Using ML, we can solve some of the most complex engineering problems that previously were infeasible. One of the advantages of running ML on the AWS Cloud is that you can continually optimize your workloads and reduce your costs. In this post, we discuss how to apply such optimization to ML workloads. We consider available options such as elasticity, different pricing models in cloud, automation, advantage of scale, and more.

Developing, training, maintaining, and performance tuning ML models is an iterative process that requires continuous improvement. Determining the optimum state in the model while going through the permutations and combinations of model parameters and data dependencies to adjust is just one leg of the journey. There is more to optimizing the cost of ML than just algorithm performance and model tuning. There is also some effort required to integrate developed models into applications and realize their benefits. Throughout this process, you can keep the cost down in numerous ways. Amazon SageMaker has made most of this journey smooth so developers and data scientists can spend most of their time focusing on what matters the most—delivering business value.

Amazon SageMaker notebook instances

An Amazon SageMaker notebook instance is an ML compute instance running the Jupyter Notebook app. This notebook instance comes with sample notebooks, several optimized algorithms, and complete code walkthroughs. Amazon SageMaker manages the creation of this instance and related resources. Consider using Amazon SageMaker Studio notebooks for collaborative workloads and when you don’t need to set up compute instances and file storage beforehand.

You can follow these best practices to help reduce the cost of notebook instances.

GPU or CPU?

CPUs are best at handling single, more complex calculations sequentially, whereas GPUs are better at handling multiple but simple calculations in parallel. For many use cases, a standard current generation instance type from an instance family such as ml.m* provides enough computing power, memory, and network performance for many Jupyter notebooks to perform well. GPUs provide a great price/performance ratio if you take advantage of them effectively. However, GPUs also cost more, and you should choose GPU-based notebooks only when you really need them.

Ask yourself: Is my neural network relatively small scale? Is my network performing tons of calculations involving hundreds of thousands of parameters? Can my model take advantage of hardware parallelism such as P3 and P3dn instance families?

Depending on the model, the GPU communication overhead might even degrade performance. So, take a step back and start with what you think is the minimum requirement in terms of ml instance specification and work your way up to identifying the best instance type and family for your model.

If you’re using your notebook instance to train multiple jobs, decide when you need a GPU-enabled instance and when you don’t. If you need accelerated computing in your notebook environment, you can stop your m* family notebook instance, switch to a GPU-enabled P* family instance, and start it again. Don’t forget to switch it back when you no longer need that extra boost in your development environment.

If you’re using massive datasets for training and don’t want to wait for days or weeks to finish your training job, you can speed up the process by distributing training on multiple machines or processes in a cluster.

It’s recommended to use a small subset of your data for development in your notebook instance. You can use the full dataset for a training job that is distributed across optimized instances such as P2 or P3 GPU instances or an instance with powerful CPU, such as c5.

Maximize instance utilization

You can optimize your Amazon SageMaker notebook utilization many different ways. One simple way is to stop your notebook instance when you’re not using it and start when you need it. Consider auto-detecting idle notebook instances and managing their lifecycle using a lifecycle configuration script. For detailed implementation, see Right-sizing resources and avoiding unnecessary costs in Amazon SageMaker. Remember that the instance is only useful when you’re using the Jupyter notebook. If you’re not working on a notebook overnight or over the weekend, it’s a good idea to schedule a stop and start. Another way to save instance cost is by scheduling an AWS Lambda function. For example, you can stop all instances at 7:00 PM and start them at 7:00 AM.

You can also use Amazon CloudWatch Events to start and stop the instance based on an event. If you’re feeling geeky, connect it to your Amazon Rekognition based system to start a data scientist’s notebook instance when they step into the office or have Amazon Alexa do it as you grab a coffee.

Training jobs

The following are some best practices for saving costs on training jobs.

Use pre-trained models or even APIs

Pre-trained models eliminate the time spent gathering data and training models with that data. Consider using higher-level APIs such as provided by Amazon Rekognition or Amazon Comprehend to help you avoid spending on tasks that are already done for you. As an example, Amazon Comprehend simplifies topic modeling on a large corpus of documents. You can also use the Neural topic modeling (NTM) algorithm in Amazon SageMaker to get similar results with more effort. Although you have more control over hyperparameters when training your own model, your use case may not need it. A lot of engineering work and experience goes into creating ready-to-consume and highly optimized models, therefore an upfront ROI analysis is highly recommended if you’re embarking on a journey to develop similar models.

Use Pipe mode (where applicable) to reduce training time

Certain algorithms in Amazon SageMaker like Blazing text work on a large corpus of data. When these jobs are launched, significant time goes into downloading the data from Amazon Simple Storage Service (Amazon S3) into the local Amazon Elastic Block Storage (Amazon EBS) store. Your training jobs don’t start until this download finishes. These algorithms can take advantage of Pipe mode, in which training data is streamed from Amazon S3 into Amazon EBS and your training jobs start immediately. For example, training Blazing text on common crawl (3 TB) can take a few days, out of which a significant number of hours are just lost in download. This process can take advantage of Pipe mode to reduce significant training time.

Managed spot training in Amazon SageMaker

Managed spot training can optimize the cost of training models up to 90% over On-Demand Instances. Amazon SageMaker manages the Spot interruptions on your behalf. If your training job can be interrupted, use managed spot training. You can specify which training jobs use Spot Instances and a stopping condition that specifies how long Amazon SageMaker waits for a job to run using EC2 Spot Instances.

You may also consider using EC2 Spot Instances if you’re willing to do some extra work and if your algorithm is resilient enough to interruptions. For more information, see Managed Spot Training: Save Up to 90% On Your Amazon SageMaker Training Jobs.

Test your code locally

Resolve issues with code and data so you don’t need to pay to run training clusters for failed training jobs. This also saves you time spent initializing the training cluster. Before you submit a training job, try to run the fit function in local mode to fetch some early feedback:

mxnet_estimator = MXNet('train.py', train_instance_type='local', train_instance_count=1)

Monitor the performance of your training jobs to identify waste

Amazon SageMaker is integrated with CloudWatch out of the box and publishes instance metrics of the training cluster in CloudWatch. You can use these metrics to see if you should make adjustments to your cluster, such as CPUs, memory, number of instances, and more. To view the CloudWatch metric for your training jobs, navigate to the Jobs page on the Amazon SageMaker console and choose View Instance metrics in the Monitor section.

Also, use Amazon SageMaker Debugger, which provides full visibility into model training by monitoring, recording, analyzing, and visualizing training process tensors. Debugger can dramatically reduce the time, resources, and cost needed to train models.

Find the right balance: Performance vs. accuracy

Compare the throughput of 16-bit floating point and 32-bit floating point calculations and determine what is right for your model. 32-bit (single precision or FP32) and even 64-bit (double precision or FP64) floating point variables are popular for many applications that require high precision. These are workloads like engineering simulations that simulate real-world behavior and need the mathematical model to be as exact as possible. In many cases, however, reducing memory usage and increasing speed gained by moving to half or mixed precision (16-bit or FP16) is worth the minor tradeoffs in accuracy. For more information, see Accelerating GPU computation through mixed-precision methods.

A similar trade-off also applies when deciding on the number of layers in your neural network for your classification algorithms, such as image classification.

Tuning (hyperparameter optimization) jobs

Use hyperparameter optimization (HPO) when needed and choose the hyperparameters and their ranges to tune on wisely.

Some API calls can result in a bill of hundreds or even thousands of dollars, and tuning jobs are one of those. A good tuning job can save you many working days of expensive data scientists’ time and provide a significant lift in model performance, which is highly beneficial. HPO in Amazon SageMaker finds good hyperparameters quicker if the search space is narrow (for example, a learning rate of 0.01–0.05 rather than 0.001–0.9). If you have some relevant prior knowledge about the hyperparameter range, start with that. For wide hyperparameter ranges, you may want to consider logarithmic transformations.

Amazon SageMaker also reduces the amount of time spent tuning models using built-in HPO. This technology automatically adjusts hundreds of different combinations of parameters to quickly arrive at the best solution for your ML problem. With high-performance algorithms, distributed computing, managed infrastructure, and HPO, Amazon SageMaker drastically decreases the training time and overall cost of building production grade systems. You can see examples of HPO in some of the Amazon SageMaker built-in algorithms.

For longer training jobs and as the training time for each training job gets longer, you may also want to consider early stopping of training jobs.

Hosting endpoints

The following section discusses how to save cost when hosting endpoints using Amazon SageMaker hosting services.

Delete endpoints that aren’t in use

Amazon SageMaker is great for testing new models because you can easily deploy them into an A/B testing environment. When you’re done with your tests and not using the endpoint extensively anymore, you should delete it. You can always recreate it when you need it again because the model is stored in Amazon S3.

Use Automatic Scaling

Auto Scaling your Amazon SageMaker endpoint doesn’t just provide high availability, better throughput, and better performance, it also optimizes the cost of your endpoint. Make sure that you configure Auto Scaling for your endpoint, monitor your model endpoint, and adjust the scaling policy based on the CloudWatch metrics. For more information, see Load test and optimize and Amazon SageMaker endpoint using automatic scaling.

Amazon Elastic Inference for deep learning

Selecting a GPU instance type that is big enough to satisfy the requirements of the most demanding resource for inference may not be a smart move. Even at peak load, a deep learning application may not fully utilize the capacity offered by a GPU. Consider using Amazon Elastic Inference, which allows you to attach low-cost GPU-powered acceleration to Amazon EC2 and Amazon SageMaker instances to reduce the cost of running deep learning inference by up to 75%.

Host multiple models with multi-model endpoints

You can create an endpoint that can host multiple models. Multi-model endpoints reduce hosting costs by improving endpoint utilization and provide a scalable and cost-effective solution to deploying a large number of models. Multi-model endpoints enable time-sharing of memory resources across models. It also reduces deployment overhead because Amazon SageMaker manages loading models in memory and scaling them based on traffic patterns to models.

Reducing labeling time with Amazon SageMaker Ground Truth

Data labeling is a key process of identifying raw data (such as images, text files, and videos) and adding one or more meaningful and informative labels to provide context so that an ML model can learn from it. This process is essential because the accuracy of trained model depends on accuracy of properly labeled dataset, or ground truth.

Amazon SageMaker Ground Truth uses combination of ML and a human workforce (vetted by AWS) to label images and text. Many ML projects are delayed because of insufficient labeled data. You can use Ground Truth to accelerate the ML cycle and reduce overall costs.

Tagging your resources

Consider tagging your Amazon SageMaker notebook instances and the hosting endpoints. Tags such as name of the project, business unit, environment (such as development, testing, or production) are useful for cost-optimization and can provide a clear visibility into where the money is spent. Cost allocation tags can help track and categorize your cost of ML. It can answer questions such as “Can I delete this resource to save cost?”

Keeping track of cost

If you need visibility of your ML cost on AWS, use AWS Budgets. This helps you track your Amazon SageMaker cost, including development, training, and hosting. You can also set alerts and get a notification when your cost or usage exceeds (or is forecasted to exceed) your budgeted amount. After you create your budget, you can track the progress on the AWS Budgets console.

Conclusion

In this post, I highlighted a few approaches and techniques to optimize cost without compromising on the implementation flexibility so you can deliver best-in-class ML-based business applications.

For more information about optimizing costs, consider the following:


About the Author

BK Chaurasiya is a Principal Product Manager at Amazon Web Services R&D and Innovation team. He provides technical guidance, design advice, and thought leadership to some of the largest and successful AWS customers and partners. A technologist by heart, BK specializes in driving DevOps, continuous delivery, and large-scale cloud transformation initiatives to success.

Source: https://aws.amazon.com/blogs/machine-learning/optimizing-costs-for-machine-learning-with-amazon-sagemaker/

Continue Reading

Trending