AI-Powered Corrosion Detection for Industrial Equipment: A Scalable Approach with AWS
A Complete AWS ML Solution with SageMaker, Lambda, and API Gateway
Introduction
Industries like manufacturing, energy, and telecommunications require extensive quality control to ensure that their equipment remains operational. One persistent issue that most components are subject to is corrosion: the gradual degradation of metals caused by environmental factors. If left unchecked, corrosion can lead to health hazards, machinery downtime, and infrastructure failure.
This project demonstrates an approach for fully automating the corrosion detection process with the use of cloud computing. Specifically, it utilizes Amazon Sagemaker, Lambda, and API Gateway to build a scalable, efficient, and fault-tolerant quality control solution.
Data
The data for this project was procured by the Synthetic Corrosion Dataset (CC BY 4.0), which contains hundreds of synthetic images. Each image is classified as either Corrosion or Not Corrosion.
The data source provides the images in separate folders for training, testing, and validation datasets, so splitting is unnecessary. The training, validation, and testing sets have 270, 8, and 14 images, respectively.
All images are stored in an s3 bucket with the following directory structure:
/train
/Corrosion
/Not Corrosion
/test
/Corrosion
/Not Corrosion
/valid
/Corrosion
/Not Corrosion
The Workflow
In the cloud solution, a user submits an image classification request to the API integrated with a Lambda function. The Lambda function fetches the image in the S3 bucket and then classifies it using the SageMaker endpoint. The result of the classification is returned to the user as an API response.
Preprocessing the Data
The ImageDataGenerator in the Keras library loads, preprocesses, and transforms images. All images are normalized, while only the training data is augmented with operations such as rotations and flipping.
Image augmentation is an essential step, given the small number of images available.
Keras automatically assigns labels to the images based on the folder they are in:
Creating the Model
The next step is to define the neural network architecture of the model that is to be trained. Given the low volume of data accessible, there is merit in using a pre-trained model, which already has configured weights that can discern features in images.
The project leverages MobileNetV2, a high-performance model that is relatively memory-efficient.
Training the Model
The model is trained for 20 epochs, with early stopping included to reduce run time.
Deploying the Model
This model must now be deployed to a Sagemaker endpoint.
To do so, it is first saved as a tar.gz file and exported to S3.
Given that the current model is custom-made, it will need to be converted into a Tensorflow object that is compatible with SageMakers containers before deployment.
With the TensorFlowModel object created, the model can be deployed with a simple one-liner:
For clarity on the syntax used for deploying the model, please check out the Sagemaker documentation.
Creating the Lambda Function
By calling the endpoint with a Lambda function, applications outside of Sagemaker will be able to utilize the model to classify images.
The lambda function will do the following:
- Access the image in the given S3 directory
- Preprocess the image to be compatible with the model
- Generate and output the model’s prediction
A quick test with a test event using an image in S3 as input confirms that the function is operational. Here is the test image, named “pipe.jpg”.
The image is classified with the following test event:
{
"s3_bucket": "corrosion-detection-data",
"s3_key": "images-to-classify/pipe.jpg"
}
As shown below, the image is classified as “Corrosion”.
Building the API
Creating an API that integrates the Lambda function increases both the usability and security of the Sagemaker model.
In AWS, this can be accomplished by creating a REST API in the API Gateway console:
A task like image classification can only be done through a POST request since users need to send information to the server. Thus, a POST method that integrates the lambda function is created in the REST API:
Once the method is integrated with the Lambda function, the API can be deployed for use, thereby allowing other applications access to the SageMaker model.
For instance, a CURL command in the AWS CLI can use the API to identify images. The following is the syntax:
curl -X POST <API Gateway Invoke URL>\
-H "Content-Type: application/json" \
-d '{
"s3_bucket": <S3 Bucket Name>,
"s3_key": <S3 Key Name>
}'
The API is now fully operational!
Benefits of the Solution
Utilizing cloud computing services to handle everything from model training to API deployment brings many benefits.
- Efficiency
SageMaker enables models to be trained quickly and deployed. Furthermore, API Gateway and Lambda would allow the users to classify images from a single interface in near real-time.
2. Scalability
Amazon Lambda and Sagemaker both offer the scalability needed to adjust to changes in workloads. This ensures that the solutions remain operational regardless of the amount of traffic.
3. Security
AWS allows users to create mechanisms such as API keys and rate limits to protect the API (and the underlying model) from malicious actors. This guarantees that only authorized users will be able to access the API.
4. Cost Efficiency
Both Amazon SageMaker and Lambda use pay-as-you-go models, meaning that there will be risks of paying for overprovisioning. Both services scale according to the workload and will only charge for compute power used when processing a request.
Limitations (and Potential Fixes)
Despite the number of advantages of using this cloud solution, there are certain areas in which it is lacking that could be addressed with some minor changes to the workflow.
- Minimal Training Data
The training data is lacking in both quantity and variety. Most pictures are of pipes and corrosion, so it is unclear how the model would classify other objects, such as boilers and turbine blades. To improve the model’s general performance across different use cases, a more extensive data collection effort is required.
2. No Support for Batching
The current approach allows users to identify images one at a time. However, this could be a tedious endeavor as the number of images needing classification rises. Batching would be an appropriate remedy for this issue, offering a simple way to classify multiple images at once
3. No Real-Time Alerts
Corrosion found in equipment needs to be dealt with as soon as possible. However, the current cloud architecture does not trigger any notifications when corrosion is detected in any image. An SNS topic that pushes messages whenever the model identifies corrosion would help end users address these cases in real-time.
Conclusion
The combination of Sagemaker, Lambda, and API Gateway allows for an efficient, automated, and scalable quality control solution. While the project focused on the classification of corrosive objects, the architecture can be applied to other computer vision use cases.
For access to the code, please check out the GitHub repository:
GitHub - anair123/Corrosion-Detection-With-AWS
Thank you for reading!
AI-Powered Corrosion Detection for Industrial Equipment: A Scalable Approach with AWS was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
from Datascience in Towards Data Science on Medium https://ift.tt/vbN0kJO
via IFTTT