Typically, many pills must be present in the training data photos in order to
identify individual tablets in multiple pill photographs. Additionally, each photograph requires the location and type of pills to be specified. The collection of training data and the labeling of the pill class in each image, however, grow more challenging as the variety of pill types rises, as does the number of conceivable combinations. As a result, we provide a technique for accurately identifying individual tablets in a multi- pill image. A single pill from each class of pills will be used to teach an image that will be used to recognize the specific tablets.
1.Pre-training | 2.laheling } 3. Multi-class Training | 4, Detection
Localization Le} JSON file Hel Extraction | Data >|
h h augmentation
' + Rotation |
* Shufle l4
The development of the suggested pill learning and detection method is depicted
Train iy Detection
Figure 3-13: Process of the proposed method
in Figure 3.13. There are four steps in the suggested procedure. Preprocessing
learning, which is a single class of pill area learning for determining a pill's area, is the initial phase. The labeling of the data process is the second phase. To identify the different kinds of pills discovered in the first stage, the third step involves multi-class pill detection learning. The procedure of finding pills is the fourth phase. There are two stages to learning pill detection. The training data for identifying the pill area in Step 1 was a picture of several pills. The multi-class training data in Step 2 was a
picture of a single pill.
3.7.1.1 Single Class Based Pill Area Detection Learning
To precisely locate pills in an image, single-class pill area detection learning is used. When label automation and the detection of several types of pills are being
done, this learning model is employed to separate the pills from the background. We used an image with several tablets and a binary mask image for each image in order
to precisely identify the positions of the pills in the image. Regardless of the kind of pill, the class ("Pill") was matched as one class.
Detected Pills #6
pos Ix, y, width eight), Fill ist
1. Fill. score: 2.000, pos (264.462.159.213)
2, Pill. scare: 1.000, per (424,621,130.132)
(a) (bì te)
Figure 3-14: Result of the pill area detection: (a) Detection result image; (b)
Cropped image of instance segmentation. Outer rectangle is a bounding box and inner solid line indicates a detected pill area; (c) Cropped image of detection
information consisting of the number of pill, detection scores, and bounding box
positions.
Figure 3.14 depicts the final image. Regardless of the color and shape of the pill, the area is expressed in units of pixels. When detecting pills, training for pill area
detection is done to identify the location and area of each particular pill. As a form of pre-training for pill detection, detection is carried out once.
3.7.1.2 Data Labeling and Automatic Generation of JSON Files
Data labeling is the process of altering and classifying data with the use of data processing tools in order to train a deep learning model. The training image and the position coordinates of the object matching to each image are necessary for image- based object detection. Mask R-CNN requires both polygonal coordinates for the
object's position and coordinates for its form. Use the video annotation tool to display the polygonal coordinates and class names for each object in the image in order to
construct this polygonal coordinate. However, using these technologies takes a lot of time and work. We require a method to automate data labeling in order to prevent
losses. Using the single-class pill area learning model presented in Section 3.8.1, we provide a method for automatically recognizing a pill's area. The detected area is then
converted into polygonal coordinates. We also provide a technique for automatically creating a JSON file using the locations and image data.
[Step 1] Pill Segmentation
Region Detection
Binarization Dilation Contour JSON File
(a)
{ "filename": filename,
“regions”:
[
{
"shape_attributes":
{"name": "polyline",
"a11_points_x":a11_points_x,
"a11 points_y":a11_points_x
}ằ
(b)
Figure 3-15: Process of data labeling and JavaScript Object Notation file creation: (a) Process of data labeling; (b) Structure of JavaScript Object Notation.
The proposed method for data labeling and creating JSON files is depicted in
Figure 3.15. The file names of each image and the polygonal coordinates of the pill region are among the pieces of information stored in the JSON file.
3.7.1.3, Multi-Class—Based Pill Label Detection Learning
A model with a focus on classification and detection is necessary for a model for pill label detection. The most effective pill detection model, as measured by
performance, is Mask R-CNN, which is a successor model to Faster R-CNN [30].
The instance segmentation function of the Mask R-CNN can express the observed
object's area in pixels. The training image for the suggested learning model contains just one pill per image, and the input data is a JSON file with the polygonal
coordinates of the pill area. Using the pill region detection model described in Section
3.8.1, data for the pill area are acquired. Using data labeling and the JSON file
automated generation algorithm described in section 3.8.2, the collected data are
turned into a JSON file.
Aside from that, exposure and rotation augmentation were carried out to make up for the lack of training data. A python module called "imgaug" was utilized for data augmentation [31], and during training, the image was rotated at any angle between - 180° and +180°. Finally, multi-class learning was carried out using distinct pill
images. The multi-class pill identification training procedure is depicted in Figure
3.16.
Mask R-CNN
Data
Classification ree RMN,
Pill Contour Dil Ệ : +
Profiling Training data b
Input set
Validation data | tarsal olitezsa
Augmentation aA
Figure 3-16: Training process of pill detection using mask region-based
convolutional neural network