One such application of computer vision is in the recognition andsolution of sudoku puzzles.Sudoku is a popular puzzle game that requires the player to fill a 9x9 grid withnumbers such t
Trang 1VIETNAM GENERAL CONFEDERATION OF LABOR
TON DUC THONG UNIVERSITY
FACULTY OF INFORMATION TECHNOLOGY
FINAL PROJECT
INTRODUCTION TO DIGITAL IMAGE
PROCESSING
Instructor: TS PHẠM VĂN HUY
Trang 2Table of Contents
1 Introduction 3
1.1 Overview 3
1.2 Problem formulation 4
2 Method 6
2.1 Chessboard segmentation 6
2.1.1 Preprocess 6
2.1.2 Detect outerbox 8
2.1.3 Find the edges 9
2.1.4 Crop the chessboard 11
2.2 Transform image board to text board 14
3 Conclusion 16
4 Reference 17
2
Trang 31 Introduction
1.1 Overview
Computer vision has revolutionized the way we approach problems in many different fields From autonomous vehicles to medical imaging, computer vision has provided new and innovative solutions that have improved our lives in countless ways One such application of computer vision is in the recognition and solution of sudoku puzzles
Sudoku is a popular puzzle game that requires the player to fill a 9x9 grid with numbers such that each column, row, and 3x3 sub-grid contains all the digits from
1 to 9 Solving sudoku puzzles requires a combination of logic and patience, and
it can be a time-consuming and challenging task for even the most experienced players
The idea of using computer vision to solve sudoku puzzles is not new, but with the advancements in computer vision techniques and the availability of powerful computing resources, it is now possible to develop systems that can recognize and solve sudoku puzzles with a high degree of accuracy In this thesis, we will explore the use of computer vision techniques to recognize and solve sudoku puzzles
Trang 4this research is to develop a robust and efficient computer vision system that can accurately recognize sudoku puzzles
In conclusion, this thesis will contribute to the field of computer vision by demonstrating the feasibility of using computer vision techniques to recognize and sudoku puzzles The results of this research will provide valuable insights into the challenges of recognizing and solving sudoku puzzles through computer vision and the effectiveness of different approaches in overcoming these challenges The end goal is to develop a system that can accurately recognize and solve sudoku puzzles, making the process of solving sudoku puzzles faster, more efficient, and more accessible to everyone
1.2 Problem formulation
The input to the problem we have is a 9x9 sudoku chessboard image
4
Trang 5We need to partition the chessboard image to obtain a binary image representing the contours and numbers belonging to the chessboard This result will be saved
to the file “output.txt”
Trang 6Next we need to identify the cell containing the number or the empty cell, then convert the chessboard to text Cells with numbers are marked with an “X” and other cells are marked with spaces The results are saved to the file “output.txt” Examples are as follows:
X XX
X X X
X X
X X XX
X X
XX X X
X X
X X X
XX X
2 Method
2.1 Chessboard segmentation
The goal of this section is to crop the binary image of the crop of the chessboard
At the same time, it is necessary to bring it to the front view, serving to crop each square of the chessboard and predict numbers later
6
Trang 72.1.1 Preprocess
To remove noise and increase accuracy, we will blur the input image with Gaussian Blur [1]
Figure 3: Bluring image Next we will thresholding the image to bring it to binary The algorithm used is adaptive Gaussian|Mean thresholding
Trang 8What we are interested in are the lines and numbers on the chessboard So we will convert them to white pixels by inverting the image
Figure 5: Inverted image
To make sure the lines are not broken, we will use the dilated morphological transformation [3]
8
Trang 9We are interested in the main lines (bold outline around), here they are clear, can
go to the next part
2.1.2 Detect outerbox.
We will find out the main contours of the chessboard The idea is to use the Flood Fill algorithm, to find the connected component with the largest size Specifically,
we go each pixel of the chessboard, color the component connected to that pixel, and find the pixel position with the largest connected component size That is the main contour of the chessboard We'll call it "outerbox"
Trang 10Apply the erosion transformation to return the outerbox to its original state (before expanding)
Figure 8: Erosed outerbox
2.1.3 Find the edges
Apply Hough Transform to detect straight lines in the image
10
Trang 11However, this transformation produces many solutions for a straight line in practice So we need to cluster them to agree on 8 lines
The result of the Hough Transform is straight lines with each line consisting of 2 components: (r,θ) Where r is the distance from the position pixel (0, 0) to that line, and θ is the angle made by that line to the horizontal axis
Trang 12clustering, we need to normalize r, θ to standard values in the interval [0,1] by Min-Max Normalization technique, to get more accurate results (The reason is that since r is measured in pixels, the value is much larger than measured in radians)
After clustering is complete, we take the center of 8 clusters to make 8 lines to find
Figure 11: 8 merged lines
2.1.4 Crop the chessboard.
Next, we take out the 2 outermost horizontal and vertical borders of the chessboard, to find the 4 corners of the chessboard The extraction of two horizontal and vertical boundaries uses basic computational logic
12
Trang 13Then find their 4 intersection points, using basic geometric calculations:
Trang 14Finally, threshold it to return the binary image Then save the result.
Figure 15: Binary image of segmented sudoku chessboard
14
Trang 152.2 Transform image board to text board.
Take the result from the previous section, resize it to 252 x 252 and cut out each
28 x 28 cell
Figure 16: Cropped cells
We can rely on the number of white pixels greater than some threshold to determine if a cell contains a number However, the border of these cells contains
a white border:
Trang 16So, we will only crop the inside of the cells.
Figure 18: Inner cropped cells Then find a suitable threshold (statistically) to determine whether the cell has a number or not For example, in this figure I choose the threshold equal to 30 That
16
Trang 17is, if the image has more than 30 white pixels, it contains numbers, otherwise it does not contain numbers
Finally, represent the result as text and then write it to a file
output.txt
X XX
X X X
X X
X X XX
X X
XX X X
X X
X X X
XX X
3 Conclusion
In conclusion, this thesis has presented a comprehensive study on the use of computer vision techniques to recognize sudoku puzzles The proposed method utilized a combination of image processing, and machine learning algorithms to accurately detect and recognize the digits in a sudoku puzzle
This study opens up a number of potential avenues for further research, such as improving the recognition accuracy, increasing the robustness of the method to different types of distortions, and incorporating more advanced computer vision
Trang 184 Reference
[1] K Kaur and S Kaur, "Gaussian Blur for Image Smoothing and Noise Reduction", Journal of Advanced Research in Dynamical and Control Systems, vol 9, no 2, pp 782-787, 2017
[2] Huang, Zhi-Kai, and Kwok-Wing Chau "A new image thresholding method based on Gaussian mixture model." Applied mathematics and computation 205.2 (2008): 899-907
[3] https://homepages.inf.ed.ac.uk/rbf/HIPR2/morops.htm
[4] https://docs.opencv.org/3.4/d9/db0/tutorial_hough_lines.html
[5] Ahmed, M., Seraj, R., & Islam, S M S (2020) The k-means algorithm: A comprehensive survey and performance evaluation Electronics 9, (8), 1295
18