final project introduction to digital image processing

One such application of computer vision is in the recognition andsolution of sudoku puzzles.Sudoku is a popular puzzle game that requires the player to fill a 9x9 grid withnumbers such t

Trang 1

VIETNAM GENERAL CONFEDERATION OF LABOR

TON DUC THONG UNIVERSITY

FACULTY OF INFORMATION TECHNOLOGY

FINAL PROJECT

INTRODUCTION TO DIGITAL IMAGE

PROCESSING

Instructor: TS PHẠM VĂN HUY

Trang 2

Table of Contents

1 Introduction 3

1.1 Overview 3

1.2 Problem formulation 4

2 Method 6

2.1 Chessboard segmentation 6

2.1.1 Preprocess 6

2.1.2 Detect outerbox 8

2.1.3 Find the edges 9

2.1.4 Crop the chessboard 11

2.2 Transform image board to text board 14

3 Conclusion 16

4 Reference 17

2

Trang 3

1 Introduction

1.1 Overview

Computer vision has revolutionized the way we approach problems in many different fields From autonomous vehicles to medical imaging, computer vision has provided new and innovative solutions that have improved our lives in countless ways One such application of computer vision is in the recognition and solution of sudoku puzzles

Sudoku is a popular puzzle game that requires the player to fill a 9x9 grid with numbers such that each column, row, and 3x3 sub-grid contains all the digits from

1 to 9 Solving sudoku puzzles requires a combination of logic and patience, and

it can be a time-consuming and challenging task for even the most experienced players

The idea of using computer vision to solve sudoku puzzles is not new, but with the advancements in computer vision techniques and the availability of powerful computing resources, it is now possible to develop systems that can recognize and solve sudoku puzzles with a high degree of accuracy In this thesis, we will explore the use of computer vision techniques to recognize and solve sudoku puzzles

Trang 4

this research is to develop a robust and efficient computer vision system that can accurately recognize sudoku puzzles

In conclusion, this thesis will contribute to the field of computer vision by demonstrating the feasibility of using computer vision techniques to recognize and sudoku puzzles The results of this research will provide valuable insights into the challenges of recognizing and solving sudoku puzzles through computer vision and the effectiveness of different approaches in overcoming these challenges The end goal is to develop a system that can accurately recognize and solve sudoku puzzles, making the process of solving sudoku puzzles faster, more efficient, and more accessible to everyone

1.2 Problem formulation

The input to the problem we have is a 9x9 sudoku chessboard image

4

Trang 5

We need to partition the chessboard image to obtain a binary image representing the contours and numbers belonging to the chessboard This result will be saved

to the file “output.txt”

Trang 6

Next we need to identify the cell containing the number or the empty cell, then convert the chessboard to text Cells with numbers are marked with an “X” and other cells are marked with spaces The results are saved to the file “output.txt” Examples are as follows:

X XX

X X X

X X

X X XX

X X

XX X X

X X

X X X

XX X

2 Method

2.1 Chessboard segmentation

The goal of this section is to crop the binary image of the crop of the chessboard

At the same time, it is necessary to bring it to the front view, serving to crop each square of the chessboard and predict numbers later

6

Trang 7

2.1.1 Preprocess

To remove noise and increase accuracy, we will blur the input image with Gaussian Blur [1]

Figure 3: Bluring image Next we will thresholding the image to bring it to binary The algorithm used is adaptive Gaussian|Mean thresholding

Trang 8

What we are interested in are the lines and numbers on the chessboard So we will convert them to white pixels by inverting the image

Figure 5: Inverted image

To make sure the lines are not broken, we will use the dilated morphological transformation [3]

8

Trang 9

We are interested in the main lines (bold outline around), here they are clear, can

go to the next part

2.1.2 Detect outerbox.

We will find out the main contours of the chessboard The idea is to use the Flood Fill algorithm, to find the connected component with the largest size Specifically,

we go each pixel of the chessboard, color the component connected to that pixel, and find the pixel position with the largest connected component size That is the main contour of the chessboard We'll call it "outerbox"

Trang 10

Apply the erosion transformation to return the outerbox to its original state (before expanding)

Figure 8: Erosed outerbox

2.1.3 Find the edges

Apply Hough Transform to detect straight lines in the image

10

Trang 11

However, this transformation produces many solutions for a straight line in practice So we need to cluster them to agree on 8 lines

The result of the Hough Transform is straight lines with each line consisting of 2 components: (r,θ) Where r is the distance from the position pixel (0, 0) to that line, and θ is the angle made by that line to the horizontal axis

Trang 12

clustering, we need to normalize r, θ to standard values in the interval [0,1] by Min-Max Normalization technique, to get more accurate results (The reason is that since r is measured in pixels, the value is much larger than measured in radians)

After clustering is complete, we take the center of 8 clusters to make 8 lines to find

Figure 11: 8 merged lines

2.1.4 Crop the chessboard.

Next, we take out the 2 outermost horizontal and vertical borders of the chessboard, to find the 4 corners of the chessboard The extraction of two horizontal and vertical boundaries uses basic computational logic

12

Trang 13

Then find their 4 intersection points, using basic geometric calculations:

Trang 14

Finally, threshold it to return the binary image Then save the result.

Figure 15: Binary image of segmented sudoku chessboard

14

Trang 15

2.2 Transform image board to text board.

Take the result from the previous section, resize it to 252 x 252 and cut out each

28 x 28 cell

Figure 16: Cropped cells

We can rely on the number of white pixels greater than some threshold to determine if a cell contains a number However, the border of these cells contains

a white border:

Trang 16

So, we will only crop the inside of the cells.

Figure 18: Inner cropped cells Then find a suitable threshold (statistically) to determine whether the cell has a number or not For example, in this figure I choose the threshold equal to 30 That

16

Trang 17

is, if the image has more than 30 white pixels, it contains numbers, otherwise it does not contain numbers

Finally, represent the result as text and then write it to a file

output.txt

X XX

X X X

X X

X X XX

X X

XX X X

X X

X X X

XX X

3 Conclusion

In conclusion, this thesis has presented a comprehensive study on the use of computer vision techniques to recognize sudoku puzzles The proposed method utilized a combination of image processing, and machine learning algorithms to accurately detect and recognize the digits in a sudoku puzzle

This study opens up a number of potential avenues for further research, such as improving the recognition accuracy, increasing the robustness of the method to different types of distortions, and incorporating more advanced computer vision

Trang 18

4 Reference

[1] K Kaur and S Kaur, "Gaussian Blur for Image Smoothing and Noise Reduction", Journal of Advanced Research in Dynamical and Control Systems, vol 9, no 2, pp 782-787, 2017

[2] Huang, Zhi-Kai, and Kwok-Wing Chau "A new image thresholding method based on Gaussian mixture model." Applied mathematics and computation 205.2 (2008): 899-907

[3] https://homepages.inf.ed.ac.uk/rbf/HIPR2/morops.htm

[4] https://docs.opencv.org/3.4/d9/db0/tutorial_hough_lines.html

[5] Ahmed, M., Seraj, R., & Islam, S M S (2020) The k-means algorithm: A comprehensive survey and performance evaluation Electronics 9, (8), 1295

18

Tiêu đề	Introduction To Digital Image Processing
Tác giả	Tô Ký Tuấn
Người hướng dẫn	Ts Phạm Văn Huy
Trường học	Ton Duc Thong University
Chuyên ngành	Information Technology
Thể loại	Final Project
Năm xuất bản	2023
Thành phố	Ho Chi Minh City

Định dạng
Số trang	18
Dung lượng	2,13 MB