3.2.1 System architecture
Database
Information
of entity query
WUĐRS API Application Fact checking
eb interface model
Information of entity
Http request/respond Word segmentation
Triple generation AP| execute Củittii word embedding and
model I Controls similarity
4&————————]
Score result
Hitp request/respond nede
User input Entity text and label
BERT-NER model]
Word separation.
entity recognition and
labeling
@
Figure 3-1 System architecture
As figure 3-1, our system is composed of numerous parts and tools that make it simple to update and maintain. A personal computer will be used to train the machine learning model, which will then be extracted and utilized in Python. The website system is based
on Reactjs and Nodejs, which are widely used to create websites that are easy to deploy, manage code, and serve as a conduit for communication between models and applications. The client side, Reactjs, creates the user interface (UI) and processes the BERT-NER results before sending them to the server side, Nodejs. Reactjs also serves as a bridge to the graph database and underlying machine learning models in Python format, allowing
38
JSON to be passed back and forth between the two sides. Another way that web apps connect users and the system is as a bridge.
To be more specific, the server component (Node.js) remains the most crucial component
of the system. This section serves as the hub of the system, connecting and exchanging data between all models, algorithms, and other services via the HTTP request response method. It also implements the CRUD (Create, Read, Update, Delete) methods. To process input or output data, JavaScript Object Notation (JSON) is needed. Regarding the knowledge graph, Neo4j aura will be used to deploy it and represent it on the Neo4j graph database. A cloud-based, fully automated, scalable, always-on graph platform is called Neo4j Aura. Therefore, when switching servers, we don't need to download the Neo4J application multiple times.
3.2.2 Knowledge graph construction
The reason we use knowledge graph is some of its benefits and benefits of using knowledge graph, database storage include:
e Knowledge graph helps organize information in a structured way. This helps us
easily track and understand relevant information.
e Represents relationships between objects, helping to clearly define the connection
between them. This can provide an overview of the data and create a rich and detailed picture of the information.
e Provides a flexible way to perform complex queries. We can search for information
and get results more quickly and efficiently.
e Knowledge graphs can be used to support artificial intelligence applications, such
as machine learning and natural language processing, by providing a rich and structured source of data.
e Knowledge graphs are usually easy to maintain and extend. As new information
becomes available, we can easily add it to the chart without affecting the overall
structure.
39
We must first construct the knowledge graph in order to be able to query it. We save time
to focus on more machine learning models, in contrast to other smart construction techniques that involve using models to separate sentences into a triple for inclusion in the knowledge graph. Therefore, we decided to construct the knowledge graph using a less complicated technique. Since the Vietnamese cuisine topic is not as large as other topics, we have been able to find information online more quickly. Wikipedia and other websites that list the dishes from all 63 of Vietnam's provinces have provided us with information. Once our system detects relevant or corresponding information extracted from a specific natural language query, an answer based on the results from the knowledge graph is generated.
We will outline our database organization strategy for the system in this section. As we previously discussed, our system's database needs to be organized. We made the decision
to gather data under supervision and by hand. The subject is foods and cuisines found in
63 Vietnamese provinces and cities. We have used techniques like:
e Web scraping: Gathering data on Vietnamese food by pulling text, photo from
pertinent blogs, forums, and websites. Next, preprocessing is done on the gathered data to guarantee consistency.
e Wikipedia: This well-known online encyclopedia was a great source of
information for the data. Structured data and textual content were extracted from Wikipedia articles about Vietnamese cuisine, provinces, cities, and culinary traditions.
We saved all of the data in CSV files after it was extracted. We've now divided it up into
a number of files, one csv file for each entity. The Food entity, for instance, has attributes like locationId, sourceld, image, etc. Figure 3-2, our data organization chart, is shown below.
40
Location
Id
Food LocationName
id lowerLocationName
Id
FoodName Regionld TypeName
engName noSpace lowerName
vieName Country
Locationld —
Typeld Region
Description Id
Image RegionName
Temporal lowerRegionName
sourceld RegionDetail
lowerRegionDetail
EngName
Figure 3-2 Data organization chart
The knowledge map built for the topic "Vietnamese Cuisine" plays an important role in organizing information and creating a logical structure between entities related to cuisine. The combination of entities and the relationships between them provide a comprehensive view of Vietnamese culinary culture.
4l
Table 3-1 Data organization analysis
1 Food Id, foodName, Id The Food entity plays a central role
engName, vieName, in the knowledge graph.
LocationId, Typeld, Relationships between Food and Description, Image, other entities such as Location, Temporal, sourceld Region, Source and Type help link
information in an organized and logical way.
2 Location Id, locationName, Id Location and Region Entities The
lowerLocationName, combination of Location and
regionId, noSpace, Region creates an _ organized
Country geographic system. Each dish can
be linked to a specific place and
3 Region Id, regionName, Id each place belongs to a certain
lowerRegionName, region. This makes it easy to track
regionDetail, and understand the origin and
lowerRegionDetail, region of each dish.
engName
4 Source Id, links Id Source and Type Entities, the
Source entity contains information about the source of the data, while
5 Type Id, typeName Id the Type entity relates to the
, , category of each dish although this
lowerName tà ca : .
entity is not too important in
helping to validate information but
it contributes to enriching the data.
This relationship helps identify and classify information effectively.
We gather information about Vietnamese food, including dishes, locations, and regions, and then we store it all in CSV files. We then import data into Neo4j. Map database to
42
enable a more comprehensive and user-friendly view, and we estimate that there are roughly 1,400 items and 15,000 relationships in our database.