BANKING ACADEMY FACULTY OF FINANCE – ADVANCED PROGRAMME DISSERTATION THESIS Title: Financial Statements Fraud Detection in Vietnamese Listed Companies: A Machine Learning Approach wit
THEORETICAL FRAMEWORK AND LITERATURE REVIEW
Overview of Financial Statements
R.Buveneswari & S Lakshmi (2015) described a financial statement is a formal record of the financial activities of a business, person or other entity Relevant financial information is presented in a structured manner and in a form of easy to understand They typically include basic financial statements, accompanied by a management discussion and analysis: a balance sheet also referred to as a statement of financial position, reports on a company’s assets, liabilities and ownership equity at a given point in time An income statement also referred as a statement of comprehensive income, statement of revenue and expenses, profit and loss report, reports on company’s income, expenses, and profit over a period of time
According to IAS1 of ISAB (International Accounting Standards Board), financial statements are economic information that accounts to provide information about the financial position, financial performance, and cash flows of an entity that is useful to a wide range of users in making economic decisions To meet that objective, financial statements provide information about an entity's assets, liabilities, equity, income and expenses, including gains and losses contributions by and distributions to owners (in their capacity as owners) cash flows That information, along with other information in the notes, assists users of financial statements in predicting the entity's future cash flows and, in particular, their timing and certainty
Sudip Das (2010) said financial statements are records that provide an indication of the organization’s financial status It quantitatively describes the financial health of the company It helps in the evaluation of company’s prospects and risks for the purpose of making business decisions The objective of financial statements is to provide information about the financial position, performance, and changes in financial position of an enterprise that is useful to a wide range of users in making economic decisions Financial statements should be understandable, relevant, reliable and comparable They give an accurate picture of a company’s condition and operating results in a condensed form Reported assets, liabilities and equity are directly related to an organization's financial position whereas reported income and expenses are directly related to an organization's financial performance Analysis and interpretation of financial statements helps in determining the liquidity position, long term solvency, financial viability, profitability and soundness of a firm There are four basic types of financial statements: balance sheet, income statements, cash-flow statements, and statements of retained earnings
Donald Lee Hửglund (1981) stated that financial statements are the instrument panel of a business enterprise They constitute a report on managerial performance, attesting to managerial success or failure and flashing warning signals of impending difficulties The three basic financial statements the balance sheet, the income statement, and the statement of changes in financial position are of great assistance in helping to determine a firm's overall financial position with respect to its past performance and to its competitors These statements also assist the experienced analyst in making projections of the enterprise's future activities
In conclusion, financial statements are essential documents that provide a comprehensive summary of a company's financial performance, position, and cash flows They consist of the balance sheet, income statement, and cash flow statement, which collectively offer valuable insights into a company's profitability, liquidity, solvency, and overall financial health Financial statements are crucial tools for investors, creditors, managers, and other stakeholders in assessing the financial performance and making informed decisions about an organization By accurately and transparently presenting financial information, financial statements promote trust, accountability, and transparency in the business world
1.2 1.1.2 Main types of financial statements
The balance sheet is a financial statement which represents a record of the organizations’ assets, liabilities, and net worth The balance sheet can also be defined as the “statement of financial position,” but the term “balance sheet” will be used in this book This record always refers to a specific point in time A balance sheet can be viewed at any point in time, but normally a business views this on a monthly basis and fiscal year basis Assets are items that are worth money to the company and are in the company’s possession or are owed to the company Liabilities are debts that the company owes for the assets Net worth is the value of the company The basic equation for the balance sheet is “assets equal liabilities plus net worth or shareholder’s equity.” It is called a balance sheet because both sides of the equation must be equal (Ronald V.Bucci , 2014)
Donald Lee Hửglund (1981) stated that the balance sheet is constructed for reporting the financial position of a business at a particular time Financial position is regarded as the amount of assets (resources) and the liabilities (debts) of a business entity on a particular date Thus, the balance sheet is often called the statement of financial position In the balance sheet, the financial position is shown by a listing of the firm's assets, its liabilities, and the equity of the owner or owners An even more descriptive title for the balance sheet would be “the statement of assets, liability, and owner's equity for a specific point in time.”
The balance sheet is a crucial financial statement that captures the financial position of a company at a specific point in time It presents a clear picture of the company's assets, liabilities, and equity Assets represent what a company owns, including cash, investments, property, and equipment, while liabilities refer to the amounts that the company owes, such as loans, accounts payable, and taxes The equity section shows how much of the company is owned by its shareholders
Stakeholders, including investors, creditors, and employees, rely on the balance sheet to assess a company's financial health and stability By understanding a company's financial position, stakeholders can make informed decisions about whether to invest, lend, or work with the company Additionally, the balance sheet provides valuable insights into a company's ability to meet its financial obligations, such as paying its debts and bills on time Thus, a high-quality balance sheet is crucial for ensuring transparency, accountability, and trust between the company and its stakeholders
According to Hadri Kusuma (2014), a cash flow statement is the statement that classifies cash receipts and cash disbursements according to whether they result from operating, investing or financing activities
Nguyen Duc Dung & Nguyen Huu Anh (2020) stated that the cash flow statement provides information as important as the financial condition of a business by providing a cash flow plan for the current year and reviewing the effects that determine the current cash flow strategy in the next phase Nguyen & Vu (2014) stated that the cash flow statement provides information as important as the financial health of a business in that it provides a cash flow plan for the year and considers the effects that determine the cash flow strategy Therefore, any conclusions drawn from the cash flow statement must be combined with an in-depth understanding of the business
The cash flow statement is an essential financial statement that provides insights into a company's cash inflows and outflows over a specific period It provides stakeholders, such as investors, creditors, and management, with valuable information about a company's liquidity and financial health By analyzing the cash flow statement, stakeholders can determine a company's ability to generate cash, meet its financial obligations, and fund its operations and investments
A company with a positive cash flow from operating activities indicates that it has generated sufficient cash to cover its day-to-day expenses, such as salaries, rent, and inventory purchases On the other hand, a company with negative cash flow from operating activities may indicate that it is facing financial difficulties, and its operations are not generating enough cash to cover its expenses
In addition, the cash flow statement can also help stakeholders to understand a company's investing and financing activities Investing activities include the purchase or sale of long-term assets, such as property, plant, and equipment, while financing activities include the issuance or repayment of debt or equity By analyzing these activities, stakeholders can gain insights into a company's future growth prospects, funding sources, and debt levels
According to Sudip Das (2010), income statement, also called profit and loss statement (P&L) and Statement of Operations is financial statement that summarizes the revenues, costs and expenses incurred during a specific period - usually a fiscal quarter or year These records provide information that shows the ability of a company to generate profit by increasing revenue and reducing costs
The purpose of the income statement is to show managers and investors whether the company made or lost money during the period being reported The important thing to remember about an income statement is that it represents a period This contrasts with the balance sheet, which represents a single moment in time The income statement is a critical financial statement that provides valuable insights into a company's profitability and financial performance It summarizes a company's revenue, expenses, gains, and losses over a specific period, typically a quarter or a year The statement's revenue section shows how much money a company has generated from its operations, including sales and other revenue streams The expenses section details the costs incurred in running the business, including salaries, rent, utilities, and other operating expenses The gains and losses section highlights any non-operating income or expenses, such as investment gains or losses or one- time charges
Overview of Financial Statement Fraud
Investors: According to Helfert (2001), investors use financial statements to evaluate a company's financial performance, profitability, and potential for growth They use this information to make decisions about buying or selling stocks, bonds, or other securities
Management: Company management uses financial statements to monitor the company's financial performance and make decisions about future investments and strategies Catherine Cote (2020) stated that managers should uphold the skillset to read financial statements in order to become effective leaders Having a finance- driven mindset will be of great assistance at a managerial level They could utilize it to track performance and other ratios to make data-driven decisions
Regulators: Regulatory bodies use financial statements to ensure that companies are complying with financial reporting requirements and accounting standards, ensuring that fraudulent activities do not disrupt the fairness of the markets
Analysts: Financial analysts and other financial professionals use financial statements to evaluate companies and make recommendations to clients regarding investment opportunities
1.4 1.2 Overview of Financial Statement Fraud
1.2.1 Definition of Financial Statement Fraud
The Association of Certified Fraud Examiners (ACFE) defines financial statement fraud as schemes which involve the intentional misstatement or omission of material information in the organization’s financial reports In simplified terms financial statements fraud happens when companies intentionally alter their figures in financial statements
Financial statement fraud (FSF) refers to the intentional omission or distortion of accounting data or important facts in order to change the perception of who have access to this information and, consequently, its final decision regarding the company (Certified Fraud Examiners, 2012) Consequently, Rezaee (2005) believes FSF is a deliberate way to lubricate users through material distortions in the financial statements
According to Kamal, Salleh & Ahmad (2016), FSF is the type of fraud with higher costs as it leads to loss of investors’ confidence, and negatively affects the capital market as well as the reputation of the organization It leads to significant declines in stock price which consequently leads to losses for shareholders, and possible outflow of the stock exchange
Rezaee (2005) stated that FSF should be strongly condemned for several reasons: 1) It is a serious threat to investor confidence in financial markets; 2) It involves high costs in terms of fines or lack of investment for organizations when fraud cases are discovered, and 3) is an illegitimate and unacceptable attitude on the part of organizations
Financial statement fraud is typically conducted by management or with their consent and knowledge Elliott and Willingham (1980) view financial statement fraud as management fraud: “The deliberate fraud committed by management that injures investors and creditors through materially misleading financial statements.”
Overall, financial statement fraud is the deliberate act of intentionally excluding or distorting accounting data or significant information with the aim of altering the perception of those who can access this information and, as a result, influencing their ultimate decision-making concerning the company Financial statements fraud is an event that has greatly impacted the business world, economies, societies, as well as all stakeholders involved Ultimately, a financial statements fraud case can affect an economy in such a way that leads to the relocation of a company that has great impact on the economy of a country Therefore, there is a growing focus on the regulation and supervision of the accounting information released, to mitigate possible material risks associated with financial statements fraud
1.2.2 Types of Financial Statement Fraud
There are different types of financial statement fraud taking place in organizations The schemes of financial statement fraud can be divided into six categories: net worth/income overstatements and understatements; timing differences understated and overstated revenues; understated and overstated liabilities/ expenses; improper asset valuations; and improper disclosure Each of these categories will be covered following the structure of Zack (2013) and dividing fraud types into revenue bases schemes, asset-based schemes, liability and expense-based schemes
Tobias Christian Gleichmann (2020), Zack (2013) classifies revenue manipulation schemes into four categories: timing schemes, fictitious or inflated revenue, misclassification schemes and gross-up schemes These will be followed in the upcoming effectuation, as they answer the when, why, where, and how of revenue recognition, thereby providing a comprehensive overview As previously mentioned, most schemes in practice are not exclusive to one category and instead extend across multiple categories through the interrelated nature of financial statements (Wells ,
According to Wells (2017), Timing, the period when of fraud schemes, deals with shifting revenue between periods outside of the legal possibilities that accounting regulations offer Most commonly, revenue is recognized too early, boosting the current period’s performance, and leading to a problematic lack of revenue in the period that should be under study The practice often results in a downward spiral, when additional manipulations are necessary for later periods to cover up for the revenue that has been recognized too early This short, sided manipulation, often termed “management myopia”, is induced by the expectations and goals that one expects to meet (Merchant, 1990) Timing schemes can be established in different ways, the most straightforward being the alteration of records Transaction documents are dated backwards to ensure the possibility of recognizing the revenue in the required period This alteration can be done with or without the knowledge of the transaction party Construction contracts offer another possibility to shift revenues When revenue recognition is based on the percentage of completion method, as commonly used under US-GAAP and IFRS, the amount of total revenue recognized in each period depends on Zack (2013) offers an in-depth review of common financial statement fraud cases Some examples to clarify the execution of different schemes based on his review will be brought up in the upcoming sections on the amount of costs accrued in the period in relation to the total estimated costs Especially the accrued costs in any period may be subject to manipulations when overstating results in high revenue recognition for the respective period Sidorsky
(2006) believes the double-booking or misclassification of costs related to other projects is often manipulated in the percentage of completion context Boosting sales by channel stuffing is another possible scheme based on timing irregularities to artificially pretend that one is running a successful business Jackson (2015) believes that channel stuffing is a scheme in which sales are generated by pushing excess inventory along the distribution line, for example to retailers, at the end of the period or quarter Singleton & Singleton (2010) believes that knowing that a substantial fraction will most likely be returned in the following period, the scheme empties one’s inventory while generating fictitious sales
Zack (2013) believes that fictitious and inflated revenues provide two possibilities to boost firm performance The former refers to fabricated transactions that have not happened; the latter to actual transactions that have been artificially inflated in scale, according to Singleton & Singleton (2010) Compared to timing schemes, where the date of recognition rather than the amount of revenue from transactions is altered, fictitious and inflated revenue schemes affect the underlying value of the transaction In practice, transaction partners may exist, yet false sales are recorded, or the transaction partners and their respective sales are made up Completely fabricated transactions, especially when fictitious customers are involved, are usually more easily detectable by irregularities in customer master files These types of transactions may be more difficult to recognize, when regular customers, which are covering the perpetrator, are involved Another possibility to fabricate fictitious revenue is the top-side adjustment in which entries are recorded in the financial statements but fail to be found in formal accounting records like the general ledger, according to Jackson (2015)
Misclassification schemes deal with intentionally wrongfully classified transactions Misclassification can have a material impact on financial statements when incorrectly classified transactions misstate positions This would not have an impact on the bottom-line outcome However, key performance indicators referring to manipulated lines may present misleading information and influence the economic decisions of the financial statement’s recipients One-time income transactions or non-recurring costs that occur outside of regular business operations and are unlikely to reoccur in the future are shifted to positions representing core business activities, according to Zack (2013)
Theoretical Basis of Machine Learning Algorithms
1.3.1 Definition of Machine Learning Algorithm
Batta Mahesh (2019) stated that Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without being explicitly programmed
Arthur Samuel (1959) was the first person to coin the definition of Machine Learning He stated that Machine Learning is defined as the field of study that gives computer the ability to learn without being explicitly programmed
Thomas W Edgar, David O Manz (2017) stated that Machine learning is a field of study that looks at using computational algorithms to turn empirical data into usable models The machine learning field grew out of traditional statistics and artificial intelligences communities
James A Nichols, Hsien W Herbert Chan and Matthew A B Baker (2019) stated that Machine learning (ML) is an umbrella term that refers to a broad range of algorithms that perform intelligent predictions based on a data set These data sets are often large, perhaps consisting of millions of unique data points Recent progress in machine learning has attained what appears to be a human level of semantic understanding and information extraction, and sometimes the ability to detect abstract patterns with greater accuracy than human experts
In conclusion, machine learning is a subfield of artificial intelligence, which is broadly defined as the capability of a machine to imitate intelligent human behavior Artificial intelligence systems are used to perform complex tasks in a way that is like how humans solve problems Machine learning algorithms work by training on a dataset, which consists of input data, which is called features and corresponding output data, which is called labels During the training process, the algorithm analyzes the data using statistical methods and learns the underlying patterns or relationships between the input and output data Once the algorithm has learned these patterns, it can be used to make predictions or take actions on new data
There are different types of machine learning algorithms, each with its own strengths and weaknesses
Decision trees: According to Micheal Luckert and Moritz Schaefer-Kehnert
(2015), a Decision Tree is a classification technique that focuses on an easily understandable representation form and is one of the most common learning methods Decision Trees use data sets that consist of attribute vectors, which in turn contain a set of classification attributes describing the vector and a class attribute assigning the data entry to a certain class A Decision Tree is built by iteratively splitting the data set on the attribute that separates the data as well as possible into the different existing classes until a certain stop criterion is reached The representation form enables users to get a quick overview of the data, since Decision Trees can easily be visualized in a tree structured format, which is easy to understand for humans
Neural networks: Singh & Chauhan (2009) define Artificial Neural Networks as “a mathematical model that is based on biological neural networks and therefore is an emulation of a biological neural system” Compared to conventional algorithms, neural networks can solve problems that are rather complex, on a substantially easier level in terms of algorithm complexity Therefore, the main reason to use Artificial Neural Networks is their simple structure and self-organizing nature which allows them to address a wide range of problems without any further interference by the programmer Example given, a neural network could be trained on customer behavior data in an online shop and predict whether 13 the person will make a purchase or not
Support vector machines: Support Vector Machines belong to the area of supervised learning methods and therefore need labeled, known data to classify new unseen data The basic approach to classify the data, starts by trying to create a function that splits the data points into the corresponding labels with (a) the least possible number of errors or (b) with the largest possible margin This is because larger empty areas next to the splitting function result in fewer errors, because the labels are better distinguished from one another
Bayesian Networks: According to Micheal Luckert and Moritz Schaefer-Kehnert (2015), Bayesian networks consist of nodes and directed connections between these nodes that symbolize dependencies between them They are probabilistic directed acyclic graphical models Each node represents an attribute of interest for the given task, such as pollution values in cities for the likelihood estimation of developing lung cancer The most basic Bayesian network is called Naive Bayes and the reason for it being called Naive is that this network assumes that there are no dependencies between attributes This is almost never the case in practical data mining tasks and therefore this method tends to achieve worse results than more detailed algorithms Normal Bayesian networks use known data to estimate the dependencies between attributes and the class label and use this information to calculate probabilities of possible different outcomes of future events It automatically applies the Bayes’ theorem to complex problems and is therefore able to gain knowledge about the state of attributes and their dependencies
1.3.3 Practical Applications of Machine Learning Algorithms
Machine learning has a wide range of practical applications in finance like algorithmic trading to credit card fraud detection, investment management and detecting fraudulent activities in financial statements
As for algorithmic trading, Stuart Colianni and his co-authors (2015) prove whether Twitter data relating to cryptocurrencies can be utilized to develop advantageous crypto coin trading strategies By way of supervised machine learning techniques, their team outlines several machine learning pipelines with the objective of identifying cryptocurrency market movement
For credit card fraud dectection, John O.Awoyemi and his co-author (2017) has investigated the performance of nạve bayes, k-nearest neighbor and logistic regression on highly skewed credit card fraud data Dataset of credit card transactions is sourced from European cardholders containing 284,807 transactions A hybrid technique of under-sampling and oversampling is carried out on the skewed data The three techniques are applied on the raw and preprocessed data
Machine Learning is also applicable in investment and portfolio decision Yilin Ma, Ruizhu Han and Weizhong Wang (2021) combines return prediction in portfolio formation with two machine learning models, i.e., random forest (RF) and support vector regression (SVR), and three deep learning models, i.e., LSTM neural network, deep multilayer perceptron (DMLP) and convolutional neural network To be specific, this paper first applies these prediction models for stock preselection before portfolio formation Then, they incorporate their predictive results in advancing mean–variance (MV) and omega portfolio optimization models To present the superiority of these models, portfolio models with autoregressive integrated moving average’s return prediction are used as benchmarks
Finally, Machine Learning Algorithms can do of great benefit to detecting fraudulent signs in financial statements To detect fraudulent statements, conventional approaches like manual audits and inspections are expensive, inaccurate, and time intensive The use of intelligent techniques can greatly assist auditors in analyzing vast quantities of financial statements Matin N Ashtiani & Bijan Raahemi (2021) presented a comprehensive review and synthesis of previous research on the application of intelligent methods for fraud detection in corporate financial statements Specifically, the review concentrated on investigating machine learning and data mining methods, as well as the diverse datasets examined for identifying financial fraud The Kitchenham methodology was employed as a clear protocol for extracting, synthesizing, and reporting the findings.
Literature Review
Detecting fraud in financial statements is of paramount importance in Vietnam, as it helps ensure transparency, integrity, and investor confidence in the country's financial markets With the potential for financial fraud to cause significant economic losses and undermine the stability of the financial system, detecting fraudulent activities becomes crucial In Vietnam, notable research has been conducted on the use of the M-Score model for fraud detection The M-Score model, developed by Professor Edward Altman, is a widely recognized tool that uses various financial ratios to assess the probability of financial statement manipulation Research in Vietnam has explored the effectiveness of the M-Score model in identifying fraudulent financial reporting, providing valuable insights for regulators, investors, and financial institutions in their efforts to combat fraud and maintain the integrity of the financial sector
Nguyen Huu Anh and his co-author (2016) stated that there are interrelations between Balance Sheets, Income Statements and Statement of Cash Flow so that fraud can always show up through certain numbers The M-score, which is based on ratio analysis, has been widely regarded as an effective tool for detecting accounting fraud This study focuses on detecting earnings management among 229 non- financial Vietnamese companies listed on the Ho Chi Minh Stock Exchange (HOSE) during 2013-2014 using the Beneish M-score model The results indicate that 48.4% of the companies were found to be involved in earnings management and the sample observations were consistent with the Beneish M-score model Therefore, the authors suggest that the M-score model is a useful technique for detecting earnings manipulation behaviors and could lead to an improvement in financial reporting quality, ultimately providing better protection for investors As for Nguyen Anh Phong and his co-authors (2022), they have analyzed data from listed non-financial companies in 2018 and 2019 using the M-Score and Z-Score models They also applied machine learning techniques such as Artificial Neural Networks and Support Vector Machine to forecast evidence of fraud in financial statements The results of their research showed that combining SVM technique and M-Score index achieved a high accuracy rate of approximately 95% in predicting fraudulent activities, making it the first study to apply machine learning algorithms such as ANN and SVM in identifying fraudulent financial statements in Vietnam
It's worth noting that Vietnamese corporations have unique financial characteristics and accounting standards due to the regulations in the Vietnamese market, which makes them differ from businesses in other markets The authors hope that using computational methods based on machine learning algorithms like ANN and SVM will provide efficiency, accuracy, and timeliness in identifying fraudulent financial statements in the Vietnamese context
Johan Perols (2011) compares the performance of six popular statistical and machine learning models in detecting financial statement fraud under different assumptions of misclassification costs and ratios of fraud firms to nonfraud firms The results show, somewhat surprisingly, that logistic regression and support vector machines perform well relative to an artificial neural network, bagging, C4.5, and stacking The results also reveal some diversity in predictors used across the classification algorithms Out of 42 predictors examined, only six are consistently selected and used by different classification algorithms: auditor turnover, total discretionary accruals, Big 4 auditor, accounts receivable, meeting or beating analyst forecasts, and unexpected employee productivity These findings extend financial statement fraud research and can be used by practitioners and regulators to improve fraud risk models
Xin Ping-Song and his co-authors (2014) present a method of assessing financial statement fraud risk The proposed approach comprises a system of financial and non-financial risk factors, and a hybrid assessment method that combines machine learning methods with a rule-based system Experiments are performed using data from Chinese companies by four classifiers (logistic regression, back- propagation neural network, C5.0 decision tree and support vector machine) and an ensemble of those classifiers The proposed ensemble of classifiers outperforms each of the four classifiers individually in accuracy and composite error rate The experimental results indicate that non-financial risk factors and a rule-based system help decrease the error rates The proposed approach outperforms machine learning methods in assessing the risk of financial statement fraud
Dimas Lagusto (2018) stated that detection of fraudulent financial statement is important for investors, regulators and auditing firms Ever since the disclosure of a series of financial statement fraud in the late 90s and early 2000s, all relevant stakeholders have felt the adverse impact of the fraud, both socially and financially Early detection of fraud may mitigate the adverse impact for these stakeholders However, with limited resource to conduct a manual detection of financial statement fraud, a more resource effective automated method is required Utilizing textual analysis and machine learning algorithm an effective automated method of detection can be obtained
Tobias Christian Glechman (2020) conducted his research mainly falls in the behavioural accounting literature stream as it deals with the potential to find clues for fraudulent manipulations in annual reports Through a mixed-model approach, qualitative and quantitative data are utilized both solely and in conjunction to create sound and comprehensive detection models
Hawariah Dalnial and his co-authors (2014) investigated the link between financial statement analysis and fraudulent financial reporting Numerous researchers have discovered clues in financial ratios that can identify fraudulent financial reporting, while others have reached different conclusions The majority of these studies were conducted in countries other than Malaysia The study's sample consists of Malaysian Public Listed firms, and the data used spans from 2000 to 2011 Their findings indicate that certain financial ratios, such as total debt to total assets and receivables to revenue, are significant predictors for detecting fraudulent financial reporting This suggests that financial ratios can be valuable in detecting instances of fraudulent financial reporting
Chyan-long Jan (2018) established a rigorous and effective model to detect enterprises’ financial statements fraud for the sustainable development of enterprises and financial markets The research period is 2004–2014 and the sample is companies listed on either the Taiwan Stock Exchange or the Taipei Exchange, with a total of
160 companies (including 40 companies reporting financial statements fraud) This study adopts multiple data mining techniques In the first stage, an artificial neural network (ANN) and a support vector machine (SVM) are deployed to screen out important variables In the second stage, four types of decision trees (classification and regression tree (CART), chi-square automatic interaction detector (CHAID), C5.0, and quick unbiased efficient statistical tree (QUEST)) are constructed for classification Both financial and non-financial variables are selected, to build a highly accurate model to detect fraudulent financial reporting The empirical findings show that the variables screened with ANN and processed by CART (the ANN + CART model) yields the best classification results, with an accuracy of 90.83% in the detection of financial statements fraud.
SUMMARY
In this chapter, we have established the theoretical framework for understanding financial statements fraud, machine learning, and their applications in the field of finance We began by exploring the concept of financial statements fraud, discussing its various forms, and the negative implications it has on organizations, investors, and the market
Next, we delved into the field of machine learning, providing an overview of its principles, methodologies, and its potential for detecting fraud in financial statements We discussed the applications of machine learning in finance, such as their ability to analyze large volumes of data to detect credit card fraud, even trading with algorithms and managing Furthermore, we reviewed existing research studies that have focused on detecting fraud in financial statements, both domestically and
DATA AND RESEARCH METHOD
Data
Regarding the methodology, the author uses secondary data for this dissertation Due to the difficulty in extracting data of more than 600 companies, the author created a Python script to loop through financial indicators via a source website and using API from 24hmoney website and other Python libraries such as pandas, numpy and vnstock
2.1.2 Data collection and processing feature
To provide further context, the research excludes data from banks due to the different financial statement structures they use compared to non-financial firms
Therefore, the sample used in this study only includes firms from a wide range of industry except for stock companies or banks, listed on the Ho Chi Minh Exchange
(HOSE) and covers a 6-year period from 2017 to 2022, resulting in a total of 685 firms in the dataset However, due to some data might be missing from the source, the author must filter out null values to warrant the non-bias and quality of the model, which cuts down to only 439 companies
Figure 2.1: Process of crawling and handling data
Secondary Data collecting and pre-processing)
Methodology
Using quantitative methods in this dissertation is essential due to the focus on the M-Score model and machine learning algorithms These approaches rely on numerical data analysis to identify patterns, correlations, and anomalies in financial statements Quantitative methods allow for the calculation of financial ratios, M-
Scores, and the processing of large datasets required for training machine learning models
The M-Score model utilizes quantitative financial ratios as indicators of potential fraud, while machine learning algorithms like SVM and random forest require numerical features for classification and prediction By employing
Step 2: Crawl data using Loops
Step 5: Categorize companies into 2 groups with
Step 6: Label as Fraudulent/Non-Fraudulent quantitative methods, this dissertation can effectively detect fraudulent behavior by analyzing statistically significant patterns and trends
In summary, the use of quantitative methods aligns with the nature of the M- Score model and machine learning algorithms, allowing for accurate and reliable fraud detection in financial statements
In this dissertation, the Beneish M-Score model is going to be used to calculate financial indices of companies to categorize the listed companies into 2 different groups: fraudulent and non-fraudulent After that, Support Vector Machine, a heavily used Machine Learning Algorithm for classification purposes will be used on the labeled data to conduct a binary classification to check if the predicted list of companies is conducting earnings manipulation or not Here is a simplified list of steps to calculate M-score results using Python
According to Beneish (2012), the Beneish M-Score is a method which can be used to detect companies that are prone to fraud on their financial statements From experience, companies with higher M Scores are more likely to fraud Beneish M- Score is a probabilistic model, that is also a limitation in fraud detection since 100% accurate is not practically reachable This is the formula:
• Days Sales in Receivables Index (DSRI):
The DSRI indicator calculates the length of time a company needs to receive payments from its customers after selling a product or service If the DSRI value is high, it could suggest that the company is experiencing delays in collecting payments, which could be a potential indication of either aggressive revenue recognition practices or manipulation of accounts receivable
The GMI indicator tracks how a company's gross profit margin changes over time A low GMI value implies that the company's gross margin is declining, which could indicate the use of aggressive revenue recognition practices, inventory manipulation, or cost manipulation
The AQI indicator evaluates the evolution of the quality of assets over time If the AQI value is high, it may suggest a reduction in the quality of assets, which could be an indication of overvalued assets or aggressive accounting practices
The SGI indicator tracks a company's sales growth trend over time A high SGI value suggests that the company's sales growth is on the rise, which could potentially be an indication of aggressive revenue recognition practices or manipulation of sales
The DEPI indicator compares a company's asset depreciation rate with its capital expenditure rate If the DEPI value is high, it implies that the company's assets are being depreciated at a slower rate than the capital expenditure rate This could be a potential indication of overvalued assets or aggressive accounting practices
• Sales, General, and Administrative Expenses Index (SGAI):
The SGAI indicator monitors the pattern of a company's sales, general, and administrative expenses over a period If the SGAI value is low, it suggests that the company's expenses are declining, which could be an indication of aggressive accounting practices or manipulation of costs
• Total Accruals to Total Assets (TATA):
The TATA indicator compares the level of accruals in proportion to total assets If the TATA value is high, it suggests that the company is utilizing accruals to artificially increase its earnings, potentially indicating aggressive accounting practices or manipulation of earnings
The LVGI indicator can sometimes be added as an eighth component to the M-score formula It gauges the alteration in a company's leverage or debt levels over time If the LVGI value is high, it suggests that the company is increasing its debt levels, which could indicate potential financial distress or aggressive accounting practices The inclusion of LVGI as an indicator can aid investors in identifying companies that may be at a greater risk of financial statement manipulation or fraud due to their high levels of leverage
According to Beneish model, if M-score higher than a certain threshold, which is -2.22, means that the company has a high probability of manipulating their financial statements
Table 2.2: M-Score Variables and Formulas
DSRI (Accounts Receivablest/Salest*Number of Days) / (Accounts
Receivablest−1/Salest−1*Number of Days)
GMI ([Salest-1–COGS t−1)/Salest−1] / [Salest–
AQI ([Salest-1–COGSt−1)/Salest−1] / [Salest –COGSt)/Salest])
Depreciationt−1))/(Depreciationt/(PP&Et + Depreciationt))
SGAI (SG&AExpenset/Salest)/(SG&A Expenset−1/Salest−1)
TATA ((Income from Continuing Operationst – Cash Flow from
LVGI ([(Current Liabilitiest + Total Long-term Debtt) / Total
Assetst]/[(Current Liabilitiest−1 + Total Long-term Debtt−1)/Total
According to Vapnik (1995), SVM is a set of artistically intelligent learning methods It is a machine learning method based on statistical learning theory and SRM (structural risk minimization) It mainly depends on using input training data to generate an optimal decomposition hyperplane that can distinguish two or more types (classes) of data through learning mechanism The underlying concept of SVM is straightforward: the algorithm forms a line or hyperplane that separates the data into different classes It is a supervised learning, applicable for both prediction and classification method for data mining In simplified terms, it is a binary classifier between two classes
Figure 2.2: Support Vector Machine Functionality
Naturally, listed companies on different exchange, whether it is HNX, Upcom or Ho Chi Minh Exchange, it is mandatory to publicize their financial statements yearly In the process of detecting fraud in financial statements and identifying companies that may be involved in fraudulent activities, a financial institution can employ various techniques One such method involves using a training dataset of past financial statements to find a decision boundary that maximizes the geometric margin, resulting in an SVM classifier that distinguishes between "good" and "bad" financial statements effectively With this classifier, the institution can evaluate the financial statements of current companies and identify those that are likely to be committing fraud by calculating their M-scores Companies that pass a certain threshold of M-score are considered to have a high probability of committing fraud, and the institution can take further steps to investigate these companies and take necessary actions to prevent fraudulent activities By employing this approach, the institution can proactively monitor financial statements and identify potentially fraudulent activities before they become significant issues
According to Breiman (2001), Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest It is an ensemble classifier, which constructs a group of independent and non-identical decision trees based on the idea of randomization
SUMMARY
In this chapter, the research methodology and theoretical framework for detecting financial fraud statements in Vietnamese Listed Companies are presented The study utilizes the Beneish M-Score model and incorporates two machine learning algorithms, namely Support Vector Machine (SVM) and Random Forest The sample comprises of 439 qualified companies listed on the HOSE and HNX exchanges
The author recognizes that the most suitable methodology would be quantitative as quantifying financial data is crucial for analyzing and detecting fraud patterns effectively By employing quantitative methods, such as calculating M-Score and Machine Learning algorithms, the study aims to leverage the power of numbers and objective measurements to uncover potentially fraudulent activities Furthermore, the theoretical foundation behind the integration of SVM and Random Forest algorithms with the M-Score model is discussed
By employing these research methodologies, the study aims to enhance the accuracy of fraud detection in Vietnamese Listed Companies The chapter serves as a pivotal point for the subsequent analysis and provides valuable insights into effectively identifying financial fraud statements in the Vietnamese context.
RESEARCH RESULTS
Overview of Financial Statements Fraud in Vietnam
The issue of financial statement fraud is still prevalent in Vietnam Many companies are suspected of manipulating their financial statements to hide their true financial position, thereby deceiving investors and creditors The State Securities
Commission of Vietnam has identified financial statement fraud as one of the most common violations in the stock market and has taken measures to improve the accuracy and transparency of financial reporting Notable cases of financial statement fraud in Vietnam include those involving ROS, BDC, LIC, PXS, SNZ, VSG, …
Faros Construction Joint Stock Company (HSX: ROS), a contractor of FLC
Group, used a virtual capital increase model to boost its capital from VND 1.5 billion to VND 4,300 billion in just 2 years Faros entrusted VND 1,417 billion for individuals and VND 2,149 billion for organizations to invest The increase in charter capital in Q1 2016 was contributed by 3 shareholders with a total of VND 462.5 billion All individuals and organizations entrusted by Faros are connected to FLC's business ecosystem, with Trinh Van Quyet, the Chairman and largest shareholder of
Faros, owning 42% of its capital This is a common practice for listed companies on
According to AASC's external audit opinion for Bach Dang Construction
Corporation's (UPCoM: BDC) 2018 financial report, the corporation set up provisions for financial investments in its subsidiaries worth VND 9.6 billion without sufficient basis to evaluate the net value of those investments This may affect the accuracy of expenses between financial years The corporation recognized revenue from real estate business activities based on the amount already paid according to the payment schedule and recognized all profits of the project in the 2018 business results This recognition may affect the accuracy of revenue and expenses between financial years
According to Deloitte's audit report for Petro Vietnam Technical Services
Corporation's (HOSE: PXS) 2018 financial statements, the company recognized revenue in advance for a construction contract in 2018, amounting to VND 18,850 billion, which corresponded to the receivable portion of the handover to the customer in 2019 This led to a decrease of VND 2,734 billion in the reported loss before tax and was mentioned in the audit report as a basis for a qualified opinion
In 2017 financial year of the Industrial Parks Development Corporation (UPCoM: SNZ), RSM’s audit highlighted those certain completed portions of construction projects worth 104.247 billion VND, which had been accepted as per the respective stage's minutes, were not recognized in the company's consolidated financial statements as of December 31, 2017 This led to a decrease in the reported accounting profit before tax compared to actual figures It is a common practice among construction companies to manipulate financial reports by recording revenue and expenses in the wrong accounting period or transferring them at the discretion of management
The audit report on Southern Container Joint Stock Company's (VSG) financial statements drew attention to an unrealized exchange rate loss due to the reassessment of the year-end balance of a long-term loan denominated in foreign currency This was recorded under "exchange rate difference" and helped reduce the loss by 33.16 billion VND, following the application of TT 201/2009 Applying VSA
10 would result in a post-tax loss of 73.82 billion VND instead of the reported loss of 40.66 billion VND in 2010, indicating that changes in accounting, rather than the company's production and business activities, contributed to part of the profit generated or loss reduction
In addition to the major frauds mentioned by auditors in audit reports, during the audit process, auditors may discover some minor frauds that are not significant enough to be mentioned in the audit report Some examples of minor frauds include manipulating completion rates and cost estimates for construction projects to recognize revenue early, not setting aside reserves for doubtful accounts, and difficult-to-collect accounts receivable.
Models and Variables
The Beneish M-Score indicates the likelihood of committing fraud in financial statements is extremely high, shows that the situation of manipulating financial indicators in financial reports of Vietnam is seriously underrated
According to Beneish, if M-Score is higher than -2.22, those companies are prone to manipulate their earnings in financial statements so the author of this dissertation set that as a threshold to label companies as Fraudulent or Non- Fraudulent
The author computed the M-Score over a period of 6 years and established a cutoff point Upon examination, it was discovered that out of 439 companies sampled,
417 of them had an M-Score exceeding -2.22, indicating that their financial records had been deliberately tampered with To avoid chance occurrences and enhance the precision of the M-Score model, the author resolved to include companies whose M-Scores had surpassed -2.22 on two separate occasions within a 6-year duration
Table 3.3: Number of manipulating companies in the analyzed period
1 year 2 years 3 years 4 years 5 years 6 years
This table illustrates how many times the companies manipulated their financial statements within 6 years, starting from 2017 Within a sample of 439 companies, the total number of manipulating companies are 417 over a 6-year period, which is very high Nearly every single company have deliberately influenced by the businesses themselves
Firstly, we need to display variables through a descriptive statistic table
Table 3.4: Descriptive statistics of variables
Correlation analysis shows the relationship between variables in the model
Durbin-Watson values (2.131) which is in the range from 1 to 3 basically explains that there is no autocorrelation phenomenon in residuals After checking for autocorrelation, the author moves on to check the fit of the model
From that result, we can reject the first hypothesis, which indicates the model would fit
As it can be seen, all VIF values are < 2, the model does not exhibit multicollinearity All variables are statistically significant In addition, AQI – the highest regression coefficient values among all 7 variables, stands at 0.955, that has the most impact on M-Score Alongside with that, DSRI has the 2 nd highest regression coefficient values, with the figure of 0.277 In summary, it can be explained the M- Score value is most impacted by AQI (Asset Quality Index) and DSRI (Day’s Sales in Receivables Index)
After having labeled data, cross-validation will be check using SVM as a classification to predict whether the calculated companies have predicted fraud or not by using financial indicators in M-Score formulas and M-Score of previous years Here is a diagram of the features selection for the model
As for the output, Y will be the labeled data of Fraudulent/Non-Fraudulent and SVM will conduct a binary classification based on the results The author chooses the ratio of test-train as 70%/30% as this is a small dataset
For accuracy assessment, a few metrics in Machine Learning is imported, which are Precision, Recall and F1-Score However, we need to convert CHAR, which stands for characters values such as: Fraudulent, Non-Fraudulent into Binary value due to the characteristics of metrics such as Precision and F1-Score All these metrics are designed to evaluate binary classification models, and therefore require the target variable to have only two classes
Here are the metrics results and an explanation of each will be thoroughly discussed below According to Will Koehrsen, when assessing a classification model for Fraudulent and Random Forest, we will evaluate the accuracy of the model with few key metrics (2023)
Precision is the number of true positives (TP) divided by the total number of predicted positives It measures the proportion of correctly identified positive predictions In this context, precision means the accuracy of the model in identifying the fraudulent cases based on the label data as it reflects the number of actual fraudulent cases identified by the model among all predicted fraudulent cases
Recall is the number of true positives divided by the total number of actual positives In this context, it measures the sensitivity of the model in detecting fraudulent cases
F1-score is a harmonic mean of precision and recall, and it provides an overall measure of the model's accuracy
Table 3.7: Model Results of SVM from 2017 to 2022
In 2017, SVM predicts Non-Fraudulent companies correct 81% of the time and for the Fraudulent 86% of the time The macro avg is 0.83, which means that the average precision for both Fraudulent and Non-Fraudulent is 83%, which can be considered as a good accuracy
The data of 2018’s result of SVM predicts Non-Fraudulent companies correct 90% of the time and for the Fraudulent 88% of the time The macro avg is 0.89, which means that the average precision for both Fraudulent and Non-Fraudulent is 89%, which can be considered as a good accuracy
For the year 2019, SVM predicts Non-Fraudulent companies correct 90% of the time and for the Fraudulent 85% of the time The macro avg is 0.88, which means that the average precision for both Fraudulent and Non-Fraudulent is 88%, which can be considered as a good accuracy
For the year 2020, SVM predicts Non-Fraudulent companies (0) correct 84% of the time and for the Fraudulent 88% of the time The macro avg is 0.86, which means that the average precision for both Fraudulent and Non-Fraudulent is 86%, which can be considered as a very good accuracy
Results Discussion
The results of this study demonstrate that incorporating the M-Score model with SVM and Random Forest algorithms yields a substantial level of accuracy in detecting fraudulent activities in financial statements The combination of these techniques enables the identification of patterns and anomalies that indicate potential fraud across various industries The analysis reveals that fraudulent activities are not limited to specific industries but are prevalent across a wide range of sectors Several notable industries emerge with a high count of fraudulent activities in their financial statements These industries include:
Figure 3.5: Industry-wise fraudulent company count
Nearly half of the companies involved in fraudulent activities (195 out of 439) can be attributed to three key industries: construction and materials, industrial goods and services, and real estate
Firstly, it can be observed that the industry of construction & materials is the most likely to engage in financial statement manipulation There can be various reasons why companies in the industries of construction may be more likely to manipulate their financial statements:
Long-term projects : One possible reason is that these industries involve long- term projects with significant capital investments, which can make it challenging to accurately forecast revenue and expenses
Corruption: The construction and materials industry is highly susceptible to corruption and collusion Given the complex nature of construction projects and the involvement of multiple stakeholders, including government agencies, contractors, suppliers, and consultants, there is a greater risk of bribery, kickbacks, and other forms of corruption These illicit practices can distort financial reporting and lead to fraudulent activities
Highly regulated: Companies may feel pressure to meet certain financial targets to avoid penalties or maintain their competitive position
Multiple of revenue streams and cost factors: The business models of these industries can be complex, with different sources of revenue and cost factors, which can create opportunities for accounting manipulation
Secondly, regarding industrial goods & services, right behind construction and materials, it was the industry to have the 2 nd highest count in fraudulent activities in financial statements It can be explained by the following reasonings:
Subcontracting Practices: The industrial goods and services industry often involves subcontracting arrangements, where companies outsource certain tasks or processes to subcontractors This complexity in subcontracting relationships can make it easier for fraudulent activities, such as inflated billing or fictitious subcontractor transactions, to occur
Inventory Manipulation : The industrial goods and services sector typically deals with a significant amount of inventory Companies may manipulate inventory records by overstating quantities or values to create a false impression of higher assets or sales This can be done through fictitious purchases, stockpiling excessive inventory, or concealing obsolete or damaged goods
Revenue Recognition Challenges: Recognizing revenue in the industrial goods and services industry can be complex, especially when projects are long-term or involve multiple stages Companies may employ fraudulent practices, such as prematurely recognizing revenue or inflating revenue figures, to meet financial targets or project a positive financial image
Bid-Rigging and Collusion: The competitive nature of the industrial goods and services industry can lead to bid-rigging or collusion among companies This unethical behavior involves conspiring to manipulate the bidding process for contracts, artificially inflating prices, or allocating contracts among specific companies These practices can result in fraudulent financial reporting
Government Contracts and Regulations: The industrial goods and services industry often involves contracts with government entities Companies may resort to fraudulent activities to secure government contracts, such as providing false information, bribing officials, or misrepresenting qualifications or capabilities
Lack of Industry Regulations: The industrial goods and services sector in
Vietnam may have limited industry-specific regulations or weak enforcement This regulatory gap can create an environment where fraudulent activities can go undetected or unpunished, allowing unscrupulous companies to exploit loopholes or engage in fraudulent practices
Financial Pressure and Cash Flow Issues: Companies in the industrial goods and services industry may face financial pressure, particularly if they experience cash flow difficulties or have high levels of debt These financial constraints can lead to fraudulent activities, such as inflating revenue or concealing expenses, to present a healthier financial position to stakeholders
Finally, the real estate industry has a significant number of companies that have been involved in fraudulent activities in their financial statements, placed as the
3 rd industry in terms of highest count There are a few possible justifications for this:
Complex Financial Transactions: The real estate industry often involves complex financial transactions, such as property development, investments, and partnerships These transactions provide opportunities for fraudulent activities, including misrepresentation of property values, overstating revenues, or understating expenses
Lack of Transparency: The real estate sector in Vietnam has faced challenges in terms of transparency and disclosure Limited access to reliable market information and opaque reporting practices can create an environment conducive to fraudulent activities, as it becomes easier to manipulate financial statements without detection
High Financial Stakes: The real estate industry typically deals with substantial financial investments, both from investors and lenders The pressure to meet financial targets and attract funding can lead some companies to engage in fraudulent practices to present a more favorable financial position than reality
SUMMARY
In this chapter, the author presents the analysis and findings of the study, focusing on feature selection and prediction results using the M-Score model The chapter also explores the accuracy metrics for Support Vector Machine (SVM) and Random Forest algorithms and discusses the varying counts of financial statement fraud across different industries
Firstly, the author examines the selected features, namely DSRI, TATA, GMI, AQI, SGI, DEPI, SGAI, LVGI, which were utilized in the analysis These features play a crucial role in capturing relevant indicators of fraudulent activities within financial statements The inclusion of these variables ensures a comprehensive evaluation of the dataset and enhances the accuracy of the predictive models
Next, the chapter presents the finalized results obtained from the M-Score model, indicating the effectiveness of this approach in detecting financial statement fraud The model demonstrates its predictive capabilities, showcasing its ability to identify potential instances of fraudulent activities within the dataset
Furthermore, the author evaluates the performance of the SVM and Random Forest algorithms in predicting financial statement fraud Both algorithms provide valuable insights into fraudulent patterns, each with its own advantages The chapter delves into the accuracy metrics for each algorithm, highlighting their respective strengths in detecting fraudulent activities within the studied dataset
Lastly, the chapter explores the reasons behind the varying counts of financial statement fraud across different industries The author identifies the construction and materials industry, real estate as having a higher incidence of fraud, while other industries such as automotive and parts, financial services, and oil and gas show lower counts.
LIMITATIONS AND RECOMMENDATIONS
Limitations
This dissertation only managed to use over 400 samples within only 5 years
In order to increase the accuracy and make sure Machine Learning Algorithms work, it requires a much bigger set of datasets and a longer timeframe
Even though, this dissertation brought out noticeable results and insights from a combination of a traditional model to detect fraud in financial statements, which is the Beneish M-Score model and Machine Learning Algorithms, it still fails in indicating specific reasonings such as the relationship between manipulating financial statements and stock prices of such companies on the market and other economic damage Additionally, the authorities still lack a legal structure to publicly announce the names of companies that have committed fraudulent activities in their financial statements so the results showcased in this dissertation is highly dependent on available data and statistics Aside from that, this dissertation did not succeed in telling which type of financial statements fraud of such companies.
Recommendation from research results
The study presented in this dissertation focused on using machine learning techniques such as SVM, Random Forest, and the Beneish M-Score model to detect financial statement fraud in Vietnamese listed companies Given the high rate of fraud in Vietnam, it is important to implement stringent measures to mitigate the risks of fraudulent activities The structure of the solutions will go through
We have observed that the Asset Quality Index (AQI) holds the most significant influence on the M-Score The AQI represents the ratio of long-term fixed assets to total assets, including long-term investments and other long-term assets
Companies with an AQI exceeding 1 are more prone to engaging in activities that artificially inflate their assets and defer costs, thus manipulating their tax liabilities
To address this issue, auditors should prioritize examining investments in subsidiaries and joint ventures, particularly transactions involving these entities, to prevent the creation of fictitious transactions Additionally, strict control over provisions for the depreciation of long-term investments and long-term prepaid expenses is crucial Furthermore, for both general listed companies and those in the construction industry, it is recommended to promptly disclose equity ownership reports, specify profits or losses from joint ventures and associates, and provide detailed information about subsidiaries and related transactions in the financial statements
As for DSRI, which has the 2 nd highest impact on M-Score model DSRI, or Days Sales in Receivables Index, has a significant impact on the M-Score model due to several reasons One possible reason is that companies may manipulate their accounts receivable to inflate their reported revenues, leading to a higher M-Score This can be addressed in Vietnam by implementing stricter regulations and monitoring of accounts receivable management practices The government can enforce timely and accurate reporting of accounts receivable, ensuring that companies provide detailed information on their customers, payment terms, and collection processes
Another reason for the impact of DSRI is the possibility of companies delaying or accelerating revenue recognition to manipulate their financial statements To address this, the government can introduce standardized revenue recognition guidelines and provide comprehensive guidance to companies on recognizing revenue appropriately This can help ensure consistency and transparency in financial reporting practices, reducing the likelihood of manipulation
Furthermore, DSRI can be influenced by the prevalence of cash-based transactions, especially in certain industries Companies may engage in fraudulent activities by underreporting cash sales or inflating accounts receivable balances To mitigate this risk, the government can promote the use of electronic payment systems, which can provide a more accurate and traceable record of transactions Implementing robust cash handling and monitoring procedures can also help prevent fraudulent activities related to cash-based transactions
Additionally, companies may engage in channel stuffing, where they push excessive inventory onto distributors or customers near the end of reporting periods to inflate sales figures To address this, the government can enforce stricter inventory management practices, conduct regular inventory audits, and encourage companies to provide detailed disclosures on their inventory levels, turnover ratios, and sales patterns
Overall, addressing the impact of DSRI on the M-Score model in Vietnam requires a combination of regulatory measures, standardization of reporting practices, and enhanced monitoring and enforcement By implementing these measures, the government can create a more transparent and trustworthy financial reporting environment, reducing the risks of financial statement fraud
It is recommended that the regulatory authorities in Vietnam take further actions to enhance their oversight and monitoring of listed companies' financial reporting practices To achieve this, the authorities could establish a system to improve audit standards This system could provide better insight into companies' financial positions and aid in the identification of potentially fraudulent activities Additionally, the regulatory authorities should conduct regular audits on companies with high fraud risks and impose stricter penalties on those found to have engaged in fraudulent activities
Furthermore, it is suggested that the authorities set up a fraud risk assessment framework to identify and address potentially fraudulent activities at an early stage The framework could include collaboration with the companies' audit committees and external auditors, as well as enhancing the role of internal auditors in detecting and preventing fraud
Moreover, the regulatory authorities could encourage companies to adopt a code of conduct for ethical behavior and implement whistleblower policies They could also provide regular training to employees on the significance of compliance with laws and regulations Implementing all these measures will foster a culture of integrity and transparency in financial reporting, ultimately leading to a healthier business environment in Vietnam
Finally, they ought to create a legal structure to enact laws and regulations to publicize companies that have committed fraudulent in their financial statements This can be done by establishing a co-operative relation with the Audit Association
To prevent fraudulent activities and maintain the integrity of the financial market in Vietnam, it is crucial for companies to adopt more robust internal control mechanisms This can include implementing whistleblower policies that encourage employees to report any suspicious activities, conducting regular internal audits to identify potential vulnerabilities in the control system, and providing regular training for employees on the importance of ethical behavior and compliance with laws and regulations
Whistleblower policies can help companies to detect fraudulent activities early on, allowing for timely interventions to prevent further damage Regular internal audits can also aid in identifying any control weaknesses or deficiencies in the system, which can be addressed promptly to prevent fraudulent activities from occurring By providing regular training for employees, companies can ensure that their staff are well-informed and equipped with the necessary knowledge to detect and prevent fraudulent activities
Overall, implementing these internal control mechanisms can go a long way in preventing fraudulent activities and promoting ethical behavior within companies
It is important for companies to take a proactive approach to fraud prevention rather than simply reacting to incidents after they occur By creating a culture of integrity and accountability, companies can help to mitigate the risks of fraudulent activities and maintain the trust of their stakeholders
4.2.4 For investors and financial analysts
SUMMARY
This chapter presents a set of recommendations for various stakeholders involved in the detection and prevention of financial statement fraud in Vietnamese listed companies The regulatory authorities in Vietnam are advised to strengthen their oversight and monitoring by improving audit standards, conducting regular audits on high-risk companies, and imposing stricter penalties on those engaged in fraudulent activities It is also suggested that a fraud risk assessment framework be established, in collaboration with audit committees and external auditors, to identify and address potential fraudulent activities at an early stage
Companies themselves are encouraged to adopt robust internal control mechanisms, including whistleblower policies, regular internal audits, and employee training on ethical behavior and regulatory compliance By implementing these measures, companies can proactively detect and prevent fraudulent activities, ensuring the integrity of their financial reporting
Investors and financial analysts are urged to conduct thorough due diligence by analyzing financial statements, conducting company background checks, and monitoring company performance over time This comprehensive approach will help them make informed investment decisions and detect any red flags that may indicate fraudulent activities
Overall, the implementation of these recommendations will contribute to a more transparent and trustworthy financial reporting environment in Vietnam, mitigating the risks of financial statement fraud and protecting the interests of investors and stakeholders
Support Vector Machine Algorithm Results:
1 Benston, G J (2006), ‘Fair-value accounting: A cautionary tale from Enron’,
Journal of Accounting and Public Policy, 25(4), 465–484
2 Catherine Cote (2020), ‘How and Why Managers Use Finnancial Statements’, retrieved on April 23 th 2023, from < https://online.hbs.edu/blog/post/how- managers-use-financial- statements#:~:text=Analyzing%20the%20balance%20sheet%2C%20income, setting%20and%20decision%2Dmaking%20processes.>
3 Chyan-long Jan (2018), ‘An Effective Financial Statements Fraud Detection Model for the Sustainable Development of Financial Markets: Evidence from Taiwan’, retrieved on April 14th, from
4 D Y Singh and A S Chauhan (2009), ‘Neural networks in data mining’,
Journal of Theoretical and Applied Information Technology, 5 (1),37–42
5 Dimas Lagusto (2018), ‘Predicting Fraudulent Financial Statement using Textual Analysis and Machine-Learning Techniques’, Master Thesis, Ritsumeikan Asia Pacific University
6 Donald Lee Hửglund (1980), ‘Determining the financial performance of private veterinary practices’, Master Thesis, Colorado State University
7 Elliott, R.K., and Willingham, J.J (1980), ‘Management fraud: Detection and deterrence’, America: New York: Petrocelli Books, retrieved on April 23 th
2023, from
24 R.Buveneswari and S Lakshmi (2015), ‘A stusy on financial statement analysis of Sriram Perfumes, Trichy’, International Journal of Advanced Research in Management and Social Sciences, 4 (7), 232
25 Rezaree (2005), ‘Causes, consequences, and deterrence of financial statement fraud’, Critical Perspectives on Accounting, 16(3), 277-298
26 Ronald V Bucci (2014), Balance Sheet, Medicine and Business, 49-57
27 Ryerson, F E., III (2009), ‘Improper Capitalization and the Management of Earnings’, Las Vegas, NV: Proceedings of the American Society of Business and Behavioral Sciences Annual Conference (ASBBS 2009), retrieved on April 23 th 2023, from < https://asbbs.org/ >
28 Sidorsky, R (2006), ‘Assessing the Risks of Accounting Fraud’, Commercial
29 Singleton, T W., & Singleton, A J (2010), ‘Fraud Auditing and Forensic Accounting (4th Edition)’, Hoboken, NJ: John Wiley & Sons, retrieved on April 23 th 2023, from
30 Stuart Colianni, Rosales, Michael Signorotti (2015), ‘Algorithmic Trading of Cryptocurrency Based on Twitter Sentiment Analysis’, retrieved on April 23 th
2023, from
33 Wells, J T (2017), ‘Corporate Fraud Handbook: Prevention and Detection (5th edition)’ Hoboken, NJ: John Wiley & Sons, retrieved on April 23 th
34 Xin-Ping Song, Zhi-Hua Hu, Jian-Guo Du, Zhao-Han Sheng (2014),’ Application of Machine Learning Methods to Risk Assessment of Financial Statement Fraud: Evidence from China’, Journal of Forecasting, J Forecast,
35 Yilin Ma, Ruizhu Han and Weizhong Wang (2021), ‘Portfolio optimization with return prediction using deep learning and machine learning’, Nanjing University of Posts and Telecommunications, 165 (3), 2 – 15
36 Zack, G M (2009), ‘Fair Value Accounting Fraud: New Global Risks and Detection Techniques’, Hoboken, NJ: John Wiley & Sons, retrieved on April
23 rd 2023, from
37 Zack, G M (2013), ‘Financial Statement Fraud: Strategies for Detection and Investigation’, Hoboken, Canada: John Wiley & Sons, retrieved on April 23 th 2023,from