1. Trang chủ
  2. » Luận Văn - Báo Cáo

Tóm tắt: Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh)

29 0 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Exploiting Big Data In Calculating The Consumer Price Index In Vietnam (Case Of Ho Chi Minh City)
Tác giả Nguyen Thanh Binh
Người hướng dẫn Ha Văn Sơn, PhD, Le Thi Thanh Loan, PhD
Trường học University of Economics Ho Chi Minh City
Chuyên ngành Statistics
Thể loại ph.d. dissertation
Năm xuất bản 2023
Thành phố Ho Chi Minh City
Định dạng
Số trang 29
Dung lượng 345,86 KB

Nội dung

Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).Khai thác dữ liệu lớn trong việc tính chỉ số giá tiêu dùng ở Việt Nam (trường hợp TP. Hồ Chí Minh).

MINISTRY OF EDUCATION AND TRAINING UNIVERSITY OF ECONOMICS HO CHI MINH CITY Nguyen Thanh Binh EXPLOITING BIG DATA IN CALCULATING THE CONSUMER PRICE INDEX IN VIETNAM (CASE OF HO CHI MINH CITY) SUMMARY OF PH.D DISSERTATION Ho Chi Minh City, 2023 MINISTRY OF EDUCATION AND TRAINING UNIVERSITY OF ECONOMICS HO CHI MINH CITY Nguyen Thanh Binh EXPLOITING BIG DATA IN CALCULATING THE CONSUMER PRICE INDEX IN VIETNAM (CASE OF HO CHI MINH CITY) Major: Statistics Code: 9460201 SUMMARY OF PH.D DISSERTATION INSTRUCTOR Ha Văn Sơn, PhD Le Thi Thanh Loan, PhD Ho Chi Minh City, 2023 The dissertation was completed at The University of Economics Ho Chi Minh City Scientific instructors: Reviewer 1: ……………………………………………………………………… Reviewer 2: ……………………………………………………………………… Reviewer 3: ……………………………………………………………………… The dissertation will be defended at the University level - Committee of Disseratation Evaluation: At ………………… On ………………………… The dissertation is available on library: CHAPTER 1: INTRODUCTION TO RESEARCH 1.1 Reasons to choose the research topic The Consumer Price Index is considered one of the most important economic indicators published by national statistical offices (Berry et al., 2019) These agencies typically select a representative sample of goods and services most frequently consumed by the population to calculate the Consumer Price Index This traditional method of collecting price information through surveys, as currently practiced, also has some limitations such as the cost of conducting surveys and the increasing difficulties of carrying them out, the growing number of retail chains leading to longer data collection, and the decreasing response rates (Crystal et al., 2019), as well as sampling errors and non-sampling errors due to the quality of collected information depending on the skills and the honesty of the interviewers The Consumer Price Index in Vietnam is collected, calculated, and published by the General Statistics Office Although the statistical information system on consumer prices is implemented according to international practices and is getting improved, there are still some existing issues similar to the global situation in calculating the Consumer Price Index Along with the increasingly developing trend of the digital economy worldwide, online transactions are becoming more common, creating an immense and diverse source of price data This large data source can help gather price information more timely, with a wider variety of items and higher collection frequency (Crystal et al., 2019) The Billion Prices Project by the Massachusetts Institute of Technology is one of the pioneering projects in exploiting this large data source Research results have proven that detailed data on retail prices can be collected remotely at a significantly lower cost compared to traditional methods (Cavallo and Rigobon, 2016) Recognizing the advantages of this big data source, countries have begun to deploy experimental research and apply big data to calculate the Consumer Price Index for official statistics, with typical studies in Norway (Manik and Albarda, 2015), the UK (Naynor et al., 2015), Belgium (Van Loon and Roels, 2018), France, Sweden, and the Netherlands (Jens, 2019), and the USA (Crystal et al., 2019) Recognizing the importance of information and communication technology, and especially big data, on May 10, 2018, the Prime Minister issued Decision No 501/QĐ-TTg approving the proposal on the application of information and communication technology in the National Statistical System for the period 2017 – 2025, with a vision to 2030 The goal is set to: "Apply big data technology to modernize, reduce costs, improve quality, and enhance forecasting capabilities for some statistical indicators in the field of price statistics" (Prime Minister, 2018) Realizing the importance and immense potential of big data, the General Statistics Office has outlined some strategic orientations such as: establishing a working group on big data, adding "Researching the application of big data to the Information Technology Application Development Program of the General Statistics Office," and building a proposal on the application of big data in National Statistics (Nguyễn Bích Lâm, 2016) Researching solutions to use big data for calculating the Consumer Price Index in Vietnam is very necessary and in line with global trends, therefore, the author has chosen the topic: “Exploiting big data in calculating the Consumer Price Index in Vietnam (the case of Ho Chi Minh City)” as the research subject for this thesis 1.2 Research objectives Objective 1: To build a process for extracting price information from big data; Objective 2: To establish procedures and techniques for calculating the Consumer Price Index from big data; Objective 3: To research the application of the Hedonic regression model for adjusting the quality changes of goods or for the scenario where goods are no longer available in the market for calculating the Consumer Price Index; Objective 4: To analyze the appropriateness of applying the Hedonic regression model for these adjustments; Objective 5: To provide policy implications for the deployment of calculating the Consumer Price Index from big data 1.3 Research questions Research question 1: How is the process of collecting prices from online websites? Research question 2: How is the process and technique of calculating the Consumer Price Index from big data? Research question 3: Can the Hedonic regression model be applied based on big data to adjust for changes in the quality of goods or in cases where goods are no longer present in the market? Research question 4: Is the Consumer Price Index from big data and the application of the Hedonic regression model based on this data appropriate? Research question 5: What research implications are there for successfully implementing online price collection for calculating the Consumer Price Index in Vietnam? 1.4 Research subjects and scopes Reseach subjects: Big data, Consumer Price Index and Hedonic regression model Reseach scopes: The research implements the calculation of the Consumer Price Index in Ho Chi Minh City; which applies Hedonic regression model based on big data of laptops Price data is collected and aggregated for the years 2017 and 2018 1.5 Research methodology To achieve the research objectives, the thesis exploited mixed- methods approach, using both qualitative and quantitative methods The qualitative method is carried out through group discussions, one-on-one discussions, and scientific seminars aimed at determining the necessity and solutions for the deployment of big data collection to serve the calculation of the Consumer Price Index; Additionally, it identifies factors that influence laptop price The quantitative research method: Based on qualitative research, a model of factors affecting laptop prices is constructed Research is conducted to collect data from websites selling laptops online, including 974 types of laptops with detailed information such as price, configuration, weight, brand From previous studies and expert’s sharings, the author chooses logarithm linear model (LogLin) for this study, and estimates are made using the Ordinary Least Squares (OLS) method Tests performed on the model include the t-test, F-test, and the use of the R-squared coefficient to evaluate the model fit, as well as using the variance inflation factor to test for multicollinearity and the White test for heteroscedasticity 1.6 Data source The thesis utilizes price information from all items collected from 29 official websites verified by the Ministry of Information and Communications, which are major and reputable online shopping sites in Vietnam 1.7 Contribution of the thesis Theoretical contribution: The thesis develops a new approach to collecting statistical information, which is one of the most important steps in the 7-step process of statistical information production Practical contribution: The data collection method based on big data will help to improve the quality of input data; establish a process for extracting price information from big data, specifically from online shopping websites in Vietnam; develop a process and technique for calculating the Consumer Price Index from big data; and construct a Hedonic regression model to adjust for changes in the quality of goods based on big data CHAPTER LITERATURE REVIEW AND OVERVIEW OF RELATED PREVIOUS STUDIES 2.1 Literature review on price and price index "The Consumer Price Index is a relative indicator (measured in %) that reflects the trend and the extent of price fluctuations over time of the goods in the representative basket of goods and consumer services" (General Statistics Office, 2018) “The weight used to calculate the Consumer Price Index is the expenditure structure of the groups of items in the total household expenditure, which is compiled from the results of the living standards survey and is fixed for about years” (General Statistics Office, 2018) To calculate the Consumer Price Index, national statistical agencies must collect data on prices and quantities for various goods and services Additionally, to estimate price changes relative to the base period (using weights, also known as expenditure shares), these agencies need to gather data on household expenditure structures Typically, this data is obtained through living standards surveys, which most national statistical agencies conduct at irregular intervals (Beegle et al., 2016) Currently, the Consumer Price Index is collected, compiled, and published in 196 economies, including 37 developed economies, accounting for a 19% share, and 159 emerging and developing economies, accounting for an 81% share (Berry et al., 2019) In Vietnam, in order to calculate and publish the monthly, quarterly, and annual Consumer Price Index as currently done, the statistical agency carries out the consumer price survey (General Statistics Office, 2015) From the survey design of the General Statistics Office and the practice of collecting price information locally, based on the research by Berry and colleagues (2019), the criteria for assessing the reasonableness of the Consumer Price Index were proposed It can be seen that the method of calculating the Consumer Price Index in Vietnam conforms well to the international practices, such as: the frequency of updating the weights of the Consumer Price Index is carried out every five years; Regarding the timeliness of data publication: Vietnam's Consumer Price Index is published on the 29th of each month; Vietnam is using the COICOP classification in line with the recommendations of international organizations 2.2 Literature review on Big Data Daas et al (2023) define "Big Data as datasets (extremely large) that may contain both structured and unstructured data and when analyzed computationally, can reveal patterns, trends, and associations related to behaviors and interactions of the units included." Struijs et al (2014) argue that big data is a very diverse and broad topic for research The emergence of big data across various fields has led to a series of groundbreaking innovations that national statistical agencies cannot be left out of due to their role as the primary data provider and the authoritative agency on official statistics Letouzé and Jütting (2014) argue that "engaging with big data is not a technical issue but a political obligation." Daas et al (2023) have synthesized a total of 44 statistical fields based on big data, of which six fields have been officially published, while the rest are experimental and have been produced once or more frequently 12 CHAPTER 3: RESEARCH METHODOLOGY AND RESEARCH PROCESS 3.1 Research process To achieve the research objectives, the dissertation employs a mixed-method approach, simultaneously utilizing qualitative and quantitative research The research process is constructed by synthesizing relevant theories and empirical studies, conducting expert surveys to develop the information collection procedure from big data, and identifying factors influencing laptop prices through the stages of Consumer Price Index calculation The dissertation also involves discussions and proposes implementation solutions 3.2 Qualitative research The dissertation conducts qualitative research through bilateral discussions and workshops to achieve the following objectives: gather expert opinions on the necessity of using big data in official statistics, especially in price statistics; collect additional information on the influencing factors of laptop prices in practice after reviewing previous studies; adjust and refine the information collection process from big data; analyze challenges and limitations and propose solutions for applying big data analytics in Vietnam 3.3 Big data mining method The dissertation has successfully developed a 12-step process for collecting information from big data sources 13 Figure 3.1 The process of collecting information from big data Source: The author 3.4 Method for calculating consumer price index from big data Figure 3.2 The process of using big data in calculating CPI Source: The author 14 Consumer Price Index collected from big data is calculated through four steps Additionally, the research also computes the Consumer Price Index by combining it with price information obtained through traditional methods (Combination and Combination 2) Specifically, the two combined calculation methods are as follows: Combination 1: Review the results of calculating the Consumer Price Index collected from big data, representing the urban area The overall Consumer Price Index is synthesized from the big data Consumer Price Index with the expenditure weight for the urban area and the traditional Consumer Price Index (rural area) with the expenditure weight for the rural area (this weight is fixed and changes once every years) Combination 2: Examine the results of calculating the Consumer Price Index collected from big data representing the enterprise sector The traditional Consumer Price Index represents the individual sector since most of the data is collected from traditional markets The overall Consumer Price Index is synthesized from the big data Consumer Price Index with the revenue weight for the enterprise sector and the traditional Consumer Price Index with the revenue weight for the individual sector This weight can be calculated monthly from the total retail revenue divided by product categories The Consumer Price Index is calculated based on the corresponding product categories between two sectors: online collection and traditional collection The horizontal weight is used to calculate the city's Consumer Price Index for each product category from level 15 to level and the overall index The horizontal weight is the proportion of total retail sales in Ho Chi Minh City for both enterprise and individual sectors, categorized by product groups 3.5 Constructing Hedonic Model: The case of laptop prices Based on previous research models on desktop and laptop computers by authors such as Ernst R Berndt et al (1995), Paul Chwelos (2003), Zafar and Himpens (2019), and the characteristics of laptop computers as suggested by experts to align with current technologies, the proposed model for this study is a ln - lin estimation model, and the estimation method is Ordinary Least Squares (OLS) The software supporting regression analysis and descriptive statistics is STATA The research model examines factors influencing the prices of laptops with variables: Dependent variable: Laptop selling price; Independent variables: CPU type variable; CPU processing speed variable; RAM type variable; RAM capacity variable; Hard drive type variable; SSD or m2.sata hard drive capacity variable; HDD capacity variable; Touchscreen monitor variable; Flip-rotate monitor variable; 4K monitor variable; Ultrasharp monitor variable; Screen size variable; Laptop battery capacity variable; Laptop with smart battery function variable; Laptop weight variable; Laptop with DVD variable; Laptop brand variable 16 CHAPTER 4: RESULT AND DISCUSSION 4.1 An overview of Ho Chi Minh city According to the statistical yearbook for the year 2022 by the General Statistics Office, compared to the whole country, Ho Chi Minh City has an area of 2,095.4 km2, accounting for 0.6%, and a population of 9,389.7 thousand people, representing 9.4% (General Statistics Office, 2022) Ho Chi Minh City is one of the provinces and cities in the key economic region in the South, serving as the economic center and leading economic locomotive of the country, consistently exhibiting a higher economic growth rate compared to the national average Ho Chi Minh City is a dynamic and innovative economic hub, always at the forefront of economic development nationwide Its high economic growth rate significantly contributes to the overall GDP, with the Gross Regional Domestic Product (GRDP) accounting for nearly 23% of the country's GDP The city's budgetary contributions make up over 30% of the national budget, and in terms of businesses, approximately 38% of the total number of businesses in the country operate within the city's boundaries 4.2 The situation of e-commerce activities in Ho Chi Minh City According to the results of the e-commerce survey in 2019, Ho Chi Minh City had 65.5% of households using payment methods, which was 22.9 percentage points higher than in 2018 (in 2018, the rate of households using payment methods was 42.6%) The proportion of businesses with websites was 35.7%, and 86.9% of businesses placed or received orders through electronic means The 17 acceptance rate of payment cards in supermarkets and shopping centers has been 100% from 2017 to the present 4.3 Number of websites and quantity of items collected A total of 246,069 items were collected from 29 websites and encoded for calculating the consumer price index The website with the largest number of items is yes24.vn, with 56,334 products This website offers a wide range of product categories, including electronics, computers, mobile phones, cosmetics, fashion for both men and women, footwear, belts, toys for children, maternity and baby products, household items, kitchen utensils, sports equipment, jewelry, and interior decorations The website with the fewest items is foodandy.vn, offering 36 products, specializing in beef, lamb, and salmon To construct the Hedonic regression model for laptops, the thesis collected data from the online laptop store website (https://laptopxachtayshop.com), comprising 974 different laptop models with varied prices, configurations, weights, and different brands 4.4 Results of the Hedonic regression model for laptop products The results of the Hedonic regression model for laptops show that the laptop price is influenced by various factors: (1) Central Processing Unit (CPU) type, (2) Random Access Memory (RAM) type, (3) Memory capacity, (4) Hard drive type, (5) Screen type, (6) Screen size, (7) Battery type, (8) Laptop weight, and (9) Brand Comparing with previous studies, the factors affecting laptop prices in this research align with studies by Baker (1997), Paul Chwelos

Ngày đăng: 04/03/2024, 13:31

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w