INTRODUCTION
Coffee
Vietnam is the world’s second-largest producer of coffee Today, coffee culture is huge in Vietnam, where you can grab coffee for under a buck at the thousands of street stalls located in every city as well as in coffeeshops and restaurants The vast majority of coffee in Vietnam comes from the robusta species, mostly growth in highland Robusta coffee is generally stronger, nuttier, and darker than that made from Arabica
Coffee bean quality was effected by species, method of cultivation and harvest Especially, method of cultivation content: plating soil, topography, climate, etc After harvesting, coffee bean pass through physical and chemical processing like washing, drying, sorting, heating, grinding
Covid-19 accelerated this trend, forcing drinkers to make coffee at home rather than replying on coffee shops In two years time, this trend will be further developed as coffee- making technology improves and machine prices reduce
Figure 1.1 Global: coffee, new product launches, by format, 2018-2020 [17]
Source: Mintel GNDP Coffee brands are far more sustainable than a decade ago, but future sales reply on them getting more serious still Post-pandemic, consumer anxiety will focus on environmental harm and seek scapegoats A more activist younger generation will show less tolerance for waste, esspecially pods that are recyclable but rarely
Global: coffee, new product launches, by format, 2018-2020
Whole bean Ground Pods/capsules Soloble/instant Coffee mixes RTD (iced) coffee
COVID-19 has heightened societal awareness of inequalities, particularly the disparity between the lucrative coffee industry and the meager earnings of farmers Despite prevalent fair trade claims, many farmers remain underpaid Climate change poses a significant threat to the coffee supply chain and the livelihoods of farmers To mitigate this risk, coffee brands must assume a proactive role in supporting farmers' adaptation efforts.
Figure 1.2 Global: new coffee launches, by key macro-trends, 2010-2020 [18]
* Health includes the following claim categories: Functional, Plus, Natural and Minus; among food and drink categories
Consumer trend on coffee label
A growing body of sensory consumer research confirmed that extrinsic product cues, such as packaging and branding, influence how consumers evaluate food products (Deliza & MacFie, 1996) It is important for practitioners and researchers to understand the interplay of sensory and non-sensory attributes as both dimensions have to be optimised for a product to be successful in the marketplace While it is agreed that extrinsic characteristics can both increase and decrease consumer acceptance of a product that is well liked in blind conditions, little is known about the relative effect of extrinsic cues on informed product evaluation when multiple
Global: new coffee launches, by key macro-trends, 2010-2020
3 cues such as branding, labelling, packaging and price are existent Knowledge about their relative importance would guide practitioners to focus on the most important drivers [39]
For example, according to market research, clean-label coffee is now a consumer expectation is developed markets and a trend led by the US More US coffee drinkers would buy coffee making clean/natural claims over those making single-origin or small batch claims US brands such as Caribou Coffee are accentuating how they
“have nothing to hide” when it comes to coffee ingredients
Organic provides “proof” of clean label and is especially important to Millenials (b.1978-94) In 2019, 11% of all global coffee launches made the organic claim for the second consecutive year
Figure 1.3 US selected factors which would encourage coffee purchase,2019 [18]
*Defined in this case as North America + Europe + Australia
Base: 1600US internet users aged 18+ who drink any coffee beverage
Source: Lightspeed/Mintel; Mintel GNPD
Coffee was found to be a product for which the evaluation of intrinsic sensory characteristics is strongly impacted by extrinsic attributes [39] It is therefore chosen as an especially suitable product category for this study
Environmentally frendly (eg sustainable ingredients)
US: selected factors which would encourage coffee purchase, 2019
Figure 1.4 Environmental and ethical claims dominate new coffee launches [17]
Global: coffee, new product launches, top five claims, 2019 Jan-mid December 2019 Source: Mintel GNPD
National situation- Our problem
Coffee is an important commodity, contributed for 3% of the country's GDP, export turnover is average 3 billion USD per year For developing the export market of this item in the future, branding must be focused Statistically, Vietnam's coffee has been exported to more than 80 countries and territories, occupied for 14.2% of the global green coffee export market share (ranked 2 nd , after Brazil) In particular, the exporting roasted, and ground coffee contributed for 9.1% of the market share
(ranked 5 th , after Brazil, Indonesia, Malaysia and India) Besides, the active support of ministries and branches in processing capacity, the expanding markets and the reorganizing exports along with the initiative and efforts of enterprises Vietnam in promotion, marketing and brand positioning will help Vietnamese coffee products increasingly assert their position in the international market Currently, the orientation of the State and ministries focus on Vietnam's coffee industry to develop in the modern, synchronous, sustainable sand highly competitive direction with diversified, high-quality products, high added- value, increasing income for farmers and businesses [36]
Environmental and ethical claims dominate new coffee launches
To achieve the goal of export turnover of 6 billion USD by 2030 and increase the added- value of Vietnamese coffee products, the coffee industry needs to have synchronous solutions:
- Regarding the production and processing, it is necessary to promote the restructuring of the coffee industry effectively
- Building brand must be paid more attention
- Enterprises need to survey the market's demand in areas including market share - taste - quality - price, thereby determining the proportion of processing suitable products
- Regarding to trade promotion, adjust production and business activities in accordance with market signals [36]
Thus, in order to protect the quality of coffee which qualify for standard, bring uniqueness about coffee type, reputation of region of coffee growth, we need to pay more attention to the geographical origin of products through claim about Geography Indicator.
Geography Indicator (GI)
Geographical indications (GIs) are signs which identify goods that are originating from a specific place and possess a given quality, reputation or other characteristic that is essentially attributable to that geographical origin GIs enable consumers to differentiate products, as they pay increasing attention to the geographical origin of products One of the key benefits of GIs for consumers is therefore to guarantee the quality of the product [15]
Geographical Indications (GIs) safeguard the distinctiveness and reputation of products originating from specific geographical locations By protecting producers from unfair competition and misappropriation, and enabling them to fetch premium prices, GIs foster the diversification and competitiveness of industrial, agricultural, and handicraft sectors GIs empower consumers by providing accurate information on product origin and characteristics, boosting trust and confidence in the marketplace Moreover, GIs promote international trade, facilitating the exchange of authentic regional products and supporting rural development through job creation and increased incomes.
6 producers and stakeholders of the value chain, and can also promote the region as a whole, with the development of tourism
GIs are a means to preserve traditional knowledge and local biodiversity since products identified by a GI are often the result of traditional processes and knowledge carried forward by a community in a particular region Since the implementation of the World Trade Organization (WTO) Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement) (1994), GI protection systems, which started in southern Europe, have expanded remarkably worldwide, in particular in Asia Indeed, all countries from the Association of Southeast Asian Nations (ASEAN) have opportunities to develop high-quality products with a strong geographical identity and have strongly engaged in the identification and registration of GIs as a tool to expand their presence on domestic and international markets
As of January 2019, 346 GIs have been registered in ASEAN countries, including
37 for foreign GIs showing the incredible interest of ASEAN countries for GI protection Including both local and foreign GIs, Cambodia has 3 registered GIs, Indonesia 74; Lao PDR 1; Malaysia 84; Thailand 115; and Viet Nam 69 To date, there are eight (08) GIs from the ASEAN region registered in the EU market, including Kampot Pepper (pepper, Cambodia, registered in 2016), Skor Thnot Kampong Speu (sugar, Cambodia, registered in 2019), Kopi Arabika Gayo (coffee, Indonesia, registered in 2017), Nuoc Nam Phu Quoc (fish sauce, Viet Nam, registered in 2012) and four GIs from Thailand, including Khao Hom Mali Thung Kula Rong- Hai (rice, 2013), Kafae Doi Chaang (coffee, 2015), Kafae Doi Tung (coffee, 2015), and Khao Sangyod Muang Phatthalung (rice, 2016)
One of the key benefits of GIs for producers is the increase in the price of the product In the EU, the price of a GI product has been estimated at 2.23 times the price of a comparable non-GI product (in average, 1.5 times more for agro-food products) Another worldwide study estimates that the GI premiums lead to prices 20% to 50% higher than comparable non-GI product In the ASEAN region, according to the data provided by the concerned IP offices in the booklet, GIs show a positive impact in terms of volumes, prices, and local development For example,
Pepper prices have surged for consumers despite stable international prices Notably, Kampot white pepper (Cambodia) prices rose 2.6-fold between 2009 and 2018, while Muntok white pepper (Indonesia) prices soared sixfold over the same period.
2015, while the price of Sarawak pepper (Malaysia) increased by a factor of 4.32 from 2003 (before GI registration) to 2016 (after GI registration) for sales in bulk Other successful GIs are in the area of coffee, with the farm gate price of Flores Bajawa Arabica Coffee red berries (Indonesia) increasing by a factor of 2.2 between
2005 and 2015, although such price increase remains unstable For Doi Chaang Coffee (Thailand), the price of coffee berries evolved by a factor of 2 Buon Ma Thuot coffee from Viet Nam benefits from an added value of 2–3% compared with the standard comparable coffee Fruits also largely benefit from GI protection with the Koh Trung pomelo (Cambodia) farm gate price increasing by a factor of 1.33, and the price of Pakpanang Tabtimsiam Pomelo (Thailand) increasing by a factor of 1.75
Until January 2019, Vietnam has registered 346 GIs in ASEAN, including: Buon
Me Thuot coffee (2005), Moc Chau Shan Tuyet Tea (2010), Phu Quoc fish sauce (2012), Son La coffee (2017), Binh Phuoc Cashew (2018), so on
Other key benefits of GIs are the development of the structure of the GI product value chain and the creation of a collective organisation of producers and processors for the management of the GI such as, for example, the Community for the Protection of Geographical Indication of Amed Bali Salt (Indonesia) Agro-tourism, another key benefit, was developed in the Sarawak Pepper (Malaysia) area largely thanks to the GI Coffee festivals have been organised in Buon Ma Thuot (Viet Nam) since the
GI registration Finally, the preservation of traditional rice varieties is expected with the GI Khao Kai Noi (Laos)
1.4.2 GI for Buon Me Thuot Coffee in Vietnam
Coffee bean in specific geographical area in Buon Me Thuot was registered GI on 14 October 2005, which has specific characteristics, production, and processing to make uniqueness about sensory [49]
Figure 1.5 Logo of Geography Indicator of Buon Me Thuot coffee
• The main characteristics of coffee bean:
- Bean colour: greyish-green, green or light
- Bean size: 10-11 mm long, 6-7 mm wide and 3-4 mm thick
- Flavour typical of coffee as being roasted to a suitable level
- Aroma: attractive, typical, with medium to high intensity (typical trait)
- Body: average to high (typical trait)
- Selected varieties belong to the Robusta genetic group Seeds or buds for grafting must be provided by licensed seed production units
- Shade trees: ensuring to prevent at least 20% direct sunlight
- Irrigation: supplying enough water during the dry season
- Organic fertilization: 10-20 tonnes of manure/ha/year
- Chemical fertilization, plant protection measures, pruning, based on soil analysis and the guideline of technical extension workers
- Harvesting, hand-picked ensuring at least 90% ripened fruits
- The Buon Me Thuot coffee beans are processed from the fresh fruits of the Robusta coffee tree by the wet (full- washed) or dry (natural) method
- Consisting of districts: Cu M’gar, Ea H’leo, Krong Ana, Cu Kuin, Krong Buk, Krong Nang, Krong Pak, Buon Ho town; and Buon Me Thuot City of Daklak
9 province (Cu Kuin separated from Krong An; Buon Ho separated from Krong Buk)
- Soil for coffee planting: soil type: red- brown basaltic soil Soil depth and slope: depth of the basaltic soil is at least 0.7 m; soil slope a maximum of 15º
- Altitude of coffee planting region (above sea level): The coffee trees are planted within an altitude range of 400-800 m This range ensures high diurnal temperature difference in the ripening season that contributes to the high coffee quality
- Temperature and diurnal temperature difference of coffee planting region: Yearly average temperature: 24–26 ºC Diurnal temperature difference in fruit ripening season: above 11.3 ºC
• Geography Indicator management body/association
- Buon Ma Thuot Coffee Association
- Department of Science and Technology of Dak Lak Province
• Number of producers using GI
- 12 collective producers (including 15,000 coffee farmers)
• Volume of coffee bean sold with GI
- Value added from GI: 2–3% compared with commercial coffee
• Other advantages from the GI
- Maintains the sustainability of coffee production
- Preserves the pride of coffee producers with GI reputation/image
- Contributes to local cultural events (coffee festivals, competitions)
- Contributes to improving the livelihood of coffee farmers
- Buon Ma Thuot Coffee was registered in April 2011 in China by a Chinese trading company as a trademark Following the action from the Buon Ma
Thuot Coffee Association, the trademark registration was cancelled in May
Currently, GI coffee in Viet Nam is not popular in local market, it mainly for exporting, gets a high price (approximate 260.000 VND/kg -280.000 VND/kg) in case normal coffee just from 220.000 VND/kg -250.000 VND/kg, because it made sure about high quality about sensory, brought reputation for Viet Nam; however, it was fronted with fraud about quality, coffee type, fake about geography information Thus, we need to authenticate GI coffee in Buon Me Thuot to normal coffee in Viet Nam, especially is Central Highland
OVERVIEW
Methods for authentication
In order to protect the quality of coffee, prevent the fraud about type of coffee, the fake about geography information; we need to authenticate high quality coffee to low quality coffee In this case, we need methods to distinguish coffee qualify for GI standard and non- qualify
Nowadays, there are numerous methods available for authentication of coffee, from biological based method of DNA; chemical-biological method of enzymatic and immunological to chemical which based methods of spectroscopy and chromatography In the case of DNA-fingerprint method, an extremely reliable method, using genetic characteristics to discriminate between strains which allows the quantitative determination of the presence of foreign species at low levels Or chromatography such as HPLC and GC-MS, which are also popular amongst authentication methods, has various advantages including high capacity, reproducibility, sensitivity, and versatility Despite different approaches and benefits, two methods above and many current methods share one common characteristic which is laboratory based And being laboratory-based methods still remains some significant drawbacks when they are cumbersome, time consuming, require highly skilled technicians and may include destruction of samples along with substantial use of chemicals [12]
In a competitive world in which global trading plays an important role in any country's economy, acknowledging the limits of laboratory-based methods could cause losses which have urged scientists to come up with more appropriate methods
In recent studies, innovative spectroscopic fingerprinting techniques have proved to be an affordable, rapid, chemical free, comprehensive, and especially non- destructive tool for a variety of products Furthermore, combining acquired spectral data with chemometrics, the result is remarkable when providing each unique chemical profile enabling rapid subtle differences configuration Some examples for application in agricultural products can be listed such as authentication of Basmati
12 rice , coffee bean cultivars [14], green asparagus cultivars [4], fresh versus frozen then thawed beef [37], and measurement of adulteration of olive oils [33]
These researches have strengthened the effectiveness of spectroscopy that if an applicable database is used, the fingerprints examination of authentic samples to mixed or mislabelled samples can tell whether there are adulterations or not within a few minutes, even non-specialist users can carry out the analysis.
Infrared spectroscopy
The general concept of spectroscopy is to obtain information on the structure and properties of matter The basic principle shared by all spectroscopic techniques is to shine a beam of electromagnetic radiation onto a sample and observe how it responds Most common types of spectroscopies used nowadays are atomic, UV-Vis, nuclear magnetic resonance, Raman and infrared
Figure 2.1 IR spectroscopy regions (Mani, Mani & Pro, 2020)
Infrared (IR) spectroscopy, divided into near, mid, and far regions based on wavelength, generates absorption spectra (absorbance or transmittance) These spectra result from the interaction between IR radiation and molecules experiencing a change in dipole moment, which occurs when charge separation exists and enables coupling with the electromagnetic field's sinusoidal oscillation.
13 field and vibrates with a greater amplitude Second condition is that IR radiation has relevant energy for transition to higher vibrational states Furthermore, substances with different characteristics will have different absorption wavelengths From these conditional and specific interactions of molecules with IR radiation, IR spectra is considered as ‘’fingerprint’’ when each spectral is unique to provide a more specific qualitative information about chemical nature, molecular structure With the condition above, it can be considered that homonuclear diatomic like O2, N2, etc will not be able to absorb IR radiation and cause unwanted spectral when undergoing rotational and vibrational motion because they do not have dipole changing moment
On the other hand, carbon dioxide and water (atmospheric and adsorbed) can have negative effects on analysis of weak sample peaks due to strong additional absorption peaks To minimize this, it is essential to regulate concentration of CO2 and H2O during analysis by conducting at a stable condition to determine and eliminate their spectra when handling data [12]
Among IR spectroscopy, NIR with advantages of being sensitive to absorption of food components, quicker response, simpler procedure, and its low instrumentation cost have fixed its position as a future of food authentication
In context of NIR region, spectral are mostly formed by absorption of simple molecular groupings that have strong interatomic bonds such as O-H, N-H, C-H leads to overtones and combination tones of molecular vibrations, these bonds are representative for food characteristic due to their percentages in food component [44]
Following spectral are the main feature of NIR absorption The 1st and 2nd overtones of the fundamental overlapping stretching vibration of O–H and N–H correspond to the NIR bands at 6825 and 1000 nm The 1st, 2nd, and 3rd overtones of the fundamental stretching vibration of C–H are reflected in the NIR at 1780,
1200, and 920 nm The fingerprint regions are represented in NIR spectroscopy as combination overtone bands such as for amide at 2100 nm and for C–H stretching
14 from 2280 to 2330 nm [34] The overtone bands dominate the NIR spectrum from
1900 nm and include those of O–H and N–H at 1934 nm [26] This occurs partly because the anharmonic constant of an X-H bond is large and partly because the fundamentals of X-H stretching vibrations are of high frequency (short wavelength) [50]
Sample presentation mode when conducting NIR analysis on rice granules is considered as diffuse reflectance mode since incident light is significantly scattered and light scatter is arguably numerically the most important NIR measurements collected This phenomenal is caused by interaction with a variety of angular surfaces from which the light is reflected specularly Specularly-reflected light contains no information about the composition of a sample and may be redirected back along the path of incidence to the detector; scattering increases the intensity of light returning to the detector but also increases the variability of the baseline due to the variable path-length of individual photons of light [14] This effect describes the detection, by diffuse reflectance, of light that is a combination of both absorbed (interaction with the sample) and scattered light (no interaction with the sample) [14] It can have a large influence on the spectrum generated, since the ratio between reflected light (absorbed and scattered) and incident light determines the absorption profile Sample presentation is therefore extremely important in order to minimize light scatter and as far as possible, keep the level of scattering constant for each sample [14], [19] Attempts have been made to develop a mathematical basis to describe light scatter and to accommodate its effects on NIR spectra, but no completely successful strategy has been forthcoming – this has resulted in the study of spectral pre-treatment to address this problem which is discussed deeper in next part [45]
- Rapid and simultaneous analysis of multiple samples,
- Non-toxic, environmental friendly by reducing the amount of chemicals used
- Reducing the number of analytical labors, saving costs
- Simple, quick procedure, simplify the sample preparation steps and avoid destroying samples during analysis
- Capable of quantifying one substance in the presence of other substances
- Applicable to both inorganic and organic analyzes
- Available in portable size to enable measurements can be carried out on site
- Low sensitivity of the signal, which can limit the determination of low concentration components with a content of less than 2.5μm (trace)
- Require regularly build and update prediction models for each sample background
- Devices must be continuously calibrated to ensure accuracy
SCiO handheld NIR device from Consumer Physics
Near infrared spectroscopy has been widely used in the horticultural industry as a non-destructive tool to provide quality prediction of fresh and stored products In this work, a low-cost portable NIR sensor, SCiO™ molecular sensor (Consumer Physics Inc., Tel-Aviv, Israel) is assessed for its ability to provide this information Fruit samples of kiwifruit, apple, feijoa, and avocado were collected, and their spectral and quality measurements obtained in order to develop NIR predictive models [30] The performance of the SCiO™ sensor for quality prediction was assessed by developing estimation or classification models using the SCiO™ Lab online application and then compared to that of existing commercial NIR spectrometers A rapid and economic sensor like the SCiO™ would enable wider industrial applicability of the NIR technique and potentially provide fast sorting and screening capability to assist with quality predictions and decision-making processes throughout the supply chain
Principle of this device can be described as follows Light from light source is transmitted through the filter which separates light into near-infrared wavelengths Near-infrared light is then directed onto the product measured at the lamp part as shown in figure 2.2 The reflected light of the product will be captured by the SCiO molecular sensor and taken by an integrated spherical mirror and focused on a detector The spectrum signal from the probe is then processed using the device's calibration model and will be displayed on the screen in digital or spectral form
Figure 2.2 SCiO handheld NIR spectrometer (Consumer physics)[31]
(1) Molecular sensor (2) Light emitter (3) Functional button (On/Off/Calibration) (4) Battery indicator light (5) USB charging port (6) LED light (7) Calibrator/Cover
GI coffee authentication model
Pre-processing mostly aims to get more information from the data without getting fooled by unwanted information by various methods with different application ranges Like any other statistical problems, particularly in spectral data, due to the nature of measurement methods, noise can be baseline creatines, peak shift or drift or one of the most common overlapped peaks These are usually caused by NIR light scattering and FTIR sensitivity to CO2, H2O, the influence of particular size, even measuring conditions such as temperature, humidity etc Depending on which noise or main purpose of treatment, they can be categorized into two main groups:
Smoothing, Correction and Enhancement [5] For Smoothing, we have Savitzky- Golay moving window or K-Neighbor method, which aims to smooth out noise without eliminating valuable information For Correction and Enhancement which mainly focus on enhancing peak, removing baselines or transformation of peak shift and drift such as Derivative, Detrend or removing light scatter such as Multiplicative Scatter Correction (MSC) etc However, some smoothing can be overdone and smooth out some valuable information while correction can enhance too much unwanted noise [11], [28] As each method has its own disadvantages and advantages, the selection of which pre-processing methods will be applied to your data set is still a triggered problem In practice, one tends to combine rather than using one method, this treatment’s strength can cover the others’ weaknesses, for instance, the over enhancement of second derivatives can be improved by applying Savitzky - Golay moving window for some smoothing effect One of the solutions so far is to do a trial-and-error procedure, where different methods will be tested sequentially to see which is the most suitable [38] In summary, all this method's purpose is to increase the Signal to Noise Ratio (SNR), helps us easier to visualize our data, integrate the signal, and above all, helps the main treatment method work at its highest efficiency
Before going into detail and the explanation of building model procedure, some of the main data analysis foundation knowledge will be revised There are three main categories for models in statistical analysing, explanatory analysis (e.g., PCA, HCA …), regression analysis (e.g., PCR, PLS Regression, …) and classification analysis (e.g., PLSDA, SIMCA…) In this study, we will focus on Regression and Classification for building authenticity rice models
In the Regression model, our X independent variable(s) will explain for quantitative Y response Linear regression, we have one X predictor explain for one quantitative Y response Furthermore, when there is more than one X predictor
18 explained for one quantitative Y response, Multiple Regression is adopted, when multiple X predictors explain for more than one quantitative Y response, Multivariate Regression will be applied
But when it comes to the main purpose of predicting quality response, the approaches are referred to as the Classification model [2] The principle behind the classification model is to assign individuals into a group or categorize them by their characteristics From a geometric view, considering each of individuals is a point in a multidimensional space, a classification model will show us which group the individual most likely to belong to According to how each approach finds each class boundary, we can category them into the Discriminant method, finding the class discriminant line and Class modeling method, finding the class region/area [3]
2.4.2.1 SIMCA classification model- Class modeling method
For the Class modeling method, it will focus on finding the similarity among individuals, this means we will gather ones sharing the same characteristics and call it a group Consequently, the region formula for each group will be found by a special algorithm and the response will answer whether they belong to the considered group or not This will result in some individuals appearing in different groups at the same time (sharing common characteristics with different group of individuals at the same time) or even no group at all (doesn’t share the same characteristics to any of the listed groups) since Class modeling only focuses on grouping individuals that have the same characteristics; compared with the Discriminant approach where every individuals belong in a group and the sample space is divided completely into classes and no overlap occurs (Figure 2.4) [3] Soft-Independent Modeling of Class Analogies (SIMCA), Unequal (UNEQ) and Artificial Neural Network (ANN) are some of the examples for Class Modeling methods [37]
Figure 2.3 Graphical representation of the general distinction between discriminants (Marini F, 2007) [2]
Soft-Independent Modeling of Class Analogies (SIMCA) based on the supposition that the natural variability present in objects belonging to the same category as the following equation (1) [50]:
Xg = Tg Pg T + Eg (1) where Xg is the matrix of empirical measurements collected on samples belonging to class g, Tg and Pg are the PC scores and loadings extracted from the
PC model, respectively and Eg is the residual matrix
Class assessment is the process of building the model space by calculating the distance d of each object to the class space, summing up the two different contributions: the distance from the model space (orthogonal distance), and the distance into the scores space (scores distance) The orthogonal and score distances are equivalent to Q and T2, which are calculated as the sum of squares of the model residuals and the Mahalanobis distance from the center of the scores space, so the distance of an i-th sample can be calculated as [11]: di,g 2 = T 2 i,g 2 + Qi,g 2 (2)
Then, the number of components will be extracted by PC model, once it has been calculated, the data will be projected onto a new T 2 new and Qnew Then the individual distance di,g will be calculated based on normalized T 2 and Q with 95 th percentile under the null hypothesis, equation (2) now becomes:
So finally, the i-th sample will be accepted to the class g if d 2 i,g ≤ 2, otherwise it is rejected
2.4.2.2 PLS-DA classification model- Discriminant method
For the Discriminant method, it could be more familiar and easier to visualize since it is quite similar to the basic Regression model These methods focus on how different each assigned group is from each other From that, a mathematical algorithm will be applied to find the discriminant lines (straight or curvy) to split our sample space into classes The response will show you whether our individual belongs to the group on which side of the discriminant lines, means every individual can only belong in one group An example for these methods is Fishers’ linear discriminant analysis [32] or Partial Least Square Discriminant method (PLS-DA), Orthogonal PLS-DA [26] or nonparametric methods like K-Nearest Neighbor [20], Support Vector Machine [24], etc
Partial Least Squares Discriminant Analysis (PLS-DA) is a linear classification tool using the application of Partial Least Square (PLS) Regression Algorithm The main technique behind is to find the optimal number of Latent variables (LVs), the linear combination of the observable variables, known as manifest variables (MVs), which has the maximal covariance with the Y response Consequently, this shows a graphical representation to visualize and understand the pattern of data by latent variable T scores and P loadings P loadings represent the coefficients of the linear combination which determines the LVs and T scores represent coordinates of samples in the LV projection hyperspace [3] The classification problem now can be reduced as a regression equation (3) to find the classification lines formula:
21 where X is the empirical measurements, matrix collected on training samples (n sample x p feature), B is the regression coefficient matrix, Y is the Dummy Matrix (n sample x g group), and E is the residual [3]
The Dummy Matrix will be built in advance This binary matrix has the dimension of n sample x g group, means each individual row represents the individual samples and each column represents class belonging Each entry yig of Y shows whether the i-th sample belongs to the g-th group and encoding by a binary code (0 = no and 1 = yes) For instance, the three-class problem, respectively to Group 1, 2, 3 we will have 3 vectors as y1 y2 and y3 coded as: y1 = [ 1 0 0]; y2 = [ 0 1 0] and y3 = [ 0 0 1] (4)
Consequently, then PLS will be calibrated on the new Dummy Matrix Y, the B coefficient will predict the response Ynew on Xnew, the equation (4) become:
The Ynew will still have the N x G dimension of the Dummy Matrix, but not the binary structure, PLS-DA will return the ynew i,g with the continuous values, representing the probability of the i-th sample belonging to the g-th group (0 - not belong to 1 - most likely to belong) Take back to the three-class example, if the 3rd sample has the coordinates of [ 0.02 0.83 0.01] will be classified as group 2 since the probability of the second column has the highest value [3]
In building a model, it is necessary to validate the calibration model to measure the predictability of the model As usual, the data set should be divided into two sets: a training set for building the calibration model (mentioned in the previous part) and a testing set for testing the predicting ability of the built model Depending on whether the tested variables are different from those in the calibration model or not, we can classify them into External validation and Internal validation (or Cross Validation) [3]
External validation involves utilizing separate training and testing sets This ensures the testing set is independent, resulting in more accurate predictability calculations However, this approach requires a sufficiently large sample space and representative sets To address this, resampling algorithms like the Duplex algorithm and Kennard-Stone approach are employed Despite its simplicity, external validation has drawbacks, as the test error estimation can vary based on set composition Additionally, statistical methods may perform poorly with limited observations, leading to potential overestimation of the test error when the testing set comprises a small portion of the entire data set.
Conjoint analysis
Traditional sensory analysis, which focused on intrinsic product attributes alone, is not sufficient to meet the requirement of today’s fast-moving markets An optimized product formulation is necessary for a successful innovation; however, consumers are also influenced by extrinsic product information such as brand, price or labelling
Especially psychologists have long been interested in the effects of the combination of sensory stimuli, both intrinsic and extrinsic, in product evaluation Conjoint analysis (CA) is a generic expression for stated preference (SP) experimental approaches whereby consumers respond to product profiles characterized by specific attributes varying at specific levels, according to a statistical design of experiments [25] In essence, this experimental methodology measures product attributes’ impact on consumer preferences
The attributes studied are usually varied according to a factorial design plan and each consumer gives scores, either liking or purchase intent, for a few combinations of the attributes In most cases, different consumer groups respond differently to the attribute combinations In such cases, it is of great importance for the purpose of generating marketing strategies to identify the segments and then to interpret them in terms of demographic or other external information (here called consumer variables) Conjoint analysis has proved to be a great commercial success [25].
Impact of extrinsic characteristics on product evaluation
Prior studies within the expectation disconfirmation framework have employed two approaches to assess the influence of extrinsic cues on sensory perceptions One strategy examines the impact of a single extrinsic cue on product evaluation, such as the effect of champagne brands or wine price.
Studies have demonstrated the significant impact of specific extrinsic cues, such as wine closure type, critic ratings, and liking statements, on consumer product evaluations However, in real-world scenarios, consumers encounter a combination of these cues, which collectively influence their informed preferences.
On the other side, a number of studies measured the combined effect of several extrinsic cues on product evaluation, without aiming to disentangle their relative impact Examples for this approach include analyzing the impact of packaging, juice type, concentration, origin, and vitamin content [11]; measuring the combined effect of price, package and brand for beer [8]; studying wine origins and labels [13] and examining the joint impact of price, brand, packaging and regions for wine [41] Based on 1653 internet users aged 18+ who drink any coffee beverage at home, they voted top 10 the most important purchase drivers includes both of extrinsic and intrinsic attributes of coffee, such as: roasted type, price, flavour, band, environment, and something relevant to economic protection [18]
Figure 2.4 Top 10 most important coffee purchase drivers, April 2020 [18]
Base: 1,653 internet users aged 18+ who drink any coffee beverage, at home
There are a few sensory consumer studies utilising a conjoint analysis approach to separate the relative effect of extrinsic and intrinsic cues on consumer choice [17], [24] But all of them were limited to one single intrinsic attribute, such as sweetness or aroma, to avoid the interaction of multiple sensory cues
% o f co ff ee d rin k er s
Top 10 most important coffee purchase drivers, April 2020
Effective of GI label on liking and purchase intention
Geographical indications (GIs) are crucial for differentiating agri-food products, providing a distinctive attribute that is difficult to replicate Consumers' perception of GIs varies based on the product and origin, with GIs being more influential for high-priced products and in specific countries where factors like culture and national identity shape their relevance.
Geographical Indication (GI) certifications enable producers to set production standards and create competitive advantage based on product's origin Some analysis of delta and theta waves indicated that men preferred coffee with GI; while women preferred coffee without GI, even though most of them indicated the opposite when verbally asked at the end of the tasting section [21]
Thus, we need to evaluate the relative impact of extrinsic attribute: Geography Indicator logo on label, on informed hedonic liking and purchase intent for roasted and ground coffee is measured by combining a blind hedonic test with an informed tasting of the same coffee packaged in different product concepts in Viet Nam market This study separated the relative effect of various attributes and also considers differences between consumers in their responsiveness to various product cues
The methodological approach applied here is based on the expectation disconfirmation framework utilising a three-stage product evaluation procedure [33]: (1) blind evaluation of sensory stimulus by respondents,
(2) measurement of respondents’ sensory expectation from extrinsic cues by evaluating acceptance of extrinsic attributes, and
(3) combined evaluation of sensory stimulus with extrinsic cues
A number of studies have questioned the stability of preferences over time and demonstrated that preferences change within and between testing sessions Product novelty, perceived stimulus complexity and specific stimulus boredom were found to particularly impact food choice over several tasting sessions but hedonic liking was found to be relatively stable in the medium and long term over a large number of exposures and tasting sessions [47], [42]
The approach taken here particularly depends on consumers’ ability to consistently evaluate an identical stimulus that is presented repeatedly in the same sensory test We have to assume that the initial blind sensory evaluation is representative for consumers’ evaluation of the sensory component in the later informed product evaluations
As no different sensory stimuli will be evaluated in our experiment, replication consistency is likely to be higher than in the experiments by Cordelle [19]
If respondents cannot be expected to completely replicate their sensory rating for the same stimulus, then there could be variance between the evaluations of the identical sensory characteristics between blind and informed liking and within informed likings As our methodological approach assumes the evaluation of the identical sensory characteristics to be constant over time, any occurring variance will be attributed to a contribution of the non-sensory attributes to the informed product evaluation Our approach therefore potentially might suffer from confounding evaluation inconsistency with the effect of extrinsic attributes, which might lead to a potential overestimation of their effect on informed liking This potential upwards- bias should be considered when interpreting the results of our experiment.
The purpose of study
In order to develop coffee reputation in Viet Nam by GI claim, we need protect the quality of coffee, prevent the fraud about type of coffee, the fake about geography information by authenticating coffee qualify for GI standard and non -qualify with a quick technique, clean to environment, non-destructive samples, easy to use for consumer and regulatory in registration GI for products The first purpose of the
27 study is to propose a classification model will be built from NIR spectra with sequential pre-processing methods such as SNV, S-G, DT, Derivatives, MSC … and comparing both discriminant classification model PLS-DA and class modelling SIMCA This work will focus on comparing all built models with different pre and main processing methods As a consequence, a reliable model can be achieved for rapid and non-destructive methods using NIR handheld devices
Extrinsic attributes, such as the presence of a Geography Indicator logo on a product label, can influence consumer perceptions This study aims to evaluate the relative impact of such attributes on consumer preferences and purchase intent A blind hedonic test will be conducted, followed by an informed hedonic liking and purchase intent assessment By comparing evaluations of the same coffee packaged under various product concepts, the impact of extrinsic attributes will be determined.
MATERIAL AND METHODS
Material
One hundred and fifty-two (152) samples of drying Robusta coffee beans, representative of different geographical origins, were considered in the present study Out of these, 49 samples came from Dak Lak, 13 samples came from Dak Nong, 44 samples came from Gia Lai and 47 samples came from Lam Dong; the details about the origin are provided in table 3.1
Collecting sample in the area are distributed among villages in the commune, the distance between each collecting point must be at least 2 km, collecting three replicates of sample per point The condition is about processing method includes: the drying of the ingredients, the coffee berries (for the dry processing method) / the coffee bean (for the wet processing method) Coffee berries was exposed under the sunny weather, cloudy weather, no rain, air temperature below 35 o C / or drying not more than 75 o C and adjust moisture is about 12-12.5% Using equipment to remove the shell fruit after the exposing/ drying and collecting the coffee beans To measure moisture of coffee bean, conduct the grinding and putting in a moisture equipment The coffee samples in Dak Lak were passed GI standard The other coffee samples were also the same processing with coffee in Dak Lak, but these were different about method of cultivation: plant soil, water, climate, etc
These were Robusta species; harvested from October 2021 to January 2022; passed the external inspection standard:size 7mm, sort 18, classification by defect (fault grain): secondary defect 0.05) were found Furthermore, a validation was performed predicting the spectral test set of a laboratory using the model developed by the other one [4]
This result also similar to the result of Haroon and partner for authentication of the geographical origin of Roselle (Hibiscus sabdariffa L) using various spectroscopies: NIR, ow-field NMR and fluorescence Principal components analysis (PCA), hierarchical cluster analysis (HCA) and PCA combined with linear discriminant analysis (PLS-DA) were performed on NIR data to assess a possible classification of samples based on origin Correct discrimination was achieved by HCA The classification of the samples into calibration and prediction sets yielded 100% discrimination rates for both calibration and prediction sets This study proved that the three spectroscopies could be viable tools for utilization in classifying roselle samples by their geographical origins [40]
Geography distance of continents and countries were very far, input spectral data is significant different, so that classification model got high accuracy All of samples in this research located on central highland, were not too much different about climate environments Thus, model with accuracy 87.96% was a useful tool to quick authenticate the quality of coffee beans
4.1.4.1 Discussion about pre-processing performance
As discussed from the result in table 4.1 and 4.3, it cannot be denied the efficiency of treating original data before building some kinds of discriminant models Although one pre-processing approach is suitable when applying to a specific type of differential test, unfortunately, it is not fitted to another Specifically, in the experiment of determining coffee bean from Buon Me Thuot with another, the way of using SNV with data scaling (mean centering) – Z4 is very well applied to raw NIR spectra data before taking the next step of PLS-DA method for building model While in SIMCA method, the best way is combined 2nd Der and SNV after mean
Evaluation the effectiveness of GI label on purchase intent of consumer
4.2.1 Descriptive of dependent and independent variables
Table 4.7 gives an overview of the dependent and independent variables for all three product concepts We evaluate 5 factors by 5-points scale: hedonic liking (dislike not at all to like very much), purchase intent (very unlikely to very likely), price evaluation (too expensive to too cheap), region evaluation and lable claim (very un-necessary to very necessary) As expected from the first serving order, blind liking was evaluated significantly higher than the three informed conditions
The acceptance ratings for extrinsic characteristics revealed a significant preference for coffee 625 in the categories of packaging and label claim Coffee 625 outperformed both coffees 802 and 325 in these aspects Additionally, the region evaluation favored coffee 625, though the difference was less pronounced Notably, the Geography indicator claim displayed on the label of coffee 625 proved to be highly effective, driving a significantly higher acceptance rate than coffees 802 and 325.
Regression results and parameter estimates are listed in Table 4.8 and the relative importance of the independent variables is given in Table 4.9 Blind liking and extrinsic product characteristic: label claim were significant positive drivers for informed liking Label claim evaluation had the strongest influence with 60.22% and blind liking is the second 21.51% attribute importance
Informed liking is able to capture the effect of all sensory and non-sensory product cues, while purchase intent primarily integrates the economic constraint into the product evaluation
Region where coffee growth (55.33% attribute importance), blind liking (22.72%) and price (12.52%) could be strong driver direct effect on purchase intent Because in the top of mind of consumer, they consider about region where coffee
49 growth effective to coffee sensory more than others factor Consumer awareness about GI label claim not high enough; thus, with the price 260.000 VND/ 1kg GI coffee in Vietnam coffee market is also a barrier for consumer purchase intent It can be explained by price evaluation, the result show that it’s too expensive for consumers willing to pay this product, the mean of price evaluation just from 3.68- 3.85 (lower than 4.0- cheap), it’s meaning for it not an affordable price for coffee in Viet Nam
This result is similar to the research of Fabio and partner on evaluation of geographical label in consumers’ decision-making process: the attitude of consumers towards geographical labels tends to be product- and origin-specific: geographical labelling is the main differentiation tool for expensive products (e.g., wine), but is of low relevance for several countries depending on country-specific factors (e.g., nationality, culture, image and reputation) [23]
Extrinsic attributes were found to impact purchase intent in a mediated process through informed liking but had no strong direct effect Unlike previous studies , we did not test the effect of price on informed product evaluation [42], [44] Respondents only were exposed to the price after the tasting and then evaluated their purchase intent Previous research in the area of extrinsic product cues has mainly compared hedonic liking with willingness to pay measures elicited in auctions and concluded that willingness to pay measures lead to a more complete product evaluation Similarly, our findings indicate that the purchase intent construct captures both perceived product quality and taste preferences on one side and economic constraints on the other side As auctions require intensive respondent briefing and training, measuring purchase intent seems to be a second-best option to integrate economic constraints into consumer evaluations and better predict real market behaviour [39].
Table 4.7 Descriptive overview of dependent and independent variables
Hedonic Liking Purchase intent Region evaluation Label claim Price evaluation
Mean Stdev Mean Stdev Mean Stdev Mean Stdev Mean Stdev
ANOVA: Tukey post hoc tests, classes with different superscript are different at P = 0.05
Table 4.8 Seemingly unrelated regression (SUR) of hedonic liking and purchase intent: aggregated results (n = 400 observations)
Label claim 0.174 *** 3.701