INTRODUCTION
Coffee
Vietnam is the second-largest coffee producer globally, with a vibrant coffee culture that allows you to enjoy a cup for less than a dollar at numerous street stalls, cafés, and restaurants The country primarily cultivates robusta coffee, which is predominantly grown in the highlands, offering a stronger, nuttier, and darker flavor compared to Arabica coffee.
The quality of coffee beans is influenced by various factors, including the species of the beans, cultivation methods, and harvesting techniques Key aspects of cultivation that affect quality include soil composition, topography, and climate conditions Following the harvest, coffee beans undergo several physical and chemical processing steps, such as washing, drying, sorting, heating, and grinding, which further impact their overall quality.
The Covid-19 pandemic has significantly accelerated the shift towards home coffee brewing, as consumers increasingly opt to make their own coffee instead of relying on coffee shops Over the next two years, this trend is expected to evolve further, driven by advancements in coffee-making technology and decreasing prices of machines.
Figure 1.1 Global: coffee, new product launches, by format, 2018-2020 [17]
Coffee brands have significantly improved their sustainability practices over the past decade; however, future sales depend on their commitment to further advancements In the post-pandemic landscape, consumers are increasingly concerned about environmental issues and may look for brands to blame for harm caused A more activist younger generation is particularly intolerant of waste, especially regarding recyclable coffee pods that are often not recycled.
Global: coffee, new product launches, by format, 2018-2020
Whole bean Ground Pods/capsules Soloble/instant Coffee mixes RTD (iced) coffee
The COVID-19 pandemic has heightened awareness of social inequalities, particularly in the coffee industry, where many farmers receive low wages despite the significant profits generated by coffee sales and fair trade claims To ensure the sustainability of coffee supply and the livelihoods of these farmers, brands must assist them in adapting to the challenges posed by global warming.
Figure 1.2 Global: new coffee launches, by key macro-trends, 2010-2020 [18]
* Health includes the following claim categories: Functional, Plus, Natural and Minus; among food and drink categories
Consumer trend on coffee label
Recent sensory consumer research highlights that extrinsic product cues, such as packaging and branding, significantly impact consumer evaluations of food products Understanding the interaction between sensory and non-sensory attributes is crucial for both practitioners and researchers, as optimizing these dimensions is essential for a product's market success While it is acknowledged that extrinsic characteristics can enhance or diminish consumer acceptance of a product favored in blind tests, the specific influence of these cues on informed product evaluations remains underexplored.
Global: new coffee launches, by key macro-trends, 2010-2020
3 cues such as branding, labelling, packaging and price are existent Knowledge about their relative importance would guide practitioners to focus on the most important drivers [39]
Market research indicates that clean-label coffee has become a consumer expectation in developed markets, particularly in the US A significant number of US coffee drinkers prefer brands that emphasize clean or natural claims over those that highlight single-origin or small-batch attributes For instance, Caribou Coffee is promoting its transparency by stating it has "nothing to hide" regarding its coffee ingredients.
Organicăprovidesăắproof”ăofăcleanălabelăandăisăespeciallyăimportantătoăMillenialsă (b.1978-94) In 2019, 11% of all global coffee launches made the organic claim for the second consecutive year
Figure 1.3 US selected factors which would encourage coffee purchase,2019 [18]
*Defined in this case as North America + Europe + Australia
Base: 1600US internet users aged 18+ who drink any coffee beverage
Research indicates that coffee's intrinsic sensory qualities are significantly influenced by external factors, making it an ideal product category for this study.
Environmentally frendly (eg sustainable ingredients)
US: selected factors which would encourage coffee purchase, 2019
Figure 1.4 Environmental and ethical claims dominate new coffee launches [17]
Global: coffee, new product launches, top five claims, 2019 Jan-mid December 2019 Source: Mintel GNPD
National situation- Our problem
Coffee plays a crucial role in Vietnam's economy, accounting for 3% of the country's GDP and generating an average export turnover of $3 billion annually To enhance the future export market for this vital commodity, a strong focus on branding is essential Currently, Vietnam's coffee is exported to over 80 countries and holds a 14.2% share of the global green coffee export market, ranking second only to Brazil Notably, roasted and ground coffee exports contribute significantly, representing 9.1% of the market share.
Vietnam ranks fifth in global coffee production, following Brazil, Indonesia, Malaysia, and India The active support from government ministries, along with expanding markets and reorganized export strategies, is crucial for enhancing the international standing of Vietnamese coffee The focus of the State and ministries is to modernize and sustain the coffee industry, aiming for high competitiveness through diversified, high-quality products that provide greater value and increased income for farmers and businesses.
Environmental and ethical claims dominate new coffee launches
To achieve the goal of export turnover of 6 billion USD by 2030 and increase the added- value of Vietnamese coffee products, the coffee industry needs to have synchronous solutions:
- Regarding the production and processing, it is necessary to promote the restructuring of the coffee industry effectively
- Building brand must be paid more attention
- Enterprises need to survey the market's demand in areas including market share - taste - quality - price, thereby determining the proportion of processing suitable products
- Regarding to trade promotion, adjust production and business activities in accordance with market signals [36]
To maintain the quality of coffee that meets established standards and to highlight the uniqueness of different coffee varieties, it is essential to focus on the geographical origin of these products Emphasizing geographical indicators can enhance the reputation of coffee-growing regions and ensure the integrity of the coffee's unique characteristics.
Geography Indicator (GI)
Geographical indications (GIs) are distinctive signs that identify products originating from specific locations, highlighting qualities, reputation, or characteristics linked to that origin As consumers increasingly focus on the geographical source of products, GIs help them differentiate between items and ensure product quality.
Geographical Indications (GIs) play a crucial role in enhancing the value of industrial, agricultural, and handicraft products, promoting diversity within these sectors By safeguarding producers from unfair competition and misappropriation, GIs enable them to command premium prices for their goods Additionally, GIs protect consumers by ensuring accurate descriptions of product origins and characteristics, thereby fostering trade on national, regional, and international levels Furthermore, GIs contribute to rural development by creating jobs and increasing incomes for local communities.
6 producers and stakeholders of the value chain, and can also promote the region as a whole, with the development of tourism
Geographical Indications (GIs) play a crucial role in preserving traditional knowledge and local biodiversity, as they represent products rooted in the unique processes and cultural heritage of specific communities Since the introduction of the WTO's TRIPS Agreement in 1994, the global landscape for GI protection has expanded significantly, particularly in Asia Countries within the Association of Southeast Asian Nations (ASEAN) are leveraging their geographical identities to develop high-quality products and are actively pursuing the identification and registration of GIs to enhance their visibility in both domestic and international markets.
As of January 2019, 346 GIs have been registered in ASEAN countries, including
The remarkable interest of ASEAN countries in Geographic Indication (GI) protection is evident, with 37 foreign GIs registered Among the ASEAN nations, Cambodia has 3 registered GIs, while Indonesia leads with 74, followed by Malaysia with 84, Thailand with 115, and Viet Nam with 69 To date, eight GIs from the ASEAN region have been registered in the EU market, including Kampot Pepper and Skor Thnot Kampong Speu from Cambodia, Kopi Arabika Gayo from Indonesia, and several GIs from Thailand, such as Khao Hom Mali Thung Kula Rong-Hai and Kafae Doi Chaang.
Geographical Indications (GIs) significantly enhance product pricing for producers, with EU GI products priced on average at 2.23 times higher than non-GI counterparts, and agro-food items averaging 1.5 times more A global study indicates that GI premiums can result in prices 20% to 50% higher than similar non-GI products In the ASEAN region, data from IP offices reveal that GIs positively influence volumes, prices, and local development, underscoring their economic benefits.
Between 2009 and 2018, the prices of geographically indicated (GI) pepper varieties experienced significant increases, despite the international pepper market remaining relatively stable Notably, the price of Kampot white pepper from Cambodia surged by 2.6 times, while Muntok white pepper from Indonesia saw an even more dramatic rise, increasing by a factor of 6 during the same period.
From 2003 to 2016, the price of Sarawak pepper in Malaysia saw a significant increase of 4.32 times following its Geographical Indication (GI) registration Similarly, the Flores Bajawa Arabica Coffee in Indonesia experienced a notable rise, with its farm gate price for red berries increasing by 2.2 times during the same period.
Between 2005 and 2015, coffee prices experienced significant fluctuations Doi Chaang Coffee from Thailand saw its coffee berry prices double, while Buon Ma Thuot coffee from Vietnam enjoyed a value increase of 23% compared to standard coffee Additionally, fruits benefited from geographical indication (GI) protection, with the Koh Trung pomelo in Cambodia seeing a 33% increase in farm gate prices, and the Pakpanang Tabtimsiam Pomelo in Thailand experiencing a 75% price rise.
Until January 2019, Vietnam has registered 346 GIs in ASEAN, including: Buon
Me Thuot coffee (2005), Moc Chau Shan Tuyet Tea (2010), Phu Quoc fish sauce
(2012), Son La coffee (2017), Binh Phuoc Cashew (2018), so on
Geographical Indications (GIs) offer significant advantages, including the enhancement of the GI product value chain and the establishment of collective organizations for producers and processors, exemplified by the Community for the Protection of Geographical Indication of Amed Bali Salt in Indonesia Additionally, GIs have fostered agro-tourism in regions like Sarawak Pepper in Malaysia, and have led to the organization of coffee festivals in Buon Ma Thuot, Vietnam, showcasing the cultural and economic benefits of these initiatives.
GI registration Finally, the preservation of traditional rice varieties is expected with the GI Khao Kai Noi (Laos)
1.4.2 GI for Buon Me Thuot Coffee in Vietnam
Coffee bean in specific geographical area in Buon Me Thuot was registered GI on 14 October 2005, which has specific characteristics, production, and processing to make uniqueness about sensory [49]
Figure 1.5 Logo of Geography Indicator of Buon Me Thuot coffee
• The main characteristics of coffee bean:
- Bean colour: greyish-green, green or light
- Bean size: 10-11 mm long, 6-7 mm wide and 3-4 mm thick
- Flavour typical of coffee as being roasted to a suitable level
- Aroma: attractive, typical, with medium to high intensity (typical trait)
- Body: average to high (typical trait)
- Selected varieties belong to the Robusta genetic group Seeds or buds for grafting must be provided by licensed seed production units
- Shade trees: ensuring to prevent at least 20% direct sunlight
- Irrigation: supplying enough water during the dry season
- Organic fertilization: 10-20 tonnes of manure/ha/year
- Chemical fertilization, plant protection measures, pruning, based on soil analysis and the guideline of technical extension workers
- Harvesting, hand-picked ensuring at least 90% ripened fruits
- The Buon Me Thuot coffee beans are processed from the fresh fruits of the Robusta coffee tree by the wet (full- washed) or dry (natural) method
- Consisting of districts: Cu M’gar,ăEaăH’leo,ăKrongăAna,ăCu Kuin, Krong Buk, Krong Nang, Krong Pak, Buon Ho town; and Buon Me Thuot City of Daklak
9 province (Cu Kuin separated from Krong An; Buon Ho separated from Krong Buk)
- Soil for coffee planting: soil type: red- brown basaltic soil Soil depth and slope: depth of the basaltic soil is at least 0.7 m; soil slope a maximum of 15º
Coffee is cultivated at altitudes between 400 to 800 meters above sea level, a range that promotes significant diurnal temperature variations during the ripening season This temperature fluctuation is crucial for enhancing the quality of the coffee produced.
- Temperature and diurnal temperature difference of coffee planting region: Yearly average temperature: 24ậ26 ºC Diurnal temperature difference in fruit ripening season: above 11.3 ºC
• Geography Indicator management body/association
- Buon Ma Thuot Coffee Association
- Department of Science and Technology of Dak Lak Province
• Number of producers using GI
- 12 collective producers (including 15,000 coffee farmers)
• Volume of coffee bean sold with GI
- Value added from GI: 2ậ3% compared with commercial coffee
• Other advantages from the GI
- Maintains the sustainability of coffee production
- Preserves the pride of coffee producers with GI reputation/image
- Contributes to local cultural events (coffee festivals, competitions)
- Contributes to improving the livelihood of coffee farmers
- Buon Ma Thuot Coffee was registered in April 2011 in China by a Chinese trading company as a trademark Following the action from the Buon Ma
Thuot Coffee Association, the trademark registration was cancelled in May
Currently, Geographical Indication (GI) coffee in Vietnam is primarily intended for export and commands a higher price of approximately 260,000 to 280,000 VND per kilogram, compared to regular coffee priced at 220,000 to 250,000 VND per kilogram This premium is attributed to its superior sensory quality and the reputation it brings to Vietnam However, the GI coffee market faces challenges, including fraud related to quality, coffee types, and misleading geographical information Therefore, it is essential to authenticate GI coffee from Buon Me Thuot in comparison to regular coffee in Vietnam, particularly in the Central Highlands region.
OVERVIEW
Methods for authentication
To ensure the integrity of coffee quality and prevent fraud regarding coffee types and geographical information, it is essential to authenticate high-quality coffee and differentiate it from lower-quality varieties Implementing effective methods to distinguish coffee that meets Geographical Indication (GI) standards from those that do not is crucial for maintaining industry standards and consumer trust.
In today's world, various authentication methods for coffee exist, including DNA-based techniques, enzymatic and immunological methods, and chemical approaches like spectroscopy and chromatography The DNA-fingerprint method stands out for its reliability, utilizing genetic traits to differentiate between coffee strains and detect foreign species at low levels Chromatography methods, such as HPLC and GC-MS, are also favored for their high capacity, reproducibility, sensitivity, and versatility However, despite the advantages of these methods, they share a common limitation: being laboratory-based This reliance on laboratory settings presents challenges, as these methods can be cumbersome, time-consuming, require highly skilled technicians, and may lead to sample destruction along with significant chemical usage.
In today's competitive landscape, where global trade significantly impacts national economies, recognizing the limitations of laboratory-based methods is crucial Failure to do so may lead to substantial losses, prompting scientists to develop more effective and suitable approaches.
Recent studies highlight the effectiveness of innovative spectroscopic fingerprinting techniques as an affordable, rapid, and non-destructive method for analyzing various products By integrating spectral data with chemometrics, these techniques create unique chemical profiles that allow for the rapid identification of subtle differences Notably, this approach has valuable applications in agriculture, such as the authentication of Basmati rice.
12 rice , coffee bean cultivars [14], green asparagus cultivars [4], fresh versus frozen then thawed beef [37], and measurement of adulteration of olive oils [33]
Recent research has demonstrated that spectroscopy is highly effective for detecting adulterations in samples By utilizing a suitable database, this technique enables the examination of authentic samples against mixed or mislabeled ones, revealing any discrepancies within minutes Remarkably, even non-specialist users can perform this analysis with ease.
Infrared spectroscopy
Spectroscopy is a technique used to gather information about the structure and properties of matter by shining a beam of electromagnetic radiation onto a sample and analyzing its response The most widely utilized types of spectroscopy today include atomic, UV-Vis, nuclear magnetic resonance, Raman, and infrared spectroscopy.
Figure 2.1 IR spectroscopy regions (Mani, Mani & Pro, 2020)
Infrared (IR) spectroscopy is categorized into three sub-regions based on wavelength: Near Infrared (0.75 to 2.5 µm), Mid Infrared (2.5 to 20 µm), and Far Infrared (20 to 100 µm) The analysis yields absorption spectra, recorded as absorbance or transmittance, due to molecular vibrations induced by IR light For absorption to occur, two conditions must be met: the IR radiation must interact with molecules that experience a shift in dipole moment, which requires the presence of charge separation, allowing for effective coupling with the oscillating electromagnetic field.
Infrared (IR) radiation causes molecules to vibrate with varying amplitudes, provided the radiation has sufficient energy to transition molecules to higher vibrational states Different substances absorb IR radiation at specific wavelengths, leading to unique IR spectra that act as a "fingerprint" for identifying chemical nature and molecular structure Notably, homonuclear diatomic molecules, such as O2 and N2, do not absorb IR radiation due to the absence of a changing dipole moment, resulting in no significant spectral contribution during their rotational and vibrational motions.
Carbon dioxide and water, both in the atmosphere and adsorbed, can adversely affect the analysis of weak sample peaks by introducing strong additional absorption peaks To mitigate these effects, it is crucial to control the concentrations of CO2 and H2O during analysis by maintaining stable conditions, which helps in identifying and eliminating their spectral interference when processing data.
Near-Infrared (NIR) spectroscopy stands out in the realm of infrared spectroscopy due to its sensitivity to food component absorption, rapid response times, straightforward procedures, and low equipment costs, establishing it as a promising tool for food authentication in the future.
In the NIR region, spectral data primarily arise from the absorption of simple molecular groupings with strong interatomic bonds, including O-H, N-H, and C-H These bonds produce overtones and combination tones of molecular vibrations, which are indicative of the food's characteristics based on the composition percentages of these components.
The key characteristics of Near-Infrared (NIR) absorption are highlighted by the first and second overtones of the fundamental stretching vibrations of O-H and N-H, which correspond to NIR bands at 6825 nm and 1000 nm, respectively Additionally, the first, second, and third overtones of the fundamental stretching vibration of C-H are observed in the NIR spectrum at 1780 nm.
1200, and 920 nm The fingerprint regions are represented in NIR spectroscopy as combination overtone bands such as for amide at 2100 nm and for CậH stretching
14 from 2280 to 2330 nm [34] The overtone bands dominate the NIR spectrum from
The absorption characteristics at 1900 nm, including those of O-H and N-H bonds at 1934 nm, can be attributed to the large anharmonic constant of X-H bonds and the high frequency of X-H stretching vibrations, which results in shorter wavelengths.
When conducting NIR analysis on rice granules, the sample presentation mode is primarily diffuse reflectance, as incident light is significantly scattered This scattering occurs due to interactions with various angular surfaces, leading to light that is reflected specularly, which lacks compositional information While specularly-reflected light may return to the detector, scattering enhances the intensity of light detected but also introduces variability in the baseline due to differing photon path lengths The diffuse reflectance captures a mix of absorbed and scattered light, impacting the generated spectrum, as the ratio of reflected to incident light influences the absorption profile Thus, proper sample presentation is crucial to minimize light scatter and maintain consistent scattering levels across samples Efforts to mathematically describe light scatter and its effects on NIR spectra have been made, but no fully effective solution has emerged, prompting further investigation into spectral pre-treatment methods to address these challenges.
- Rapid and simultaneous analysis of multiple samples,
- Non-toxic, environmental friendly by reducing the amount of chemicals used
- Reducing the number of analytical labors, saving costs
- Simple, quick procedure, simplify the sample preparation steps and avoid destroying samples during analysis
- Capable of quantifying one substance in the presence of other substances
- Applicable to both inorganic and organic analyzes
- Available in portable size to enable measurements can be carried out on site
- Low sensitivity of the signal, which can limit the determination of low concentrationăcomponentsăwithăaăcontentăofălessăthană2.5 mă(trace)
- Require regularly build and update prediction models for each sample background
- Devices must be continuously calibrated to ensure accuracy
SCiO handheld NIR device from Consumer Physics
Near infrared spectroscopy is increasingly utilized in the horticultural sector as a non-destructive method for predicting the quality of fresh and stored produce This study evaluates the SCiO™ molecular sensor (Consumer Physics Inc., Tel-Aviv, Israel), a low-cost portable NIR sensor, for its effectiveness in quality assessment Kiwifruit, apple, feijoa, and avocado samples were analyzed to develop predictive models based on spectral and quality measurements The performance of the SCiO™ sensor was benchmarked against existing commercial NIR spectrometers by creating estimation and classification models using the SCiO™Lab online application The affordability and rapid functionality of the SCiO™ sensor could enhance the industrial application of NIR technology, facilitating quick sorting and screening processes to improve quality predictions and decision-making in the supply chain.
The device operates by transmitting light from a source through a filter that separates it into near-infrared wavelengths This near-infrared light is directed onto the product being measured, as illustrated in figure 2.2 The reflected light from the product is captured by the SCiO molecular sensor, which uses an integrated spherical mirror to focus it onto a detector The spectrum signal obtained from the probe is processed through the device's calibration model and is subsequently displayed on the screen in either digital or spectral format.
Figure 2.2 SCiO handheld NIR spectrometer (Consumer physics)[31]
(1) Molecular sensor (2) Light emitter (3) Functional button (On/Off/Calibration)
(4) Battery indicator light (5) USB charging port (6) LED light (7) Calibrator/Cover
GI coffee authentication model
Pre-processing is essential for extracting valuable insights from data while minimizing the impact of irrelevant information through various methods tailored to specific applications In spectral data analysis, common issues such as baseline noise, peak shifts, drift, and overlapping peaks often arise due to measurement techniques These challenges are frequently influenced by factors like NIR light scattering, FTIR sensitivity to CO2 and H2O, particle size, and environmental conditions such as temperature and humidity Depending on the type of noise encountered or the primary objective of the analysis, these issues can be categorized into two main groups.
Smoothing, correction, and enhancement are crucial pre-processing techniques in data analysis Smoothing methods like the Savitzky-Golay moving window and K-Neighbor aim to reduce noise while preserving important information In contrast, correction and enhancement techniques focus on improving peak visibility, removing baselines, and addressing peak shifts or drifts through methods such as Derivative, Detrend, and Multiplicative Scatter Correction (MSC) However, excessive smoothing can obscure valuable data, and over-correction may introduce unwanted noise Each method has its pros and cons, making the selection of appropriate pre-processing techniques a challenging task Often, a combination of methods is used to balance their strengths and weaknesses; for example, applying Savitzky-Golay smoothing can mitigate the over-enhancement caused by second derivatives A trial-and-error approach is commonly employed to determine the most effective method Ultimately, these techniques aim to enhance the Signal to Noise Ratio (SNR), facilitate data visualization, integrate signals, and optimize the efficacy of primary analytical methods.
Before delving into the model-building procedures, it's essential to review foundational concepts in data analysis Statistical models are primarily categorized into three types: explanatory analysis (such as PCA and HCA), regression analysis (including PCR and PLS Regression), and classification analysis (like PLSDA and SIMCA) This study will specifically concentrate on regression and classification methods for developing authenticity models for rice.
In a regression model, the independent variable(s) (X) are used to predict a quantitative response (Y) In linear regression, a single predictor (X) is employed to explain one quantitative outcome (Y) Additionally, when multiple predictors (X) are present, they collectively contribute to explaining the variability in the quantitative response (Y).
18 explained for one quantitative Y response, Multiple Regression is adopted, when multiple X predictors explain for more than one quantitative Y response, Multivariate Regression will be applied
The primary goal of predicting quality responses is achieved through classification models, which categorize individuals based on their characteristics Geometrically, each individual is represented as a point in a multidimensional space, and the classification model identifies the group to which an individual is most likely to belong Classification approaches can be divided into two categories: the Discriminant method, which determines the class discriminant line, and the Class modeling method, which identifies the class region or area.
2.4.2.1 SIMCA classification model- Class modeling method
The Class modeling method emphasizes identifying similarities among individuals by grouping those with shared characteristics An algorithm is employed to determine the region formula for each group, which indicates whether an individual belongs to a specific group This approach allows for the possibility of individuals being part of multiple groups simultaneously or not belonging to any group at all In contrast, the Discriminant approach categorizes every individual into a group, resulting in a complete division of the sample space without overlaps Examples of Class modeling methods include Soft-Independent Modeling of Class Analogies (SIMCA), Unequal (UNEQ), and Artificial Neural Networks (ANN).
Figure 2.3 Graphical representation of the general distinction between discriminants (Marini F, 2007) [2]
Soft-Independent Modeling of Class Analogies (SIMCA) based on the supposition that the natural variability present in objects belonging to the same category as the following equation (1) [50]:
The empirical measurements matrix, denoted as Xg, is derived from samples belonging to class g This matrix is calculated using the formula Xg = Tg Pg T + Eg, where Tg represents the principal component (PC) scores and Pg signifies the loadings extracted from the analysis.
PC model, respectively and Eg is the residual matrix
Class assessment involves constructing the model space by determining the distance (d) of each object to this space, which includes two key components: the orthogonal distance from the model space and the scores distance within the scores space These distances are represented by Q and T², calculated through the sum of squares of model residuals and the Mahalanobis distance from the scores space center Consequently, the distance for the i-th sample can be expressed as di,g² = T²i,g² + Qi,g².
The PC model will extract the number of components, and once calculated, the data will be projected onto new T² and Q values Subsequently, the individual distance \( d_{i,g} \) will be computed using the normalized T² and Q at the 95th percentile under the null hypothesis, as outlined in equation (2).
So finally, the i-th sample will be accepted to the class g if d 2 i,g ≤ă2,ăotherwiseăită is rejected
2.4.2.2 PLS-DA classification model- Discriminant method
The Discriminant method is a familiar approach that resembles basic Regression models, focusing on the distinctions between assigned groups This method employs a mathematical algorithm to determine discriminant lines—either straight or curvy—that divide the sample space into distinct classes Each individual is classified based on their position relative to these lines, ensuring that each person belongs to only one group Notable examples of these methods include Fisher's Linear Discriminant Analysis, Partial Least Squares Discriminant Analysis (PLS-DA), Orthogonal PLS-DA, as well as nonparametric techniques like K-Nearest Neighbors and Support Vector Machines.
Partial Least Squares Discriminant Analysis (PLS-DA) is a linear classification method that utilizes the Partial Least Square (PLS) Regression Algorithm to identify the optimal number of Latent Variables (LVs) These LVs are linear combinations of observable variables, or manifest variables (MVs), that maximize the covariance with the response variable Y This approach enables a graphical representation of data patterns through latent variable T scores and P loadings, where P loadings are the coefficients that define the LVs, and T scores indicate the coordinates of samples in the LV projection hyperspace Ultimately, PLS-DA simplifies the classification challenge into a regression equation to derive the formula for classification lines.
21 where X is the empirical measurements, matrix collected on training samples (n sample x p feature), B is the regression coefficient matrix, Y is the Dummy Matrix (n sample x g group), and E is the residual [3]
The Dummy Matrix is a pre-constructed binary matrix with dimensions of n samples by g groups, where each row represents an individual sample and each column indicates group membership Each entry, yig, signifies whether the i-th sample belongs to the g-th group, encoded in binary form (0 for no and 1 for yes) For example, in a three-class problem with groups 1, 2, and 3, the vectors are represented as y1 = [1, 0, 0], y2 = [0, 1, 0], and y3 = [0, 0, 1].
Consequently, then PLS will be calibrated on the new Dummy Matrix Y, the B coefficient will predict the response Ynew on Xnew, the equation (4) become:
The Ynew will maintain the N x G dimensions of the Dummy Matrix but will not follow a binary structure In PLS-DA, the output ynew i,g will provide continuous values that indicate the probability of the i-th sample belonging to the g-th group, ranging from 0 (not belonging) to 1 (most likely belonging) For instance, in a three-class scenario, if the third sample has coordinates of [0.02, 0.83, 0.01], it will be classified as belonging to group 2, as the probability for the second column is the highest.
Validating the calibration model is essential for assessing its predictability This process involves dividing the dataset into two parts: a training set used to build the calibration model and a testing set to evaluate its predictive capabilities The validation can be categorized into External validation and Internal validation (or Cross Validation), depending on whether the tested variables differ from those in the calibration model.
External validation involves using separate training and testing sets, ensuring that the testing set remains independent for more accurate predictability assessments This approach necessitates a sufficiently large sample space, with both sets being representative of expected samples during routine model use Techniques like the Duplex algorithm and the Kennard-Stone approach are commonly employed for effective set separation Despite its simplicity, external validation has drawbacks; the estimation of test error can vary based on the composition of each set, and statistical methods may perform poorly with limited observations, potentially leading to an overestimation of test error, especially when the testing set comprises about 20% of the total dataset.
Conjoint analysis
Traditional sensory analysis, which emphasizes only the intrinsic attributes of a product, is no longer adequate for today's rapidly evolving markets To achieve successful innovation, it is essential to optimize product formulation while also considering extrinsic factors like brand, price, and labeling that significantly influence consumer choices.
Psychologists have extensively studied how the combination of intrinsic and extrinsic sensory stimuli affects product evaluation Conjoint analysis (CA) refers to a range of stated preference (SP) experimental methods where consumers evaluate product profiles defined by specific attributes at varying levels, based on a statistical design of experiments This methodology effectively measures the impact of product attributes on consumer preferences.
In a factorial design plan, various attributes are analyzed as consumers provide scores based on their liking or purchase intent for different attribute combinations Typically, consumer groups exhibit diverse responses to these combinations, making it crucial to identify and interpret these segments using demographic or other external consumer variables Conjoint analysis has demonstrated significant commercial success in this context.
Impact of extrinsic characteristics on product evaluation
Previous research utilizing the expectation disconfirmation framework has primarily focused on two strategies to assess how extrinsic attributes influence sensory evaluation One strategy involves analyzing the effect of a single extrinsic cue on informed product evaluation, with notable examples including the significant influence of champagne brands and the direct correlation between wine price and perceived quality.
Numerous studies have demonstrated that specific attributes, such as wine closures, significantly impact product evaluation and consumer preferences However, in real-world scenarios, consumers encounter various extrinsic cues simultaneously, which can also influence their perceptions and liking of products.
Numerous studies have explored the combined effects of various extrinsic cues on product evaluation, often without isolating their individual impacts This includes research on factors such as packaging, juice type, concentration, origin, and vitamin content, as well as the influence of price, packaging, and brand in beer evaluation, and the significance of wine origins and labels A survey of 1,653 internet users aged 18 and older who consume coffee at home identified the top ten purchase drivers, highlighting the importance of both extrinsic and intrinsic attributes, including roasted type, price, flavor, brand, environmental considerations, and economic protection.
Figure 2.4 Top 10 most important coffee purchase drivers, April 2020 [18]
Base: 1,653 internet users aged 18+ who drink any coffee beverage, at home
There are a few sensory consumer studies utilising a conjoint analysis approach to separate the relative effect of extrinsic and intrinsic cues on consumer choice [17],
[24] But all of them were limited to one single intrinsic attribute, such as sweetness or aroma, to avoid the interaction of multiple sensory cues
% o f co ff ee d rin k er s
Top 10 most important coffee purchase drivers, April 2020
Effective of GI label on liking and purchase intention
Geographical Indicators (GIs) have emerged as a strategic tool for differentiating agri-food products, serving as a unique attribute that enhances product quality and makes replication challenging Consumer research reveals mixed findings on the significance of geographical labeling compared to other product characteristics Notably, consumer attitudes toward geographical labels vary based on the specific product and its origin; for high-end items like wine, geographical labeling is a key differentiator, while its importance may diminish in certain countries influenced by factors such as nationality, culture, and overall image.
Geographical Indication (GI) certifications allow producers to establish production standards and gain a competitive edge linked to the product's origin Research on delta and theta waves revealed that men favored coffee with GI, whereas women showed a preference for coffee without GI, despite most women verbally expressing the opposite preference after the tasting session.
This study examines the influence of extrinsic attributes, specifically the geography indicator logo on labels, on informed hedonic liking and purchase intent for roasted and ground coffee in the Vietnamese market By combining blind hedonic tests with informed tastings of the same coffee presented in different product concepts, the research isolates the effects of various attributes Additionally, it explores how consumer responsiveness varies based on different product cues.
The methodological approach applied here is based on the expectation disconfirmation framework utilising a three-stage product evaluation procedure [33]:
(1) blind evaluation of sensory stimulus by respondents,
(2)ă measurementă ofă respondents’ă sensoryă expectationă fromă extrinsică cuesă byă evaluating acceptance of extrinsic attributes, and
(3) combined evaluation of sensory stimulus with extrinsic cues
Research indicates that preferences for food can fluctuate over time, with factors such as product novelty, perceived complexity, and boredom influencing choices across multiple tasting sessions Despite these changes, hedonic liking—how much individuals enjoy a product—remains relatively stable in the medium to long term, even after numerous exposures and tasting experiences.
The approach outlined relies on consumers' capacity to consistently assess the same stimulus during repeated sensory tests It is essential to assume that the initial blind sensory evaluation accurately reflects consumers' perceptions of the sensory aspects in subsequent informed product assessments.
As no different sensory stimuli will be evaluated in our experiment, replication consistency is likely to be higher than in the experiments by Cordelle [19]
Variability in sensory ratings among respondents may arise, indicating that evaluations of identical sensory characteristics can differ between blind and informed liking, as well as within informed likings Our methodology presumes that these sensory evaluations remain constant over time; thus, any observed variance may be attributed to non-sensory attributes influencing informed product evaluations This could result in confounding evaluation inconsistencies linked to extrinsic attributes, potentially leading to an overestimation of their impact on informed liking It is essential to consider this upward bias when interpreting our experimental results.
The purpose of study
To enhance the reputation of Vietnamese coffee through Geographical Indication (GI) claims, it is essential to safeguard coffee quality and combat fraud related to coffee types and geographic information This can be achieved by implementing efficient and environmentally friendly techniques for authenticating coffee quality, distinguishing between GI-compliant and non-compliant products These methods should be user-friendly for both consumers and regulatory bodies involved in GI registration.
This study aims to develop a classification model using NIR spectra, employing sequential pre-processing techniques like SNV, S-G, DT, Derivatives, and MSC It will compare the effectiveness of two discriminant classification models, PLS-DA and SIMCA, focusing on various pre-processing and main processing methods The goal is to establish a reliable model for rapid and non-destructive analysis with handheld NIR devices.
This study aims to assess the influence of the Geography Indicator logo on consumer preferences and purchase intentions for roasted and ground coffee By integrating a blind hedonic test with an informed tasting of the same coffee presented in various packaging concepts, we can evaluate the impact of this extrinsic attribute on informed hedonic liking.
MATERIAL AND METHODS
Material
In this study, a total of 152 samples of drying Robusta coffee beans were analyzed, representing various geographical origins Specifically, 49 samples were sourced from Dak Lak, 13 from Dak Nong, 44 from Gia Lai, and 47 from Lam Dong, with detailed origin information available in Table 3.1.
In the commune, coffee samples are collected from various villages, ensuring that each collection point is spaced at least 2 km apart, with three replicates taken per location The processing methods vary: for dry processing, coffee berries are sun-dried under optimal conditions—sunny or cloudy weather, with temperatures below 35°C or drying temperatures not exceeding 75°C, achieving a moisture content of 12-12.5% After drying, equipment is used to remove the fruit shell and collect the coffee beans Moisture content is then measured by grinding the beans and using specialized moisture equipment Coffee samples from Dak Lak met GI standards, while other samples underwent similar processing but differed in cultivation methods, including soil, water, and climate factors.
The Robusta coffee species, harvested between October 2021 and January 2022, met external inspection standards with a size of 7mm, sort 18, and a classification of secondary defects below 10% After harvesting, the beans underwent processing using either the wet (full-washed) or dry (natural) methods.
Three replicates of 300 grams of coffee beans were packaged in light barrier materials and sealed with silica gel for preservation The samples were prepared over two months and delivered within seven days to the Department of Food Technology at Ho Chi Minh City University of Technology, where they were stored in containers.
The study analyzed datasets at a temperature range of 25 to 30°C, utilizing 75% of the 115 samples from four provinces for calibration, while the remaining 25% (37 samples) served as an external validation set.
Table 3.1 General origin information of collected samples
STT Province District Calibration samples
1 Dak Lak Buon Me Thuot
4 Lam Dong Duc Tr ng
Hand-heldăspectrometeră(SCiO™)ăwithăspectralărangeăbetweenă740ănmăandă1070ă nm in 1-nm resolution
In April 2022, a market research company recruited respondents (N0) from central locations in Ho Chi Minh City to study coffee consumption behaviors Table 7.1 provides a comparison of sociodemographic characteristics and coffee habits between the sample and the coffee consumer population in Vietnam's Central Highlands The research targets both male and female participants aged 18 to 50 years, with a household income ranging from 10 million to over 30 million VND, who consume coffee at least once a month, with a preference for black coffee.
Segment of Economic Hard Quota (±5%) TOTAL
In order to testing about sensory of coffee, this study will test on roasted ground coffee: Buon Me Thuot coffee representative for GI quality
Product for testing is black coffee with sugar Protocol for preparing product sample is:
- Roast coffee to 180 o C, in 14 minutes
- 18 gr grinded coffee combine with 100 ml Hot water (95-98 o C), wait 4-5 minutes, collect from 50 to 60 ml coffee extract at 50-60 o C
- Use coffee extract combine with 12 gr sugar at 40-50 o C for evaluating flavor, aroma and tasting.
Method
NIR analysis of coffee samples was conducted within 48 hours of delivery, utilizing a spectral range of 740 nm to 1070 nm at a resolution of 1 nm and room temperature (~26 °C) in diffuse reflectance mode, without any sample pre-treatment Each of the three replicates consisted of 80 grams of carefully mixed coffee beans, with measurements taken from five different spots on the petri dish—left edge, right edge, middle edge, top, and bottom—while the dish was kept in rotation to ensure representative spectra Each sample was scanned at least five times from a distance of approximately 1.5 cm, with background scans performed using an internal reference standard, resulting in a total of 331 wavelengths recorded for each sample scan.
3.2.2 Research flowchart for building model by NIR spectra
Our research can be sum up in the following charts
(115 samples and 7-fold cross validation)
3.2.3 Experiment protocol for consumer test
This experiment conducts on the same intrinsic factor: sensory quality of coffee from Buon Me Thuot to minimize bias from intrinsic and maximize effective from extrinsic to hedonic liking
Participants were selected for a sensory evaluation of black coffee, with five consumers testing the coffee simultaneously in individual booths To ensure unbiased feedback, respondents were instructed to avoid interaction, although the booths lacked physical partitions.
The room maintained a comfortable temperature of 21°C, featuring natural daylight and effective air conditioning and ventilation to minimize the accumulation of aromas and alcohol in the air Data collection was conducted using paper questionnaires.
In a blind tasting, participants evaluated a cup containing 50 ml of coffee extract and 12 grams of sugar served at a temperature of 40-50°C Each respondent was required to drink at least one-third of the cup to assess the coffee's flavor, aroma, and taste using a structured 5-point liking scale, ranging from "dislike not at all" to "like very much."
3.2.3.2 Sensory expectation from extrinsic cues by evaluating acceptance of extrinsic attributes
Participants evaluated one of three product concepts based on their acceptance of two extrinsic cues—region and label claim—using structured 5-point scales ranging from "dislike not at all" to "like very much." This assessment aims to capture the sensory expectations associated with each specific extrinsic cue The study investigates two primary factors: region, with three levels (None, BMT, Pleiku), and label claim, with two levels (None, GI) However, the combinations of Pleiku region with GI claim and none region with GI claim are excluded due to restrictions imposed by GI policy.
Table 3.3 Sensory expectation test for coffee
3.2.3.3 Combined evaluation of sensory stimulus with extrinsic cues
Consumers participated in a sensory evaluation by tasting coffee extracted from ground coffee packaged in different formats Each respondent sampled four variations of coffee, specifically labeled 267, 324, 625, and 802, with the order of tasting controlled by a William design to ensure unbiased results.
After sampling the coffee, participants assessed their hedonic liking on a scale from dislike to like very much They also evaluated their willingness to accept a price of 260,000 VND per kilogram, which reflects the current market price in Vietnam as of April 2022, ranging from too expensive to too cheap.
35 their purchase intent (very unlikely to very likely) region evaluation and lable claim (very un-necessary to very necessary), all using structured 5-point scales
A five-minute break was held between successive coffees in which respondents were encouraged to clean the palate with water
At the conclusion of the test, participants filled out a sociodemographic and coffee behavior survey, detailing their preferred sweetness levels, types of coffee consumed, and purchase frequency across various store types Additionally, they rated their self-perceived coffee knowledge on a 5-point scale, with options ranging from 'not knowledgeable' to 'very knowledgeable about coffee.'
Table 3.4 Informed test for coffee evaluation
Buon Me Thuot Region None Buon Me
3.2.4.1 Pre-processing data (for NIR data)
The NIR spectra of each coffee sample were analyzed using the SIMCA software package version 16.0 (Umetrics® Suite MVDA EduPack), with the three replicates for each sample averaged to ensure accuracy in the results.
Various pre-processing techniques were explored to remove potential artifacts from NIR spectra and address their nonlinear behavior These methods included Standard Normal Variate (SNV), Multiple Scattering Correction (MSC), and first or second derivative transformations utilizing the Savitzky-Golay algorithm, applied either individually or in conjunction with SNV or MSC.
36 treatment applied, all the data were subjected to mean centering prior to any multivariate data analysis
The purpose of scale correction treatment is to eliminate signals from uninteresting factors such as light scattering and particle size variations This is achieved through transformation or normalization algorithms, which include mean centering, standard normal variate (SNV), and multiplicative scatter correction (MSC) Enhancement treatment utilizes derivatives to address overlapping peaks by removing constant background effects with the first derivative or adjusting baseline slopes with the second derivative While these methods can reveal less apparent features, they may also introduce disadvantages.
Combining multiple methods rather than relying on a single spectral filter remains the most effective approach It is recommended to utilize the S-G W-O-D algorithm combination for pretreatment, which smooths the data using a specified window width (W), applies Oth order polynomials for correction, and enhances with Dth derivatives This method can be effectively paired with other preprocessing techniques such as SNV and MSC to improve results.
3.2.4.2 Aggregated analysis (for consumer data)
The evaluation of respondents is structured within a two-stage system, where the informed preference for coffee j (j = 1 to 4) by respondent i (i = 1 to 100) is influenced by their initial blind preference and their acceptance of extrinsic attributes such as region and label claim This relationship is represented through a linear equation, with regression coefficients a1 to a3 indicating the varying impact of these independent variables on the informed liking of the coffee.
Informed likingi;j = const +a1blind likingi;j + a2regioni;j + a3labelclaimi;j + errori;j
A respondent's purchase intent for coffee is influenced by informed liking and price evaluation If blind liking and extrinsic attribute assessments independently affect purchase intent without mediation from informed liking, the coefficients b1 to b4 will show significant differences from zero This establishes a proposed linear relationship between these factors.
Purchase intenti;j = const + b1informed likingi;j + b2blind likingi;j + b3regioni;j + b4labelclaimi;j + errori;j (6)
Informed liking serves as both the dependent variable in Equation (5) and the independent variable in Equation (6), making it impossible to estimate the two equations separately To address this, seemingly unrelated regression (SUR) was employed to simultaneously estimate the regression coefficients for both equations The significance of each independent variable was assessed by analyzing its contribution to the explained variance (partial R²), achieved by sequentially removing each attribute from the equations The analysis and attribute weight estimations were conducted using STATA 11.0 software.
RESULTS AND DISCUSSION
Building a model to authenticate GI coffee and non-GI coffee
Our research developed classification models, SIMCA and PLS-DA, utilizing a dataset of 152 individual samples This dataset comprises 37 samples that qualify for the Geography Indicator coffee (GI) and 78 samples that do not qualify (non-GI).
4.1.1 Visualize raw and pre-processed spectral data
Using the SCiO™ hand-held spectrometer to scan green coffee beans generates spectral bands and peaks that reflect the unique chemical composition of different coffee samples As illustrated in Figure 1, the observed spectra range from 740 to 1070 nm Analyzing these adsorption plots can be challenging with raw data; therefore, pre-processing is essential before applying multivariate data analysis The resulting spectra are influenced by high-frequency waves and the vibrations of key chemical components, including caffeine, chlorogenic acid, and water.
Figure 4.1 Raw NIR spectra (X0) obtained from handheld Scio NIR device
Initially, different pre-processing methods were investigated aiming at eliminating potential artefacts from NIR spectra and correcting their nonlinear
In this study, various data pre-treatment methods were employed, including Standard Normal Variate (SNV), Multiple Scattering Correction (MSC), and first or second derivative transformations utilizing the Savitzky-Golay algorithm, either independently or in combination with SNV or MSC Regardless of the pre-treatment method used, all datasets underwent mean centering before any multivariate data analysis The specific treatments applied include: a) Mean centering (MC) denoted as Z1, b) First derivative with mean centering (Z2), c) Second derivative with mean centering (Z3), d) SNV combined with mean centering (Z4), e) SNV with first derivative and mean centering (Z5), and f) SNV with second derivative and mean centering (Z6).
Figure 4.2 Pre-processed NIR spectra with a) MC, b) 1st Der + MC, c) 2nd Der + MC, d) SNV + MC, e) SNV+ 1st Der+ MC, f) SNV + 2nd Der + MC
4.1.2.1 Building model and Internal validation
The selection of the optimal pre-processing treatment is based on the classification error and the number of variables explained through a 7-group Leave-One-Out Cross-Validation (LOO-CV) Table 4.1 presents a comparison of all pre-processing methods applied to GI and non-GI coffee samples, detailing the number of latent variables (LVs) selected, the classification error percentage, and the explained variance percentage for each class, along with their arithmetic average.
Table 4.1 SIMCA model result by internal validation
Pre-processing LVs Error rate CV (%) Explained var (%)
The SIMCA model, utilizing 7 latent variables (LVs) and the preprocessing method Z6: SNV + 2nd Der + MC, demonstrates the highest efficacy for input data, achieving the lowest average classification error of 10.34% while accounting for 91.47% of the total variance.
2 nd Der + MC is also a good choice with average classification error 14.54%, by 21 LVs, explaining for 98.92% variance
4.1.2.2 Testing SIMCA model by External validation
To validate the SIMCA model combined with 2nd Derivative + MC (Z3) and 2nd Derivative + SNV + MC, we utilized validation samples consisting of 14 GI samples and 23 non-GI samples This evaluation focused on measuring the error rate, sensitivity, and specificity of the model Sensitivity indicates the model's ability to accurately identify samples within the target class, while specificity reflects its effectiveness in correctly rejecting samples from all other classes.
Table 4.2 SIMCA model result by external validation Error rate (EV) Sensitivity (%) Specificity (%)
Model Z6 got accuracy 85.76%, sensitivity 82.56%, specificity 89.08% and error rate 16.78% Besides, model Z3 got a lower accuracy 80.09% sensitivity 78.01%, specificity 82.23% and error rate 23.22%
SIMCA utilizes distinct models for each group, preventing the creation of a universal plot for all groups To evaluate SIMCA results, two specific plots can be employed: the Coomans' Plot, which compares distances to model results in pairwise comparisons, and the Membership Plot, which assesses the distance to the model against the distance from the model center for unknown test samples Both plots can include calculated limits to assist in determining group membership, although these limits are based on often questionable distributional assumptions to exclude a certain percentage of actual group members A higher exclusion percentage reduces the likelihood of misclassifying non-members as group members.
The analysis utilizes the Z6 model to assess the distance to the GI model (Dcrit 1.276) and the distance to the non-GI model center (Dcrit 1.21) for each coffee group The results are visually represented with vertical and horizontal lines marking the limits For a sample to be confidently classified as part of this group, it must be located in the lower left quadrant of the plot.
Figure 4.3 Coomans’ Plot for SIMCA model Z6
This study mirrors the research conducted by Taynna Kevla Lopes de Araujo and colleagues on the non-destructive authentication of gourmet ground roasted coffees through NIR spectroscopy and digital imaging As the demand for high-quality coffees rises, the risk of adulteration increases, leading to significant economic losses The research introduces the effectiveness of NIR spectroscopy and digital images, specifically from CACHAS, combined with one-class classification methods for authenticating gourmet coffees without sample preparation A total of 44 gourmet coffee samples were successfully distinguished from traditional and superior varieties by analyzing the coffee powder directly The DD-SIMCA models, utilizing offset correction for NIR and RGB histograms, achieved accurate recognition of all 90 samples in both training and testing phases This innovative approach not only benefits consumers but also aids regulatory agencies in upholding the high standards of Brazilian specialty coffees, thereby combating fraudulent labeling practices.
4.1.3.1 Building model and Internal validation
The PLS-DA model developed by 8 LVs, Z4: SNV + MC is the most suitable pre-processing for data, with the lowest average classification error (11.44%), explaining 94.92% of total variance
Table 4.3 PLS-DA model result by internal validation
Pre-processing LVs Error rate CV (%) Explained var (%)
4.1.3.2 Testing PLS-DA model by External validation
The optimal PLS-DA model was utilized on validation samples, which were pre-processed using SNV+MC, comprising 14 gastrointestinal (GI) samples and 23 non-GI samples Out of the 14 GI samples, the model accurately identified 13, while one sample was not assigned Additionally, the analysis included 23 non-GI samples.
GI samples, it recognized right 20 samples, wrong 1 sample and not assigned 1 sample
Table 4.4 PLS-DA model recognized test sets
Real/predicted GI Non-GI Not assigned
Result of testing model by external validation was shown on figure 4.4, red color representative for validation samples, divided into 2 groups belonging to GI (green) and non-GI (blue)
Figure 4.4 Predicted scores of test sets belonging to 2 groups
Z4 model by SIMCA and pre-processing SNV + MC got sensitivity 81.25%, specificity 95.23%, error rate 12.03% and accuracy 87.96%
Table 4.5 PLS-DA model result by external validation
Error rate (EV) Sensitivity (%) Specificity (%)
A study by A Giraudo and colleagues explored the geographical origin of green coffee beans through near-infrared (NIR) spectroscopy and multivariate data analysis, highlighting a rapid and non-invasive classification method The research involved analyzing FT-NIR spectra from 191 coffee samples sourced from two continents and nine countries, conducted across two laboratories Utilizing a hierarchical approach, the study developed Partial Least Square-Discriminant Analysis (PLS-DA) models, first distinguishing by continent and then by country The continent-based classification achieved over 98% accuracy in predictions, while the country-based model reached a perfect 100% accuracy.
The reliability of the proposed method was validated using the McNemar test, which showed no significant differences (P > 0.05) Additionally, the model was further validated by predicting the spectral test set from a laboratory, based on a previously developed model.
The findings align with Haroon and colleagues' research on authenticating the geographical origin of Roselle (Hibiscus sabdariffa L) through various spectroscopic techniques, including NIR, low-field NMR, and fluorescence Utilizing principal components analysis (PCA), hierarchical cluster analysis (HCA), and PCA combined with linear discriminant analysis (PLS-DA) on NIR data, the study successfully classified samples based on their origin HCA achieved correct discrimination, resulting in 100% accuracy for both calibration and prediction sets This research demonstrates that these three spectroscopic methods are effective tools for classifying roselle samples according to their geographical origins.
The geographical distance between continents and countries leads to significant variations in spectral data, resulting in high accuracy for classification models In this research, all samples were sourced from the central highlands, where climate conditions are relatively uniform Consequently, the model achieved an impressive accuracy of 87.96%, proving to be an effective tool for quickly assessing the quality of coffee beans.
4.1.4.1 Discussion about pre-processing performance
The results from tables 4.1 and 4.3 highlight the importance of data pre-processing in enhancing the efficiency of discriminant models While certain pre-processing techniques may be effective for specific differential tests, they may not be universally applicable For instance, in the experiment distinguishing coffee beans from Buon Me Thuot, the application of Standard Normal Variate (SNV) with mean centering (Z4) proved effective for raw NIR spectra data prior to implementing the Partial Least Squares Discriminant Analysis (PLS-DA) method Conversely, for the SIMCA method, the optimal approach involved combining the second derivative with SNV following mean centering.
Evaluation the effectiveness of GI label on purchase intent of consumer
4.2.1 Descriptive of dependent and independent variables
Table 4.7 presents a summary of the dependent and independent variables for the three product concepts, assessing five factors on a five-point scale: hedonic liking (ranging from dislike not at all to like very much), purchase intent (from very unlikely to very likely), price evaluation (from too expensive to too cheap), region evaluation, and label claim (from very unnecessary to very necessary) Notably, blind liking received significantly higher evaluations compared to the three informed conditions, as anticipated from the initial serving order.
The study revealed that coffee 625 had a higher informed liking and purchase intent compared to coffees 324 and 802, although the differences were not statistically significant, suggesting that respondents did not strongly discriminate between these options Additionally, coffee 625 received significantly better ratings for its packaging and label claims than coffees 802 and 324 The geographical indicator on the label of coffee 625 was particularly appealing and effective, contributing to its higher evaluation compared to the other coffees.
The regression analysis results, detailed in Table 4.8, indicate that blind liking and extrinsic product characteristics, specifically label claims, are significant positive influences on informed liking Notably, label claim evaluation emerged as the most impactful factor, accounting for 60.22% of the variance, while blind liking contributed 21.51% to the overall attribute importance.
Informed liking is able to capture the effect of all sensory and non-sensory product cues, while purchase intent primarily integrates the economic constraint into the product evaluation
The region where coffee is grown significantly influences consumer preferences, accounting for 55.33% of its importance Additionally, factors such as blind tasting (22.72%) and price (12.52%) also play crucial roles in shaping purchase intent Ultimately, consumers prioritize the origin of coffee when making their buying decisions.
The effectiveness of growth factors in enhancing coffee sensory attributes is notable, yet consumer awareness of the GI label claim remains insufficient Priced at 260,000 VND per kilogram, GI coffee in the Vietnamese market poses a significant barrier to purchase intent Price evaluation results indicate that consumers find this product too expensive, with average ratings ranging from 3.68 to 3.85, falling below the threshold of 4.0, which is considered affordable This suggests that GI coffee is not within the financial reach of many consumers in Vietnam.
Research by Fabio and colleagues indicates that consumers' attitudes towards geographical labels are often influenced by the specific product and its origin Geographical labeling serves as a key differentiator for premium products like wine, while its significance varies across countries due to factors such as nationality, culture, and the product's image and reputation.
Extrinsic attributes influence purchase intent through informed liking, although they do not exert a strong direct effect Unlike prior studies, this research did not examine the impact of price on informed product evaluation, as respondents were only made aware of the price after tasting the product Previous investigations in extrinsic product cues primarily focused on comparing hedonic liking with willingness to pay in auction settings, concluding that willingness to pay provides a more comprehensive product evaluation Our findings suggest that the purchase intent construct encompasses both perceived product quality and taste preferences, alongside economic constraints Given that auctions necessitate extensive respondent training, measuring purchase intent serves as a practical alternative for incorporating economic factors into consumer evaluations, thereby enhancing predictions of actual market behavior.
Table 4.7 Descriptive overview of dependent and independent variables
Hedonic Liking Purchase intent Region evaluation Label claim Price evaluation
Mean Stdev Mean Stdev Mean Stdev Mean Stdev Mean Stdev
Coffee 802 3.88 a 0.67 3.79 b 0.74 3.80 c 0.72 3.99 b 0.79 3.79 a 0.77 ANOVA: Tukey post hoc tests, classes with different superscript are different at P = 0.05
Table 4.8 Seemingly unrelated regression (SUR) of hedonic liking and purchase intent: aggregated results (n = 400 observations)
Label claim 0.174 *** 3.701