advantages and disadvantages of primary and secondary data

ADVANTAGES AND DISADVANTAGES OF PRIMARY AND SECONDARY DATA Primary data is explicitly collected for a specific goal, including the design, method, and data analysis techniques to be util

Trang 1

NATIONAL ECONOMICS UNIVERSITY

STATISTICS

ass: Auditing CFAB K64

oup 4: Lê Nguyễn Ngọc Hảo (11222182)

Nguyễn Diệu Anh (11220318)

Nguyễn Phương Thảo (11225911)

Pham Minh Anh (11220538)

Luc Thi VG Anh (11220288)

Nguyễn Thị Thu Hà (11221942)

Lê Thị Nhàn (11224867)

Lê Hà Phương Anh (11220234)

Lê Minh Thu (11226042)

guyễn Huyền Trang

Trang 2

TABLE OE CONTENTS

A ADVANTAGES AND DISADVANTAGES OF PRIMARY AND SECONDARY

IT, Reasons to choose the ÍODÍC: Ăn 0 TH TT TH n0 0v 000 cụ iv 20

TƯ SHIHHDHHHĐ HICÍÏHOỨ: Q0 HT TH HH HH c0 0000090000 00 0 80800 994909 21

Trang 3

A ADVANTAGES AND DISADVANTAGES OF PRIMARY AND SECONDARY DATA

Primary data is explicitly

collected for a specific goal,

including the design, method,

and data analysis techniques to

be utilized Primary data

collectors have complete control

over the process For a survey,

for example, this means that the

questions will be created

specifically to achieve that goal

To obtain a better outcome, it is

also feasible to make any

modifications or updates to the

questionnaire or the data

collection process If any

problems appear or there are

things that must be clarified in

the questionnaire, changes can

be made

- Updated information:

Primary data is first-hand data

and likely to be the most recent

at the moment it is collected

- Higher reliability and

accuracy: This is a result of the

fact that there is more control

over the data and the fact that

primary data is newly collected

The data collected by others

might contain mistakes or might

be trying to deliberately mislead

people, but newly collected data

is more likely to be accurate, and

- Lower cost or free:

Most secondary sources can be accessed for free or for very minimal cost It saves you time and effort in addition to money Secondary research enables you to obtain data without having to put any money on the table, in contrast

to primary research, which requires you to plan and carry out a whole primary study procedure from the

outset

- Time-saving:

As the above advantage suggests, you can perform secondary research in no time Sometimes it

is a matter of a few Google searches to find a source of data

- Generate new insights from previous analysis:

Re-analyzing old data can bring unexpected new understandings and points of view or even new relevant conclusions

- Large sample size:

Secondary data often covers large populations, providing a broader perspective and increased statistical power

- Longitudinal analysis:

Secondary data allows you to perform a longitudinal analysis which means the studies are

Trang 4

the collectors are going to give a

higher level of attention to

details As collectors are

collecting data for themselves,

they will generally be truthful

and reliable However, this is

under the assumption that the

collectors have enough expertise

- The target problem is dealt

with: People who are engaged in

the collection of data prepare the

questionnaire and sometimes

conduct interviews from the

targeted group to obtain data

Also, the problem is addressed

so that after proper feedback it

could be put in the limelight and

can be resolved In this way, the

program can be made

productive, and problems also

can be easily handled

- A better understanding of

data:

Data collected by someone else

might be wrongly interpreted

For primary data, however, the

collector should be the one who

understands the data and the

method they use to collect the

most There will be no

misunderstanding

- Ownership:

With the approval of the people

who were surveyed, the

researchers of the data will have

complete control over whether

the core data is made public,

patented, or sold to other parties

In order to maintain their

performed spanning over a large period of time This can help you

to determine different trends In addition, you can find secondary data from many years back up to a couple of hours ago It allows you

to compare data over time

- Availability:

In general, it is simple to access secondary data sources Anyone can find the information gathered

by other researchers, particularly when using the Internet There are many sources of information available for reference, and those who are unfamiliar with other methods of data generation can benefit from them

Trang 5

competitive advantage (being the

first to gather and evaluate the

data), the collector can keep the

information secret and

inaccessible to anybody else

(such as their competitors)

It takes a lot of time to collect

data from raw sources For

secondary data, data is gathered

from already processed sources,

which makes the process much

easier

- Higher cost and labor:

Experts will need to use specific

tools and programs, as well as

employ workers, to collect data

These are pricey chores Primary

data collection may not be

feasible on one's own

Additionally, there are situations

where respondents to surveys or

questionnaires must be

compensated As a result, it will

be more difficult and require

more resources to find the

suitable candidates Sometimes

this can be impossible

- The questionnaire must be

easy and understandable:

The questionnaire prepared must

be easy to understand then only

the researchers may get correct

and valid feedback The

researchers have to make the set

of sample questionnaires in such

a way or use the method or

technique that may help the

people to interpret it easily if not

- Outdated or incomplete information:

The data provided through different sources may also be outdated as it has been stored and managed for many years

Therefore, it may also sometimes

be outdated and may not be relevant for today’s scenario

- Lack of control over the collection process:

The secondary data might lack quality It is a limitation of secondary data that the data collected over the past few years may be inaccurate The source of the information may be questionable, especially when you gather the data via the Internet As you rely on secondary data for your data-driven decision-making, you must evaluate the reliability of the information by finding out how the information was collected and analyzed

- Anyone can access data: Data is not being privatized by the person who owns it; anyone who wants to do study on the subject can access the data There is no data secrecy, but the person who accesses the data cannot contest its possession or ownership

Trang 6

the feedback which is produced

will be wrong or inaccurate

- High difficulty and expertise

required:

It may not always be possible for

non-experts or inexperienced

persons to apply the best

technique or to create the ideal

survey that will meet their goals

There's also a risk that the

feedback was gathered

incorrectly since the

mexperienced collectors used the

wrong technique After gathering

the raw data, they might also

want the assistance of a

specialist to perform data

to the needs of the researcher Because of this, the secondary data may not be dependable for your

present requirements You can get

a ton of information from secondary data sources, but quantity is not always a good indicator of relevance

- Bias:

As the secondary data 1s collected

by someone else than you, typically the data is biased in favor

of the person who gathered it This might not meet your requirements

as a researcher or marketer

B TYPES OF SAMPLING

I Probability sampling

Probability sampling is a sampling technique in which researchers choose samples from

a larger population using a method based on the theory of probability This sampling method considers every member of the population and forms samples based on a fixed

process

There are 5 types of probability sampling methods: Simple random sampling, Systematic sampling, Stratified random sampling, Cluster sampling, Multi-stage Random

sampling

Simple random sampling samplin pling samplin pling samplin pling

Definition | Simple random —_| Systematic Stratified Cluster

sampling is a sampling is a random sampling | sampling is a

Trang 7

of a population at regular intervals

It requires the selection of a starting point for the sample and sample size that can be repeated at regular intervals

This type of sampling method has a predefined range, and hence this sampling technique is the least time- consuming

is amethod in which the researcher divides the population into smaller groups that don’t overlap but represent the entire population

While sampling, these groups can

be organized and then draw a sample from each group separately

method where the elements in the population are first divided into separate groups called clusters Each element of the population belongs to one and only one cluster A simple random sample of the clusters is then taken This method works best when each cluster is a representative small-scale version of the entire population

- Types of Cluster Sampling + One-stage cluster sampling: + Two-stage cluster sampling

the stratifying variable

- Step 2: Divide the sampling - Step 1: Define

the population

- Step 2: Divide your sample into clusters

Trang 8

- Step 4: Divide

frame into strata

or categories

- Step 3: Draw a systematic or random sample

of each stratum

- Step 3: Randomly select clusters

sample size from | the population of sample the

numbers into groups of k

individuals

- Step 5: Select k= N

correspond to the

randomly chosen | - Step 5:

numbers Randomly select

one individual from the Ist

group

- Step 6: Select

every kh

individual thereafter

Example | The instructor of | Let’s say a Researchers are | - One-stage

a total of 54 entrance and 20-30, 30-40 products She

students, in this survey every 30" splits the

Trang 9

is essentially random

neighborhood into several areas and randomly selects

customers to

form cluster samples Then she surveys every member chosen from the neighborhood for her research

- Two-stage cluster sampling: Let’s say the management of

a toy company wants to

examine how all of its outlets are performing

in the market The

management

divides the outlets based on location and randomly selects samples

to form clusters Then they used the cluster sample to study the performance

of all the outlets

- Multi-stage Random Sampling

A complex form of cluster and stratified sampling

Trang 10

¢ Carried out in stages

« Using smaller and smaller sampling units at each stage

Compare the probability sample methods

This type of research involves basic

observation and recording skills It

requires no basic skills out of the

population base or the items being

researched It also removes any

classification errors that may be

involved if other forms of data

collection were being used

- Not require any additional

information except the contact

information:

Researchers only need the contact

information of the respondents to

choose random person for survey

- Reduce researcher bias:

There are two common approaches

that are used for random sampling

to limit any potential bias in the

data The first is a lottery method,

which involves having a population

group drawing to see who will be

included and who will not

Researchers can also use random

numbers that are assigned to

specific individuals and then have a

random collection of those numbers

selected to be part of the project

- It offers an equal chance of

selection for everyone within the

- Identification of all members of the population can be difficult: Only when a complete list of the entire population to be researched is available can a simple random sampling yield an accurate statistical measure of a big population Consider a list of university students or a group of workers at a particular business The availability of these lists is the issue Accessing the entire list can therefore be difficult It's possible that some colleges or universities won't want to give researchers an exhaustive list of their faculty or students Similar to this, some businesses might not be able or willing to provide information about particular employee groups due to privacy policies

- Time-consuming for a large population:

When a full list of a larger population is not available, individuals attempting to conduct simple random sampling must gather information from other sources If publicly available, smaller subset lists can be used to recreate a full list of a larger population, but this strategy takes time to complete

10

Trang 11

It allows everyone or everything

within a defined region to have an

equal chance of being selected This

helps to create more accuracy

within the data collected because

everyone and everything has a

50/50 opportunity It is a process

that builds an inherent “fairness”

into the research being conducted

because no previous information

about the individuals or items

involved is included in the data

or smaller subgroup lists from a third-party data source

- Easy to construct, execute,

compare, and understand:

The formula to choose sample

subsets is predetermined, the only

random aspect of the study is

choosing the initial subject From

there, the selection process follows

a fixed pattern until the desired

sample group is complete

Additionally, since systematic

sampling builds representative data

for the overall group, researchers

don’t need to number each subject

This means sample selection and

data analysis are quick and easy

- Samples are evenly distributed:

Systematic sampling is highly

structured, resulting in a more

authentic representation of the

overall population No matter how

diverse the group is, this selection

process produces an evenly

distributed collection of subjects

This makes their results easier to

compare, execute, and analyze

- High sampling bias if periodicity exists:

If study participants deduce the sampling interval, this can bias the population as non-participants will

be different from study participants

- Greater Risk of Data Manipulation:

There is a greater risk of data manipulation with systematic sampling because researchers might

be able to construct their systems to increase the likelihood of achieving

a targeted outcome rather than letting the random data produce a

representative answer Any

resulting statistics could not be trusted

- Success Relies on Population Count:

The effectiveness of systematic sampling depends on the initial count of the population After all, that’s the number that is divided by the desired sample size to determine

11

Trang 12

- Quick and cost-effective:

The way systematic sampling is

structured makes surveys easy to

create and the data easy to analyze

This type of sampling is also

effective when the budget is nght

because the sample selection

process is relatively straightforward

with no further research needed at

the outset

the fixed interval for sample selection When the population isn’t measurable or available, researchers have to be able to make a close approximation If the population is estimated to be smaller or larger than its actual number, this can affect the samples and produce inaccurate results

Stratified

- More accurate sample:

It is more accurate than other

sampling techniques because it

divides the population into smaller

groups, or strata, based on

important characteristics This

allows researchers to gather more

precise data and make more

accurate predictions about the

larger population

- Effective representation of all

subgroups:

Stratified random sampling also

ensures that each stratum is

represented in the sample, which

helps to reduce bias and increase

the accuracy of the data This

makes it an ideal technique for

studying populations that are

diverse and have distinct

subgroups

- Comparisons:

Stratified sampling facilitates valid

comparisons between different

subgroups within the population

Researchers can analyze and

compare results across strata,

gaining insights into similarities

- Complex to apply at practical levels

Stratified random sampling is a more complex and time-consuming technique than other sampling methods This is because it requires researchers to divide the population into smaller strata, and then sample from each stratum in proportion to its size This can be a challenging task, especially for large or diverse populations

- Increased cost and time: Stratified sampling may involve increased costs, as researchers need

to collect data from multiple strata The process of identifying, selecting, and sampling from each

stratum can also consume more

time and resources compared to simpler sampling methods

12

Trang 13

and variations among different

segments

- Statistical inference:

Stratified sampling assists in

statistical inference by improving

the representativeness and reducing

bias Estimates derived from

stratified samples often yield

smaller sampling errors, thus

enhancing the reliability and

robustness of the findings

Cluster

- Convenience:

Cluster sampling simplifies the

logistics of data collection

Researchers can concentrate their

efforts on selected clusters, making

it more convenient to reach and

interview participants within those

designated areas

- Cost efficient:

Cluster sampling is generally a

cost-efficient sampling process It

allows you to gather responses from

a certain niche audience without

having to pay for the whole sample

to come from that audience (which

can be expensive, depending on

their criteria)

- Applicable where no complete

lists of units are available:

Cluster sampling should only be

considered when there are

economic justifications to use this

approach If reduced costs can be

used to overcome precision losses,

then it can be a useful tool This

advantage occurs most often when

the construction of a complete list

- May not be representative of the whole population:

Cluster sampling can provide a wonderful dataset that applies to a large population group It is also essential to remember that the findings of researchers can only apply to that specific demographic That’s why generalized findings that apply to everyone cannot be obtained when using this method One neighborhood is not reflective

of an entire city, just as a single state or province isn’t reflective of

an entire country

- Biased samples:

If the clusters in each sample get formed with a biased opinion from the researchers, then the data obtained can be easily manipulated

to convey the desired message It creates an inference within the information about the entire population or demographic, creating

a bias in that segment simultaneously

- Higher sampling error:

13

Tiêu đề	Advantages and Disadvantages of Primary and Secondary Data
Tác giả	Lờ Nguyễn Ngọc Hảo, Nguyễn Diệu Anh, Nguyễn Phương Thảo, Pham Minh Anh, Luc Thi VG Anh, Nguyễn Thị Thu Hà, Lờ Thị Nhàn, Lờ Hà Phương Anh, Lờ Minh Thu, guyễn Huyền Trang
Trường học	NATIONAL ECONOMICS UNIVERSITY
Chuyên ngành	STATISTICS
Thể loại	Report

Định dạng
Số trang	27
Dung lượng	1,55 MB