Ministry of Education and TrainingNational Economics University
Nguyễn Thị Thu Quỳnh - 11219551
Class: Financial economics 63
HÀ NỘI, 05/2022
Trang 2Question 1:1.
$12500 TO14999
Trang 3$15000 TO17499
$17500 TO19999
$20000 TO22499
$22500 TO24999
$25000 TO29999
$30000 TO34999
$35000 TO39999
$40000 TO49999
$50000 TO59999
$60000 TO74999
$75000 TO$89999
$90000 $109999
$110000 OROVER
Trang 4—> It does not make sense to use the histogram for this variable A histogram isthe most frequently used graph to showcase frequency distributions The x-axis on
Trang 5the histogram represents intervals that show the scale of values which themeasurements fall under; meanwhile the y-axis shows the frequency of the valuesoccurred within the intervals However, in this case, the x-axis does not representthe desired data (amount of family income last year) but assigned group value andthe missing values column is not stated clearly since it is lying within thenon-missing values, which ensue misunderstandings.
—> Bar chart makes sense in this situation because it can show a distribution ofdata points while also performing a comparison of metric values across discreteincome groups From the bar chart, we can see the most common group, thedirection of the trend as well as how others compare against each other.
Trang 6- The median is often believed to be the greatest representation of the data'smiddle location in skewed distributions.
same width, so while it makes it easier to read and comprehend the graph, itis not required.
● Others ways to record income
1 Compiling the database on the occupation of the members of a family-> Divide members of a household into different categories
For example:
- Adults (>18 years old)- Teenagers (<18 years old)…
-> Continue to divide it into smaller sectors and focus on their careers in eachcategory.
2 Collecting the data by age
-> Divide family members into different age groups and do a survey to gatherincome information.
For example:
16 – 18 (<18), 18 – 24, 24 – 30 , 30 – 36,…
Trang 7● The problem associated with them
- Due to the complexity of the techniques, synthesising the data will take along time.
- Maybe it might need a large amount of expenses.
- The mean values will be invalid because the variance between variables is solarge.
Question 2:Frequencies
Hours per day watching TVFrequency Percent Valid Percent
CumulativePercent
Trang 8other interval, smaller than 3 Only 12 has a high frequency of 13, which isunusual.
b) Based on frequency table, Valid Percentage of People:Don’t watch any TV: 6%
Watch 2 hours or less: 53.1%
Five hours or more: 100% - 83.3% = 16.7%Watch 1 hour: 20.9%
Watch 4 hours or less: 83.3%c)
StatisticsHours per day watching TV
The value for 25 percentile: 1.00th
The value for 50 percentile: 2.00th
The value for 75 percentile: 4.00th
The value for 90 percentile: 6.00th
The value for Median: 2The value for Mode: 2d) Problem in Bar chart
In general, the dataset is not distributed equally As can be seen from the bar graphbelow, most of the respondents watch TV from 1 to 4 hours per day, whereas onlya minority of those watch TV for more than 10 hours.
Moreover, the values <9, 13, 16, 17, 18, 19, 21, 22, 23= are not included in the barchart due to the fact that these values do not appear in the survey answers (thismight occured since the number of respondents are not large enough) Therefore,the problem in the bar chart is that it does not show a gap which represents theseuncollected data (so-called missing values), which can lead to misunderstandings.In addition, it’s hard to tell the trend after reading the bar chart as there are manybars with unequal distribution.
Trang 9All the values in the histogram are clumped together as histogram represents acontinuous data set, refers to a graphical representation that displays data to show
Trang 10the frequency of numerical data It’s different from a bar chart which is a pictorialrepresentation of data that uses bars to compare different categories of data Barsdo not touch each other, hence there are spaces between bars.
Also, the histogram is positively skewed with a long tail to the right, which meansthat most of the values are distributed in the left Observing the histogram, weconclude that most people taking the survey watch tv from 1 to 4 hours while veryfew of them watch more than 10 hours.
Bar charts and histograms both display data, but for different purposes Bar chartsallow us to compare specific variables or categories Histograms allow us tounderstand the distribution of variables or the frequency of specific occurrences Inthis case, a histogram is a better choice than a bar chart as there are manycategories and it’s more necessary to understand the distribution of variables(number of tv hours people most watch).
Question 3:
To distinguish people who are very happy with their marriage from thosewho are less content, our group has decided to choose mean family income; rangesrecorded to midpoints, or variable incomdol.
Trang 11After researching, we believe that income has a direct correlation withpeople’s happiness A chart shows a clear trend that a difference in income wouldresult in a significant difference in people’s level of satisfaction in their marriage.As can be seen from the figure, people who are happier with their marriage havehigher income than those who are less happier The reason might be that peoplewith higher income have fewer worries about financial issues, and can focus moreon building their marital happiness with their partners.
To conclude, a high level of happiness in marriage depends much onpeople’s income.
Trang 12Besides mean family income, happiness of marriage also depends on MeanHusband and WIfe’s Education (yrs) Two above bar charts show a clear trend thathusbands and wifes who have a higher number of years of education would belikely to have a happy marriage However, the difference is not as significant asthose with mean family income.