1. Trang chủ
  2. » Tất cả

Humanities Data Analysis

1 1 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 1
Dung lượng 47,99 KB

Nội dung

Humanities Data Analysis “125 85018 Karsdrop Humanities ch01 3p” — 2020/8/19 — 11 03 — page 188 — #24 188 • Chapter 5 Group 1 entropy 0 6 Group 2 entropy 1 4 Group 3 entropy 1 6 As we can see, group1[.]

“125-85018_Karsdrop_Humanities_ch01_3p” — 2020/8/19 — 11:03 — page 188 — #24 188 • Chapter Group entropy: 0.6 Group entropy: 1.4 Group entropy: 1.6 As we can see, group1 is the least diverse and group3 is the most diverse The diversity of group2 lies between the diversity of group1 and group3 This is what we anticipated Now that we have a strategy for measuring the variability of observed types, all that remains is to apply it to the data of interest The following block illustrates the use of entropy to compare the variability of responses to the degree question for respondents in different regions of the United States: df.groupby('reg16')['degree'].apply( lambda x: scipy.stats.entropy(x.value_counts())) reg16 foreign new england 1.51 1.35 middle atlantic e nor central w nor central 1.32 1.25 1.21 south atlantic e sou central w sou central 1.26 1.20 1.29 mountain pacific 1.21 1.28 Looking at the entropy values we can see that respondents from the New England states report having a greater diversity of educational backgrounds than respondents in other states Entropy here gives us similar information as the proportion of distinct values but the measure is both better aligned with our intuitions about diversity and usable in a greater variety of situations 5.6 Measuring Association 5.6.1 Measuring association between numbers When analyzing data, we often want to characterize the association between two variables To return to the question we began this chapter with—whether respondents who report having certain characteristics are more likely to read novels—we might suspect that knowing that a region has an above average percentage of people with an advanced degree would “tell us something” about the answer to the question of whether or not an above average percentage has read a work of fiction recently Informally, we would say that we suspect higher levels of education are associated with higher rates of fiction reading In this section we will look at two formalizations of the idea of association: the correlation coefficient and the rank correlation coefficient

Ngày đăng: 20/11/2022, 11:30

TÀI LIỆU CÙNG NGƯỜI DÙNG

  • Đang cập nhật ...

TÀI LIỆU LIÊN QUAN