[Contents Page]     [Previous section]     [Next section]       Back to my homepage


8 --- Data analysis

Sections:

8.1   Testing the Barnett curves
8.2   Notation for the experimental variables
8.3   Manipulation steps
8.4   The Barnett hypotheses
8.5   Other hypotheses

8.1 Testing the Barnett curves

The Barnett model can quantify, test, and elaborate on the common opinion that:

Humanities scholarship continues to utilize older materials while the social sciences emulate the scientific model emphasizing the latest materials.[118]

8.2 Notation for the experimental variables

For the various distributions, let

t = citation age, in years, i.e., (date - 1996)

u(t) = the experimentally derived frequency distribution, for 0 =< t =< 50 years,

i.e., the percentage of citations, as a function of citation age

Umax = the maximum value of u(t) for a given experimental distribution

t* = the modal citation age; i.e., when Umax occurs

µ = the mean citation age, i.e., [sum of all u(t).t] / [sum of all u(t)]  for all 0 =< t =< 50

8.3 Manipulation steps

For each thesis

1) Tabulate u(t) for 0 =< t =< 50

2) Use step-1 to calculate, for each subject distribution,

· its µ

· its standard deviation, s = square root of  ([sum of all u(t).[t-µ]²] / [sum of all u(t)])

· its skew, ([sum of all u(t).[t-µ]³] / [sum of all u(t)].[s³])

· its kurtosis, ([sum of all u(t).[t-µ]^4] / [sum of all u(t)].[s^4])

3) For all the Humanities theses (as a single distribution), use Step-1 to

· tabulate u(t)

· determine Umax

· determine t*

4) Repeat step-3 above for all the Social Sciences theses.

8.4 The Barnett hypotheses

To summarize: [119]

     Fields differ in citation patterns; ... science is least likely to cite work in the distant past with social sciences next, ... humanities are expected to be most likely to quote from the older literature.
     ... If [social] scientists cite a greater percentage of more recent articles, their discipline's curve should reach a higher peak in a shorter period of time. Disadoption should begin earlier as the articles in use are replaced by new, up-to-date ones. In the arts and humanities, we expect that it takes a long period of time for ideas to be adopted and longer for the disadoption of an old idea. Therefore the arts and humanities curve should be flatter overall. Because a smaller percentage of citations should come from any given year, it should have a lower peak. The arts and humanities curve should decline more slowly, over a longer period of time.

Truncated versions of Barnett's hypotheses are immediately applicable to this project.[120]

Hypothesis-1A: A Barnett curve may be generated for the Humanities citations that will fit the distribution data better than any straight line.

Test. With the data from §8.3 Step-3:

1. Using regression, calculate R(t) = the line of best fit for u(t)

2. Generate y(t), a Barnett curve, which has the same approximate area and mean citation age as u(t); i.e.,

0% << [sum of all y(t)]  which approximately equals  [sum of all u(t)] < 100%

and tµ = [sum of all y(t).t] / [sum of all y(t)]
      which approximately equals [
sum of all u(t).t] / [sum of all u(t)] = µ

3. Compare the variance of the line & the curve,

i.e., compare  sum of all [u(t) - y(t)]²  with  sum of all [R(t) - u(t)]²

Since u(t), y(t), and R(t) will be expressed as percentages (numbers out of 100), the variances or error factors will be expressed as numbers out of 10,000 in order to be more easily comprehended and verified.

4. If necessary, re-generate a better y(t), such that

sum of all [y(t) - u(t)]² < sum of all [R(t) - u(t)]²

Hypothesis-1B: A Barnett curve may be generated for the Social Sciences citations that will fit the distribution data better than any straight line.

Test. As for Hypothesis-1A, but using data from §8.3 Step-4.

Hypothesis-2: t*SS < t*H

Test. Arithmetical result. Compare results from §8.3 Steps -3 and -4. Note that this hypothesis is slightly different from Barnett's Hypothesis-2; We are comparing the modal citation ages, where as Barnett et al. were comparing the modal years.[121]

Hypothesis-3: Umax,SS < Umax,H

Test. Arithmetical result. Compare results from §8.3 Steps -3 and -4.[122]

8.5 Other hypotheses

There are other testable premises which are appropriate when examining the continuum of recent-through-archaic Barnett curves, i.e., the theoretical spectrum of citation patterns from the ultra-scientific to the extremely humanistic. As mentioned above:[123]

     [Social] science is least likely to cite work in the distant past ... humanities are expected to be most likely to quote from the older literature.
     ... In the arts and humanities, we expect that it takes a long period of time for ideas to be adopted and longer for the disadoption of an old idea. Therefore the arts and humanities curve should be flatter overall. Because a smaller percentage of citations should come from any given year, it should have a lower peak.

It therefore follows that, in general, the citation ages of recent distributions are less than the citation ages of archaic distributions --- i.e.,

a) the mean citation age increases in the progression from recent to archaic.

Furthermore, recent distributions, when graphed, are sharper [or leptokurtic]; archaic distributions are flatter [or platykurtic] --- i.e.,

b) the kurtosis lessens in the progression from recent to archaic.

Additionally, in accord with the pattern suggested by Barnett et al.,[124] intensely recent [science-like] distributions have a significant cluster of citations including the mode with a citation age less than the arithmetically mean age. Conversely, archaic distributions are likely to have fewer citations with a citation age less than the mean, --- hence,

c) the skew of the particular curve from the left (a positive skew) towards the right, i.e. the skew decreases, in the progression from recent to archaic.

These three correlated trends (a, b, and c) should be reflected in the various experimentally derived frequency distributions for all six examined subjects.

Hypothesis-4: As µ increases, the skew decreases.
Hypothesis-5: As µ increases, the kurtosis decreases.
Hypothesis-6: As the skew decreases, the kurtosis also decreases.

Test. With the data from §8.3 Step-2, perform rank-order comparisons between the subjects ranked by µ, skew, and kurtosis. Spearman's rho coefficient will suffice to test for an statistically significant correlations because identical values (and thus ties in the ranking orders) are most unlikely. Spearman's rho is used (rather than, say, Pearson's r) because it is the trend that we wish to test; the mathematical model does not suggest that any pairs of these three qualities have a linear relationship.


Footnotes to chaper 8

118.  Buchanan and Herubel, "Comparing Materials," 69. This generalization is concluded on scant evidence (5 philosophy dissertations vs 5 Political science dissertations), but is a belief held by most authorities.

119.  See Barnett, Fink, and Debus, "Mathematical Model," 515-6.

120.  These hypotheses are a paraphrase of ibid., 516-8, but obviously excluding references to Science citations.

121.  Unfortunately Barnett et al. initially use "t" to represent citation age, yet also use it to represent the parameter of publication year; i.e., tage = ([Year]0 - tyear) in effect. I have avoided this unnecessary confusion. Since Barnett's year parameter is of opposite sign to the age, Barnett's Hypothesis 2 may be translated as: "The modal date of the Social Science distribution is greater [i.e., chronologically later] than the modal date of the Humanities distribution."

122.  Again, Hypothesis-3 in Barnett et al. has a superficially different form, because they use y(t) to represent both the theoretical and experimental distributions. I hope that I have avoided any confusion by employing

u(t) for the actual distribution of the counted data,  and
y(t) for the theoretical distribution it may represent.

123.  See Barnett, Fink, and Debus, "Mathematical Model," 515-6.

124.  Barnett, Fink, and Debus, "Mathematical Model," figure 1, p.517.


This site is maintained at Geocities. Concerning their free home page offer, click here.