Computer Assignment

Sociology 166-461B

Todd Ferguson (9833181)

Dr. Kara Joyner

 

Choice of Sample

I selected only respondents between the ages of 18 and 65, to control for the effects of both retirement for older respondents and after-school jobs for younger respondents. Since only one value in the sample was 18 and none were younger, this was a sampling only those under 65 in the regression analysis.

The Dependent Variable

 

I have selected Respondentās Income as the dependent variable to be examined. I am interested in exploring the relationship between income and various other social factors. In addition, Respondentās Income is a variable whose distribution is suitable for the purposes OLS regression analysis.

The Independent Variables

The independent variables I examined were age of respondent; highest degree completed (replacing highest year of school completed); sex of respondent; average number of hours spent watching TV each day; number of sex partners respondent had in the previous year; frequency of sex during the previous year; whether the respondent was married or not; and the number of children reported by the respondent. Relationships between some of the above variables and income may not be readily grasped. Age tends to correspond with an individualās prime working years, and should therefore cause a bell curve in the distribution of income. Education is believed to be positively correlated with age. I assumed that those watching more hours of television would spend less time, for example, improving upon their human capital, and income would be subsequently reduced. If a respondent had sex often and with many different partners, I assumed that this would effect the amount of time and energy left to earn an income (unless having sex

·2

with different partners was done to earn income!). I hoped to establish that people choose to marry after they assess their income to be high enough to support a family; the same rationale applies to the number of children reported.

Coding of Independent Variables

Several of the independent variables I selected needed to be collapsed into broader categories. The variable age was collapsed into ten five-year aggregates, from 18 to 64; the variable hours spent watching television had the top categories collapsed into an "7 hours or more" grouping; the variable number of children had 5, 6, 7, or 8 or more children collapsed into one category; and the variable number of sex partners had the upper limits collapsed into a category for 4 or more partners. I replaced years of schooling with the variable highest degree completed, as this variable corresponded to the groupings that the former needed to be collapsed into. The variables sex of respondent, frequency of sex and married did not require re-coding as the size in each category was sufficiently large.

Mean Values

The mean values for all variables I examined were as follows:

 

N

Minimum

Maximum

Mean

Std. Deviation

Respondent's Income

994

1

22

12.80

5.62

Respondent's Sex

1500

1

2

1.57

.49

RS Highest Degree

1496

0

4

1.41

1.18

NEWPART

1367

.00

4.00

.9993

.8396

NEWTV

1489

.00

7.00

2.7676

1.7242

NEWKIDS

1495

.00

5.00

1.7900

1.5126

NEWAGE

1210

1.00

10.00

4.9479

2.3702

Married ?

1499

1.00

2.00

1.4696

.4992

Frequency of Sex During Last Year

1330

0

6

2.88

1.98

Valid N (listwise)

849

       

The mean values for the best fit, that between income and highest degree completed (which correlated with a Pearsonās r of .353), was:

 

Respondent's Income

 

RS Highest Degree

Mean

N

Std. Deviation

Less than HS

8.97

106

5.64

High school

12.11

528

5.44

Junior college

12.81

73

5.14

Bachelor

14.75

191

4.85

Graduate

16.92

95

4.43

Total

12.80

993

5.62

Effects of Individual Variables

Variable

Effect on Income

Significance

Age

.596

0

Degree

1.178

0

Sex

-2.889

0

Hours of TV

-.656

0

# of Sex Partners

.232

.287

Frequency Of Sex

0.130

0.242

Married?

-.904

.018

# of Children

-.128

.373

 

Here we can immediately see that the variables number of children, number of sex partners and frequency of sex have little significance with regards to income, as their p-values are much higher than even a 90% level of confidence.

Of the remaining variables, the strongest effects appear to be due to the sex of the respondent, whether or not the respondent was married, the highest education degree they had received, their age, and how much TV they watched. Therefore, these five variables will be combined in my final model.

FINAL MODEL

Descriptives

N

Minimum

Maximum

Mean

Std. Deviation

Respondent's Income

994

1

22

12.80

5.62

Respondent's Sex

1500

1

2

1.57

.49

RS Highest Degree

1496

0

4

1.41

1.18

NEWTV

1489

.00

7.00

2.7676

1.7242

NEWAGE

1210

1.00

10.00

4.9479

2.3702

Married ?

1499

1.00

2.00

1.4696

.4992

Valid N (listwise)

940

       

Regression

 

Model Summary

 

Model

R

R Square

Adjusted R Square

Std. Error of the Estimate

1

.521

.271

.267

4.73

a Predictors: (Constant), Respondent's Sex, NEWAGE, RS Highest Degree, Married ?, NEWTV

 

 

ANOVA

 

Model

 

Sum of Squares

df

Mean Square

F

Sig.

1

Regression

7780.780

5

1556.156

69.483

.000

 

Residual

20917.955

934

22.396

   
 

Total

28698.735

939

     

a Predictors: (Constant), Respondent's Sex, NEWAGE, RS Highest Degree, Married ?, NEWTV

b Dependent Variable: Respondent's Income

 

Coefficients

 

Unstandardized Coefficients

Standardized Coefficients

t

Sig.

Model

 

B

Std. Error

Beta

   

1

(Constant)

14.949

.882

 

16.946

.000

 

NEWTV

-.680

.104

-.189

-6.524

.000

 

NEWAGE

.542

.072

.215

7.525

.000

 

Married ?

-.725

.319

-.065

-2.270

.023

 

RS Highest Degree

1.263

.136

.270

9.312

.000

 

Respondent's Sex

-2.783

.311

-.252

-8.950

.000

a Dependent Variable: Respondent's Income

 

 

 

From the above, we can see that these five variables explain 27.1% of the differences between incomes of respondents. Using the formula 1-R Square, we can also see that 72.9% of the differences between incomes are explained by other variables not accounted for in this model. Those variables could be anything, from type of education to regional employment rates. For example, the combined R Squares of the other independent variables account for 20.5% of the remaining differences. Still, an 27% causality for these five variables is somewhat convincing. Where x is zero, the predicted value of y would be 14.949, which translates to an annual income of approximately $25,000.