Computer Assignment

Computer Assignment

Sociology 166-461B

Todd Ferguson (9833181)

Dr. Kara Joyner

Choice of Sample

I selected only respondents between the ages of 18 and 65, to control for the effects of both retirement for older respondents and after-school jobs for younger respondents. Since only one value in the sample was 18 and none were younger, this was a sampling only those under 65 in the regression analysis.

The Dependent Variable

I have selected Respondentâs Income as the dependent variable to be examined. I am interested in exploring the relationship between income and various other social factors. In addition, Respondentâs Income is a variable whose distribution is suitable for the purposes OLS regression analysis.

The Independent Variables

The independent variables I examined were age of respondent; highest degree completed (replacing highest year of school completed); sex of respondent; average number of hours spent watching TV each day; number of sex partners respondent had in the previous year; frequency of sex during the previous year; whether the respondent was married or not; and the number of children reported by the respondent. Relationships between some of the above variables and income may not be readily grasped. Age tends to correspond with an individualâs prime working years, and should therefore cause a bell curve in the distribution of income. Education is believed to be positively correlated with age. I assumed that those watching more hours of television would spend less time, for example, improving upon their human capital, and income would be subsequently reduced. If a respondent had sex often and with many different partners, I assumed that this would effect the amount of time and energy left to earn an income (unless having sex

·2

with different partners was done to earn income!). I hoped to establish that people choose to marry after they assess their income to be high enough to support a family; the same rationale applies to the number of children reported.

Coding of Independent Variables

Several of the independent variables I selected needed to be collapsed into broader categories. The variable age was collapsed into ten five-year aggregates, from 18 to 64; the variable hours spent watching television had the top categories collapsed into an "7 hours or more" grouping; the variable number of children had 5, 6, 7, or 8 or more children collapsed into one category; and the variable number of sex partners had the upper limits collapsed into a category for 4 or more partners. I replaced years of schooling with the variable highest degree completed, as this variable corresponded to the groupings that the former needed to be collapsed into. The variables sex of respondent, frequency of sex and married did not require re-coding as the size in each category was sufficiently large.

Mean Values

The mean values for all variables I examined were as follows:

	N	Minimum	Maximum	Mean	Std. Deviation
Respondent's Income	994	1	22	12.80	5.62
Respondent's Sex	1500	1	2	1.57	.49
RS Highest Degree	1496	0	4	1.41	1.18
NEWPART	1367	.00	4.00	.9993	.8396
NEWTV	1489	.00	7.00	2.7676	1.7242
NEWKIDS	1495	.00	5.00	1.7900	1.5126
NEWAGE	1210	1.00	10.00	4.9479	2.3702
Married ?	1499	1.00	2.00	1.4696	.4992
Frequency of Sex During Last Year	1330	0	6	2.88	1.98
Valid N (listwise)	849

The mean values for the best fit, that between income and highest degree completed (which correlated with a Pearsonâs r of .353), was:

Respondent's Income

RS Highest Degree	Mean	N	Std. Deviation
Less than HS	8.97	106	5.64
High school	12.11	528	5.44
Junior college	12.81	73	5.14
Bachelor	14.75	191	4.85
Graduate	16.92	95	4.43
Total	12.80	993	5.62

Effects of Individual Variables

Variable	Effect on Income	Significance
Age	.596	0
Degree	1.178	0
Sex	-2.889	0
Hours of TV	-.656	0
# of Sex Partners	.232	.287
Frequency Of Sex	0.130	0.242
Married?	-.904	.018
# of Children	-.128	.373

Here we can immediately see that the variables number of children, number of sex partners and frequency of sex have little significance with regards to income, as their p-values are much higher than even a 90% level of confidence.

Of the remaining variables, the strongest effects appear to be due to the sex of the respondent, whether or not the respondent was married, the highest education degree they had received, their age, and how much TV they watched. Therefore, these five variables will be combined in my final model.

FINAL MODEL

Descriptives

	N	Minimum	Maximum	Mean	Std. Deviation
Respondent's Income	994	1	22	12.80	5.62
Respondent's Sex	1500	1	2	1.57	.49
RS Highest Degree	1496	0	4	1.41	1.18
NEWTV	1489	.00	7.00	2.7676	1.7242
NEWAGE	1210	1.00	10.00	4.9479	2.3702
Married ?	1499	1.00	2.00	1.4696	.4992
Valid N (listwise)	940

Regression

Model Summary

Model	R	R Square	Adjusted R Square	Std. Error of the Estimate
1	.521	.271	.267	4.73

a Predictors: (Constant), Respondent's Sex, NEWAGE, RS Highest Degree, Married ?, NEWTV

ANOVA

Model		Sum of Squares	df	Mean Square	F	Sig.
1	Regression	7780.780	5	1556.156	69.483	.000
	Residual	20917.955	934	22.396
	Total	28698.735	939

a Predictors: (Constant), Respondent's Sex, NEWAGE, RS Highest Degree, Married ?, NEWTV

b Dependent Variable: Respondent's Income

Coefficients

		Unstandardized Coefficients		Standardized Coefficients	t	Sig.
Model		B	Std. Error	Beta
1	(Constant)	14.949	.882		16.946	.000
	NEWTV	-.680	.104	-.189	-6.524	.000
	NEWAGE	.542	.072	.215	7.525	.000
	Married ?	-.725	.319	-.065	-2.270	.023
	RS Highest Degree	1.263	.136	.270	9.312	.000
	Respondent's Sex	-2.783	.311	-.252	-8.950	.000

a Dependent Variable: Respondent's Income

From the above, we can see that these five variables explain 27.1% of the differences between incomes of respondents. Using the formula 1-R Square, we can also see that 72.9% of the differences between incomes are explained by other variables not accounted for in this model. Those variables could be anything, from type of education to regional employment rates. For example, the combined R Squares of the other independent variables account for 20.5% of the remaining differences. Still, an 27% causality for these five variables is somewhat convincing. Where x is zero, the predicted value of y would be 14.949, which translates to an annual income of approximately $25,000.