Project 2

Regression Analysis

The manager of boiler drums wants to use regression analysis to predict the number of worker hours needed to erect the drums in future projects. Consequently, data for 36 randomly selected boilers were collected. In addition to worker hours (Y), the variables measured include boiler capacity, boiler design pressure, and drum type.

 

Number of Worker-Hours Needed to Erect Boiler Drums

 

 

 

 

 

 

Worker_Hours

Boiler_Capacity

Design_Pressure

Boiler_Type

Drum_Type

 

6,928

610,000

1,500

2

1

 

3,211

610,000

1,500

2

2

 

4,065

90,000

1,140

1

1

 

1,515

150,000

250

1

2

 

3,748

88,200

399

1

1

 

5,651

441,000

410

1

1

 

2,000

150,000

500

1

2

 

4,206

441,000

410

1

2

 

1,200

30,000

325

1

2

 

1,965

65,000

750

1

2

 

2,566

150,000

500

1

2

 

6,454

627,000

1,525

2

1

 

2,048

30,000

325

1

1

 

2,635

90,000

1,140

1

2

 

4,268

150,000

500

1

1

 

2,974

120,000

375

1

2

 

3,163

88,200

399

1

1

 

4,006

441,000

410

1

2

 

3,775

441,000

410

1

2

 

2,680

125,000

750

1

1

 

4,526

150,000

500

1

1

 

3,120

441,000

410

1

2

 

7,606

610,000

1,500

2

1

 

2,972

88,200

399

1

1

 

1,206

30,000

325

1

2

 

6,387

441,000

410

1

1

 

14,791

1,089,490

2,170

2

1

 

3,590

65,000

750

1

1

 

3,698

610,000

1,500

2

2

 

4,023

150,000

325

1

1

 

6,565

441,000

410

1

1

 

10,825

1,073,877

2,170

2

1

 

2,735

150,000

325

1

2

 

3,137

120,000

375

1

1

 

6,500

441,000

410

1

1

 

3,728

627,000

1,525

2

2

 

 

 

 

 

Svatter plots of the dependent variable vs all the independent variables

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The above graph shows that Boiler capacity which is the independent variable explains about 68% of the dependent variable which is Worker Hours, so it should be considered for Regression Analysis.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The above graph shows that the boiler type explains 33% of the dependent variable which is Worker hours.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The above graph shows that the Drum type which is an independent variable explains 25% of the dependent variable.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The above graph depicts that the dependent variable which is worker hours is explained 43% by the independent variable Design Pressure.

 

 

 

Correlation matrix helps us to see if any of the independent variables are highly correlated and if they are showing redundancy.

 

 

 

Worker Hours

Boiler Capacity

Design Pressure

Boiler Type

Drum Type

 

Worker Hours

1

 

Boiler Capacity

0.827356289

1

 

Design Pressure

0.65893468

0.762166006

1

 

Boiler Type

0.574528783

0.796895361

0.902270223

1

 

Drum Type

-0.505794595

-0.110707049

-0.138483068

-0.074701788

1

 

 

 

 

 

multicollinearity exists when 2 of the independent variables are Highly correlated(Redundancy)

 

 

The above relationship shows that Boiler capacity, Design Pressure and Drum Type are highly correlated with Worker Hours.

 

 

 

SUMMARY OUTPUT

 

 

Regression Statistics

 

Multiple R

0.950241938

 

R Square

0.90295974

 

Adjusted R Square

0.890438416

 

Standard Error

894.6031853

 

Observations

36

 

 

ANOVA

 

 

df

SS

MS

F

Significance F

 

Regression

4

230854854.1

57713713.53

72.11375982

2.97665E-15

 

Residual

31

24809760.63

800314.8591

 

Total

35

255664614.8

 

 

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

7291.783466

781.4787781

9.330750457

1.63182E-10

5697.946101

8885.62083

5697.946101

8885.62083

Boiler_Capacity

0.008749011

0.000903468

9.683809218

6.8642E-11

0.006906375

0.010591647

0.006906375

0.010591647

Design_Pressure

1.926477177

0.648906909

2.968803613

0.005722954

0.603022072

3.249932281

0.603022072

3.249932281

Boiler_Type

-3444.254644

911.7282884

-3.777720498

0.000674819

-5303.737785

-1584.771503

-5303.737785

-1584.771503

Drum_Type

-2093.353564

305.6336847

-6.849223985

1.1242E-07

-2716.697921

-1470.009206

-2716.697921

-1470.009206

 

Considering all the independent variables like Boiler capacity, drum pressure ,boiler type and drum type we get the regression equation for the model as

 

 

Worker Hours= 7291.7+0.008749Boiler Capacity+1.9264Design Pressure-3444.2Boiler Type-2093.35Drum Type