Using Complexity to Estimate Effort

Home Page
Public Links Personal Links
Scrap Heap
This note documents a modification to the PROBE estimatation methodology from PSP. It grew out of my experience with the PSP for Engineers course. During the course I was disappointed with the accuracy of my time estimates despite the accuracy of my size estimates. The graph shows the estimated duration and the actual duration for the ten assignments. The time estimates never got accurate. In fact it seems that a fixed estimate of 400 minutes would have been as useful.
To better understand the reasons for the inaccuracy I measured the size of my programs in LOCC for each of the ten assignments and correlated this measure with the actual time in minutes to complete each assignment. The result shown below explains why time estimates based on locc never got close. The R-squared value is only 21%. The resulting equation was time = 210+0.79LOCC; which given an average LOCC produces an average estimate of 401. This agrees with the observation that an estimate of 400 minutes would have been nearly as good as the estimate generated by my process.
Regression Statistics

R

0.46

R Square

0.21

Adjusted R Square

0.11

Standard Error

170.04

Observations

10

When looking at my actual time by phase, I observed that a lot of the error variance was due to variations in Design time. From the observation, I hypothesized that the difference in time was partly due to the complexity of the solution I designed and that complexity was not adequately captured by a simple LOCC metric. To investigate this hypothesis I constructed a new database enriched by a complexity metric calculated using McCabe’s Cyclomatic Complexity. The database is shown below. The first column is the average complexity per method accross all the classes in each assignment. The second column is the total of all the average complexities for the classes. The idea was to potentially isolate complexity from simple locc. Although as the results will show, there was no apparent independ contribution from LOCC. Also recorded in the data set was total N&C, total time and design time.

Avg Complexity

Total Complexity

Actual (N&C)

Actual Min.

DesignMin

1.435592186

10.0491453

178

195

78

2.152777778

12.91666667

231

266.4666667

38.96666667

1.7

27.2

446

411.4833333

89.61666667

1.449905033

13.0491453

167

210.8333333

17.18333333

1.307081807

18.2991453

310

389.05

34.86666667

1.390537241

34.76343101

310

783.1666667

220.9833333

1.358773253

23.0991453

106

268.6

80.21666667

1.766559829

14.13247863

146

474.2166667

186.4166667

1.435005427

25.83009768

192

459.6

108.8166667

1.645807896

29.62454212

334

556.2833333

154.1666667

New results

Regression Statistics

Multiple R

0.81

R Square

0.66

Adjusted R Square

0.56

Standard Error

119.64

Observations

10

Coefficients

 

Coefficients

Standard Error

t Stat

P-value

Intercept

32.24

115.42

0.28

0.79

Total Complexity

17.18

5.68

3.03

0.019

Actual (N&C)

0.042

0.45

0.093

0.93

Place some stuff in here.

Nr

Class

NCCS/Method

# of Methods

Avg Complexity

Category

SubCategory

1

AssignmentOneA

12

3

1

Control

Batch

2

AssignmentTwoA

8

2

1.5

Control

Batch

3

AssignmentThreeA

6

2

1.5

Control

Batch

4

AssignmentFourA

5.5

2

2

Control

Batch

5

AssignmentFiveA

17

2

1

Control

Batch

6

AssignmentSixA

12.5

2

3

Control

Batch

7

AssignmentSevenA

5.3

3

2

Control

Batch

Explain table here.

Size

VS

S

M

L

VL

Control-Batch

0

5.01

10.41

15.81

21.21

IO File

0

2.04

4.88

7.71

10.55

Model-Comp

1.1

2.17

3.22

4.27

5.32

Model-Logic

0

2.72

6.34

9.96

13.57

Model-Structure

1.2

1.69

2.17

2.66

3.14

 

 

 

 

 

 

Complexity

VL

Low

M

High

VH

Control-Batch

1

1.10

1.97

2.83

3.69

IO File

1

1.16

1.92

2.67

3.43

Model-Comp

1

1

1

1

1

Model-Logic

1

1

1.85

2.94

4.03

Model-Structure

1

1

1.08

1.37

1.66