2.4 Data Analysis
Because the dependent variable of the research consisted of two groups with high and low problem solving success, it
has a categorical structure and therefore logistic regression analysis was used for the first sub-problem. The main
objective of logistic regression analysis is to form a regression equation to be used in estimating which group an
individual belongs to (Çokluk, Şekercioğlu and Büyüköztürk, 2010). In other words, it is classification of individuals
into different groups. In this study, dual logistic regression analysis was used to determine the independent variables
group that best explains the case of being a member of certain groups stated as two-category dependent variable. Since
the research was exploratory, forward likelihood method was used.
Because it was aimed to determine the order of importance of contribution of the strategies to classifying successful and
unsuccessful students in the second sub-problem of the research, and also because the dependent variable is a
categorical variable, discriminant analysis was used. Discriminant function is used to classify individuals or units, to
test the theories about whether individuals or units can be classified upon estimations, to research the differences
between groups, to assess the relative order of importance of independent variables in classifications by dependent
variable and to eliminate the insignificant variables with little importance in classification (Çokluk, 2012).
3. Findings
In this research, students with high and low problem solving success were compared in terms of reading comprehension
(speed, reading accuracy percentage, prosody, literal comprehension and inferential comprehension) and strategies they
used while solving a problem.
In the study, reading comprehension skills predictive of high and low problem solving success were analysed first.
Since logistic regression analysis was conducted for the first problem, whether there was a multiple relation problem
between the variables was checked first. The multiple relation problem between the predictive variables in the research
were analysed in terms of eigenvalues, case indices and variance rates. Findings were presented in Table 3.
Table 3. Analysis of multiple relation problem between the predictive variables in the research in terms of eigenvalue,
case indices and variance rates
Dimension
Eigenvalue
Case
Index
Variance Rates
Speed
Reading
Accuracy
Percentage
(Wrr)
Prosody
Literal
Comprehension
Inferential
Comprehension
1
5.726
1.00
.00
.00
.00
.00
.00
2
.109
7.26
.00
.00
.53
.23
.07
3
.073
8.86
.02
.00
.31
.58
.07
4
.065
9.40
.01
.00
.07
.08
.81
5
.027
14.53
.72
.00
.02
.00
.04
6
.009
21.17
.25
.92
.16
.01
.00
According to Table 3, it is seen that the eigenvalues of the predictive variables look alike and there is no value bigger
than the others among case indices. In terms of variance means, the highest variance of each predictor is seen to be
loaded on a different eigenvalue. This allows us to see that each predictive variable explains a different dimension of the
variable. Table 4 shows the findings about multiple relation problem in terms of standard error, tolerance and VIF
values.
Journal of Education and Training Studies Vol. 5, No. 6; June 2017
52
Table 4. Analysis of multiple relation problem between the predictive variables in the research in terms of standard error,
tolerance and VIF values
According to Table 4, it is seen that tolerance values of all variables are above .1. According to Mertler and Vannatta
(2005), the fact that tolerance values are above .1 shows that there is no multiple relation problem. In this context, it was
seen that there was no multiple relation problem among the variables in terms of tolerance values. In terms of VIF
values of each variable, it is seen that the values are below 10 and average VIF value is 1.477. According to Çokluk
(2010), in order for VIF values not to cause multiple relation problems, it should be below 10. In this context, it was
seen that there was no multiple relation problem in terms of VIF values either. In terms of correlation values among the
variables, it is seen that there are medium level relations between the variables and no high level relations. This helps us
to see there were no multiple relation problems among the variables. Since no multiple relation problems were seen
among the variables, first -2LL value of null model (the model with no predictive variables) and the model’s iteration
history were checked and the findings can be seen in Table 5.
Table 5. Initial Model Iteration History
Iteration
-2Log Likelihood (-2LL)
Coefficients
Constant
Step 0
1
385.194
-.151
2
385.194
-.151
3
385.194
-.151
According to Table 5, -2LL is 385.194 before predictive variables are added to the model. Çokluk (2010) states that for
excellent consistence, -2LL value should be 0 and any drops in this value contributes to the model positively. It is
expected that at further stages, as predictive variables enter the model, this value will fall. After this stage, first
classifications obtained as a result of logistic regression were checked and findings can be seen in Table 6.
Table 6. First classification case obtained after logistic regression
Real/observed case
Estimated case
Accurate
Classification
Percentage
Low Success
High Success
Low Success
150
0
100.0
High Success
129
0
.0
Total Accurate Classification Percentage
53.76
According to Table 6, it is seen that all the students were classified in low success category and thus accurate
classification percentage is 53.76%. This classification was only produced for the initial model with constant term and
only low success students were classified accurately. The fact that the number of low success students was high in the
model caused this rate to appear high; if the students had been classified according to high problem solving success
category, this rate would have been 46.24% (129/279). As predictive variables are added to the model, it is expected that
accurate classification percentage will rise. At the next step, standard error about the constant term constituting the
initial model, Wald statistics that tests significance of the variable, degree of freedom for Wald statistics, and
exponential logistic regression coefficient Exp(ß) value representing significance level of this value and Odds rate were
checked. Table 7 shows the findings.
Table 7. Variables in the initial model/equation
Step 0
ß
Standard Error
Wald
sd
P
Exp(ß)
Constant
-.151
.120
1.578
1
.209
.860
According to Table 7, it is seen that Wald value for the initial model is not significant (wald=1.578, p >.05) and that
exponential logistic regression coefficient Exp(ß) value representing odds ratio is . 860. When predictive variables are
added to the initial model, the changes in Wald and Odds ratios will let us see the effect of predictive variables. At the
next step, information is provided about scores of the predictive variables not included in the initial model, their degrees
of freedom and error chi-square values.
Predictive Variables
ß
Tolerance
VIF
Correlations
1
2
3
4
5
1.Speed
.008
.564
1.774
-
.636
**
.346
**
.354
**
.363
**
2.Wrr
.093
.479
1.589
.636
**
-
.528
**
.373
**
.301
**
3.Prosody
.037
.720
1.389
.346
**
.528
**
-
.448
**
.549
**
4. Literal Comprehension
.536
.831
1.203
.354
**
.373
**
.448
**
-
.519
**
5. Inferential Comprehension
.892
.784
1.276
.363
**
.301
**
.549
**
.519
**
-
Journal of Education and Training Studies Vol. 5, No. 6; June 2017
53
Table 8. Variables not included in the initial model/equation
Step 0
Variables
Score
sd
P
Speed
23.327
1
.000
Wrr
18.715
1
.000
Prosody
9.372
1
.002
Inferential comprehension
72.181
1
.000
Literal comprehension
39.191
1
.000
Error chi-square statistics (x
2
ßo
)
86.110
5
.000
According to Table 8, it is seen that error chi-square value is significant (x
2
ßo
=86.110, p<.01). The fact that this value is
significant enables us to see that addition of one or more variables not included in the initial model to the model will
increase predictive power of the model. Score variables are the effect score statistics of Roa and the fact that these
values are significant means that the variables will contribute to the model (Field, 2005). The fact that all variables in
the model are significant shows that they can contribute to the model. Meanwhile, in the stepwise method, the
inferential comprehension variable with the highest score statistics (72.181) is the first variable to enter the model
followed by literal comprehension variable (39.191). After this stage, in order to see the difference between the
chi-square values of the initial and target models, Omnibus test was conducted. Findings are shown in Tablo 9.
Table 9. Omnibus test of model coefficients
Step
Chi-square
Sd
P
1
Step
82.536
1
.000
Block
82.536
1
.000
Model
82.536
1
.000
2
Step
17.430
1
.000
Block
99.966
2
.000
Model
99.966
2
.000
According to Table 9, it is seen that two variables entered the initial model and led to positive contribution to the model.
In each step, increases in chi-square statistics also confirm this finding. At the next step, consistency statistics of the
target model were checked. Table 10 shows the findings.
Table 10. Summary of the target model
Step
(-2LL)
Cox&Snell R
2
Nagelkerke R
2
1
302.658
.256
.342
2
285.228
.301
.402
Before analysing Table 10, noting that -2LL value of the initial model was 385.194, it is seen that 2LL value decreased
at (385.194-302.658) 82.536 rate when the first variable entered the initial model. When the second variable entered the
model, -2LL value is seen to decrease (302.658-285.228) 17.430 more. The fact that both variables added to the initial
model caused significant drops in -2LL value shows that these variables are consistent with the model. According to
Field (2005), Cox&Snell R2 and Nagelkerke R2 values both show the variance amounts models explain in the
dependent variant. According to Field, Nagelkerke R2 values appear to be higher than Cox&Snell R2 values. It is seen
that according to Cox&Snell R2 values, inferential comprehension variable that entered the model first explained 26%
of the total variance but when literal comprehension skill entered the analysis in the second step, both variables together
explained 30% of the change in the variance. When this rate is analysed in terms of Nagelkerke R2 values, these three
variables together explained 40% of the change in the variance (first step=.34, second step=.40). In the following stage,
to assess consistency of the model as a whole, Hosmer and Lemeshow tests were conducted. Table 11 shows the
findings
Table 11. Results of Hosmer and Lemeshow tests
Step
Chi-square
Sd
P
1
13,131
6
,041
2
10,196
8
,252
According to Hosmer and Lemeshow test results in Table 11, the results are significant for the first step (p<.05), but not
significant for the second step (p>.05). According to Çokluk (), the fact that Hosmer and Lemeshow test results are
significant shows that model-data consistency is not acceptable and the fact that Hosmer and Lemeshow test results are
not significant shows that model-data consistency is acceptable. In this context, it turns out that data consistency of the
first model is not acceptable but that of the second model is acceptable. In the next stage, another indicator of model
consistency, that is the classification table that allows comparison of the real cases of the subjects with the group in
which they appear in the model, was checked and Table 12 shows the findings.
Journal of Education and Training Studies Vol. 5, No. 6; June 2017
54
Table 12. Classification table
Real/observed case
Expected Case
Accurate Classifying
Percentage
Low Success
High Success
Step 1
Low Success
120
30
80.00
High Success
44
85
65.89
Total Accurate Classifying Percentage
73.47
Step 2
Low Success
122
28
81.33
High Success
36
93
72.09
Total Accurate Classifying Percentage
77.06
Before analysing Table 12, considering the first classifying results, it is seen that the number of students with low
problem solving success was 150 while the number of students with high problem solving success was 129 and since
the group with low problem solving success was taken as reference, the accurate classifying rate was (150/279) 53.76%.
When inferential comprehension variable was added to the model, 120 out of 150 students were seen to be classified
accurately while 30 were classified inaccurately, which made accurate classifying percentage 80%. When the case is
considered in terms of high problem solving success, 85 students out of 129 did accurate classification and 44 did
inaccurate classification, which made accurate classifying rate 65.89%. Total accurate classifying percentage for the
first step is 73.47%. ın the second step, when literal comprehension variable was added to the target model together with
inferential comprehension variable, 122 out of 150 students were seen to be classified accurately while 28 were
classified inaccurately, which made accurate classifying percentage 81.33%. When the case is considered in terms of
high problem solving success, 93 students out of 129 did accurate classification and 36 did inaccurate classification,
which made accurate classifying rate 72.09%. Total accurate classifying percentage for the second step is 77.06%. In
the next stage, the coefficient estimations for the variables in the target model were analysed and Table 13 shows the
findings.
Table 13. Coefficient Estimations for the Variables in the Target Model
Step
ß
Standard
error
Wald
sd
P
Exp(ß)
1
Inferential
Comprehension
.500
.068
54.292
1
.000
1.649
Constant
-4.209
.566
55.300
1
.000
.015
2
Inferential
Comprehension
.450
.069
42.316
1
.000
1.568
Basit Anlama
.310
.078
15.890
1
.000
1.363
Constant
-5.717
.741
59.549
1
.000
.003
According to Table 13, it is seen that when the first predictive variable, inferential comprehension, entered the model,
Wald value, which hadn’t been significant in the initial model, became significant (Wald=54.292, p<.01), and when the
second variable, literal comprehension, entered the model, Wald value sustained its significance by increasing more
(Wald=59.549, p<.01). These findings enable us to see that both inferential comprehension and literal comprehension
contribute to the model. According to Çokluk (2010), 1-Exp(ß).100 formula is used to determine how much low or high
success odds of a variable is affected positively or negatively. This formula enables us to see that one unit increase in
inferential comprehension variable caused 56.8% [(1-1.568).100] rise in low success odds while one unit increase in
literal comprehension variable caused 36.3% [(1-1.363).100] rise in low success odds.
Table 14. The model when predictive variables are omitted from the model
Variable
Model LL
Change in -2LL
Sd
P
Step 1
Inferential
Comprehension
-192.597
82.536
1
.000
Step 2
Inferential
Comprehension
-171.192
57.156
1
.000
Literal
Comprehension
-151.329
17.430
1
.000
According to Table 14, when inferential comprehension variable enters the basic model which involves only the
constant term, the change in -2LL value is 82.536 and when the second variable, literal comprehension, enters the
change becomes 17.430. In both cases, the change in the consistency of the model is significant (p<.01). According to
Field (2005), the fact that the change in -2LL value is significant is indicative of the contribution of the variables to the
model and so omitting the variables that contribute to the model isn’t a good idea. In this context, it was decided to keep
both inferential comprehension and literal comprehension variables in the model. As can be remembered, five variables
had been added to the model and while inferential comprehension and literal comprehension were kept in the model,
speed, reading accuracy rate and prosodic reading variables couldn’t enter the model.
Journal of Education and Training Studies Vol. 5, No. 6; June 2017
55
In this section, answer was sought to the second problem of the research: “What are the differences between problem
solving strategies used by students showing high and low problem solving success?” For this purpose, discriminat
analysis was used. For discriminat analysis, descriptive statistics of the strategies used by students showing high and
low problem solving success were analysed first and Table 15 shows the findings
Table 15. Group statistics
Strategies
Group
N
X
S
Writing mathematical sentence
Low success
150
4.90
2.32
High success
129
1.72
1.54
Looking for a pattern
Low success
150
.84
.74
High success
129
1.45
.59
Systematic listing
Low success
150
.06
.28
High success
129
.79
.88
Estimation and control
Low success
150
.37
.72
High success
129
1.87
1.31
Backward Studying
Low success
150
1.14
.78
High success
129
1.40
.70
Drawing figures and diagrams
Low success
150
.84
1.10
High success
129
1.80
1.19
According to Table 15, use of writing mathematical sentence is more in the group with low success (X=4.90) than the
group with high success (X=1.72). The rate of use of “looking for a pattern” strategy is more in the group with high
success (X=1.45) than the group with low success (X=.84). The rate of use of “systematic listing” strategy is more in
the group with high success (X=.79) than the group with low success (X=.06). The rate of use of “estimation and
control” strategy is more in the group with high success (X=.1.87) than the group with low success (X=.37). Similarly,
the rate of use of “backward studying” strategy is more in the group with high success (X=1.40) than the group with
low success (X=1.14). Finally, the rate of use of “drawing figures and diagrams” strategy is more in the group with high
success (X=1.80) than the group with low success (X=.84). These findings show that except for writing mathematical
sentence strategy, all the other strategies are used more in the group with high success. In the next stage, eigenvalue and
canonical correlation degree of discriminant function were analysed and Table 16 shows the findings.
Table 16. Eigenvalues
Function
Eigenvalue
Variance
Canonical correlation
1
1.612
100.0
.786
According to Table 16, it is seen that eigenvalue of discriminat function is 1.612 and according to Kalaycı (2005), this
value doesn’t have a max. limit but its min. limit is .40. Accordingly, it can be said that eigenvalue of the function is
rather high. Çokluk (2012) states that canonical correlation value gives information about separation efficiency of
discriminant function. In this study, canonical correlation value is .79, which helps us see that this function has a strong
effect on separating the strategy differences of the students with high and low success. In the next stage, Wilks’ lambda
and chi-square values, other values produced for the separation efficiency of the function, were analysed and Table 17
shows the findings.
Table 17. Wilks’ Lambda statistics
Function
Wilks’
Lambda
Chi-square
Sd
p
1
.383
263.038
6
.000
According to Table 17, Wilks’ Lambda value is .383 and according to Çokluk (2012), if this value converges to 1, it
means that separation efficiency of the function is inadequate and if the value decreases, it shows that separation
efficiency increases. The fact that Wilks’ Lambda value was low shows that separation efficiency of the model is high.
Another value for separation efficiency is chi-square value and this value is seen to be significant for our function [ X
Chia sẻ với bạn bè của bạn: |