The Effect of Reading Comprehension and Problem So (1)

“Tiyatro” used by Keskin (2012) to evaluate fluent reading skills of 4
grade students, was used. In order to determine 
prosodic reading levels of the students, prosodic reading scale comprised of 15 items and developed by Keskin and 
Baştuğ (2011) was used. The scale with minimum score 0 and maximum score 60 was one-dimensional with Croncbach 
alpha coefficient .98. In this context, each student was made to read the text and it was recorded to determine word 
prosodic reading levels of the students. The video recordings were scored by three experts who had completed their PhD 
on reading comprehension in elementary school teaching using prosodic reading scale. In order to see the reliability of 
the scoring done by using prosodic reading scale Weighted Kappa coefficient was checked. The data obtained from 
Kappa coefficient are interpreted as “Poor agreement=< 0.20; Acceptable agreement=0.20-0.40; Medium 
agreement=0.40-.60; Good agreement=0.60-0.80; Absolute=0.80-1.00” (Şencan, 2005, p. 485). Accordingly, 
concordance among the scorers was found .68, which can be said to be good agreement. Prosodic reading score of each 
student was obtained by taking the mean of the scores which scorers gave to prosodic reading scale. In order to 
determine validity of the scale with a reference to the mean scores, confirmatory factor analysis (DFA) was done and it 
was seen that the fit indices of the model set up with the one-factor structure of the scale (χ²/sd=0.698, RMSEA=0,038, 
TLI=0,93, IFI=0,95, GFI=0.97) were sufficient. 
Evaluating comprehension skill: Comprehension skill was evaluated in two dimensions: literal and inferential 
comprehension. In order to evaluate literal comprehension skill (Wh-questions), a text- which was developed by Başaran 
(2013), consisted of 336 words and was called “Kasabanın Kahramanı”- and 5 short-answer questions- which were 
prepared with a reference to this text, evaluated directly the remembering level of the information and whose validity and 
reliability were checked by expert opinion- were used. Because it was seen during pilot scheme that 8 minutes was enough 
for text reading and 7 minutes was enough to answer the questions in the text, the implementation process were structured 
with a reference to these durations. During implementation, students were given two sheets of paper: one involving the 
reading text and the other 5 simple comprehension questions at remembering level about the text; when the time for 
reading (8 min.) was over, the first paper was collected and when the time for answering (7 min.) was over, the other paper 
was collected. The answers of the students were scored as 2, 1, 0 from precise answers to inaccurate answers. While 
loading the data statistically, replies to each question were coded and no scoring operation was done at this stage. Later, 
the codes were analyzed by experts who had completed PhD on reading comprehension in elementary school teaching; 
domain experts scored each code by reading the text. The codes on which statistical means were loaded were turned into 
scores with a reference to expert opinions and so literal comprehension score given by each expert to each student was 

Journal of Education and Training Studies Vol. 5, No. 6; June 2017 
obtained. In order to determine reliability of this scoring, the relation between the scores given by three experts to students 
was analyzed using Weighted Kappa test and the value thus obtained (r=.81) showed that there was a good agreement 
among scorers. Later, having the average of the scores given by the experts, literal comprehension score was obtained 
for each student. With a reference to the mean scores, in order to determine validity of the scale, confirmatory factor 
analysis (DFA) was conducted and it was seen that the fit indices of the model set up with the one-factor structure of the 
scale (χ²/sd=1.638, RMSEA=0,040, TLI=0,91, IFI=0,95, GFI=0.99) were sufficient.
In order to evaluate inferential comprehension skill, a text- which was developed by Başaran (2013), consisted of 226 
words and was called “Mantarlar”- and a scale- which was prepared with a reference to the text, whose reliability and 
validity were assured upon expert opinion and which consisted of 5 questions- were used. The questions in the scale 
consisted of such implicit questions as finding the main idea, finding a title, lessoning, developing empathy, forming 
cause and effect relation. Because it was seen during pilot scheme that 15 minutes was enough to answer the test, the 
implementation process were structured with a reference to this duration. The answers of the students were scored as 3, 2, 
1, 0 from precise answers to inaccurate answers. While loading the data statistically, replies to each question were coded 
and no scoring operation was done at this stage. Later, the codes were analyzed by experts who had completed PhD on 
reading comprehension in elementary school teaching; domain experts scored each code by reading the text. The codes 
on which statistical means were loaded were turned into scores with a reference to expert opinions and so inferential 
comprehension score given by each expert to each student was obtained. In order to determine reliability of this scoring, 
the relation between the scores given by three experts to students was analyzed using Weighted Kappa test and the value 
thus obtained (r=.64) showed that there was a good agreement among scorers. Later, having the average of the scores 
given by the experts, inferential comprehension score was obtained for each student. With a reference to the mean 
scores, in order to determine validity of the scale, confirmatory factor analysis (DFA) was conducted and it was seen 
that the fit indices of the model set up with the one-factor structure of the scale (χ²/sd=1.467, RMSEA=0,052, TLI=0,95, 
IFI=0,96, GFI=0.98) were sufficient.
Problem solving scale: First of all, a problem-solving achievement test was developed to classify the errors made by the 
students in the research. The problem-solving achievement test is composed of 10 word problems used in the studies 
performed by Ulu (2011), Altun (2007), Yazgan and Bintaş (2005), Griffin and Jitendra (2008). While developing the 
test, expert opinion of three experts having completed their PhD in mathematics education in elementary teaching. The 
experts decided that the test had better be comprised of questions that were appropriate for using problem solving 
strategies suggested by MEB (2005). Table 2 shows the strategies that could be used in solving the questions in this test. 
Table 2. Strategies that could be used in solving the questions in problem solving test 


Guess and 


The study to assess the validity and reliability of the scale was performed on 124 fourth-grade students at the school 
with the closest score to the Kütahya average based on the 2014/2015 YEP (Placement Scores). Firstly, the item 
difficulty and item distinctiveness of each question and secondly, the reliability coefficient (KR-20) of the scale was 
calculated in order to determine its validity and reliability. According Tekin (1997), items with an item difficulty index 
between 0 and 1 and difficulty indices between 0.30 and 0.70 are of a moderate difficulty level. The item difficulty 
indices of items in the scale vary between 0.32 and 0.48, which indicates that all of the problems in the test are of a 
moderate difficulty. The distinctiveness index varies between -1 and +1, with a value of 0.40 or higher demonstrating 
the distinctiveness of the items (Tekin, 1997). The distinctiveness indices of items in the test vary between 0.43 and 
0.64, which indicates that all of the items are distinctive. The KR-20 value for the internal consistency of the scale was 
calculated as 0.84. If the KR-20 value is 0.70 or higher, it shows that the test has a high level of internal consistency and, 
therefore, reliability (Büyüköztürk, 2006).
According to Şekercioğlu, Bayat, and Bakır (2014), factor analysis of the scales scored as 0-1 should be conducted on 
tetrachoric correlation matrice. Because problem solving scale is scored as 0-1, construct validity (factor analysis) of the 
scale was done on tetrachoric correlation matrice. According to the analysis result, the fact that KMO value was .898 

Journal of Education and Training Studies Vol. 5, No. 6; June 2017 
shows that the scale has sufficient sampling size for factor analysis and Barlett test results (X2(45) =881.338; p<.01) 
show that the variables have equal variance (Büyüköztürk, 2006). As a result of analysis, factor loads of the scale items 
varied between .898 and .496 and since factor loads were sufficient, it was decided to keep all the items in the scale 
(Büyüköztürk, 2006). It was also seen that with its one-dimension structure, the scale explains 66.32% of problem 
solving variance. Also confirmatory factor analysis (CFA) was performed to determine the validity of the scale based on 
the average scores, and it was seen that fit indices of the model established with the scale's one-factor structure 
(χ²/sd=1.144, RMSEA=0.023, TLI=0.99, IFI=0.99, GFI=0.97) are sufficient.

