TRANS-SEASON RELIABILITY OF PHYSICAL FITNESS TESTING IN STUDENTS OF “TOURISM” SPECIALITY

The study purpose was to prepare a model of the trans-season reliability of physical fitness testing on the example of “Tourism” speciality students. Material and Methods. A total of 50 university first year bachelor’s students studying “Tourism” as a business service were involved in the physical fitness testing: 20 males with body mass 67.3±9.5 kg (M±SD) and body length 174.6±5.6 cm; and 30 females (body mass 59.6±7.3 kg, body length 163.9±5.2 cm). Monthly testing was conducted seven times from September to March using a complex test KONTREKS–2. Trans-season mean score reliability was evaluated within the intraclass correlation model. Results. Approximately 86% of the students were found to be on the average and higher levels and only nearly 5% − on the low level. Males showed significantly better state of the physical fitness than females (16.6%, p < 0.002) with 96.3% similarity of trends in scores between males and females during the trans-season study. A great variation within these samples was noticed, too; a corresponding part in the total variation was derived as 93.6%. A strong and high level significant correlation (r > 0.80, p < 0.001) was determined between all the seven monthly test-retest trans-season trials. Significant trans-season reliability on the excellent level was found for each of two gender samples (ICC > 0.95, p < 0.001). Deviations from mean values for seven monthly tests undertaken during the study period were not significant (chi-squared = 13.939, p = 0.834). Conclusions. A model of the trans-season reliability of physical fitness testing created on the example of the first year bachelor’s students of “Tourism” speciality showed its effectiveness and could be recommended for physical education of high school students.


Introduction
Recently health and physical readiness of children and youth have sharply deteriorated (Sahoo K., Sahoo B., Choudhury, Sofi, Kumar & Bhadoria, 2015). In particular, this is due to the crisis in the national physical education system, which does not meet modern requirements. Physical training sessions do not provide the volume of motor activity necessary for a young person; they are insufficiently taken into account individual interests and needs of students (Al-Khudairy, Loveman, Colquitt, Mead, Johnson & Fraser, 2017;Liposek, Planinsec, Leskosek & Pajtler, 2019).
Problems of physical education of student youth were studied by many scientists. It was noted that a serious obstacle to physical improvement of the student youth is the fall of interest in the traditional forms of physical education. Therefore, the organization and content of physical education in higher educational schools need to be updated (Mcdavid, McDonough, Wong, Snyder, Ruiz & Blankenship, 2019).
Due to the expansion of the market of educational services and emergence of new specializations, increase a number of students who differ from the majority by the small volume of motor activity, arose the question about the study of an optimal system of educational activities that takes into account the features of different groups of students (Kharchenko, Khomenko, Krasilov & Rybalko, 2019, p. 194).
The theme of the use of sport and health tourism as a component of physical education of student's youth was disclosed in research works. However, not enough attention is paid to the formation of a system of perennial sport and health tourism during the whole period of study at higher education schools. Only integrated use of sports and health tourism as a component of physical education can positively influence the solution of the problem of strengthening of health, optimization of volumes of motive activity, formation of positive motivations to the tourism activities, after graduation etc. (Kukhtiy & Labartkava, 2011). Skaliy A., Skaliy T., Muszkieta, Kalosza and Żukow (2009) showed that hiking tourism has positive influence on the physical fitness of young people. They studied physical ТМФВ, 2020, том 20, № 2 ISSN 1993-7989 (print). ISSN 1993-7997 (online). Теорія та методика фізичного виховання. Том 20, № 2 conditions of students in the course of increased physical activity in the mountain hiking tourism. The dynamics of the physical state of students during many days hiking made possible to accept the research hypothesis regarding significant improvement of the physical readiness. For the comprehensive assessment of the functional capacity of the cardiovascular system and physical qualities of the examined in the tourism practice, the test-complexes KONTREKS as scoring systems were useful (Dushanin, Pirogova & Ivashchenko, 1985). KONTREKS-2 is a complex diagnostic system, which is recommended for current hospital and pedagogical control. It can help to determine not only the level but also the structure of physical training. It is characterized by simplicity and reliability; it can be used for individual and inter-control during the independent activities of physical exercises.
A quantitative measure of the test re-test reliability is intraclass correlation coefficient used effectively in different fields of sciences. Though much research and attention has been directed at assessing the correlation coefficient under range restriction, the assessment of reliability under range restriction has been largely ignored. This article uses item response theory to simulate dichotomous item-level data to assess the robustness of test-retest under varying selection ratios (Fife, Mendoza & Terry, 2012, p. 862). Almehrizi (2013, p. 438) presented a general form of ICC and extends its use to estimate internal consistency reliability for nonlinear scale scores (used for relative decisions). He also examines this estimator of reliability using different score scales with real data sets of both dichotomously scored and polytomously scored items. Different score scales show different estimates of reliability. The effects of transformation functions on reliability of different score scales were also explored. Fitness testing is used frequently in many areas of physical activity, but the reliability of these measurements under real-world, practical conditions is unknown. Therefore, it is necessary to evaluate the reliability of specific fitness tests using the methods and different time periods used in the context of real-world sport and occupational management (Burnstein, Steele & Shrier, 2011, p. 505;Ishii, Shibata, Adachi, Nanoue & Oka, 2015). A special interest is focused on the long period of time regarding test reliability in the physical education area because trans-seasons terms and academic years.
Research hypothesis. The intraclass test score reliability model could be valid in the trans-season testing of the physical fitness of the first year bachelors of "Tourism" speciality.
Purpose. The aim of this research was to create a model of the trans-season reliability of the physical fitness testing on the example of the first year bachelors of "Tourism" speciality.

Study participants
Totally 50 university first year bachelors studying "Tourism" as a business service were involved into the physical fitness testing. They were 20 males with body mass 67.3±9.5 kg (M±SD) and body length 174.6±5.6 cm; and 30 females (body mass 59.6±7.3 kg, body length 163.9±5.2 cm). All the students were in good physical shape, and they participated in the university lessons of physical fitness according the program of Lviv State University of Physical Culture (Academic program, 2016). This study was approved in advance by Ethical Committee of Lviv State University of Physical Culture. All the participants voluntarily provided written informed consent before participating. The procedures followed were in accordance with the ethical standards of the Ethical Committee on human experimentation.

Procedure
Monthly testing was conducted seven times from September to March using a complex test KONTREKS-2.
The scoring system consists of eleven indicators: five of them are biomedical: age, body weight, blood pressure, heart rate, reducing pulse; a six -motor: flexibility, speed, dynamic power, speed, power and overall endurance. Negative scores of indicators were replaced with zeros. Levels of the physical fitness were evaluated regarding a sum score of all the indicators using the scale as follows (Table 1). Table 1. Evaluation scale of the physical fitness (Viktorov, 1990) Physical fitness level Definition Score The testing was conducted afternoon on the first week of month at the sport venue of Lviv State University of Physical Culture.

Statistical analysis
Distribution of scores in the samples was determined in the frames of statistical hypothesis by Shapiro -Wilk method (Shapiro & Wilk, 1965). Dynamics of changing of test-retest results during the study period was evaluated using relative parameters of the total score: where x is a total test score of a sample, x − an average total score during the study period (Zanevskyy & Zanevska, 2016). A difference in score dynamics between males and females was calculated with formula as follows: where ξ and ξ are relative parameters of the total scores correspondingly for males and females, N = 7 is a number of test-retest treaties. Because 3) A quantitative measure of difference between males and females physical fitness levels was determined using the coefficient of difference calculated by the formula as follows (Zanevskyy & Zanevska, 2019): (Eq. 4) Two-ways ANOVA was used to determine differences between male and female samples. One-way ANOVA with repeated measures was used to evaluate differences in results between means of monthly testing and relative parts of variation between subjects and interaction between testretest and interpersonal variation. This one-way ANOVA design was realised in three problems, i.e. for the male sample, for the female sample, and for the united sample. Pearson correlation between test-retest scores was applied with a purpose to substantiate ANOVA with repeated measures (Zanevskyy & Zanevska, 2017). Trans-season correlation was studied using Pearson pared linear coefficient (r), and corresponding significance of this correlation was determined using t-Student statistics as follows: (Eq. 5) Variation of scores in the samples was evaluated using the coefficient of variation (Suni, Oja, Laukkanen, Miilunpalo, Pasanen, Vuori et al., 1996) where SD: standard deviation, M: arithmetic mean. If V < 10%, variation is small, 10-20% − moderate, and V > 20% − great. Trans-season mean score reliability was evaluated in the frames of the intraclass correlation coefficient using the formula by Shrout and Fleiss (1979): (Eq. 7)  where k is a number of trials (seven months), MS B is mean square of scores between persons (students' samples), and MS W − within persons. The evaluation scale was used with levels as follow: ICC>0.95 (excellent reliability), 0.91-0.95 (good), 0.81-0.90 (moderate), 0.71-0.80 (acceptable), 0.61-0.70 (questionable), and 0.60 or smaller (unacceptable). Calculations were done using Statistica software package analysis (Dell Inc., 2017).

Results
Because the hypothesis regarding normality of test results was accepted (SW−W = 0.947-0.981, p = 0.169-0.593), parametric statistics was used for treating of score results ( Table 2).
A character of changing of the test scores during this seven months period for males and females was rather similar between them (Figure 1). During the first four months (September -December) the results of the male and female groups remained approximately constant. In January test results felt down near a quarter, and during February -March they returned up near to the autumn level. All the time, males had greater results than females about 20%. Variation of the ТМФВ, 2020, том 20, № 2 ISSN 1993-7989 (print). ISSN 1993-7997 (online). Теорія та методика фізичного виховання. Том 20, № 2  results insight the male and the female samples was rather high (V = 26-47%). Significant differences between male and female samples was noticed: 16.6% (Eq. 4), p = 0.0017. A great variation within these samples was noticed too; corresponding part in the total variation was derived as 93.6% (Table 3). Similarity of trends in scores between males and females during the study was determined by the coefficient of changes in dynamics equal 96.3% (Eq. 3).
Strong and a high level significant correlation (r = 0.806 -0.965, p < 0.001) was noticed between all the seven months test-retest trans-season trials (Table 4).
Because strong correlation was found between this monthly testing, one-way ANOVA of repeated measures was applied for study a trans-season reliability of the physical fitness testing (Table 5). Taking into account significant differences regarding gender, the investigation of transseason reliability was undertaken separately in the male and female samples.
Significant trans-season reliability on the excellent level was found for each of two gender samples (ICC > 0.95, p < 0.001). So, and males, and females showed very high trans-season reliability during the seven months of testing.
The results of testing were evaluated on five levels of the physical fitness (see Table 1). Relative numbers of the subjects were graphed vs. corresponding levels (Figure 2). Approximately 86% of subjects appeared on the average and higher levels and only near 5% − on the low level.
Trans-season reliability was evaluated regarding trends in numbers of subjects on the physical fitness levels. With a purpose to meet mathematical conditions of the chi-squared method regarding a minimum numbers of subjects in cells of the research table, a number of levels were reduced from five to three. Low and lover than average levels were united into one, as well as high and higher than high were united, too.
Deviations from mean values in three new levels were calculated for seven tests taken during the study period with (3x7-1)(2-1) degree of freedom: chi-squared = 13.939, p = 0.834.

Discussion
The study of long time periods reliability for the physical fitness was a reasonable problem of the modern state of the theory and methods of physical education. While shortterm high intensity functional training (HIFT) effects have been established, fitness improvements from program participation exceeding 16 weeks are unknown. Cosgrove, Crawford, and Heinrich (2019) examined the effectiveness of participation in HIFT through CrossFit. During 2013-2014, fitness performance testing was incorporated into an ongoing university CrossFit program with 0-27 months of HIFT experience (grouped into 0-6 months and 7+ months). Participants completed three separate days of assessments across 10 fitness domains before and after participating in the program for six months that was near seven months period of the present research.
The purpose of the research (to create a model of trans-season reliability of the physical fitness testing), was undertaken on the example of the first year bachelors studying "Tourism" speciality. This approach was conveniently agreed with results of the analysis of the student's body physical conditions in the course of increased physical activity during mountain hiking tourist undertaken by Skaliy A., Skaliy T., Muszkieta, Kalosza, and Żukow (2009). As a key method of achieving the aim of this research was a complex test Kontreks-2, which was used for evaluation the physical fitness of the tourist students. The results appeared with good accordance with results of previous studies taken by Tymoshenko and Labartkava (2011). Sport and health tourism is a promising and generally accessible means of physical education of a youth. It is considered an important factor in comprehensive development of student youth, moral and physical rehabilitation, education of national consciousness and involvement in systematic activities of physical exercises.
Advances of the complex test Kontreks-2 in evaluation of physical fitness of students have age restrictions regarding adolescents younger than nineteen years old. Then, another specialised for teenagers test should be used, e.g. "Field-based fitness assessment in young people: the ALPHA healthrelated fitness test battery for children and adolescents". The battery include selected fitness tests: the 20 m shuttle run test to assess cardio respiratory fitness; the handgrip strength and standing broad jump to assess musculoskeletal fitness, and body mass index, skin fold thickness and waist circumference to assess body composition (Ortega, Cadenas-Sanchez, Sanchez-Delgado, Mora-Gonzalez, Martinez-Tellez, Artero et al., 2015;Ruiz, Castro-Pinero, Espana-Romero, Artero, Ortega, Cuenca, et al., 2011;Ortega, Artero, Ruiz, Vicente-Rodríguez, Bergman, Hagstromer et al., 2005).
Accurate measures of youth fitness require researchers and practitioners. Evidence of validity and reliability are essential before results of youth fitness tests can be used to make sound decisions. Mahar and Rowe (2008) proposed practical guidelines for valid and reliable youth fitness testing. They described a three-stage paradigm for validation research and provided guidance for conducting and understanding norm-referenced and criterion-referenced validity and reliability research. Advice is provided on how to administer fitness tests and how to use fitness test results in ways that promote reliability and validity in practice. Users of fitness tests are cautioned that interpretation and use of fitness tests involve important educational, pedagogical, and psychological consequences. Confidence in youth fitness test results and the decisions that are made based on these scores depend upon careful test design and administration that incorporate a sound understanding of the principles of validity and reliability (Trost, Pate, Sallis, Freedson, Taylor, Dowda et al., 2002;Haddock, Poston, Heinrich, Jahnke & Jitnarin, 2016).
In addition, most of reliability tests are conducted over a short period of time, but the reliability properties may be very different over the months to years in which they are routinely used in the field. Therefore, the purpose of our study was to evaluate some typical fitness tests for reliability within the environment and using the methods and timeframe that will be used in the field. Reliability was acceptable (ICC > 0.6) over an 18-month time period for all pairwise comparisons and all time points together for the push-up, vertical jump, and pullup assessments. The Harvard step test and 60-second jump test had poor reliability (ICC < 0.6) between baseline and other time points. When we excluded the baseline data and calculated the ICC for 6-month, 12-month, and 18-month time points, both the Harvard step test and 60-second jump test demonstrated acceptable reliability. Dynamic balance was unreliable in all contexts. Limit-of-agreement analysis demonstrated considerable intraindividual variability for some tests and a learning effect by administrators on others (Falk & Kennedy, 2019). The trans-season test Kontreks-2 in this research showed rather good reliability (ICC>0.95) relatively other tests used for the evaluation of the physical fitness (Petersen, Thieschafer, Ploutz-Snyder, Damann & Mester, 2015).

Conclusions
A model of the trans-season reliability of the physical fitness testing created on the example of the first year bachelors of "Tourism" speciality showed its effectiveness and could be accepted for the practice of physical education in high school.
Overall, approximately 86% of students appeared on the average and higher levels and only near 5% − on the low level.
Males showed significant better state of the physical fitness than females (16.6%, p < 0.002) with 96.3% similarity of trends in scores between males and females during the trans-season study. A great variation within these samples was noticed too; corresponding part in the total variation was derived as 93.6%. Strong and a high level significant correlation (r > 0.80, p < 0.001) was noticed between all the seven months test-retest trans-season trials. Significant transseason reliability on the excellent level was found for each of two gender samples (ICC > 0.95, p < 0.001). Deviations from mean values for seven monthly tests undertaken during the study period with was not significant (chi-squared = 13.939, p = 0.834).