A PROSPECTIVE COHORT STUDY TO PREDICT RUNNING-RELATED LOWER LIMB SPORTS INJURIES USING GAIT KINEMATIC PARAMETERS

The study purpose was to follow a prospective cohort study design to use gait kinematic parameters to identify the risk factors and to develop a statistical model to predict running-related lower limb injuries of sportspersons. Materials and methods. BTS G-WALK® gait analysis system was used to collect gait kinematic data of 87 subjects from an institute of physical education and sports science. The subjects were followed for a full academic season after which the researcher inquired about their injury occurrences. Binary logistic regression was used to develop a prediction model to predict lower limb injuries of sportspersons. Results. The result of the study revealed that increasing Range of Obliquity, Range of Tilt and Range of Rotation were associated with increased likelihood of future running-related lower limb injury. However, the lower Symmetry Index was associated with increase in the likelihood of future running-related lower limb injury. Conclusions. The study confirmed that it is possible to predict injury, but for practical implication further research is essential with a bigger sample size.


Introduction
"Whether one is a professional athlete or a weekend warrior, almost every participant in sports incurs physical injury at one time or another" By Conrad (2006) Despite the fact that there is no widely recognised definition for sports injury, the injuries sustained by exercising are considered as "sports injuries" (Bahr & Krosshaug, 2005;Mc-Crory, 2006). For certain people, the analytical conversation over the concept of an injury is an excessive over-complication of a basic problem. But for an in-depth clinical report for a study on a particular condition, deciding what injury entails might not be so straightforward (Verhagen & Mechelen, 2010). Injury types determined by different scholars have had different meanings. Definition by the International Olympic Committee (IOC) is the most suitable which describe sports injury as damage to body tissues resulting from sports, ex-ercises, or from other physical activities (Roald Bahr et al., 2012).
Regardless of the injury's diagnosis, an athlete will face varying degrees of expected effects depending on his degree of participation in the sport (Kraemer et al., 2009). Professional athletes may have a high risk of losing their money, losing their contracts, and even losing their lives. A team's loss of a significant player at a time of competition is the worst moment result in loss of financial resources and a loss of the team's growth. Young and growing athletes may need to shut down their sports career even before they get the opportunity for actual begin. The results may lead to significant issues with family members, social interactions and losing the ability for daily activities.
Researchers have cited several factors that may contribute to sports-related injuries (Saragiotto et al., 2014). A single specific and identifiable event can cause acute injury, while chronic and overuse injuries can occur without evidence of a single identifiable event due to repeated micro-trauma (Knight, 2008;Verhagen & Mechelen, 2010). Numerous researches have also been conducted to predict injuries and appropriate precautions have been suggested to prevent it. Slobounov (2008) observed a correlation between postural instability and sports injury. Apart from various internal factors, external factors such as a biomechanical pattern of movement, training schedule, or the training surface are indictors for sports injuries (Harris-Hayes et al., 2016). Studies conducted on running injury have identified several factors to understand the root cause of running injuries. Zhang et al. (2017) said abnormal gait kinematics is associated with overpronated feet, resulting in overuse injuries. Zhang et al. (2017) also suggested that foot kinematics of persons with overpronated feet must be examined for better understanding about the mechanism of overuse injuries. Powell and Barber-Foss (1999) and Sorenson (2009) had explored the possible utility of pre-participation screening methods to identify athletes who are at higher risk of suffering sports injury. Physical trainers or coaches can identify biomechanical deficits of their trainees, build preventative strategies, and can minimise the number of injuries (Kruse & Lemmen, 2009;Sorenson, 2009). Mokha et al. (2016) found asymmetry functional movement screen (FMS) score as a better predictor of musculoskeletal injuries. Springer et al. (2016) found that gait analysis can be used in the diagnosis of shoulder overuse injury. Azzam et al. (2015) and Whittaker et al. (2017) had suggested to focus on high-quality cohort studies to find the most relevant results of movement quality in the prediction of risk of injury.
Injuries induce health care cost for young athletes. More of it, if after investing so much time, resources, and effort, if the player gets injured; all their hard work will go in vain. If someone could predict injury; he/she can take proper action to mitigate it. Prediction is an integral aspect of statistical analysis. There are several approaches to statistical prediction. Generally, statistics provides knowledge of population based on sample population; but for predictive statistical analysis, it is not necessarily the same (Johnson et al., 2012;Verma, 2013). The process of prediction is called forecasting and it requires time series data (Verma, 2013). Various regression analysis methods and their subcategories are used for predictive studies. Prediction is mainly an extrapolation of a problem (Steyerberg, 2009). For example, "which team is going to win the next soccer tournament?" Or, "how long it will take Usain Bolt to complete the upcoming 100-meter race?" Steyerberg (2009) feels that the researcher should also be concerned about hypothesis testing. For example, is the height of an athlete a predictor of high jump performance. Or in general: what are the factors responsible for high jump performance?
A prospective cohort study is a longitudinal cohort study that observes a group of similar individuals over time who may vary with respect to certain factors, to determine how these factors influence a certain outcome (Mann, 2012;Rango, 2016). The prospective cohort studies are essential for understanding the causes of diseases and disorders (MacGill, 2018). The one distinguishing feature of a prospective cohort study is that it is a study that starts before the participants have established all the outcomes of interest (Setia, 2016). Subjects in a prospective cohort study are tracked over a long period to see if an outcome happens, and to establish any association between the exposures and the outcome (Song & Chung, 2010). This way, researchers will eventually use the data of exposures to address several questions about the associations between «risk factors» and the outcomes (Verma, 2016). The principal benefit of a prospective cohort study is that it allows for longitudinal observation of the risk factors in a case, and the compilation of results are at regularly scheduled intervals, so that recall error can be minimised.
Lower limb area is the most vulnerable to overuse injuries, most of which occur at, or below the knee, with the most frequent injuries being patellofemoral pain, medial tibial stress syndrome, achilles tendinopathy, and plantar fasciitis/ plantar heel pain (Callahan, 2020). Running is the main components of any sports training and faulty running mechanics is the root cause of lower limb musculoskeletal overuse injuries. The latest study shows that examination of activity patterns can not only reflect the anomalies arising from complicated walking activities but can be used to anticipate musculoskeletal injuries (Aicale et al., 2018;Hreljac, 2004). Thijs et al. (2007) observed that anterior knee pain can be predicted by pressure distribution during initial contact and loading response. Another study concluded that gait-related risk factors were the root cause for exercise-related lower leg pain among 400 students (Willems et al., 2006). Research showed that participants who suffered from sports-related lower limb injury had a distinct running pattern relative to participants without injury (Hamill et al., 2012). Additionally, there is an increasing number of studies demonstrating that variability or fluctuation of gait favourably modulates injury risk for overuse injuries among runners (Ferber et al., 2009;Hamill et al., 2012;Messier et al., 2018).
Having gone through the related research, and recognising the potential use of gait parameters as predictors of lower limb musculoskeletal injuries, it can be speculated that analysing the gait pattern during running activities may help to categorise the sportspersons at risk of lower limb injuries and to find risk factors for injuries of the sportsperson. Therefore, the purpose of the current study was to follow a prospective cohort study design to use gait kinematic parameters to identify the risk factors and to predict running-related lower limb injuries of sportspersons.

Study participants
All participants were purposively selected from an institute of physical education and sports science courses. Initially, a total of 87 participants were selected and all were 1stsemester students of the academic session 2017 & 2018. The average age of the participants was 18.39±1.02 years, weight 62.30±7.44 kg and height 171.19±5.87 cm. It was made sure that nobody was injured during the time of commencement of data collection for the study. The participants went through a common conditioning programme at the institute for an entire academic semester. All participants had been specialised in their games or sports. Apart from their regular and common conditioning program, the participants continued to take parts in their specialised games or sports practice.

Ethics and consent of data collection
All procedures for data collection were followed as per the decision and instruction of the review board of the institute. The supervisor of the researcher was constantly guiding and supervising the procedures throughout the data collection period. Prior consent to participate in the study was taken from the subjects before the collection of data. A consent form was given to the subjects before the start of data collection which was comprised of information about the study and the rights of the subjects. The subjects were also provided with the right to withdraw from the study. Data were collected with the minimum risk of injury to the subjects.

Selection of variables
After careful review of the related literature and the authors own understanding, the following variables had been selected for the current study (Baker, 2013;Kirtley, 2006;Levine et al., 2012;Richards et al., 2013;Whittle, 1991) (Table 1).

Selection of equipments
The gait kinematic parameters were obtained using the BTS G-WALK® gait analysis system. The G-WALK system is composed of an inertial sensor, i.e., G-Sensor and a dedicated software named as G-Studio. The G-Sensor is made up of three different sensors, an accelerometer (±1.5 g, ±6 g), a magnetometer and a gyroscope (±300 gps, ±1200 gps) and are coupled with the sensor fusion technology. The triaxial accelerometer measures acceleration in three axes of movement, the triaxial gyroscope measures rotation in three axes, and the magnetic sensor provides a positional sense. The G-Studio software is a must to use when analysing and controlling the G-Sensor system. The entire BTS G-WALK® system can provide objective data and allows for comparisons between the left and right sides of the body as well as regular data. The system also provides kinematic data on the pelvis and lower limbs of the human body (BTS SpA, Milan, Italy).

Test protocol
The Run Protocol of the BTS G-WALK® system was implemented to acquire gait kinematic data. The sensor was positioned below the S1-S2 vertebra between the two dimples of Stance phase duration (STPD) Stance phase duration is the average duration of the right and left foot support phase. It is measured in the unit of '% of the running cycle' . 3.
Swing phase duration (SWPD) Swing phase duration is the average duration of the right and left foot swing phase. The unit of measurement is '% of the running cycle' . 4.
Float phase duration (FPD) Float phase duration is the average duration of the phase in which none of the two feet is on the ground. The unit of measurement is '% of the running cycle' . 5.
Propulsion speed (PS) Propulsion speed is the average pushing speed when the limb is in contact with the ground. It is measured in the unit 'm/s' . 6.
Range of pelvic tilt (RT) Pelvic tilt is the rotation of pelvic in the sagittal plane. Range of pelvic tilt is the amount of variation between the highest and lowest angle of rotation of pelvic in the sagittal plane. The unit of measurement is 'degree' . 7.
Range of pelvic obliquity (RO) Pelvic obliquity is the rotation of pelvic in the frontal plane. And the range of obliquity is the amount of variation between the highest and lowest angle of rotation of pelvic in the frontal plane. The unit of measurement is 'degree' . 8.
Range of pelvic rotation (RR) Pelvic rotation is the rotation of pelvic in the transversal plane. Range of rotation is the amount of variation between the highest and lowest angle of rotation of pelvic in the transverse plane. The unit of measurement is 'degree' . 9.
Symmetry index (SI) Symmetry index comes from the comparison between right and left running cycle. It is the index of similarity between right and left running cycle.
venus. The device was centred on the vertebral line pointing upwards. The belt was tightened as much as required to deter displacement during the test. The test was administered inside a laboratory which facilitated a treadmill. The test was started with the subject standing in a still position on the treadmill. The position was maintained for a few seconds until the end of the stabilization phase. Following the command from the operator, the subjects started to run on the treadmill. The operator steadily increased the treadmill speed until it reached 8 km/h. The subjects maintained this speed for at least 5 minutes. After completion of 5 minutes, the operator stopped the treadmill and manually enter the total distance travelled by the subject to the software. After entering the distance manually, BTS G-WALK® gait analysis system automatically generated values of spatio-temporal parameters, pelvic kinematics, and symmetry index which was further used for the study.

Experimental Protocol
The prospective cohort study design was used for the present study. A group of people with identical characteristics were observed over time. The differences among group individuals with respect to certain factors were identified and recorded initially and later it is analysed how these differences in selected factors affect certain outcomes. The subjects follow their general conditioning program for a full academic season after which the researcher inquired about their injury occurrences. Afterwards the researcher analyses how differences in gait kinematic affect the occurrence of injury among the subjects. The whole experimental protocol is explained using figure 1.

Statistical analysis
Binary logistic regression is generally used to predict the odds of occurrence of an event whose outcomes are binary in nature (Verma, 2013). Using binary logistic regression, a prediction model can be developed to predict a dichotomous dependent variable based on categorical or numerical independent variables (predictors). In the current study also, the researcher intended to develop a model to predict lower limb injuries of sports person based on some spatial-temporal gait kinematic parameters. Simple descriptive statistics were first used to summarize the data for the study. To find the factors affecting the occurrence of running-related lower limb sports injuries and to develop a predictive model, binary logistic regression was used employing IBM SPSS Version 25.0 (Armonk, 2017). All statistical tests were calculated at the significance level of 0.05.

Results
The initial investigation reported that out of 87 subjects, 2 subjects got accidental injuries, therefore those subjects were excluded from the study. The effective and final number of subjects became 85 (N = 85). The average age of the subjects was 18.40 ± 1.01 years, weight 62.35 ± 7.52 kg, Hight 171.22 ± 5.93 cm with an average BMI of 21.24 ± 2.05 (Table 2). Out of the effective 85 subjects, 20 (23.5%) subjects suffered from lower limb injuries (Table 3).
Before the application of binary logistic regression, simple descriptive statistics were calculated to summarise the values of independent variables (Table 4). The assumptions were also verified and it was found that all assumptions were fulfilled by the data set.
The omnibus tests of model coefficients (table 5) display that model 4 is statistically significant (p < 0.05) with the highest degree of freedom (df = 4). Therefore, it can be said that the overall model 4 is statistically significant in predicting future injury.
The results illustrated in table 7 indicates that the accuracy of the final model is 91.8%. The results also revealed that 80% of injured subjects were predicted correctly by the model to get running-related lower limb injuries. On the other hand, 95.4% of subjects who didn't get injury were also predicted correctly by the model not to get any runningrelated lower limb injuries.
The Wald test result column in table 8 revealed that the variables Range of Obliquity (RO), Range of Tilt (RT), Range of Rotation (RR) and Symmetry Index (SI) were added significantly to the final model (p < 0.05).
Based on the result from table 8 the following logistic regression model can be developed. log(p/(1-p))= 67.027 +1.078·RO + 0.614·RT + 0.597·RR -0.927·SI (p = Probability of getting injury) The logistic model was statistically significant as x 2 (4) = 63.42; p < 0.05. The model explains 79.2% of the variability in running-related lower limb injury and correctly classified 91.8% of cases. It was found that increasing Range of Obliquity, Range of Tilt and Range of Rotation were associated with an increased likelihood of future running-related lower limb injury. However, lower Symmetry Index was associated with an increase in the likelihood of future runningrelated lower limb injury.    Chi-square df Sig.
Step 4 Step  Step 4

Discussion
Injury is a major cause of morbidity among young athletes. Previous studies indicated that future research is necessary to determine risk factors for injuries (Gogoi et al., 2020). Identifying factors that are contributing to sports injury may have tremendous significance in the athlete's health care. Therefore, the researchers intended to analyse gait kinematics to determine the risk factors and to develop a predictive model for lower limb injuries among sportsperson.
In the current study prospective cohort study design was employed by using an inertial sensor-based device to assess the gait kinematic factors. A similar kind of the previous study suggested application of inertial and GPS sensors combined with biometrics, nutrition and sleep data can be used to track movements and physical activity and can be used to describe injuries, anticipate future injuries, and then proactively assess which modifiable risk factors can be altered for optimum performance (Bourdon et al., 2017).
The result of the current study indicated that gait kinematic may be used to identify the elevated risks of lower limb injury among sportsperson. From table 6, 'Cox & Snell R square' indicates that 52.6% of the variation in the dependent variable can be explained by the final logistic model. And the Nagelkerke R Square indicates that 79.2% variability in the dependent variable can be explained by the independent variables in the final logistic model (table 6). Further, the table 7 reveals that the model may correctly distinguish 91.8% of the running-related lower limb injuries. The positive predictive value was (100·16/(16+3)) 84.21% which was the percentage of correctly predicted cases of all the observed injury cases. The negative predictive value was (100·62/(62+4)) 93.94% which was the percentage of correctly predicted cases of all non-injury cases (table 7). The value of Exp(B) (table 8) for the variable Range of Obliquity (RO) indicates that if we keep all the other independent variables constant, one unit increase in Range of Obliquity (RO) will increase the odds of having running-related lower limb injury by 2.938 times. Same way, keeping all other independent variables constant, if we increase one unit of Range of Tilt (RT), it will also increase the odds of having running-related lower limb injury by 1.848 times. For the independent variable Range of Rotation (RR), the odd of having running-related lower limb injury will increase by 1.816 times. But for one unit increment in Symmetry Index (SI) the odd of having running-related lower limb injury will decrease by 0.396 times ( Table 8).
The comparative descriptive statistics (table 4) of the independent variables reveals that there was not much difference in the cadence of subjects with injury (168.85 ± 5.82) and without injury (169.31 ± 9.69). A study conducted by Burns et al. (2019) also reported a similar kind of result, cadence did not exhibit any significant impact on running efficiency. In contrast to Burns et al. (2019), several other studies have indicated that a faster running cadence helps runners to decrease their chances of injury over time (Kessler, 2020;Schubert et al., 2014;Wellenkotter et al., 2014). The overall swing phase duration (76.95 ± 3.38) of the subjects seemed to be higher whereas the stance phase duration (23.05 ± 3.38) was lower compared to normal running as the subjects were running at a higher speed (8 km/h) (Chumanov et al., 2011;Simonsen, 2013). For both the variables, there was not any noticeable difference between injured and without injury subjects. A similar trend also followed for the variable float phase duration and propulsion speed. The variables range of obliquity had exhibited a noticeable difference between injured (11.00 ± 1.26) and without injury subjects (7.85 ± 1.80). Benca et al. (2020) also observed increased pelvic obliquity in malalignments population. In the current study, injured subjects had shown a higher range of tilt. Alizadeh & Mattes (2019) suggested, pelvic tilt in the kinetic chain might have the potentiality to predispose injured soccer player. In addition to it, a recent finding indicated that distance runners may have an increased risk of running-related lower limb injury with increased hip adduction (Mokha & Gatens, 2018).
The current study also reveals that injury subjects have a higher degree of range of rotation (10.70 ± 1.17) with less symmetry index (97.14 ± 1.49).
Despite the limitation of the current study, the authors have developed the model to predict running-related lower limb sports injury. Since runners can modify their running mechanism. Therefore, the authors suggest the sportspersons to analyse their gait kinematics and take preventive gait modification to thwart future injury.

Conclusions
One of the major objectives of scientific sports training is to prevent players from injury but it may not be always possible for coaches to identify the injury-prone players on the field. Therefore, clinical prediction modelling may provide an objective estimate of injury identification. For example, a coach can use a predictive model to identify which athletes might be at risk for musculoskeletal injuries followed by a rehabilitation programme to help address their deficiencies (Hughes et al., 2018;Wilkerson et al., 2012) advised periodical health examination to identify, prevention and rehabilitation of athletes which may be at risk of future injury. A similar pattern of analysis may also be followed to periodically analyse gait kinematics to find the faulty action of movement to prevent future injury of a sportsperson.
The main limitation of any prediction model is that there are variations in the assessment of the factors. In addition, the reliability of the model derived from such cohort studies is strongly dependent on the number of cases in the study. If there are not enough samples, the data cannot be considered as representative for a larger group of athletes. Therefore, for future research, it is suggested to have a much larger data set to establish a reliable injury prediction model.
After completion of the current study, firstly we can ask ourself can we identify the risk factors of injury? Secondly, can we predict the injuries with certainty? For the first question, we can answer yes, we can identify the risk factors of lower limb injury within the limitation of the current study. And for the second question, based on the results, we can agree that prediction of injury is theoretically possible but for the practical implication, further research is essential with a bigger sample size. Apart from it, the research was conducted on male sports person, so the results may not be applicable to female sportspersons. Therefore, it is suggested to conduct future research on female sportspersons also.