A TIME SERIES ANALYSIS OF TUBERCULOSIS INCIDENCES IN

Tuberculosis (TB) is a serious infectious disease caused by Mycobacterium Tuberculosis that mainly affects the lungs but can also attack various body organs. Globally, it has been reported that the annual number of people provided with TB treatment has grown from 6 million in 2015 to 7 million in 2018 and then 7.1 million in 2019 (WHO, 2020). In the Philippines alone, there had been an estimated 500,000 incident cases of TB in 2019. The study’s objectives are to develop a model that indicates the occurrence of TB in Pasig City, determine the incidence rate in terms of gender and age of TB patients


INTRODUCTION
Infectious diseases are disorders caused by organisms such as bacteria, viruses, and fungi (Kotra, 2007). Some are lower respiratory tract infections, including pneumonia, Tuberculosis (TB), malaria, and Human Immunodeficiency Virus (HIV) infection. Note that such diseases caused some of the pandemics that humanity experienced, such as the bubonic plague and the coronavirus disease 2019 (COVID-19) pandemic (Parry & Peterson, 2020).
Globally, it has been reported that the annual number of people provided with TB treatment has grown from 6 million in 2015 to 7 million in 2018 and then 7.1 million in 2019. In the Philippines alone, there had been an estimated 500,000 incident cases of TB in 2019. Likewise, rifampicin-and multidrug-resistant TB disease has grown annually from a global total of 122,726 in 2016 to 156,205 in 2018156,205 in and 177,099 in 2019156,205 in (WHO, 2020. One of the leading infectious diseases that cause death worldwide is TB. Despite being preventable and with an 85% success rate in treatment, data shows that 1.4 million people died from TB in 2019 (WHO, 2019). In the same year, it was recorded that 87% of the new cases came from 30 heavily tolled countries, with India on top of the list and including the Philippines. that are bacteriologically confirmed. In the Philippines, 30% are bacteriologically confirmed. Furthermore, from 2018 to 2019, the increase in notifications in high TB-burden countries correlated with the decrease in bacteriologically confirmed cases.
Extrapulmonary TB has the same definition as pulmonary cases, but it is an infectious disease that occurs in organ systems other than the lungs. In 2019, extrapulmonary TB represented 16% of the total 7.1 million incident cases, whereas, for the Philippines, it ranges from 0 -9.9%.
One of the leading causes of death in the Philippines is TB. With a life expectancy of 69.1 years old, as of 2016, it has come a long way since 1980, when the life expectancy at birth is just 62.2 years old, all the while combating diseases preventing its population from perishing (WHO, 2020).
The Philippines is one of the seven highest TB burden countries in the world, and these seven countries account for 57% of the global TB incidence. Note that 75% of the individuals seeking initial care are in the private sector of the high burden countries. Private healthcare providers were trained to manage TB but are not supported systematically regarding the recording, reporting, and monitoring treatment. Furthermore, private expenditure represents 61%-74% of the total health expenditure. The National Tuberculosis Control Program (NTP) in the Philippines has implemented the Public-Private Mix (PPM) strategy since 2000 but has risen to the top 4 high-TB incidence countries. In 2017, 317,266, 55%, out of 581,000 patients with TB, were notified by the NTP and reported to WHO, and 17% were from the private sector. Although an estimated 217,925 first-line treatment is sold annually for anti-TB drugs, which comprises 43% of the country's total market, it provides evidence of a lack of systematic support for privately managed TB cases (Wells & Stallworthy, 2019).
People aged 15-49 years are considered the most productive age group in TB in South-East Asia (SEA) (WHO, 2020). This study will follow the transmission model used by (Fu et al., 2020), where TB patients were divided into three age groups: children (< 15 years old), adults , and elders (≥65). The Global Tuberculosis Report of WHO in 2020 presents that TB is the highest burden in adult men aged 15 years old and above, accounting for 56% of all TB cases in 2019. It was the highest among all the age groups and sexes compared to adult women, with only 32% of the cases of that same year and 12% composed of children. Note that cases of adult males were significantly elevated.
In a study conducted by Zhang et al. (2011), where age, sex, and race are considered to have factors in the presentation of TB, it is concluded that from its four age groups (15-24, 25-44, 45-64, and ≥ 65; all in years), the second and third age group, 25-44 years old and 45-64 years old, respectively are more likely to develop Extrapulmonary Tuberculosis (EPTB) together with PTB than the other age groups. The same age group but those of Somali and Asian descent were at a higher risk of contracting EPTB. However, on the contrary, those from Greenland with the same age group are less likely to have EPTB than all other age groups.
According to Cheng et al. (2020), elderly people risk developing TB due to immune responses. Male patients aged 70 years and above, living in rural areas with diabetes, congestive heart failure, chronic obstructive pulmonary disease, chronic kidney disease, and cancer were independent risk factors for TB in Taiwan Province (Cheng et al., 2020). Like global studies, male patients who smoke, with BMI < 18.5, and with low annual household income are risk factors in developing TB. Moreover, elderly people have a higher chance of developing TB when they were/are a smoker with low BMI (Cheng et al., 2020). According to Cheng et al. (2020), a study conducted in 22 high TB-burden countries, mostly in Asia and Africa, reported the top contribution risk factors, namely, malnutrition (27.0%), smoking (21.0%) and HIV infection (16.0%).

METHODOLOGY
This section presents the research design and briefly describes the study population from Pasig City. The data gathering, statistical treatment, and applicable mathematical procedures are briefly presented below.

Figure 1
Methodology Flowchart

Data Gathering
Recorded files of TB patients in Pasig City from 2015 -2020 were obtained with permission from data sources under the PCHO. Using retrospective random sampling, PTB patients were categorized into three age groups: less than 15, 15-64, and 65 years or above. Subsequently, collected data were filtered and organized in Microsoft Excel using the following format:

Cubic Spline Interpolation
Since the PTB cases were reported annually by the PCHO, a cubic spline interpolation was used to approximate the PTB cases in finer time intervals, resulting in monthly data points within the covered time range. Using the MATLAB cubic spline tool, a series of cubic polynomials were fitted in between each of the available data points (nodes), each resulting in continuous and smooth curves. The resulting cubic splines determine the cumulative change or rate of change on the given interval.
The equation is given as follows: (1)

ARIMA Model
The cubic spline-generated time series of PTB cases is then used and tested to fit an Autoregressive Integrated Moving Average (ARIMA) model. The ARIMA model was then used to forecast the projected number of TB patients for the year 2021 in comparison to the actual data reported by PCHO. It is a statistical model that uses time series data to predict the incidence rate and even forecast future trends. A shorthand notation for this is given as:

Cubic Spline Interpolation
Since the PTB cases were reported annually by the PCHO, a cubic spline interpolation was used to approximate the PTB cases in finer time intervals, resulting in monthly data points within the covered time range. Using the MATLAB cubic spline tool, a series of cubic polynomials were fitted in between each of the available data points (nodes), each resulting in continuous and smooth curves. The resulting cubic splines determine the cumulative change or rate of change on the given interval.
The equation is given as follows:

ARIMA Model
The cubic spline-generated time series of PTB cases is then used and tested to fit an Autoregressive Integrated Moving Average (ARIMA) model. The ARIMA model was then used to forecast the projected number of TB patients for the year 2021 in comparison to the actual data reported by PCHO.
It is a statistical model that uses time series data to predict the incidence rate and even forecast future trends. A shorthand notation for this is given as: ARIMA (p,d,q), where p = non-seasonal autoregressive order d = non-seasonal differencing q = non-seasonal moving average order The equation is given as follows: where μ = mean of the data (1) (3) 5 New Relapse Subtotal Total

Cubic Spline Interpolation
Since the PTB cases were reported annually by the PCHO, a cubic spline interpolation was used to approximate the PTB cases in finer time intervals, resulting in monthly data points within the covered time range. Using the MATLAB cubic spline tool, a series of cubic polynomials were fitted in between each of the available data points (nodes), each resulting in continuous and smooth curves. The resulting cubic splines determine the cumulative change or rate of change on the given interval.
The equation is given as follows:

ARIMA Model
The cubic spline-generated time series of PTB cases is then used and tested to fit an Autoregressive Integrated Moving Average (ARIMA) model. The ARIMA model was then used to forecast the projected number of TB patients for the year 2021 in comparison to the actual data reported by PCHO.
It is a statistical model that uses time series data to predict the incidence rate and even forecast future trends. A shorthand notation for this is given as: ARIMA (p,d,q), where p = non-seasonal autoregressive order d = non-seasonal differencing q = non-seasonal moving average order The equation is given as follows: where μ = mean of the data (1) (3)

Cubic Spline Interpolation
Since the PTB cases were reported annually by the PCHO, a cubic spline interpolation was used to approximate the PTB cases in finer time intervals, resulting in monthly data points within the covered time range. Using the MATLAB cubic spline tool, a series of cubic polynomials were fitted in between each of the available data points (nodes), each resulting in continuous and smooth curves. The resulting cubic splines determine the cumulative change or rate of change on the given interval.
The equation is given as follows:

ARIMA Model
The cubic spline-generated time series of PTB cases is then used and tested to fit an Autoregressive Integrated Moving Average (ARIMA) model. The ARIMA model was then used to forecast the projected number of TB patients for the year 2021 in comparison to the actual data reported by PCHO.
It is a statistical model that uses time series data to predict the incidence rate and even forecast future trends. A shorthand notation for this is given as: ARIMA (p,d,q), where p = non-seasonal autoregressive order d = non-seasonal differencing q = non-seasonal moving average order The equation is given as follows: where μ = mean of the data (1) (3) was used to predict the incidence rate for 2018 -2019 based on the monthly incidence of PTB in China from January 2005 to December 2017. Note that the predicted and actual data were compared to determine the effectiveness of the ARIMA model. Furthermore, it is recommended by Yan et al. (2019) that the ARIMA model is the best to use in the short-term prediction of TB cases.

Root Mean Square Error
To measure the accuracy of prediction for the 2021 PTB cases using the ARIMA model, the Root Mean Square Error (RMSE) is computed. RMSE is the standard deviation of the residuals or prediction errors. In contrast, it measures the distance of the data from the regression line to determine if it is within the best-fit line. It is also the square root of the average of the squared differences of the forecasted value and the observed value of the th observation divided by the number of observations. The equation is given as follows: (4) where The RMSE is a good measure of the accuracy of the data and helps in comparing prediction errors of the computed model. A low RMSE value (≤ 0.2) of a model indicates that the model is highly accurate in predicting the data. Meanwhile, RMSE values between 0.2 and 0.5 show that the model can relatively predict the data accurately (Kenney & Keeping, 1962).

Data Gathered
The data acquired from the PCHO were anonymized from 2015-2020.
Since the study aims to present an age-gender dependent analysis of PTB occurrences, the demographic profile of patients obtained was Φ = slope parameter Θ = moving average parameter choice of the ARIMA model for this research was based on the study of Yan et al. (2019), wher RIMA statistical model was used to predict the incidence rate for 2018 -2019 based on the monthl ence of PTB in China from January 2005 to December 2017. Note that the predicted and actua were compared to determine the effectiveness of the ARIMA model. Furthermore, it i mmended by Yan et al. (2019) that the ARIMA model is the best to use in the short-term predictio B cases.
t Mean Square Error easure the accuracy of prediction for the 2021 PTB cases using the ARIMA model, the Root Mean are Error (RMSE) is computed. RMSE is the standard deviation of the residuals or prediction errors ontrast, it measures the distance of the data from the regression line to determine if it is within th -fit line. It is also the square root of the average of the squared differences of the forecasted valu the observed value of the th observation divided by the number of observations. The equation i n as follows: where N = the number of observations available for analysis = forecasted value for the ith observation of data ̂ = observed value for the ith observation of data RMSE is a good measure of the accuracy of the data and helps in comparing prediction errors o omputed model. A low RMSE value (≤ 0.2) of a model indicates that the model is highly accurat edicting the data. Meanwhile, RMSE values between 0.2 and 0.5 show that the model can relativel ict the data accurately (Kenney & Keeping, 1962).

RESULTS AND DISCUSSION
a Gathered data acquired from the PCHO were anonymized from 2015-2020. Since the study aims to presen ge-gender dependent analysis of PTB occurrences, the demographic profile of patients obtained wa red to these two variables. The data were organized per year and grouped according to age and er. From eight age groups, it was narrowed down to three based on the study of (Fu et al., 2020) first group included children under 15 who were recorded to have TB and are classified as pediatri (CDC, 2021). The second age group, which ranges from 15 to 64 years of age, was considered th (4) Φ = slope parameter Θ = moving average parameter oice of the ARIMA model for this research was based on the study of Yan et al. (2019), where IMA statistical model was used to predict the incidence rate for 2018 -2019 based on the monthly ce of PTB in China from January 2005 to December 2017. Note that the predicted and actual ere compared to determine the effectiveness of the ARIMA model. Furthermore, it is ended by Yan et al. (2019) that the ARIMA model is the best to use in the short-term prediction ases.
ean Square Error sure the accuracy of prediction for the 2021 PTB cases using the ARIMA model, the Root Mean Error (RMSE) is computed. RMSE is the standard deviation of the residuals or prediction errors. rast, it measures the distance of the data from the regression line to determine if it is within the line. It is also the square root of the average of the squared differences of the forecasted value observed value of the th observation divided by the number of observations. The equation is s follows: where N = the number of observations available for analysis = forecasted value for the ith observation of data ̂ = observed value for the ith observation of data SE is a good measure of the accuracy of the data and helps in comparing prediction errors of puted model. A low RMSE value (≤ 0.2) of a model indicates that the model is highly accurate icting the data. Meanwhile, RMSE values between 0.2 and 0.5 show that the model can relatively the data accurately (Kenney & Keeping, 1962).

RESULTS AND DISCUSSION
athered ta acquired from the PCHO were anonymized from 2015-2020. Since the study aims to present gender dependent analysis of PTB occurrences, the demographic profile of patients obtained was to these two variables. The data were organized per year and grouped according to age and . From eight age groups, it was narrowed down to three based on the study of (Fu et al., 2020). st group included children under 15 who were recorded to have TB and are classified as pediatric C, 2021). The second age group, which ranges from 15 to 64 years of age, was considered the (4) filtered to these two variables. The data were organized per year and grouped according to age and gender. From eight age groups, it was narrowed down to three based on the study of (Fu et al., 2020). The first group included children under 15 who were recorded to have TB and are classified as pediatric TB (CDC, 2021). The second age group, which ranges from 15 to 64 years of age, was considered the active age group among all three and where most of the working class was found. The last group, all 65 years old and above, were grouped because they were more susceptible to acquiring different diseases due to weakened immunity brought upon by older age, especially TB. Driving mechanisms for the prevention and treatment of TB are not clear on the age disparities, and age is a crucial factor in shaping TB epidemiology, progression risks, and TB control (Fu et al., 2020). The three groups were further categorized by gender, resulting in six study groups. The data for the year 2021 was also obtained to compare the forecasted occurrence corresponding to the empirical records.

Study Population
After the acquisition and sorting of the data, it is programmed in RStudio with respect to the research needs. TB patients diagnosed from the years 2015 -2020 were the ones considered. The data has been divided according to TB into groups based on the given data of the PCHO. Subsequently, the age category of patients is presented in Table 2 and further categorized according to gender, shown in Table 3.

Frequency Distribution of the Sample
This part presents the frequency distribution of the TB data per year regarding their respective age group and gender. The summary distribution is presented in Table 4.

Table 4
Distribution  (WHO, 2020). Based on the data, AG1 has higher cases than AG3, but both are consistently low throughout the years 2015 -2019, but in the years 2020 and 2021, there is a rise in the number of cases for AG3. This is due to the restriction on movement for the ages 0 -14 due to the COVID-19 pandemic, but as restrictions ease, TB detection improves (WHO, 2020). AG2, although it has the larger proportion of the two, remains consistently high. According to a study by Snow et al. (2018), over a quarter of TB patients in the Philippines are children, adolescents, and young adults, reflecting the country's young population. The Philippines' large number of patients under 25 has implications that indicate both in the short term when the disease burden among children and young people is significant. Meanwhile, in the long term, the current generation ages with a high incidence of latent TB infection.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Incidence Rate in Terms of Gender and Age
The rate of occurrence of PTB among patients of Pasig City is examined according to age and gender from 2015 to 2020. A visualization of the occurrence of PTB among patients in Pasig City according to gender and age is presented in Figures 2 and 3, respectively.

Figure 3
Age Group Percentage Occurrence of Tuberculosis (2015Tuberculosis ( -2020 Case notification rates are higher for males than for females. Based on the pie chart, the ratio of gender occurrence of TB cases is at least twice higher among men than women, approximately 60:40, which suggests that men are more likely to get TB than women. According to Horton et al. (2016), TB prevalence is significantly higher among men than women in countries with low-and middle-income, presenting strong evidence that men are disadvantaged in accessing TB care in many settings. Globally, males have 1.8 times the notification rate than females (WHO, 2020). As Pasig City recovers from the economic slump brought upon by the pandemic, the city entices more investors. An ordinance was passed to help the city recover from the economic collapse, creating an economic development and investment office.
In that way, investors were more likely to set up their offices in the city through the help of investor-friendly ordinances that would bring upon jobs that included physical labor (Tucay Quezon & Ibanez, 2021), exposing these men to diseases, including TB. National TB programs and global strategies should recognize the male group as a high-risk group and improve access to diagnostic and screening services to effectively monitor and manage TB and ensure gender equity in TB care.

10
Case notification rates are higher for males than for females. Based on the pie chart, the ratio of gender occurrence of TB cases is at least twice higher among men than women, approximately 60:40, which suggests that men are more likely to get TB than women. According to Horton et al. (2016), TB prevalence is significantly higher among men than women in countries with low-and middle-income, presenting strong evidence that men are disadvantaged in accessing TB care in many settings. Globally, males have 1.8 times the notification rate than females (WHO, 2020). As Pasig City recovers from the economic slump brought upon by the pandemic, the city entices more investors. An ordinance was passed to help the city recover from the economic collapse, creating an economic development and investment office. In that way, investors were more likely to set up their offices in the city through the 10 Case notification rates are higher for males than for females. Based on the pie chart, the ratio of gender occurrence of TB cases is at least twice higher among men than women, approximately 60:40, which suggests that men are more likely to get TB than women. According to Horton et al. (2016), TB prevalence is significantly higher among men than women in countries with low-and middle-income, presenting strong evidence that men are disadvantaged in accessing TB care in many settings. Globally, males have 1.8 times the notification rate than females (WHO, 2020). As Pasig City recovers from the economic slump brought upon by the pandemic, the city entices more investors. An ordinance was passed to help the city recover from the economic collapse, creating an economic development and investment office. In that way, investors were more likely to set up their offices in the city through the 10 Case notification rates are higher for males than for females. Based on the pie chart, the ratio of gender occurrence of TB cases is at least twice higher among men than women, approximately 60:40, which suggests that men are more likely to get TB than women. According to Horton et al. (2016), TB prevalence is significantly higher among men than women in countries with low-and middle-income, presenting strong evidence that men are disadvantaged in accessing TB care in many settings. Globally, males have 1.8 times the notification rate than females (WHO, 2020). As Pasig City recovers from the economic slump brought upon by the pandemic, the city entices more investors. An ordinance was passed to help the city recover from the economic collapse, creating an economic development and investment office. In that way, investors were more likely to set up their offices in the city through the 10 Case notification rates are higher for males than for females. Based on the pie chart, the ratio of gender occurrence of TB cases is at least twice higher among men than women, approximately 60:40, which suggests that men are more likely to get TB than women. According to Horton et al. (2016), TB prevalence is significantly higher among men than women in countries with low-and middle-income, presenting strong evidence that men are disadvantaged in accessing TB care in many settings. Globally, males have 1.8 times the notification rate than females (WHO, 2020). As Pasig City recovers from the economic slump brought upon by the pandemic, the city entices more investors. An ordinance was passed to help the city recover from the economic collapse, creating an economic development and investment office. In that way, investors were more likely to set up their offices in the city through the On the other hand, on Age Group Percentage of Tuberculosis throughout 2015 -2020, as observed in Figure 3, consistently, the largest part of the pie chart is always AG2. It aligns with the statistics released by WHO in 2020, where it has been observed that TB is the highest burden in adult men aged 15 years old and above, accounting for 56% of all TB cases globally in 2019. The prevalence of TB is high in the ages 15-24 and 25-34 since they are more likely to develop TB. Although AG2 has the larger proportion of the two, it remains consistently high. Based on the national TB registry, the Department of Health (DOH) reported 311,000 TB cases in 2021 compared to 263,000 in 2020. The Philippines is one of the 16 countries where essential TB-related services were affected by the COVID-19 pandemic. In 2020, TB services decreased, comparable with the level recorded globally in previous years, with the Philippines as one of the countries being the most affected (WHO, 2020). This is why there is a huge difference in the 2019 and 2020 TB data of the PCHO.

Cubic Spline Interpolation
This part demonstrates cubic spline interpolation of the data points to indicate the data points in between the given data points. The given data is interpolated to have finer data points since the data from the PCHO is only per year. The researchers interpolated through the given yearly data and created 12 splines between data points to simulate monthly occurrences of TB. Note that the interpolation was done monthly to provide more data points to get the best ARIMA model in the next section. It can be observed that the graph of the actual data was smoothened through cubic spline interpolation, and more points are now used to create the time series analysis and ARIMA modeling.
The following shows the interpolated distribution in comparison to the actual data:    Figure 6 portrays the graph of the interpolated points versus the actual data points for Age Group 3 -Males. The highlighted points were the confirmed number of TB patients in the group, with values of 89, 76, 105, 124, 228, and 125, respectively. Figure 7 shows the graph of the interpolated points versus the actual data points for Age Group 1 -Female. The highlighted points were the confirmed number of TB patients in the group, with values of 200, 224, 178, 206, 141, and 50, respectively. Figure 8 shows the graph of the interpolated points versus the actual data points for Age Group 2 -Female. The highlighted points were the confirmed number of TB patients in the group with values 517, 548, 608, 775, 1154, and 1137, respectively. Figure 9 shows the graph of the interpolated points versus the actual data points for Age Group 3 -Female. The highlighted points were the confirmed number of TB patients in the group, with 52, 68, 70, 96, 177, and 96 values, respectively.   Interpolated vs. Actual Data of Age Group 3-Female

ARIMA Model
After performing the cubic spline interpolation, each group was estimated to obtain the ARIMA parameters. This will aid in forecasting the number of TB patients for 2021. Note that only the values for December will be considered since the data only accounts for a yearly basis, and other values from January to November are considered as interpolated points to aid in the prediction of the graph.  -373.19 In Table 5, the ARIMA model that best fits AG1M is ARIMA (2,2,1) since it produced the lowest AIC with the value -373.19 using the auto.arima() function. It determines the best model using the R programming language. If p = 2, the predicted data will increase, and d = 2, which means it will also increase.  Figure 10 illustrates a graphical representation of the predicted values.

AG1M Prediction for 2021
For Table 7, the ARIMA model that best fits AG2M is the ARIMA (0,2,0) since it produced the lowest AIC value, 397.8. Since d = 2, it means that the AG2M data will either linearly increase or decrease. That is why there is a sharp decrease and then an increase in the curve for the predicted data in the graph. Here, a factor that affected the prediction that caused a downward trend is the huge difference in the actual data between 2019 and 2020, which are 1,859 and 1,137, respectively.  Table 7, the ARIMA model that best fits AG2M is the ARIMA (0,2,0) since it produced the lowest AIC value, 397.8. Since d = 2, it means that the AG2M data will either linearly increase or decrease. That is why there is a sharp decrease and then an increase in the curve for the predicted data in the graph.
Here, a factor that affected the prediction that caused a downward trend is the huge difference in the actual data between 2019 and 2020, which are 1,859 and 1,137, respectively.

Figure 11
AG2M Predictions for 2021 Table 8 shows the predicted values for AG2M for 2021 per month after using the ARIMA model to predict. The predicted number for 2021 is 656.82307, with a standard error of 98.3654, while the actual is 1,410 TB patients. As mentioned before, there is a huge difference in the total number of persons diagnosed with PTB between 2019 and 2020, which made the predicted sharp decrease then increase for 2021. Another factor that explains the sharpness of the curve is that the prediction was only obtained by interpolating the actual data from 2015 -2020. The data given by the PCHO is only yearly. Hence, the accuracy of the prediction is affected. Figure 11 illustrates the graphical representation of the AG2M predictions for 2021. Table 9 shows the best-fit model for AG3M is ARIMA (2,2,2) since it produced the lowest AIC value, -359.56. Since p = 2 and d = 2, the AG3M data predicted a linear increase or decrease. That is why it is seen in the graph there is a sharp decrease and then an increase in the curve for the predicted data, which is similar to the graph of AG2M.

Figure 12
AG3M Predictions for 2021 Table 10 presents the predicted values for AG3M for 2021 per month after using the ARIMA model to predict. The predicted number of PTB patients for 2021 in Pasig City is 196.45242, with a standard error of 25.04793, while the actual data for the count of patients in 2021 is 206. Although it can be observed that some of the monthly predictions were far from the actual points, they were only compared to the actual splined points. The only observation that should be the basis for the model's accuracy is the predicted value for December since the data is cumulatively collected at the end of the year. As observed in the graph, there is a huge dip in the 75th to 76th month. A factor that made the prediction rise and decrease sharply is the difference between the 2019 and 2020 actual data, which are 125 and 206, respectively. Similar to AG2M, the accuracy of the prediction is AG3M Predictions for 2021 Although it can be observed that some of the monthly predictions were far from the actual points, they were only compared to the affected by the interpolated data of AG3M. Figure 12 illustrates the graphical representation of the predicted values for AG3M for 2021. In Table 11, the best-fit model for AG1F is ARIMA (2,2,2) since it produced the lowest value of AIC, which is -362.57. Since p = 2 and d = 2, then it means that the AG1F data will either increase or decrease. Table 12 shows the predicted values for AG1F for 2021 per month after using the ARIMA model to predict. The predicted number of TB patients in Pasig City for AG1F is 50.19175, with a standard error of 25.8370. The actual data of TB patients in 2021 is 65. Figure 13 depicts the graphical representation of the predicted values for AG1F for 2021.

Figure 13
AG1F Predictions for 2021 In Table 13, the best-fit model for AG2F is ARIMA (0,2,1) since it produced the lowest value of AIC, which is 549.46. Like in AG1F, d = 2 means that the data for AG2F is not stationary, and the prediction may still increase or decrease over time. AG1F Predictions for 2021

Interpolated vs Actual Data of Age Group 2 -Female
In Table 14, the projected number of TB patients in Pasig City is 582.9836, with a standard error of 99.38276, while the actual data for the count of patients in 2021 is 871. Note that the difference between n Table 14, the projected number of TB patients in Pasig City is 582.9836, with a standard error 9.38276, while the actual data for the count of patients in 2021 is 871. Note that the difference betwe he actual data and the predicted data is large. As observed in Figure 14, there is a slight increase fro 020 up to 2021 but a sharp decrease from 2019 to 2020. Similar to AG2M, the accuracy of t the actual data and the predicted data is large. As observed in Figure  14, there is a slight increase from 2020 up to 2021 but a sharp decrease from 2019 to 2020. Similar to AG2M, the accuracy of the prediction is affected by the large proportion of the data for AG2F.   Table 15 presents the ARIMA model for AG3F. The best-fitted model for the age group is ARIMA (1,1,1), with AIC equal to 155.28. Since p = 1, the data will increase or decrease linearly; since d = 1, the data is not stationary and will still tend to go up or down as time progresses.
In Table 16, the predicted number of TB patients in Pasig City for AG3F is 107.4912 with a standard error of 18.43877, while the actual data in 2021 is 115. The graph shows a sharp decrease from 2019 to 2020 because of the difference in the actual data from 2019 and 2020.

Root Mean Square Error
After estimating the parameters of the ARIMA model per group, the predicted values will be compared to the actual values, and its accuracy will be measured through RMSE.
The researchers used ARIMA modeling to forecast the number of TB cases in 2021 using the data collected from 2015-2020. Due to the lack of data points, as the given data were cumulatively in years, the researchers performed cubic spline interpolation to facilitate the data smoothing.
As observed in Table 17, actual values of TB patients per age group of 2021 were given and compared to the generated predicted values ot Mean Square Error ter estimating the parameters of the ARIMA model per group, the predicted values will be compared the actual values, and its accuracy will be measured through RMSE.
e researchers used ARIMA modeling to forecast the number of TB cases in 2021 using the data lected from 2015-2020. Due to the lack of data points, as the given data were cumulatively in years, researchers performed cubic spline interpolation to facilitate the data smoothing.
observed in Table  For AG1M, in Table 6, it can be observed that the predicted data for 2021 has a value of 96.7013 with a standard error of 11.37 compared to the actual data, which is 106. Here, the RMSE value for AG1M is 2.684303474, which deems the predicted model accurate.
For AG2M, in Table 8, it can be observed that the predicted data for 2021 has a value of 656.82307 with a standard error of 98.3654 compared to the actual data, which is 1410. The RMSE value for AG2M is 217.4234516, which deems the model inaccurate. This is due to the large difference between the years 2019 and 2020.
For AG3M, in Table 10, it can be observed that the predicted data for 2021 has a value of 196.45342 with a standard error of 25.04793 compared to the actual data, which is 206. RMSE value for AG3M is 2.755860266, which deems the predicted model accurate.
For AG1F, in Table 12, it can be observed that the predicted data for 2021 has a value of 50.19175 with a standard error of 25.83708 compared to the actual data, which is 65. The RMSE value for AG1F is 4.274773562, which deems the predicted model accurate.
For AG2F, in Table 14, it can be observed that the predicted data for 2021 has a value of 582.9836 with a standard error of 99.38276 compared to the actual data, which is 871. The RMSE value for AG2F is 83.14317304, which deems the model inaccurate. This is due to the large difference between the years 2019 and 2020. AG2F results are similar to that of AG2M.
For AG3F, in Table 15, it can be observed that the predicted data for 2021 has a value of 107.4912 with a standard error of 18.43877 compared to the actual data, which is 115. RMSE value for AG3F is 2.167603851, which deems the predicted model accurate. Table 17 summarizes the predicted values per group compared to the actual value and their computed root mean square.

Conclusions and Recommendations
Infectious diseases have been one of the causes of ailments all over the globe and can even cause death. One of these is TB, which is the focus of this study. This study aims to forecast TB cases in Pasig City for 2021 using the data collected from 2015-2020.
The researchers have acquired TB data from the PCHO to obtain the occurrence of TB per year. Based on the PCHO data, AG2 has the highest number of cases, mainly in the 15 -34 age range. Adulthood and early adolescence are key risk periods for TB infection, diseases, and adverse effects. AG1 had higher cases than AG3 throughout the years 2015 -2019, but in the years 2020 and 2021, it can be observed that there was a rise in the number of cases for AG3. This is due to the restriction in movement for AG1 due to the COVID-19 pandemic, making it hard to travel to healthcare facilities (WHO, 2020).
It was observed that through all the cases in the last six years, it is evident that there are more male patients recorded than female. Also, men are more likely to be exposed to TB since they are more likely to obtain employment involving physical labor. In terms of age, AG2 comprises most of the recorded cases since it is also the largest age group, which indicates that the most productive age group was also highly exposed to TB among all the age groups. The best fit ARIMA models for each group are as follows: AG1M has the best fit ARIMA model of (2,2,1), AG2M with (0,2,0), AG3M with (2,2,2), AG1F with (2,2,2), AG2F with (0,2,1), and lastly, AG3F with an ARIMA model (1,1,1).
The computed RMSEs suggest how accurate the prediction was. AG1M, AG3M, and AG3F have the lowest values of RMSE with values of 2.755860266, 2.684303474, and 2.167603851, respectively. Therefore, AG1M, AG3M, and AG3F have the highest prediction accuracy among all groups. It is recommended to future researchers to 1) Expand the data set to at least 10 years and on a seasonal basis, either quarterly or monthly, for a more accurate prediction of TB cases. 2) Use other statistical treatments like Poisson regression to forecast data. 3) Use other analyses of TB cases, such as getting the survival rate of patients of different age groups.
Infectious diseases have been one of the causes of ailments all over the globe and can even cause death. One of these is TB, which is the focus of this study. This study aims to forecast TB cases in Pasig City for 2021 using the data collected from 2015-2020.
The computed RMSEs suggest how accurate the prediction was. AG1M, AG3M, and AG3F have the lowest values of RMSE with values of 2.755860266, 2.684303474, and 2.167603851, respectively. Therefore, have the highest accuracy of prediction among all groups. It is recommended to future researchers to 1) Expand the data set to at least 10 years and on a seasonal basis, either quarterly or monthly, for a more accurate prediction of TB cases. 2) Use other statistical treatments like Poisson regression to forecast data. 3) Use other analyses of TB cases, such as getting the survival rate of patients of different age groups.