Authors : T. Jai Sankar, R. Prabakaran, K. Senthamarai Kannan and S. Suresh
Abstract: This study proposes a technique using Autoregressive Integrated Moving Average (ARIMA) Model for cattle production. Stochastic modeling and forecasting plays a vital role in many fields such as agricultural production, animal husbandry economics, stock prices prediction, etc. ARIMA Model was introduced by Box and Jenkins. Hosking has introduced a family of models called fractionally differenced autoregressive integrated moving average models by generalizing the d fraction in ARIMA (p, d, q) models. Mandal was using ARIMA Model for analyzing sugarcane production. This study analysis the design of ARIMA process to select the appropriate model for cattle production in Tamilnadu. These results are verified on the basis of various diagnostic checking and error analysis which is used to forecast the future values. Also, results are shown by graphically and numerically.
T. Jai Sankar, R. Prabakaran, K. Senthamarai Kannan and S. Suresh, 2010. Stochastic Modeling for Cattle Production Forecasting. Journal of Modern Mathematics and Statistics, 4: 53-57.
INTRODUCTION
India is an agricultural country with about 70% of its population dependent on income from agriculture. Cattle and buffaloes are maintained for milk production, motive power for various farm operations, village transport, irrigation and production of manure. The animals are generally maintained for agricultural byproducts and crop residues. The small income farmers and diary developers are well based on the cattle production. But the cattle production is very low for the past 25 years.
In fact livestock and human are dependent on each other. Cattle were raised mainly to get the male calves which were used for agriculture fields and dung for enriching the soil. Higher the number of the cattle maintained meant the higher the availability of the bullock /draught power and the farm yard manure, due to which the productivity and the production is higher.
MATERIALS AND METHODS
In this study, the source of data for cattle production in Tamilnadu is collected from the Department of Animal Husbandry and Veterinary Services, Government of Tamilnadu for the period 1970-2008. ARIMA model was introduced by Box and Jenkins (1970) and is used for discovering the pattern and predict the future values of the time series data. Akaike (1970) discussed with the stationary time series by an AR (p), p is finite and bounded by the same integer. The Moving Average (MA) models were first used by Slutzky (1973). Hannan and Quinn (1979) for pure AR models and Hannan (1980) for ARMA models, suggest obtaining the order of a time series model by minimizing the errors. Prajnesh and Venugopalan (1996) have discussed various statistical modeling techniques viz., polynomial, ARIMA time series methodology and nonlinear mechanistic growth modeling approach for describing marine, inland as well as total fish production of the country during the period 1950-51 to 1994-95. Model parameters were estimated using the Statistical Package for Social Sciences (SPSS) package to fit the ARIMA models. ARIMA process for any variable involves four steps: Identification, estimation, diagnostic checking and forecasting. Each of these four steps is explained for cattle production.
ARIMA process: The time series when differenced follows an AR and MA model is known as autoregressive integrated moving averages (ARIMA) model. Autoregressive process of order (p) is:
Moving average process of order (q) is:
The general form of ARIMA model of order (p, d, q) is:
Where ε_{t}’s are independently and normally distributed with zero mean and constant variance σ^{2} for t = 1, 2,...n. The different models can be obtained for various combinations of autoregressive and moving average. The best model is obtained with the following diagnostics low Akaike’s Information Criteria (AIC) which is defined by:
Where m = p+q+P+Q and L is the likelihood function. Since -2 logL is approximately equal to n (1+log2Π) + nlogσ^{2}, where σ^{2 }is the mean square error. Also AIC can be written as:
and Schwartz Bayesian Criteria (SBC) is defined by:
To check the adequacy for the residuals using Q statistic. A modified Q statistic is the Box-Ljung Q statistic is defined by:
Where:
r_{k} | = | The residual autocorrelation at lag k |
n | = | The number of residuals |
The Q statistic is compared to critical value from Chi square distribution. If the p-value associated with Q statistic is small (p<α), the model is consider in adequate. Forecasting the future periods using the parameters for the tentative model has been selected.
Analysis and trend fitting techniques: For evaluating the AR, MA and ARIMA process adequacy, various reliability statistics like R^{2}, Stationary R^{2}, Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Maximum Absolute Percentage Error (MaxAPE), Mean Absolute Error (MAE), Maximum Absolute Error (MaxAE) and Normalized BIC have been used. Lesser the various reliability statistics better will be the efficiency of the model in predicting the future cattle production. For calculating the Box-Ljung, Q statistics have also been used.
RESULTS AND DISCUSSION
Model identification: ARIMA model is designed after assessing which varies the variable under forecasting as a stationary series. The stationary series is the set of values vary over period of time around a constant mean and constant variance. The stationarity is checked by graphical representation.
Figure 1 shows that the data is non-stationary. Non-stationarity in mean is corrected through first differencing of the data. For this purpose, the various autocorrelations up to 12 lags were computed and the same along with their significance which is tested by Box-Ljung test are shown in Table 1.
Fig. 1: | Time plot of cattle production in tamilnadu |
Table 1: | ACF and PACF of cattle production |
^{a}The underlying process assumed is independence (white noise), ^{b}Based on the asymptotic chi-square approximation |
Fig. 2: | ACF of differenced data |
Fig. 3: | PACF of differenced data |
The graphs of ACF and PACF are shown in Fig. 2 and 3. The tentative ARIMA models are described with differenced once and model is chosen which has minimum normalized BIC (Bayesian Information Criterion).
The models and corresponding normalized BIC values are shown in Table 2. The value of normalized AIC is 959.78 and R^{2} value is 99%. So the most suitable model for Cattle Production is ARIMA (1, 1, 0) as this model has the lowest AIC value.
Model estimation: Model parameters were estimated using SPSS package. Results of estimation are shown in Table 3 and 4.
Diagnostic checking: Based on the estimation, the autocorrelations and partial autocorrelations of the residuals of various orders are analysed.
Table 2: | BIC values of ARIMA (p, d, q) |
Table 3: | Estimated ARIMA model of cattle production |
Table 4: | Estimated ARIMA model fit statistics |
Table 5: | Residual of ACF and PACF of cattle production |
Fig. 4: | Residuals of ACF and PACF |
For this purpose, the various autocorrelations up to 12 lags were computed and the same along with their significance are shown in Table 5. As the results show, none of these autocorrelations is significantly different from zero at a reasonable level.
Table 6: | Forecast of cattle production |
This proves that the selected ARIMA model is an appropriate model. The ACF and PACF of the residuals are shown in Fig. 4. It also indicates good fit of the model. So the fitted ARIMA model for the cattle production data is:
Forecasting: Forecasted value of cattle production (Quantity in numbers) for the year 2009 through 2015 are shown in Table 6. To assess the forecasting ability of the fitted ARIMA model, important measures of the sample period forecasts’ accuracy were computed.
Fig. 5: | Actual and estimate of cattle production |
This measure indicates that the forecasting inaccuracy is low. Figure 5 shows that the actual and forecasted value of cattle production data with 95% confidence limits.
The constructed model designed for cattle production is found to be ARIMA (1, 1, 0). Based on the numerical calculations and graphical representations, it can be found that forecasted production for the year 2009 is <2010 but in subsequent years the production increases. The validity of the forecasted values can be verified for the period from 1970-2008 regarding cattle production. This study provides evidence on complete cattle production data.
CONCLUSION
The estimated results indicate that there is an increase in the cattle production which will improve the economy of the state. This provides evidence in favour of Box-Jenkins methodology as it applies to cattle production and future efficiency.