Bootstrap Residual Ensemble Methods for Estimation of Standard Error of Parameter Logistic Regression To Hypercolesterolemia Patient Data In Health Laboratory Yogyakarta

Logistic regression is one of regression analysis to determine the relationship between response variable that have two possible values and some predictor variables. The method used to estimate logistic regression parameters is the maximum likelihood estimation (MLE) method. This method will produce a good estimate of the parameters if the estimation results have a small standard error. In a research, the characteristics of good data must be representative of the population. If the samples taken in small size they will cause a large standard error value. Bootstrap is a resampling method that can be used to obtain a good estimate based on small data samples. Small data will be resampling so it can represent the population to obtain minimum standard error. Previous studies have discussed resampling bootstrap on residuals as much as b times. In this research we will be analyzed resampling bootstrap on the error added to the dependent variable and take the average parameter estimation ensemble logistic regression model resampling result. Next we calculate the standard value error logistic regression parameters bootstrap results. This method is applied to the hypercholesterolemic patient status data in Health Laboratory Yogyakarta and after bootstrapping, the standard error produced is smaller than before the bootstrap resampling.


Introduction
Logistic regression is a non linear regression where the relationship curve between the response variable and the predictor variable is not a straight line. Logistic regression is used as a method to analyze the relationship of binary response variables (0 and 1) with predictor variables. However, a problem arises when the samples taken are small in size.
Whereas the characteristics of good data should be representative which means the sample data objective and describe the population so that the sample can represent the population.
If the sample taken is much smaller than the size of the population then it is less representative so that the conclusions obtained produce a fairly large standard error.
Therefore, a method is needed to solve the problem. Efron and Tibshirani [3] introduced a resampling method known as the bootstrap method that can resampling small samples with the help of a computer. This method assumes that the empirically distributed sample is then considered a population and from that population resampel can be done. The size of the bootstrap resampling is better taken quite a lot in order to represent the population data so that the resulting standard error is small.
Previous studies (Sahinler and Topuz [9], Hossain and Khan [1]) discussed the bootstrap resampling algorithm for the estimation of linear regression and logistic regression parameters by resampling the residuals generated from the model. Furthermore Pardoe and Weisberg [6] discusses the conditional probability bootstrap method which is a bootstrap method if the value of a variable the response is influenced by the predictor variables.
In this study the model parameter estimation is done by the residual ensemble bootstrap method as has been done Handajani et al [10 ] on the spatial regression model.  into the linear probability model as follows .

Logistic Regression
The LPM in (1) above shows the right-hand side is unlimited (since the value of x is continuous) but the left-hand Y or π (x) value must be limited (0 or 1). Therefore the lefthand side of model (1) must be changed so that the left segment of value 0 and 1 can have values between -∞ to ∞ like the right-hand segment to obtain a logistic regression model (Hosmer dan Lemeshow [4]).
To estimate the parameters in nonlinear regression, especially logistic regression was used the maximum likelihood method. Basically this method gives an estimate value of β by maximizing its likelihood function (Hosmer and Lemeshow [4]).
Mathematically the probability distribution of the Y function can be expressed as follows, Each observation of y is mutually independent then the likelihood function is the multiplication of each probability distribution that is Sungkono [7]). The bootstrap approach uses a sampling method with returns. The basic idea of the bootstrap method is to build artificial samples using information from the original data. According to Teknomo [8], the bootstrap method depends on its own source or can is said to depend on the sample which is the only source owned researchers.
The bootstrap estimation for ( ) is the standard error of for the random number of n data taken from . (Efron and Tibshirani [3]).

Data and Method.
Secondary data taken from Health laboratory of Yogyakarta. The data used in this case is data on the cholesterol status of 20 patients. The data obtained are response variable that is patient cholesterol status and predictor variable that is LDL, HDL and triglyceride level from patient data of hypercholesterolemia. Analytical steps taken are Step 1. Estimate the standard error of original sample logistic regression parameters Step 2. Determine the logistic regression model so that the value of is obtained Step 3. Calculates the residual value of the model obtained from the difference between Y and Step 4. Resampling the residue generated from the model with bootstrap Step 5. Estimates the regression model parameters of each group of response variables that have been added with the residual value resulting from the bootstrap resampling with predictor variables Step 6. Calculate the standard error value of the logistic regression coefficient by the bootstrap method Step 7. Determine the ensemble regression logistic regression equation by averaging the logistic regression coefficients of k the regression equation obtained 33

Results and Discussion
The data used is secondary data obtained from Job Training Report by Prabandari [5] which comes from the internal data of the Pathology of Health laboratory in Yogyakarta.
Data on the cholesterol status of 20 patients is presented in the following table.

Estimation of Logistic Regression Parameters
To estimate logistic regression parameters with MLE on 3 independent variables using R-software and obtained the output in the following table.

Bootstrap Replication
The bootstrap resampling is applied to the residual values generated from the logistic regression model. The amount of data from the bootstrap replication depends on how much data there is in the observed data multiplied by the desired number of bootstrap replications. In this research, the bootstrap loop counted 100 times with each data amount of bootstrap result of 20 data.
Next we regret each group of response variable data that has been added residue value resampling bootstrap with predictor variable. After bootstrapping the residue 100 times and obtained the estimation of logistic regression parameters from the response variables that have been added with the residual result of bootstrap with predictor variable then we can calculate the standard error value of logistic regression parameters with and is presented in Table 3 below Table 3. The standard error value of logistic regression parameters After a standard error value the logical regression parameters of the bootstrap results are obtained, then compared with the standard error values in the original sample data. The result of standard error value comparison after and before bootstrapping is shown in Table 4 below.

0.000138602
The results of the analysis obtained logistic regression model with 100 times recurrent bootstrap method, with the total probability model of cholesterol a patient will suffer from hypercholesterolemia is

Conclusion
In the case of small amounts of data, the bootstrap method proved to provide better estimation results than using the original samples. This is evidenced by the bootstrap method is able to minimize the value of standard errors on the parameters up to 100 repetitions. The logistic regression model derived from the 100 times recurrent bootstrap method for the cholesterol total probability model of a patient will have hypercholesterolemia ie