Potency of Various MicroRNA as Sputum Biomarker in Lung Cancer

Introduction: Lung cancer is one the deadliest cancer known. Current method using low-dose CT scan to screen lung cancer applied are proved to reduce mortality, but lack in accuracy leading to overdiagnosis. Current researches are mostly seeking for a noninvasive and cost-effective method, hence biomarkers such as microRNA has a potency to screen and diagnose lung cancer. This literature review aimed to discuss the potency of microRNA in sputum as a biomarker in screening lung cancer. Methods: The study is conducted by literature searching for related journals in search engines such as PubMed, Science Direct, and Google Scholar with keywords. 35 articles were included with relevance and within 10 years publication. Result: MicroRNA is a short non-coding RNA which regulates gene expression. It acts in an oncogene or tumor suppressor gene regulation. DNA mutations or defects occurred in cancer particularly in lung cancer causes increase or decrease of microRNA expression. Alterations of microRNA expression in sputum detected by rt-PCR may represent progressions of lung cancer from a cell cycle dysregulation and provides better sensitivity and specificity among other biomarkers. Combinations of miRNA species offer increase of sensitivity and specificity. Conclusion: MicroRNA has the promising potency and strong accuracy with a noninvasive and cost-effective procedure in detecting early signs of lung cancer occurrence and can be further applied as biomarker used in lung cancer.


INTRODUCTION
Lung cancer is one of the predominant cause of death related to cancer. Based on GLOBOCAN 2018 database, lung cancer was estimated for 2.09 million newly diagnosed cases and responsible for 1.76 million mortality in the world. It is the major cause of cancer-related death in men, while ranks second behind breast cancer as the leading cause of cancer death in women 1 . Cigarette smoking has known as the most important risk factor associated to lung cancer, contributing 75% of lung cancer deaths in men and 50% in women. The other risk factors related are secondhand smoker, air pollution, occupational exposure, radon gas exposure from soil and building material, and asbestos 2 .
Histologically, lung cancer can be divided into two primary groups, non-smallcell lung cancer (NSCLC) and small-cell lung cancer (SCLC). NSCLC can be further distinguished into large cell carcinoma, squamous cell carcinoma, and adenocarcinoma 3 . NSCLC contributes up to 80% of all lung cancer, with adenocarcinoma has the most number of subtype presented. Prior to 1990s, squamous cell carcinoma was the most common, but now trend is shifting to adenocarcinoma. Different types of cigarettes, which has filter and low composition of tar, used nowadays, may play a role in these facts 4 .
Screening and early diagnosis on high risk patients are shown to decrease the mortality rate of lung cancer. Currently developed screening method on lung cancer uses low dose CT Scan (LDCT). LDCT had cut numbers of death in lung cancer for 20%. Principle of LDCT is by examining the diameter of nodules framed. Specificity recorded vary by threshold 5 .
Although LDCT have succeeded reducing the death rate in lung cancer, the National Lung Screening Trial (NLST) recorded 26.6% false positive result. Moreover, high false positive values were recorded in several trials such as American College of Radiology (ACR) for 12.8%. 24% of LDCT results noted noted with nodules identified. Further up, only 4% of the findings diagnosed as malignancy, but all went for invasive diagnostic procedures counting biopsy 5,6 .
Cost effectiveness remains a problem in LDCT shown by systematic reviews and meta-analytic studies. LDCT is now unable to be afforded and is abandoned mostly in lowincome countries.
Other radiographic screenings as well as positron emission tomography (PET) are available with better cost effectiveness, but lack in sensitivity, specificity, and malignancy determination compared to LDCT. This remains as an issue to seek another screening test on lung cancer 5,7 .
These days, in order to facilitate early diagnosis for the disease, many breakthroughs are being made to help diagnosing the disease conveniently with sputum. Sputum examinations comprise of sputum cytology and sputum molecular analysis. Nevertheless, the former's capability of yielding the result in screening trials is considered low if compared with the latter 8 . Hence, the molecular analysis of sputum is more dependable and reliable. This includes autoantibodies, complement fragments, microRNAs (miRNAs), DNAs, proteins, and others as the promising candidates for diagnosis from molecular examination 9 . However, many studies are concerned in the work of the miRNAs since these biomarkers have shown their promising reliability in neoplastic and non-neoplastic diseases diagnosing, monitoring, and even the managements working 10 . miRNAs work in modulating mRNAs degradation by binding to a specific site in post-transcriptional regulation of cells 11 . Studies conducted at 1990s identified the lin-4 gene in Caenorhabditis elegans as a miRNA, yet until the early of 2000s no prominent studies had shown major findings associated this molecule while now there are many miRNAs discovered and being made as a distinct molecule that contributes to the pathogenesis in allergic diseases such as asthma and others 12 .
miRNAs also contribute to many biological processes such as proliferation, differentiation, and apoptosis. They are even also crucial for the regulations of the oncogenes and the tumor suppressor gene expression. Consequently, these result in the surging numbers of researches conducted regarding to miRNAs over the past decade 11 . This literature review aims to reveal the potency of miRNA as a non-invasive biomarker with sputum sample to screen the early progression of lung cancer. The literature review also discuss on miRNA's ability to classify types of lung cancer and comparison to other possible biomarkers to lung cancer screening and diagnosis.

METHOD
This literature review was made by collecting, reviewing, and citing related journals from search engines such as Pubmed, Science Direct, and Google Scholar. Keywords used are "Biomarkers", "Lung Cancer", "miRNA", "Screening", and "Sputum". Journals used are within 10 years range. A total of 35 journals are gathered in this literature review.

Lung Cancer
Lung cancer is a state of malignancy among lung cells. Cells associated with the lung cancer vary from epithelial cells to the parenchymal cells. Lung cancer is highly associated with heavy smoking and genetic factors. Patient with high risk factors tend to undergo genetic mutations precipitated by carcinogens which commonly come from tobacco smoking, or other chemical exposure 13 .
Genetic factors in lung cancer generally mutated, occurred by the alteration of genes regulating the cell cycle. Mutations modify the expression of oncogenes and tumor suppression gene, causing uncontrolled cell growth. Many genes had been found related to the carcinogenesis in lung cancer. Genes related are TP53, KRAS, CDK23, EGFR, ALK, and many others. Those mutations activate the RAS/RAF signaling leading to malignancy. Histone modification was also noted to have a relation to DNA damage, resulting to unopposable cell growth. Noncoding RNAs (miRNA and lncRNA) are also known to regulate the gene expression of both proliferation and differentiation 13,14 .

MicroRNA
MicroRNA is a group of short noncoding RNA, containing about 21-25 nucleotides, which regulates gene expression in cells by either suppressing translation of mRNA or degrading it. Its biogenesis, as depicted in Figure 1, starts in transcription process from the genome by RNA polymerase II, producing pri-miRNA, a stem-loop structured RNA. This product is then cleaved into pre-miRNA by two intranuclear enzymes: Drosha (a RNase III enzyme) and Pasha/DGCR8 (a ds-RNA binding protein) 10 . The pre-miRNA is subsequently exported from nucleus into cytoplasm by exportin-5 15 . In the cytoplasm, the stem-loop structure of pre-miRNA is cut by Dicer (another RNase III enzyme) producing miRNA:miRNA duplex. This duplex will unwind, leaving one strand of mature miRNA binds with RNA-induced silencing complex (RISC) while another strand is degraded. RISC has base pairs sequence, which shares some complementarity with mRNA and thus it can bind to mRNA. This binding may results in two processes, which are translation suppression or mRNA degradation 10 .
The mRNA degradation involves intranuclear RNA hydrolysis and this will results in strong repression of gene expression. This process is reported to be the main effect of miRNA, other than the translation suppression effect 16 . In human body, miRNA is expressed in nearly all tissues. However, the expression tissue location differs between each family of miRNA. Some miRNA, as instances, miR-21 family members are widely expressed in all tissues, while others are more tissue-specific, creating an opportunity to make use of this properties for diagnosing certain diseases 10 .
The expression of miRNAs family is affected by multifactorial causes. This includes defects in the biogenesis, co-expression with the host genes, DNA methylation, hypoxia, endogenous factors and exogenous factors such as xenobiotics. Within the biogenesis, in the transcriptional level, the expression of miRNAs can be altered due to mutations or regulated by the promoter's regulations. On the other level, in the posttranscriptional one, biogenesis enzymes like Dicer and Drosha might downregulate the expression. These enzymes too, are likely to be affected by mutation and epigenetic modification. Surely, modified associated enzymes production will resulting impacts in miRNA biogenesis. Other factors as endogenous and exogenous substances also may change the expression by certain mechanisms 17 .
In lung cancer, miRNA has its role to play in the progression of the tumor mass development where miRNA work is deregulated so that the physiological mechanism of the miRNA shifted from the normal one. miRNA may function as oncogene or, on the other hand, as tumor suppressor gene, depending on the favorable circumstances. The impairments of miRNA, caused by changes in miRNA production, affect every each of the six biological hallmarks of the cancer proposed by Hanahan et al. about two decades ago 18 .
This, in case, modifies both the initiation and the progression of tumor mass. Among the six hallmarks, the prominent signs for cancer include sustainable proliferative signaling and evading growth suppressors of cells involved. Up to the very time, the epidermal growth factor receptor (EGFR)signaling is the most familiar pathway involved to these major marks where miRNA works directly to. The E2F proteins are significant regulator to maintain the stability of cells proliferation. The E2F1, as a member of the E2F family, induces target gene transcription in G1 to S transition period and is thought as a tumor suppressor since a study with mice performed while the substance is deficient, showed progressions of the mass growth 19 .
A family of miRNA, miR-17-92, had shown activity in reducing the E2F1 expression by the translation inhibition after being activated by c-Myc. Since c-Myc too, induces the E2F1, the miRNA family members oppose the loop to prevent overproduction of the latter. However, the massive production of the miR-17-92 may interfere with the cycle and thus, promote the positive feedback for the proliferation. Furthermore, miRNAs are also needed for cells to pass the normal G1/S checkpoint. This is proven by the fact that Dicer-deficient germline stem cells proliferation stopped at G1/S transition. In the same experiment, the enzyme-deficient stem cells increased the expression of Dacapo, a p21/p27 family member of Cdk inhibitors 19 .
Another miRNA family, miR-221/222, is also thought to play part at this process, as this miRNA family targets the Cdk inhibitor and thus promotes proliferation. Additionally, miR-663 also works towards the same Cdk inhibitors. Such miRNA is found to be upregulated in nasopharyngeal carcinoma, took its role as the oncogene. Besides, miRNAs too, work to change the expression of Cdk and cyclin. For example, there is miR-545 which represses cyclin D1 and CDK4 resulting in cell cycle arrest, making suppressors could not work instead. Other miRNAs, such as miR-760, work in the same manner, of which it reduces the cell proliferation by suppressing ROS1 (ROS proto-oncogene 1 receptor tyrosine kinase) expression in NSCLC cell lines 15,19 .
Cancer cells, as well as lung cancer cells, are said to be immortal since its replication is continuous throughout the cells life. The core to this kind of work is telomeres and its telomerases. Telomerase enzyme allows the telomeric DNA to add the replication of the repeated segments to its end. miRNA works directly to the telomerase's catalytic subunit such as human telomerase reverse transcriptase (hTERT) to control the telomere lengths. However, to date, no miRNA modulate this hTERT in lung cancer. On the other hand, other miRNAs, for instances, miR-512-5p and miR-498 work with such mechanism in neck squamous carcinoma and ovarian cancers respectively 15 .
Moreover, miRNA also involved in invasion and metastasis. An available example is miR-200 family which the binds homeobox (ZEB)1 and ZEB2. These proteins inhibit Ecadherin production, an important substance for a process called epithelial-to-mesenchymal transition (EMT), serving metastasis's progress in cancers, counting lung cancer. Furthermore, miRNA is involved in angiogenesis in the disease. Some members of the family are known to inhibit angiogenesis by targeting vascular endothelial growth factor (VEGF). Another variants of miRNA belong to the other family too such as miR-497 works to the same extent with different binding; it aims directly to hepatoma-derived growth factor (HDGF). Studies also shown that miRNA, miR-16-1 in particular, induces lung cancer apoptosis by modifying Bcl-2. Contrarily, another different family member, miR-130b, suppress lung cancer's regulated cell death with same general mechanism 15 .

Testing of microRNA from sputum sample
The first step in testing for microRNA from sputum is collecting the sputum samples. The samples are collected before any treatment or chemotherapy for the cancer has been done to the patients, as these interventions may alter the true expression of miRNA 20 . Before collecting, patients are requested to free their nose from secretion, and rinse their mouth with water. This is done to reduce the contamination from nasal and saliva squamous epithelial cells, so the samples collected will mostly consist of respiratory epithelium from trachea, bronchus, and more distal airway 21 .
Samples collected can be either from spontaneous sputum or induced sputum. Spontaneous sputum is collected by asking the patient to breathe deeply and cough the sputum to a sterile container without any tool aid 22 . For patients who are unable to spontaneously cough out the sputum, the sputum can be induced by inhaling a hypertonic or isotonic NaCl solution through a nebulizer. A more advanced technique for sputum induction is using a special tool that works by vibrating the airway through sound wave generation 21 .
Then, cytological examination is performed by firstly preparing the slides using a cytospin machine. The slides are then stained with Papanicoulau staining and subsequently examined under the microscope. This examination will determine cytological classification of the samples and also their quality seen from how much contamination found from squamous epithelial cells 21,23 . For specified collection of the respiratory epithelial cells, the samples are processed by adding 0.1% dithiothreitol and phosphatebuffered saline solution, each time followed by vortexing. They are consecutively filtered from any debris and mucus, then resuspended with the phosphate-buffered saline solution 21,24 .
Afterwards, total RNA will be isolated from the cell pellets gathered from previous filtration process by using an isolation kit. Next, the RNA purity will be assessed by using dual-beam UV spectrophotometer and its integrity will be determined by capillary electrophoresis method 21 . The threshold integrity value of greater than six is used to decide which samples can be quantified in the next step 25 .
To determine the expression of specific miRNA, quantitative real time reverse transcriptase polymerase chain reaction (RT-qPCR) analysis is performed. The PCR kit includes specific primer for the target miRNA, buffer solution, deoxynucleoside triphosphates (dNTPs), and the reverse transcriptase enzyme solution. Preceded by a target miRNA-specific primer, the reverse transcriptase enzyme will transcribe complementary DNA (cDNA) from RNA. Then, the cDNA is mixed with other PCR reagents in order to start the reaction. The reaction is done in a thermal cycler. The data of the reaction is analyzed by a computer software, and the threshold cycle (Ct) value is determined. The Ct value will be normalized with an internal control, usually U6B RNA, so the expression of the target miRNA can be determined 23,24,26 .

Other Sputum Biomarkers
Sputum is easy to obtain and able to be presented as a non-invasive body fluid for a sample in respiratory test 27 . Sputum is considered to be a reliable sample for lung cancer diagnosis since it contains respiratory epithelial cells, which specifically demonstrates cell injury and molecular anomalies occurs in the airway epithelium. These molecular abnormalities, which are commonly caused by cigarette smoking, might be assessed through several types of sputum examination 28 .
Sputum cytology has been used extensively since the past times as lung cancer diagnosis. The idea supporting this method is that changes in respiratory epithelial cell morphology are related to the cancer's pathophysiology.
Nonetheless, this examination has low sensitivity, only up to 60% 8 . It also lacks in standardization and relies heavily on the pathologist's experiences, which may results to bias and subjectivity 28 . Those disadvantages cause the molecular study proposed in sputum examination. Several biomarkers, such as allelic alteration, DNA methylation, DNA mutation (including KRAS and p53 mutation), and miRNAs have been evaluated to be used for lung cancer diagnosis 9,28 .
Allelic alteration is a chromosomal abnormality. This means a microsatellite instability (MSI) and loss of heterozygosity (LOH). MSI happens when microsatellite, a part of DNA composed of repeated short motives, experienced mutation and resulting length changes in DNA. LOH is defined as lost of a gene in allelic pair, while other alleles have already inactivated. Regions of chromosome, underwent MSI or LOH in cancer, are usually tumor suppressor genes. Two chromosomes most commonly assessed are chromosome 3p and 9p, which hold tumor suppressor genes, such as RBSP3, NPRL2, CDKN2A, and CDKN2B. Even so, these alterations do not appear frequently; MSI or LOH, whether individually or in combination, happen in less than 70% of lung cancer 28 .
DNA methylation is one of an important gene regulation process. The methyl group is added into cytosine base in a CpG dinucleotide (cytosine and guanine base pair). The CpG island, containing a lot of CpG, lies in gene promoter region. Once it is hypermethylated, it alters the shape of the chromatin and thus prevents transcriptional factors to bind to the strand. Therefore, this will results in gene silencing. Several genes, including P16, MGMT, RASSF1A, and DAPK, taking roles in cancer pathophysiology, shown to be silenced by the mechanism.
The prevalence of hypermethylated P16 in sputum samples is the highest between these genes, with percentage ranging in interval of 25-74% 28 . A research studying this methylation process in threegene combination experiment yields sensitivity and specificity of 98% and 71% respectively 9 .
Gene mutation has been largely studied as one of underlying processes in cancer. KRAS and p53 are the most often mutated genes. KRAS gene acts as oncogene and the mutation results in hyperactivation of the gene 28 . The mutation when measured in sputum samples had a sensitivity of 79% when compared to those found in tumor biopsy. However, in a cohort study assessing the mutation in heavy smoker patients, which not yet developed lung cancer, no KRAS mutation was observed. This suggests that KRAS mutation is less relevant for screening purposes 8 . In contrast, p53 is a tumor suppressor gene. The mutation mostly occurs in DNA binding domain of the gene and results in loss of transcription, making it lost its suppressing function. It is the most common mutation occurred in all types of cancers, including lung cancer. About 65% of adenocarcinoma have shown mutations in p53 gene 28 .

microRNA in Sputum
As stated previously, miRNA has its oncogenic and tumor suppressor genic function. Lung cancer means changes occur in genes regulatory, so does with miRNA. miRNA's specific roles on several genes cause variety of its types. In lung cancer, there were noted up to 818 species of miRNA identified in NSCLC patients. 13 species represent genes dysregulation with 10 of it appear in sputum 29 .
Varieties of combination provide high sensitivity and specificity available for NSCLC screening, yet some of these combinations show barely perfect sensitivity or specificity and unavailable widespread. The combinations of miR-21, miR-31, and miR-210 are the most known and applied. Each species shows a significant increase in NSCLC than cancer free smoker as depicted in Figure 2. These combinations are stable and stated for not being affected by any other factors, such as age, pulmonary nodules, COPD status, gender, and any other factors.
Their sensitivity and specificity also shows slight disparity in range 29,31 . miRNA also provides better accuracy instead of other biomarkers possibly examined in the sputum. Sputum cytology still has better specificity as it could differentiate the cell types appear in the sputum, but its low sensitivity make it shown high numbers of unidentified lung cancer. DNA methylation has better sensitivity than cytology, but has not match the accuracy of miRNA. DNA hypermethylation combination of RASSF1A, 3OST2, and PRDM14 manifested similar sensitivity to miRNA although their specificity still marked lower in comparisons. Table 1 emphasizes that molecular diagnosis offers better accuracy than cytology. miRNA combination of miR-21, miR-31, and miR-210 remains better than others 29 . miRNA as an effective biomarker shows better accuracy than the others, but still is out of supremacy. miR-21 as the most common singly shows only 70% sensitivity, but 100% specificity. It is known that it is expressed only in lung cancer and showed minimal false negative value, but its single test still lost some patients to screen 30 . Combinations of some miRNA species may increase the screening ability as shown in Table 2. microRNA in Different Samples miRNA as a marker is able to be detected in many body fluids. Samples can be obtained directly from the tissue suspected with risk to malignancy. Anyhow, still, in lung cancer, this means for a direct aspiration from the lung. Other specimens are blood (whole blood, plasma, serum), and sputum particularly in the disease. The quantitative rt-PCR test analysis in AUC is provided in Table  3 33 . Plasma has more diversity of species of miRNA detected. Unfortunately, it has risk of bias from other organ malignancy as it may express miRNA to the blood. Tissue specimen has the highest AUC, but invasive procedure makes it inapplicable for screening test. Sputum has slight lower AUC than tissue specimen, but presented satisfying accuracy with less invasive method 33 .
Sputum sample also suggested a stable expression instead of blood sample. Moreover, it has more specified-known miRNA species instead. As a noninvasive procedure, sputum miRNA test is able to be combined alongside LDCT. It was reported to have a sensitivity for 92.4% and specificity 91.2%. This improves the accuracy and wellknown LDCT "high false positive result" 32 .

microRNA in Specific Type of Lung Cancer
miRNA is also known for its capability to specify and determine types of lung cancer progression in a screened patient. This is probably due to its great variety of species or types of miRNA available and detected. Each miRNA species offers specific genes alteration, which indicates types of lung cancer developing at the time. There are some known miRNA found only in certain types of lung cancer. miR-21 is the most often indicator of an adenocarcinoma. Meanwhile, miR-205 and miR-210 each has 96% sensitivity and 90% specificity for SCC 30,32 . Generally, miRNA might distinct SCLC and NSCLC. A study on miRNA species revealed that there are members of miRNA that found specific to lung cancer but none discovered in healthy individuals. Other groups of miRNA are found to be unique to NSCLC while others are more into SCLC. This means it has a good ability to decide whether the patient will be diagnosed as SCLC or NSCLC. Some groups also provided information of former chemotherapeutic history 36 .
There are 6 miRNA species which are potential to screen lung cancer progression in high-risk individuals. These miRNAs are miR-17, miR-19a, miR-19b, miR-26b, miR-190b, and miR-375. miR-375 was noted to have the best determination of SCLC and NSCLC alongside its screening potency with AUC 0.845 and went to 18-fold increase in positive result. miR-17 gives the same mechanism but has less potency than miR-375, while miR-190b shows large decrease in lung cancer patients 34 .
Some specific species of miRNA offer high accuracy to assign the type of lung cancer. It was reported if there were 11 miRNA species which were unique to NSCLC. Seven miRNA species were also reported to be unique to SCLC. This may certainly guide to a prompt diagnosis of lung cancer, certainly is more advanced than just a screening. Several miRNA expressions are used to decide whether the patient suffer SCLC or NSCLC. Various potent miRNA species used to separate distinctively of between SCLC, NSCLC, as the control groups are elaborated in Table 4 35 .
Among those species, some recorded with 100% sensitivity and specificity, which are miR-331-5p, miR-451a, and miR-363-3p, to distinguish between NSCLC and SCLC. Other miRNAs are also used as prognostic and survival rate determination, like miR-221, miR-137, miR-372, and miR-182. Additionally, some expressions are found to be related to recurrence, such as miR-21 and miR-24. Reductions of expressions are also expected in post chemotherapy, giving positive result 35 . CONCLUSION miRNA is a short regulating noncoding RNA related to cancer, whether as an oncogene or tumor suppressor gene. miRNA has a great potency to be a biomarker with good specificity and sensitivity in comparison to other biomarkers available. The presence of specific miRNA in sputum and simple procedure by rt-PCR is an advantage to screen for early lung cancer progression noninvasively, instead of current invasive or expensive methods. Several miRNA species are specific to distinguish types of the developing Lung Cancer. Overall, further researches on some specific types of miRNA and combinations of tests are required to unveil the potency of miRNA to screen and diagnose lung cancer.