After training on the training set using fivefold cross-validation, their TF classification model using the SVM was tested on an independent test set and achieved 80.19% sensitivity and 85.85% specificity. If the protein is a TF, DC will be used to encode the protein, and the XGBoost algorithm will be used to determine if that TF protein is a TFPM. Then, the support vector machine (SVM) algorithm is used to determine if the sequence is a TF. First, the protein sequences are encoded into a vector of 201 features selected from the composition/transition/distribution, split AAC, and dipeptide composition (DC) features. (7) introduced a dataset of TFs and TFPMs and proposed a machine learning method to recognize TFPMs. At the present time, there are not many tools that can do this. Automated methods are therefore needed to quickly and accurately identify TFs and TFPMs. In addition, the database of TFs is increasingly being expanded. (7) However, these methods are often costly and time-consuming. There are several experimental methods to identify TFs and TFs that prefer binding to methylated DNA targets (TFPMs). (6,7) The identification of TFs is an important step in understanding the activities of TFs in detail and thereby better understanding the role of methylated DNAs in gene expressions and in the regulation of 3D genome structures. (3) However, recent experimental studies have shown that there are TFs that can still bind to methylated DNA fragments. Previous studies showed that methylated DNA fragments inhibited the binding of TFs (3−5) for example, methylation at cytosine–guanine dinucleotides (CpGs) has ability to prevent TFs from binding to DNA fragments. (2) Therefore, determining the binding sites of TFs plays an important role in understanding the gene expression mechanism and regulating 3D genome conformation. In addition to their role in gene expressions, TFs can act as protein anchors, thereby helping regulate 3D genome conformation. Other TFs bind to enhancer DNA fragments that can either stimulate or inhibit gene transcription. (1) Some TFs bind to DNA promoters to form transcription initiation complexes from which the polymerase can bind to the DNA fragment and initiate transcription. TFs directly control gene expression as they have a special property of being able to bind to specific sequences of DNA. Transcription factors (TFs) are proteins that play an important role in gene expression. These results are higher than those of other studies on the same problems. This method achieved 82.61% sensitivity, 64.86% specificity, and an AUC of 0.8486 on another independent test set. For the TFPM identification problem, we propose to use the reduced g-gap dipeptide composition for data representation and the support vector machine algorithm for modeling. This method achieved 90.56% sensitivity, 83.96% specificity, and an area under the receiver operating characteristic curve (AUC) of 0.9596 on an independent test set. For the TF identification problem, the proposed method uses the position-specific scoring matrix for data representation and a deep convolutional neural network for modeling. In this study, we propose two machine learning methods for two problems: (1) identifying TFs and (2) identifying TFs that prefer binding to methylated DNA targets (TFPMs). However, as experimental methods are often time-consuming and labor-intensive, developing computational methods is essential. The identification of these TFs is an important steppingstone to a better understanding of cellular gene expression mechanisms. However, recent studies have found that there were TFs that could bind to methylated DNA fragments. Previous studies showed that methylated DNAs had ability to inhibit and prevent TFs from binding to DNA fragments. Some TFs bind to promoter DNA fragments which are near the transcription initiation site and form complexes that allow polymerase enzymes to bind to initiate transcription. TFs have ability to bind to specific DNA fragments called enhancers and promoters. Transcription factors (TFs) play an important role in gene expression and regulation of 3D genome conformation.
0 Comments
Leave a Reply. |