O6-carboxymethyl guanine(O6-CMG) is a highly mutagenic alkylation product of DNA that causes gastrointestinal cancer in organisms. Existing studies used mutant Mycobacterium smegmatis porin A (MspA) nanopore assisted by Phi29 DNA polymerase to localize it. Recently, machine learning technology has been widely used in the analysis of nanopore sequencing data. But the machine learning always need a large number of data labels that have brought extra work burden to researchers, which greatly affects its practicability. Accordingly, this paper proposes a nano-Unsupervised-Deep-Learning method (nano-UDL) based on an unsupervised clustering algorithm to identify methylation events in nanopore data automatically. Specially, nano-UDL first uses the deep AutoEncoder to extract features from the nanopore dataset and then applies the MeanShift clustering algorithm to classify data. Besides, nano-UDL can extract the optimal features for clustering by joint optimizing the clustering loss and reconstruction loss. Experimental results demonstrate that nano-UDL has relatively accurate recognition accuracy on the O6-CMG dataset and can accurately identify all sequence segments containing O6-CMG. In order to further verify the robustness of nano-UDL, hyperparameter sensitivity verification and ablation experiments were carried out in this paper. Using machine learning to analyze nanopore data can effectively reduce the additional cost of manual data analysis, which is significant for many biological studies, including genome sequencing.
The research shows that personality assessment can be achieved by regression model based on electroencephalogram (EEG). Most of existing researches use event-related potential or power spectral density for personality assessment, which can only represent the brain information of a single region. But some research shows that human cognition is more dependent on the interaction of brain regions. In addition, due to the distribution difference of EEG features among subjects, the trained regression model can not get accurate results of cross subject personality assessment. In order to solve the problem, this research proposes a personality assessment method based on EEG functional connectivity and domain adaption. This research collected EEG data from 45 normal people under different emotional pictures (positive, negative and neutral). Firstly, the coherence of 59 channels in 5 frequency bands was taken as the original feature set. Then the feature-based domain adaptation was used to map the feature to a new feature space. It can reduce the distribution difference between training and test set in the new feature space, so as to reduce the distribution difference between subjects. Finally, the support vector regression model was trained and tested based on the transformed feature set by leave-one-out cross-validation. What’s more, this paper compared the methods used in previous researches. The results showed that the method proposed in this paper improved the performance of regression model and obtained better personality assessment results. This research provides a new method for personality assessment.