Ying Jin

Professor, Computer Science Department

Data Analysis

Data integration

Data visualization

Data security

Fuzzy databases

Conference proceeding

Analysis of Student Emotional States in AP Courses through Social Media Based on Deep Learning

by Emily J. Yang and Ying Jin

Published 01/01/2022

The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings

Conference Title: 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI) Conference Start Date: 2022, Aug. 9 Conference End Date: 2022, Aug. 11 Conference Location: San Diego, CA, USAWith the recent changes in college admissions, Advance Placement (AP) courses become ever more critical in demonstrating student academic performance in college applications. Teens feel the pressure of taking more AP courses and performing well on AP exams. Stress is one important factor that contributes to psychological disorders. It is valuable to understand and analyze teens' emotions toward AP courses using real-world data; however, there is a lack of research in the literature. Considering the feasibility and limitations of the traditional questionnaire approach, this research collects real world data from social media. Students' emotional states, such as enjoyment and stress, are analyzed from Twitter's tweets using various deep learning models with different text augmentation techniques. We summarize the analysis results as word clouds and emotion charts in each month in an academic year. Students can use the research results to prepare for the self-adjustment of emotions over time when taking AP courses. Parents, school counselors, and psychologists can use this empirical study to better understand students' sentiments and trends during different time periods, and help teens to thrive.

Journal article

Decoupling Object Detection from Human-Object Interaction Recognition

by Ying Jin, Yinpeng Chen, Lijuan Wang, Jianfeng Wang, Pei Yu, Lin Liang, Jenq-Neng Hwang and Zicheng Liu

Published 12/12/2021

We propose DEFR, a DEtection-FRee method to recognize Human-Object Interactions (HOI) at image level without using object location or human pose. This is challenging as the detector is an integral part of existing methods. In this paper, we propose two findings to boost the performance of the detection-free approach, which significantly outperforms the detection-assisted state of the arts. Firstly, we find it crucial to effectively leverage the semantic correlations among HOI classes. Remarkable gain can be achieved by using language embeddings of HOI labels to initialize the linear classifier, which encodes the structure of HOIs to guide training. Further, we propose Log-Sum-Exp Sign (LSE-Sign) loss to facilitate multi-label learning on a long-tailed dataset by balancing gradients over all classes in a softmax format. Our detection-free approach achieves 65.6 mAP in HOI classification on HICO, outperforming the detection-assisted state of the art (SOTA) by 18.5 mAP, and 52.7 mAP in one-shot classes, surpassing the SOTA by 27.3 mAP. Different from previous work, our classification model (DEFR) can be directly used in HOI detection without any additional training, by connecting to an off-the-shelf object detector whose bounding box output is converted to binary masks for DEFR. Surprisingly, such a simple connection of two decoupled models achieves SOTA performance (32.35 mAP).

Journal article

Improving Vision Transformers for Incremental Learning

by Pei Yu, Yinpeng Chen, Ying Jin and Zicheng Liu

Published 12/11/2021

This paper studies using Vision Transformers (ViT) in class incremental learning. Surprisingly, naive application of ViT to replace convolutional neural networks (CNNs) results in performance degradation. Our analysis reveals three issues of naively using ViT: (a) ViT has very slow convergence when class number is small, (b) more bias towards new classes is observed in ViT than CNN-based models, and (c) the proper learning rate of ViT is too low to learn a good classifier. Base on this analysis, we show these issues can be simply addressed by using existing techniques: using convolutional stem, balanced finetuning to correct bias, and higher learning rate for the classifier. Our simple solution, named ViTIL (ViT for Incremental Learning), achieves the new state-of-the-art for all three class incremental learning setups by a clear margin, providing a strong baseline for the research community. For instance, on ImageNet-1000, our ViTIL achieves 69.20% top-1 accuracy for the protocol of 500 initial classes with 5 incremental steps (100 new classes for each), outperforming LUCIR+DDE by 1.69%. For more challenging protocol of 10 incremental steps (100 new classes), our method outperforms PODNet by 7.27% (65.13% vs. 57.86%).

Journal article Open access Peer reviewed

Historical and projected datasets of the United States electricity-water-climate nexus

by Julian Fulton and Ying Jin

Published 10/2021

Data in brief, 38

This article describes datasets that were produced in connection with the research article: “Visualizing the United States electricity-water-climate nexus” published in Environmental Modeling and Software (https://doi.org/10.1016/j.envsoft.2021.105128). Data cover 9,961 individual power plants across the United States, including monthly values for electricity generation, greenhouse gas emissions, water withdrawal, and water consumption between 2003 and 2020, as well as projections out to 2050. Data were retrieved from publicly available sources and processed for the purpose of providing plant-level information that can be aggregated according to various user needs. Power plant information was retrieved from the US EPA Facility Registry Service (FRS) web service through the filter of “EIA860.” For these plants, we retrieved electricity generation, greenhouse emission, water consumption, and water withdrawal of each plant from heterogeneous data sources, including web services and files, clean and process them, and save them in our database tables. We filled remaining data gaps using a coefficient-based approach. This data article describes metadata and methods for producing the historical and projected datasets in the format of CSV files. The datasets are beneficial for researchers to view electricity generation in the context of emissions and water usage at the granularity of power plants, such as for data analysis and machine learning. These data also can be aggregated to different spatial scales, such as watershed, county, state, and national level, according to different analytical needs. In addition, decision makers can use these data for future energy and resource allocations with the awareness of emission and water constraints.

Journal article Peer reviewed

Visualizing the United States electricity-water-climate nexus

by Julian Fulton and Ying Jin

Published 09/01/2021

Environmental modelling & software : with environment data news, 143, 1 - 13

The United States power sector uses immense quantities of water and is vulnerable to drought, water mismanagement, and climate change. Integrated management of energy and water systems is therefore critical, yet is hindered by disparate and delayed information. We present the Energy-Water-Emissions Dashboard (EWED) as an integrative modeling approach and exemplary web-based user interface displaying monthly generation, water withdrawal and consumption, and greenhouse gas emissions at nearly 10,000 U.S. power plants from 2003 to the most recent data available. EWED users may view aggregated results at larger spatial scales including watersheds, where they are compared to modeled water availability data. EWED also models projections out to 2050 based on a range of energy and climate scenarios, finding that water resources will continue to play a critical role in energy sector transitions. As such, EWED supports efforts to integrate decision making on immediate and long-term management challenges at energy-water-climate nexus.
[Display omitted]
•Existing electricity-water assessments can be complemented by our modeling approach.•Our tool shows potential water availability-based constraints on the power sector.•Power sector water use has declined but is projected to increase in coming decades.•Future power sector water use varies with energy sources and technologies.•Our tool supports integrated energy-water management over short and long terms.

Conference proceeding

A Two-Phase Approach for the Prediction of United States Power Plant Water Consumption

by Emily J Yang and Ying Jin

Published 08/2021

2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science (IRI), 290 - 293

It is important to consider water constraints when making decisions for future energy allocations, as some promising renewable energy sources have a high demand for water and are restricted by water availability. The Energy-Water-Emissions Dashboard (EWED) project is an information exchange system of the United State Energy-Water Nexus. Our previous publication describes our approach of using machine learning models to predict future electricity generation, water consumption, and water withdrawal of different types of power plants across the United States. The performance of water consumption prediction is less desirable than that of electricity generation and water withdrawal. This paper describes a novel two-phase approach to improve the prediction of water consumption. The first phase uses Recurrent Neural Network (RNN) to predict future water consumption based on time series. The predicted result is then fed into the second phase as a new feature to produce the final water consumption prediction using Artificial Neural Network (ANN). Compared to our previous ANN prediction, Root Mean Square Error (RMSE) decreased 6.9% and Mean Absolute Error (MAE) decreased 21%. Compared with the conventional coefficient method used by EWED, RMSE decreased 53%. The performance evaluation is comprehensive with five statistical measures and is accurate with k-fold cross-validation.

Journal article

Contemporary Symbolic Regression Methods and their Relative Performance

by William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabrício Olivetti de França, Marco Virgolin, Ying Jin, Michael Kommenda and Jason H Moore

Published 07/29/2021

Many promising approaches to symbolic regression have been presented in recent years, yet progress in the field continues to suffer from a lack of uniform, robust, and transparent benchmarking standards. In this paper, we address this shortcoming by introducing an open-source, reproducible benchmarking platform for symbolic regression. We assess 14 symbolic regression methods and 7 machine learning methods on a set of 252 diverse regression problems. Our assessment includes both real-world datasets with no known model form as well as ground-truth benchmark problems, including physics equations and systems of ordinary differential equations. For the real-world datasets, we benchmark the ability of each method to learn models with low error and low complexity relative to state-of-the-art machine learning methods. For the synthetic problems, we assess each method's ability to find exact solutions in the presence of varying levels of noise. Under these controlled experiments, we conclude that the best performing methods for real-world regression combine genetic algorithms with parameter estimation and/or semantic search drivers. When tasked with recovering exact equations in the presence of noise, we find that deep learning and genetic algorithm-based approaches perform similarly. We provide a detailed guide to reproducing this experiment and contributing new methods, and encourage other researchers to collaborate with us on a common and living symbolic regression benchmark.

Journal article

Is Object Detection Necessary for Human-Object Interaction Recognition?

by Ying Jin, Yinpeng Chen, Lijuan Wang, Jianfeng Wang, Pei Yu, Zicheng Liu and Jenq-Neng Hwang

Published 07/27/2021

This paper revisits human-object interaction (HOI) recognition at image level without using supervisions of object location and human pose. We name it detection-free HOI recognition, in contrast to the existing detection-supervised approaches which rely on object and keypoint detections to achieve state of the art. With our method, not only the detection supervision is evitable, but superior performance can be achieved by properly using image-text pre-training (such as CLIP) and the proposed Log-Sum-Exp Sign (LSE-Sign) loss function. Specifically, using text embeddings of class labels to initialize the linear classifier is essential for leveraging the CLIP pre-trained image encoder. In addition, LSE-Sign loss facilitates learning from multiple labels on an imbalanced dataset by normalizing gradients over all classes in a softmax format. Surprisingly, our detection-free solution achieves 60.5 mAP on the HICO dataset, outperforming the detection-supervised state of the art by 13.4 mAP

Journal article Peer reviewed

Influences of formation potential on oxide film of TC4 in 0.5 M sulfuric acid

by Qingrui Wang, Feifei Huang, Yi-Tao Cui, Hiroaki Yoshida, Lei Wen and Ying Jin

Published 04/01/2021

Applied surface science, 544, 148888

The correlation between film formation potential and the thickness, valence state, and corrosion resistance of the oxide film on TC4 (Ti-6Al-4V) in 0.5 M sulfuric acid is investigated. Potentiostatic polarization and electrochemical impedance spectra measurements reveal the passivation process and electrochemical properties of TC4 alloy. Relative quantitative X-ray photoelectron spectroscopy and Auger electron spectrum analyses illustrate that the formation of thicker and higher valence state film is promoted by the increase in applied potential. Those passive films formed at the potentials in passive region show good protections, and the corrosion resistance of the passive film becomes larger with the increase in passivation potentials. The TiO2 form in oxide films generated under the above conditions transforms from anatase to rutile with the increase in applied potentials and this transformation is found through comparing with the X-ray absorption spectroscopy of reference oxides.

Journal article Peer reviewed

Coulostatic Perturbation Measurements and the Corresponding Time-to-Frequency Transform Data Analysis for Micro-Electrochemical Study

by Shuangyu Cai, Lei Wen, Xiuquan Yao, Feifei Huang, Zhigang Yu and Ying Jin

Published 02/12/2021

Journal of the Electrochemical Society, 168, 2, 021508

Traditional micro-electrochemical impedance spectroscopy measurement using a capillary cell presents problems such as high ohmic resistance, long test duration and the subsequent possible tip blocking by corrosion products. In comparison, coulostatic perturbation measurements can avoid these issues due to its unique test principle and much shorter test duration. In this work, the coulostatic perturbation tests were performed on microregions of duplex stainless steel (DSS) 2205 immersed in 3.5 wt.% NaCl solution. The micro-electrochemical parameters were estimated by linear fitting the time-domain curve (LFTC), and subsequently by fitting the frequency-domain curve (FFC) obtained through Fast Fourier Transform (FFC-FFT) for comparison. It is shown that FFC-FFT method minimizes the problem of manual error in slope and intercept evaluation during LFTC. In comparison to the traditional EIS tests, FFC-FFT method causes less perturbation to the tested system, less interference of ohmic resistance with shorter test duration, thus can obtain valid low frequency data more efficiently, which is particularly favorable in studying high polarization resistance system or unstable systems. The micro-electrochemical experimental tests of DSS 2205 show that the polarization resistance of the microregion gradually increases with the increase of austenite phase, while the double-layer capacitance shows a decreasing trend.

Ying Jin

Professor, Computer Science Department

Output list