ABOUT ME
Hi, I'm Tao Ding (Chinese: 丁涛).

I'm a PhD student at the Department of Information Systems at University of Maryland, Baltimore County and advised by Dr. Shimei Pan in Text Mining and Social Media Analytics Lab. We mainly focus on social media-based user behavior and trait analysis. I was advisored by Dr. Malcom Gethers to work on feature location, impact analysis and software measurement tasks. My research interests include natural language processing, applied machine learning and software engineering.

I obtained my Master Degree in computer and information science from Gannon University in 2013, advised by Dr. Frank Xu. I received my bachelor’s degree in Software Engineering from Nanchang Hangkong University, China. Before I started working for my master degree, I had worked as software engineer for 3 years in SNDA and RENREN Inc.

My CV is available [here] or scroll down for an abbreviated web version.
CONTACT
Email: tding1027@gmail.com
Mail: University of Maryland, Baltimore County

The Department of Information Systems
1000 Hilltop Circle
Baltimore, MD 21250

USEFUL LINKS
Linkedin Profile
GitHub
Wonderful Talk

SELECTED PROJECTS
things
Personalizaed Emphasis Framing
Persuasion is an integral part of our personal and professional lives. It is generally believed that persuasion is more effective when it is custom-tailored to reflect the interests and concerns of the intended audience. we analyzed the relationship between an individual's traits and his/her aspect framing decisions. Our analysis has uncovered interesting patterns that can be used to automatically customize a message's content to enhance its appeal to its receivers.
things
A Multi-domain Empirical Investigation on Sentiment Analysis
Sentiment analysis (also known as opinion mining) is frequently used in monitoring public opinions on the internet. However, the technology isn't fully mature yet. As a result, if not used carefully, the results from sentiment analysis can be misleading. We present an empirical investigation of the effectiveness of using current sentiment analysis tools to assess people's opinions in five different domains
things
Mass Sentiments Analysis of FCC comments: Net Neutrality policy
Federal Communication Commission has published their bulk of comments they received on the Open Internet/ Network Neutrality rule making proceeding to the public to help analyzing those comments. We stimulated visualization of the Mass Sentiments around this topic on the map of the United States specifically and worldwide generally. We find users’ mood sometimes cannot reflect people’s attitude in some cases: 1) it is difficult to identify aspects of object, such as policy and people, reviewer can talk about anything they want; 2) ambiguous relation among aspects and object such as privacy for net neutrality, it is hard to say people vote net neutrality because of they pay or do not attention to their privacy
things
Flow-based Anomaly Intrusion Detection
While popular Intrusion Detection Systems like Snort are used in many network locations, comprehensive deployment is costly due to the need for high-speed monitors at many network ingress points. Flow statistics are used to discover anomalies by aggregating network traces and then using machine-learning classifiers to discover suspicious activities. However, the efficiency and effectiveness of the flow classification models depends on the level and granularity of aggregation. We describes the design and implementation of a novel multi-level approach that aggregates packets into network flows and correlates them with security events generated by payload-based IDSs.
things
A Study of the effects of Expert Knowledge on Bug Reports
Bug reports are crucial software artifacts for both software maintenance researchers and practitioners. A typical use of bug reports by researchers is to evaluate automated software maintenance tools: a large repository of reports is used as input for a tool, and metrics are calculated from the tool’s output. But this process is quite different from practitioners, who distinguish between reports written by experts such as programmers, and reports written by non-experts such as users. Practitioners recognize that the content of a bug report depends on its author’s expert knowledge. In this paper, we present an empirical study of the textual difference between bug reports written by experts and non-experts. We find that a significance difference exists, and that this difference has a significant impact on the results from a state-of-the-art feature location tool.
things
Rule-Based Test Input Generation From Bytecode
Search-based test generators, such as those using genetic algorithms and alternative variable methods, can automatically generate test inputs. They typically rely on fitness functions to calculate fitness scores for guiding the search process. This paper presents a novel rule-based testing (RBT) approach to automated generation of test inputs from Java bytecode without using fitness functions. It extracts tagged paths from the control flow graph of given bytecode, analyzes and monitors the predicates in the tagged paths at runtime, and generates test inputs using predefined rules. Our case studies show that RBT has outperformed the test input generators using genetic algorithms and alternative variable methods.

RESUME/CV
Education
University of Maryland, Baltimore County, 2013 - Present
Ph.D. in Information Systems

Gannon University, 2011 - 2013
M.S. in Computer and Information Science

Nanchang Hangkong University, 2004 - 2008
B.S. in Software Engineering


Teaching Experience
University of Maryland, Baltimore County
IS420 Database Application Development
Nanchang Hangkong University
Java Networking Programming
Discrete Mathematics
Professional Experience
TCL America Research, Summer 2016
Data Scientist Intern

General Electric Transportation, Otc. 2011 - Aug. 2013
Research Programmer

Renren Inc., 2010 - 2011
Software Engineer

Shanda Ltd., 2008 - 2010
Software Engineer

Publications
Tao Ding, Fatema Hasan, Shimei Pan, Interpreting Social Media-based Substance Use Prediction Models with Knowledge Distillation,IEEE Internal Conference on Tools with Artificial Intelligence (ICTAI 2018)
Tao Ding, Cheng Zhang, Maarten Bos, Causal Feature Selection for Individual Characteristics Prediction, IEEE Internal Conference on Tools with Artificial Intelligence (ICTAI 2018)
Tao Ding, Warren K. Bickel, Shimei Pan, Predicting Delay Discounting from Social Media Likes with Unsupervised Feature Learning, The IEEE/ACM International Conference on Social Networks Analysis and Mining (ASONAM 2018)
Tao Ding, Warren K. Bickel, Shimei Pan, Multi-View Unsupervised User Feature Embedding for Social Media-based Substance Use Prediction, Conference on Empirical Methods in Natural Language Processing (EMNLP2017)
Tao Ding, Arpita Roy, Zhiyuan Chen, Qian Zhu, Shimei Pan, Analyzing and Retrieving Illicit Drug-Related Posts from Social Media, Workshop on Data Mining in Translational Biomedical Informatics (TBI) in conjunction of The IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2016
Tao Ding, Shimei Pan, Personalized Emphasis Framing for Persuasive Message Generation, Conference on Empirical Methods in Natural Language Processing (EMNLP2016)
Tao Ding, Shimei Pan, How Reliable is Sentiment Analysis? A Multi-domain Empirical Investigation, Lecture Notes in Business Information Processing (Invited paper)
Tao Ding, Shimei Pan, An Empirical Study of the Effectiveness of Using Sentiment Analysis Tools for Opinion Mining, The 12th International Conference on Web Information Systems and Technologies (WEBIST 2016)
Tao Ding, Ahmed AlEroud, George Karabatis, Multi-level Aggregation of Network Flows For Security Analysis, IEEE International Conference on Intelligence and Security Informatics (IEEE ISI 2015)
Paige Rodeghero, Da Huo, Tao Ding, Collin McMillan, Malcom Gethers, An empirical study on how expert knowledge affects bug reports. Journal of Software: Evolution and Process. 2016
Da Huo, Tao Ding, Collin McMillan, Malcom Gethers., "An Empirical Study of the Effects of Expert Knowledge on Bug Reports", 30th International Conference on Software Maintenance and Evolution (ICSME 2014)
Weifeng Xu, Tao Ding, Dianxiang Xu , Rule-based Test Input Generation From Bytecode, The 8th IEEE International Conference on Software Security and Reliability (SERE 2014)
Weifeng Xu, Tao Ding, Hanlin Wang, Utilizing Java Bytecode to Mining Auto-Generated Test Inputs for Test Oracle, The 37th Annual International Computer Software and Applications Conference (COMPSAC 2013)
Weifeng Xu, Hanlin Wang, Tao Ding, Mining Auto-Generated Test Inputs for Test Oracle, The 10th International Conference on Information Technology: New Generation(ITNG 2013)
Weifeng Xu, Lin Deng, Tao Ding, Detecting Web Security Risks With UML Design Models, The 7th IASTED International Conference on Communication, Internet, and Information Technology (CIIT 2012)
Yunkai Liu, Mary Vagula, Weifeng Xu, Tao Ding, Gene Expression Games: A Case Study of the Integration between Game Programming and Bioinformatics Education, Great Lakes Bioinformatics Conference 2012 (GLBIO 2012)


PDF of my Full CV

FUN
things things things
things things things
things things