Richard J. Bocchinfuso

"Be yourself; everyone else is already taken." – Oscar Wilde

Ph.D. Journey Begins

One of the more difficult aspects of the Ph.D. application process was selecting an area of research. Below is an excerpt from my application essay outlining my proposed area of research.

I have given thought to my potential area of research interest, and one area of computing that interests me significantly is sentiment analysis. Big data, machine learning, deep learning, and artificial intelligence are changing the way we make decisions. Ray Dalio says that radical transparency and algorithmic decision-making are going to change our lives, and I agree. Social, big data, exponential increases in computational power, the advent of cloud computing and democratization of access to computing resources has made it easy to convert subjective and loosely coupled data from various sources into meaningful information. Organizations can leverage sentiment analysis to understand how their brand, products, or services are perceived. Sentiment can feed artificial intelligence, which can then autonomously engage with consumers and proactively protect and promote their brand. The question is as we democratize access to these technologies, lower the barriers to entry, and make them easier to consume, how accurate, and efficient are sentiment analysis algorithms?

I want to conduct research that explores classification techniques, existing NLP (Natural Language Processing) libraries, and algorithms; measuring precision, recall, and accuracy while considering aspects of the modern human lexicon, such as the use of colloquial expressions, subjectivity, tone, irony, sarcasm, analogies, and emojis.

In addition to the hypotheses and theories I develop as a result of quantitative and qualitative research and the analysis of empirical data captured from conducted experiments, I would like to explore relevant topics such as the impact of corpus source and size on accuracy and efficacy and how the corpus impacts bias. For example, as application developers increasingly leverage pre-trained models provided by industry goliaths what unknown biases are introduced and reproduced? Where does the ethical and moral responsibility reside in these situations?