Picture of me when I was studying abroad in Thailand

Bio
I’m Amy Hilla, and I’m from Arlington VA. I’m a senior, majoring in Economics and minoring in Data Science. I’m on the club sailing team and I’m in Tri Delt.

Things I’m interested in:

Sample script:

sentiment analysis (.py)

This script takes .csv files with text data and analyzes the text in each row. It assigns two sentiment scores, assessing subjectivity and polarity. The sentiment scores are added to the .csv as a new column, then the .csv with sentiment scores is exported as a new file.

# sentiment_analysis.py

# By: Amy Hilla

# Version 1.0

# Last Edit: 2020-09-03

# This script takes .csv files with text data
# and analyzes the text in each row
# it assigns two sentiment scores, assessing subjectivity and polarity
# the sentiment scores are added to the .csv as a new column
# the .csv with sentiment scores is then exported as a new file


####################
# Required Modules  #
####################
import argparse
import pandas as pd
from textblob import TextBlob


####################
#    Function      #
####################
def sentiment_scores():

    # User inputs information about what they want to analyze:
    path_to_file = input('Enter file path to .csv file: ')

    encodingtype = input('Enter encoding type for .csv file: ')

    column_name = input('Enter the name of the column containing text you want to analyze: ')

    new_file_name = input('Enter the name of the new file you want to output: ')

    # read in the file as a pandas dataframe
    alldisc = pd.read_csv(path_to_file, encoding = encodingtype)

    # create an empty list which will store the sentiment scores
    senti = []

    # analyze the text data row by row
    for post in alldisc[column_name]:

        # for each row, first check that the text data is a string
        if type(post) == str:

            # convert to TextBlob object
            blob = TextBlob(post)

            # use TextBlob object attribute .sentiment to calculate sentiment scores
            sent = blob.sentiment

            # add the calculated sentiment scores to the list senti containing all the scores
            senti.append(sent)

        else:
            # if the row being analyzed does not contain string data, add 'invalid' for that row
            # this allows the function to run even if some rows are empty or contain non-string data
            # in the final .csv, the column containing scores will contain 'invalid' for these rows
            senti.append('invalid')

    #add the new list of sentiment scores to the original table
    alldisc['sentiment'] = senti

    #export the table with the sentiment scores to a new .csv
    alldisc.to_csv(new_file_name, encoding = encodingtype)

    # tell the user they have succesfully created a new file with the scores
    return print('Your new .csv with the sentiment scores, ' + str(new_file_name) + ', has been created.')

###################
#      Main       #
###################
if __name__ == "__main__":

    # --help command line description
    parser = argparse.ArgumentParser(description='This script will ask you for a .csv file, with at least one column containing text data that you want to perform sentiment analysis on. Then it will calculate subjectivity and polarity scores and create a new .csv file containing the scores.')
    args = parser.parse_args()

    sentiment_scores()