Along with a few other members of ML@B, I am currently working on a consulting project for Intuit with the goal being to help them effectively summarize internal documents using various ML algorithms. The idea is that through summarization one can efficiently access, organize, and understand large documents.
We began the project early September by conducting literature review to gain a better understanding of how we could approach our task. Although, we were unable to access to the companies internal dataset, our points of contact at Intuit suggested that we use NYT Annotated Corpus and the CNN/Daily Mail datasets. After reading numerous papers, I spent a few weeks parsing and analyzing the data from the two sources. Additionally I used Word2Vec to generate word embeddings for the words in the data.
My next task was to implement SummaRuNNer a state of the art model described here. Using TensorFlow I was able to fully implement the model and both the extractive and abstractive methods of training. The project is on schedule to complete by the end of November.
I am a member of the Web Development Committee of the Regents' and Chancellor's Scholar Association at Berkeley. As a new member on the committee I learned essential HTML/CSS skills and applied them in creating this personal website.
Inspired by the neural network implemenation I created in the Spring, I decided to further explore the area by implementing other more complex ML models entirely from scratch. My goal was to fully understand the theory behind these models and for this I felt nothing was better than implementing it on my own. Along the way I created corresponding library implementations using either TensorFlow or scikit-learn to compare the efficiency and performance of the models.
One of these projects to generate English Text by implementing and
training various neural models. I began by implementing a Reccurent
Neural Network from scratch which accepted characters as input and at
each time step outputted a probability distribution for the next character.
I trained this model using a large dataset of words, and was excited to see
the model learning the syntactic regularities of English words and generating
strings that phonetically made sense.
Link to RNN from scratch
I took the project one step further by using TensorFlow to implement a
more powerful LSTM model. With its greater capability to retain long-term
dependencies, I decided to use this model to generate entire passages of text
as opposed to single words. I trained the model using a full-length novel and
found that it could generate passages that almost seemed as if they could have
been from the book.
Link to TensorFlow LSTM
In this project I explored another very popular architecture Convolutional
Neural Networks in the context of image classification. Again I began by
implementing this model from scratch. Learning the theory behind the
architecture, I was able to understand how the model could develop a sense
of spatial understanding and robustness. I applied the model I created to
the commonly used benchmark dataset MNIST and found that it could classify
digits at a high level of accuracy.
Link to CNN from scratch
Next, I used TensorFlow to implement a deeper model, and used it for the
same image classification task, but this time on the more complex CIFAR-10
dataset. Of course, this implementation was more efficient, but I found that
my earlier implementation allowed me to truly understand what was happening
behind the scenes.
Link to TensorFlow CNN
This project was based upon a dataset I found on Kaggle that provided
various numerical features for different cases of cancer as well as whether
the cancer was benign or malignant. I aimed to predict malignant or benign
based on the features supplied in the dataset. In this project I
experimented with a number of different ML model implementations.
I created both a decision tree and random forests algorithm from scratch and
compared it with my own implementation of a vanilla neural network.
Link to Code
This is another project I based of a dataset from Kaggle. The dataset provided
numerous characteristics of passengers aboard the Titanic as well as whether
or not the passenger survived the disaster. In this project I work with the data
in a number of different ways. I visualize different correlations, engineer new
features, clear out missing data, and more.
Link to Code
During Summer 2017, I learned the theory and mathematics behind basicneural networks—from the network architecture to the backpropogation algorithm that allowed the model to learn optimal weights.
This Spring I completed a Coursera course on Neural Networks for Machine Learning and the final project was to an elementary Feed-Forward Neural Network from scratch that involved an input layer, a hidden layer, and one softmax output layer. The framework of the program was supplied, but I had to program the actual logic from scratch.
I participated in FIRST Robotics for 7 years, but every year I learned something new and applied my knowledge in different ways. This is a project I worked on during my final year which aimed to predict the scores of future matches and thus streamline the process for selecting alliance partners during the playoff stage of the competition.
The model itself was a neural network implemented using Tensorflow which aimed to predict the match score given the participating teams and various statistics. Various techniques were used to improve performance in training and usage from regularization methods such as dropout and weight penalties to training tehcniques such as momentum and mini-batch learning. Ultimately we were able to provide significant insight into the result of the match before it actually took place.