NLP Discord bot

2017-present

The idea

Back in high school, I thought it would be funny to make a Discord bot that could pretend to be my friends using a Markov chain. The idea is that it would log all of a user’s messages into a corpus, and you could then generate new messages in the user’s style. I ended up using the Markovify library for Python, and it worked and was a ton of fun. My friends and I named it Robot Kirby after my Super Smash Bros. main at the time.

Iterative improvements

Being young and naive, I didn’t know anything about databases and my code sucked for a number of reasons. It stored all the messages in text files which caused significant scalability concerns. So, throughout college and to this day, I have been maintaining it and upgrading it through several complete rewrites.

It now has some more fun features including sentiment analysis using VADER sentiment analysis and it can make visualizations of a user’s corpus.

Over time, my friends have also contributed feature ideas and code to the project. I am particularly thankful to Amelia and Jenna for their code contributions, and to all of my friends who have provided ideas and encouragement.

Nowadays, it has a much more sophisticated database using MongoDB, and it is containerized using Docker and Docker Compose. I run it on my home server.

Selected functionality

Command	Arguments	Description
`sentient`	`[prompt]` `[topic]` `[channel]` `[member]`	Imitate a given `[member]`, `[channel]`, or current server (if no filters are specified). `[prompt]` provides an optional starting state, and `[topic]` filters the corpus to a specific search term.
`opinion`	`topic` `[channel]` `[member]`	Calculate a sentiment score for a given `topic`, optionally filtering by `[channel]` or `[member]`
`wordcloud`	`[channel]` `[member]`	Create a wordcloud using messages from a `[member]`, `[channel]`, or server (if no filters are specified)
`timedensity`	`[timezone]` `[channel]` `[member]` `[topic]`	Plot when a `[member]`, `[channel]`, or server (if no filters are specified) is active. Optionally, show results in `[timezone]` (defaults to EST).

[] indicates optional argument

At the end of each year I also send out Robot Kirby Wrapped, which gives you statistics about your corpus, ranks your top words and channels, analyzes your positivity/negativity, and analyzes your use of offensive language.

All logging is strictly opt-in through the opt in/opt-out commands. You can delete your own data using the delete command.

Using Robot Kirby

Due to wanting to have tight controls over who is using my Robot Kirby instance due to scalability/privacy concerns, I won’t post a link to the instance I run. However, the code is publicly available on GitHub at sashaiw/robotkirby, along with instructions on how to run your own instance. Also, if you have any cool features you would like to see added, feel free to contact me! I will also consider pull requests on GitHub.