Proofpoint, headquartered in Sunnyvale, California, is a cybersecurity company that provides solutions to top research universities, banks and over half of the Fortune 100 corporations. Proofpoint protects sensitive data across every domain including email, the web, the cloud, social media and mobile messaging.
Proofpoint shields their clients from millions of spam emails per day. By analyzing terabytes of email data, Proofpoint can predict future cyber security attacks and prevent them before they happen, increasing the security and reliability of their system and the web.
Our Leveraging SPAM to Make Bold Societal Predictions project utilizes the large amount of spam Proofpoint collects to make predictions about real-world events.
Our system analyzes the data from spam emails via various machine learning techniques and analyzes all spam emails about a particular topic to look for underlying patterns that might indicate the outcome of a particular event.
The trends in spam data, as well as any predictions our system makes, are available to be viewed on our web dashboard. The dashboard highlights any interesting data trends, and also various predictions, including the 2020 presidential election, stock prices and consumer sentiment.
Proofpoint analysts use our dashboard to help predict future cyber security risks before they happen, allowing them to provide superior security to their clients.
Our back end runs locally on Proofpoint’s secure data server to collect information stored in spam .EML files and anonymizes the data so that customer privacy is protected, while still not compromising the information we are able to use from those files.
Our web dashboard consists of a React front end with a Django and SQLite back end that is being hosted on an Apache web server. Our machine learning model is implemented in Python with Flair.
