PeerJ
Company
PeerJ is an award-winning, open-access, peer-reviewed journal with an innovative lifetime subscription model.
Team
I was hired as a Senior Software Engineer and later transitioned into the role of DevOps Lead. The technical team was relatively small, consisting of the CEO, one senior developer, and myself. During my time at PeerJ, I also hired a DevOps engineer to support the team.
Challenge
I joined PeerJ while their initial MVP was still in development and contributed to its growth into the platform it is today. Some of the notable projects I worked on during my time there include:
Scaling, Load Testing, & High Availability: I set up and configured Varnish caching, including automating cache invalidation and IP blacklisting. In conjunction with this, I created a JMeter load profile and conducted load tests in a staging environment to optimize the configuration of Apache, PHP, and Varnish. Additionally, I set up a Jenkins build server and developed the deployment process.
Accurate Page Level Analytics: Initially, we used Google Analytics (GA) to publicly report individual page-level metrics such as page views and PDF downloads. However, we quickly noticed issues with GA, including decreasing metrics or large, one-off jumps that would later return to original levels. I implemented logic to mitigate these anomalies, but it became clear that GA was not a viable long-term solution. To address this, I created an EMR cluster and used Hadoop and Hive to query log files for the same metrics. As part of this process, I implemented IP and bot exclusions to ensure the metrics remained largely consistent with previous data. Additionally, I architected the system to bring the cluster online for only one to two hours each night, retrieving recent metrics while minimizing costs.
Document Collaboration: PeerJ explored a new product to help academics collaborate on documents. The initial system was based on file sharing, and after thorough investigation, I integrated Seafile into the PeerJ platform. Additionally, I made various modifications to the API and adapted the Qt application to align with PeerJ’s branding.
Migration away from Scalr: When I joined PeerJ, Scalr was heavily used for various aspects of the MVP, including job orchestration, server scaling based on load, deployments, and database replication. Due to time constraints, I initially continued using Scalr. However, to achieve cost savings and address occasional issues with its job orchestration engine, I initiated a long-term project to migrate away from Scalr. As part of this migration, we transitioned the database from MySQL to Aurora, eliminating the dependency on Scalr. The Scalr job orchestration was reconfigured to run on several servers using a combination of Cron, Cronlock, Graylog, and MEMon. The deployment process was rebuilt using Ansible and Ansistrano. Additionally, we migrated the website to leverage native AWS Load Balancing.
Document Generation: One of the core components of PeerJ is generating a review PDF that combines the main document with all supporting figures and images for peer reviewers. The system must handle documents of various formats and sizes while ensuring accuracy. I was responsible for ensuring the robustness of this system, utilizing a combination of open-source tools and paid APIs to perform resizing and conversions as required.
Results
PeerJ was highly regarded within the academic community and established itself as a leader in the open-access space. I am proud to have been a part of its journey. In 2024, PeerJ was acquired by Taylor & Francis.
The High Availablity work achieved a one-year uptime of 99.99%, equating to less than one hour of downtime over the year.
The accurate page-level analytics project was essential in maintaining trust within the PeerJ community. Upon completion, we were able to deliver accurate and consistent metrics to our users, resulting in no further complaints.
The document collaboration product successfully reached an alpha release, but ultimately, due to insufficient customer demand, the project was discontinued.
The Scalr migration project was 90% complete when I left PeerJ, with all systems functioning well post-migration, including the resolution of issues with certain jobs failing to run. The project was estimated to reduce costs by 20%, and I am confident that this target was achieved upon its completion.