Olivier Tharan Site Reliability Engineer / DevOps email@example.com +33 6 25 13 31 08
I enjoy working on infrastructure and making sure it is always on, and design computer systems that allow automation, on the principle of “Don’t Repeat Yourself”, in order to make people more productive. Along my career, I evolved from a pure operations perspective with oncall duties on the firefighting side, to be more development-oriented, and would like to share this knowledge with other teams, and grow with them.
Site Reliability Engineer for the Corporate Engineering team, based in Dublin, Ireland, then Paris, France. My main job for 12 years has been to ensure that internal services for the company keep running, and the 60,000+ Google employees can work productively.
Redefine the concept of Enterprise architecture beyond the traditional network perimeter, by providing finer-grained access controls for all internal services and stop relying on internal IP addresses as indicators of trust. A presentation on this project was made at LISA’13.
I initiated the project, which was mostly program management and directing the work of dozens of different teams across the IT and Security departments.
I designed a client-server project to send logs from all corporate machines to our centralized logs systems; I wrote the server in C++.
Moving from a permit-all network to permissions based on machine certificates and LDAP group membership requires self-service tools for Google employees.I am currently designing and implementing a tool to troubleshoot network access issues for corporate users, leading a small team around it and building the Go-based server software.
Admin our fleet of internal virtual machines running the Ganeti software (opensource, built in-house). These go from 2-machine clusters to several racks worth of virtual machines located in our offices and datacenters. My job was to reply to alerts, performing upgrades and build automation tools to manage large clusters of virtual machines.
Be oncall for critical Google services with a tight SLA, < 5m response.
Change management: we have a corporate change management process, which consists in reviewing upcoming changes, making sure they are planned with appropriate thoughts for user communication, planned rollback, escalation paths, etc. I chaired the weekly meetings and brought lots of improvements to the process itself.
Manage the fleet of NetApp fileservers across the company. Google has some of the largest distributed fleet of filers for its corporate needs.
Manage a fleet of proxy servers (Squid and in-house).
Write and deploy application servers (HTTP + RPC) written in C++, Go and Python on standard Google production environment (Borg), with all the relevant SRE practices. Gained internal readability in C++ and Python, working on Go. Project management skills involve project tracking, making sure various services are up to SRE standards by performing a Production Readiness Review (and updating the review process itself), help development teams get to these standards, lead 2-3 person teams on a project, communicate with teams all across the company.
Systems design: I wrote several designs ending up in full-fledged projects. The major one is “Beyond Corp” described above. I also reviewed dozens of projects with relevant feedback.
8-year proven track record of working remotely from the rest of my respective project teams: from Paris, France with teams in Switzerland, Ireland, US East and West Coast.
A taste for security-oriented architecture and large-scale networking.
Systems and network administrator for a 2000-person biology research campus in the center of Paris. My main tasks were maintaining servers, doing user support and network connectivity for the campus. Significant accomplishments: redid the email infrastructure using Postfix and tying it to the internal LDAP servers for user lookup; setup replication between 3 LDAP servers; redid the caching proxy infrastructure; upgraded some relatively high traffic (for the time) web servers.
IDEALX was one of the first entirely opensource-oriented company in France, with a dedication to releasing most of their client projects as opensource. I worked with clients on various projects; the main ones consisted in performing systems administration for Tiscali (massive web hosting provider, part of the ISP), and doing Perl-SOAP development for Cegetel (telecommunications provider and ISP).
Very good knowledge of Unix-like systems administration: Linux, FreeBSD; shell scripting, Python, Go; fair, but limited knowledge of C++.
System automation, change management, use of version control systems: Perforce, and build management tools: mostly internal to Google. Make heavy use of automation and unit testing to ensure code and data will perform as intended.