Site Reliability Engineer
Job Details
About the Company
With operational hubs scattered across Europe, Asia, and LATAM, and its headquarters situated in San Francisco, US, the company boasts a workforce of over 1,000 adept professionals. Spanning across more than 20 countries, ALLSTARSIT offers a diverse range of skilled employees across various verticals, including AI, cybersecurity, healthcare, fintech, telecom, media, and so on.
About the Project
Univeris is the leader in enterprise-class retail wealth management solutions for Canada's financial services industry. At Univeris, their focus is to build the best technology to help their clients grow their business and to provide superior competitive advantage.
The Univeris platform, which supports mutual funds, segregated funds, GICs, cash and individual life and health insurance, is the leading retail wealth management solution for wealth firms across Canada. Univeris is the definitive, enterprise-level and single system for wealth management through its integrated management of back office operations, retail product distribution and compliance requirements, and a front office practice management system for advisors.
Specialization
Headquarters
Years on the market
Team size and structure
Current technology stack
Required skills:
- University degree in computer science or engineering or equivalent practical experience
- Advanced English level
- Track record of configuring, monitoring and maintaining complex, business-critical applications through SaaS using the public cloud (preferably GCP) and the private cloud
- Experience with installing, monitoring and troubleshooting APIs
- Process automation using scripting (PowerShell)
- Ability to read Java code for the purpose of using application logs as part of application troubleshooting
- Proficiency in working with monitoring tools, preferably New Relic for monitoring and troubleshooting infrastructure and applications
- Experience with workload automation and job scheduling solutions, preferably Flux
- Advanced technical skills with Windows, MS SQL Server and PostgreSQL
- Database management, optimization and troubleshooting skills with emphasis on MS SQL Server
- Good understanding of cloud infrastructure
- Demonstrated ability to implement new technologies for solving business and technical problems
- Excellent interpersonal and communication skills (oral and written) and comfortable presenting to and interacting with senior managers and staff
- Excellent critical thinking, planning skills and ability to work independently and effectively in a dynamic environment
- Ability to understand, follow and update technical documentation
- Web Services knowledge is an asset.
Scope of work:
As a Site Reliability Engineer (SRE), you will work in conjunction with the Infrastructure and DevOps teams to ensure that the Univeris Enterprise Wealth Management System (EWMS) SaaS platform delivers services that are secure, available and performing well for customers 24x7. The client is very focused on service quality and the role of the SRE is to own that quality and to ensure that Univeris customers have a seamless experience. They seek to eliminate manual and repetitive operations tasks at every opportunity by assessing and implementing third-party tools for environment monitoring, business process management, etc. They value technical aptitude, innovative thinking and great learning ability.
- SaaS Automation and Operations
- SaaS services setup activities such as software configurations, monitoring tools and jobs setup and production readiness reviews.
- Evaluate and improve the SaaS operation by implementing automation through scripting or by leveraging third-party tools
- Develop and manage service health dashboards
- Troubleshoot production incidents and drive workaround deployment and issue resolution with our Product Development, DevOps and Infrastructure teams.
- Receive, analyze and act on alerts related to application and API issues by following documentation and internal processes and by employing your own skills and critical thinking
- Receive, review and provide an initial response to alerts received from the Security Operations Center following internal processes.
- Database Management and Maintenance
- Maintain operability of existing database jobs using the MS SQL Server Agent
- Troubleshoot database issues (errors, performance). Recommend, develop and implement solutions.
- Install, monitor and maintain database jobs
- Install and execute existing database maintenance packages (indexes, fragmentation, etc.)
- Install and execute packages for database size management (archive, delete data)