Artwork

Player FM - Internet Radio Done Right
Checked 1y ago
Added three years ago
Content provided by Tharun Shiv. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Tharun Shiv or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!
icon Daily Deals

Reason why you lag behind the team | Become a better SRE Site Reliability Engineer | Engineering | Backend | Tharun Shiv | Developer Tharun

3:47
 
Share
 

Manage episode 318334807 series 3112412
Content provided by Tharun Shiv. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Tharun Shiv or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Link to the article: https://dev.to/developertharun/8-ways-to-become-a-better-sre-right-now-8-non-technical-characteristics-to-have-3n4p

Link to the YouTube video: https://youtu.be/2drsyhJzcao

Subscribe the podcast if you like it!

Thanks for listening.

  continue reading

50 episodes

Artwork
iconShare
 
Manage episode 318334807 series 3112412
Content provided by Tharun Shiv. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Tharun Shiv or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Link to the article: https://dev.to/developertharun/8-ways-to-become-a-better-sre-right-now-8-non-technical-characteristics-to-have-3n4p

Link to the YouTube video: https://youtu.be/2drsyhJzcao

Subscribe the podcast if you like it!

Thanks for listening.

  continue reading

50 episodes

All episodes

×
 
Link to the article: https://dev.to/developertharun/8-ways-to-become-a-better-sre-right-now-8-non-technical-characteristics-to-have-3n4p Link to the YouTube video: https://youtu.be/2drsyhJzcao Subscribe the podcast if you like it! Thanks for listening. Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv…
 
Link to the article: https://dev.to/developertharun/8-ways-to-become-a-better-sre-right-now-8-non-technical-characteristics-to-have-3n4p Link to the YouTube video: https://youtu.be/2drsyhJzcao Subscribe the podcast if you like it! Thanks for listening. Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv…
 
Link to the article: https://dev.to/developertharun/8-ways-to-become-a-better-sre-right-now-8-non-technical-characteristics-to-have-3n4p Link to the YouTube video: https://youtu.be/2drsyhJzcao Subscribe the podcast if you like it! Thanks for listening. Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv…
 
Subscribe to the podcast to get latest episodes 1. SRE is all about the right Mindset a. No blame game b. Thirst to solve As an SRE we deal with multiple components and are a bridge between the users and the application. Even though the application is well written, a bigger responsibility falls upon SRE to keep the applications and the services it uses up and running. In this process, there might be a few situations where one of the SRE does a mistake that causes a disruption or even an outage. When this happens, the first thing to happen shouldn't be to blame anyone for the outage, but the following has to be performed. i. Fix the issue ii. Write an RCA ( Root Cause Analysis ) that mentions why the issue occurred in the first place, the names can be anonymous. iii. Mention the first aid and the fix for the issue iv. Discuss how the issue can be prevented the next time v. Set an ETA for the fix Another aspect is to have the right mindset to solve problems. As an SRE you have the responsibility to optimize the infrastructure, fix issues, build automation tools, monitoring tools, and more, which requires a lot of problem-solving skills. Unless you have the thirst to solve the problems, you will only feel more stressed out, or even worse, would cause issues. 2. Communication a. Overcommunication is not a problem b. Be kind and show empathy Are you performing a production activity or even a stage change that could affect other teams? Have you made progress in the project that you are working on? Make sure to keep the necessary stakeholders in sync always. Write emails, send slack messages well in advance before the production activity, just before and after the activity. It might sound like over-communication, but trust me, as the company scales, you need to keep everyone relevant to the component that you are working on in sync. This way, if they have to take any actions from their side, they will do it, or if they face any issues post-activity they'll know who the right person to get in touch with is. One other important characteristic to have as a human being is to be kind and show empathy. This will apply to all levels of engineering on either side of the conversation, period. Whether someone asks a silly question, or does a mistake, or behaves rudely with you, you should never mirror that behavior. 3. Stay synced with the team a. Do not miss team meetings b. Prevent duplication of work c. Do not compete, but contribute In this work from home ( WFH ) period, the only time where you have an opportunity to speak to your teammates is during a team meet. The reason why this is special is, you get an opportunity to stay synced with your team on what they all are working on, whether they are blocked on any tasks, how you can contribute to their tasks and also you will be using this opportunity to convey on what you are working on and get help if necessary. This also prevents duplication of work. 4. Shadow teammates on tasks and issues The best way to learn is by doing it hands-on and the best way to begin would be by watching how it is done. I also believe that the best way to retain the learned information is by performing it repeatedly. This also includes watching your teammates perform the activities. It ensures that the activity is done without any mistakes when there are several eyes to watch it. 5. No Spoon-feeding, do homework Do not expect all details to be taught by your teammates and seniors. Read the documentation, watch tutorials, read engineering blogs, practice on your own, and suggest improvisations. Even a well-built system will have much more efficient solutions, that you can propose…
 
Subscribe to the podcast to get latest episodes 1. SRE is all about the right Mindset a. No blame game b. Thirst to solve As an SRE we deal with multiple components and are a bridge between the users and the application. Even though the application is well written, a bigger responsibility falls upon SRE to keep the applications and the services it uses up and running. In this process, there might be a few situations where one of the SRE does a mistake that causes a disruption or even an outage. When this happens, the first thing to happen shouldn't be to blame anyone for the outage, but the following has to be performed. i. Fix the issue ii. Write an RCA ( Root Cause Analysis ) that mentions why the issue occurred in the first place, the names can be anonymous. iii. Mention the first aid and the fix for the issue iv. Discuss how the issue can be prevented the next time v. Set an ETA for the fix Another aspect is to have the right mindset to solve problems. As an SRE you have the responsibility to optimize the infrastructure, fix issues, build automation tools, monitoring tools, and more, which requires a lot of problem-solving skills. Unless you have the thirst to solve the problems, you will only feel more stressed out, or even worse, would cause issues. 2. Communication a. Overcommunication is not a problem b. Be kind and show empathy Are you performing a production activity or even a stage change that could affect other teams? Have you made progress in the project that you are working on? Make sure to keep the necessary stakeholders in sync always. Write emails, send slack messages well in advance before the production activity, just before and after the activity. It might sound like over-communication, but trust me, as the company scales, you need to keep everyone relevant to the component that you are working on in sync. This way, if they have to take any actions from their side, they will do it, or if they face any issues post-activity they'll know who the right person to get in touch with is. One other important characteristic to have as a human being is to be kind and show empathy. This will apply to all levels of engineering on either side of the conversation, period. Whether someone asks a silly question, or does a mistake, or behaves rudely with you, you should never mirror that behavior. 3. Stay synced with the team a. Do not miss team meetings b. Prevent duplication of work c. Do not compete, but contribute In this work from home ( WFH ) period, the only time where you have an opportunity to speak to your teammates is during a team meet. The reason why this is special is, you get an opportunity to stay synced with your team on what they all are working on, whether they are blocked on any tasks, how you can contribute to their tasks and also you will be using this opportunity to convey on what you are working on and get help if necessary. This also prevents duplication of work. 4. Shadow teammates on tasks and issues The best way to learn is by doing it hands-on and the best way to begin would be by watching how it is done. I also believe that the best way to retain the learned information is by performing it repeatedly. This also includes watching your teammates perform the activities. It ensures that the activity is done without any mistakes when there are several eyes to watch it. 5. No Spoon-feeding, do homework Do not expect all details to be taught by your teammates and seniors. Read the documentation, watch tutorials, read engineering blogs, practice on your own, and suggest improvisations. Even a well-built system will have much more efficient solutions, that you can propose…
 
Link to the complete episode: https://anchor.fm/dashboard/episode/e1cjm7b Hey there! Follow the podcast if you like the episode This is Tharun. In the Developer Tharun Podcast, I speak about Software Engineering Thank you for Listening In this Episode Ways in which you can secure your vault server Hashicorp Vault is a secrets management engine And more... Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv…
 
Site reliability engineering Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application. Now, in order to explain the different responsibilities of an SRE, I have divided it into 4 different categories. I have always seen SRE this way, and definitely not as some ad-hoc process. The four categories in which I would classify the tasks of a Site Reliability Engineer are: Create Monitor Manage Destroy Let's dive deep into each one of them. Create 1. Provision virtual machines / PXE Baremetals SREs are responsible for provisioning the virtual machines with the requested resources in terms of CPU, memory, disks, network configurations, and operating system. They are also responsible to be rack aware during provisioning. Example operating systems involve Linux Ubuntu, CentOS, Windows. 2. Setup services Example technologies involve NGINX, Apache, RabbitMQ, Kafka, Hadoop, Traefik, MySQL, PostgreSQL, Aerospike, MongoDB, Redis, MinIO, Kubernetes, Apache Mesos, Marathon, MariaDB, Galera. 3. Optimize the infrastructure Since there are several components and services that are being used in the infrastructure, there is a scope for improvements in terms of performance, efficiency, and security. The SRE optimizes the components by keeping them up to date, choosing the right service for the right job, patching the servers. 4. Write monitoring scripts When the SRE are involved in maintaining an infrastructure of any size, they never underestimate any component of the infrastructure and write a monitoring script to monitor the components and metrics of each and every one of them. This provides the ability to get real-time alerts on any of the components malfunctioning and also a better view of the infrastructure. The SRE uses programming languages like Bash, Python, Golang, Perl, and tools like daemon processes, Riemann, InfluxDB, OpenTSDB, Kafka, Grafana, Prometheus, and APIs to monitor the infrastructure 5. Write automation scripts If there are more than 10 steps to be performed and chances are that the task has to be performed more than once, the SRE never hesitate to automate the task. This saves time and also prevents human error. The SRE uses programming languages like Bash, Python, Golang, Perl, Ansible to automate the tasks. 6. Manage users on the machines…
 
Site reliability engineering Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application. Now, in order to explain the different responsibilities of an SRE, I have divided it into 4 different categories. I have always seen SRE this way, and definitely not as some ad-hoc process. The four categories in which I would classify the tasks of a Site Reliability Engineer are: Create Monitor Manage Destroy Let's dive deep into each one of them. Create 1. Provision virtual machines / PXE Baremetals SREs are responsible for provisioning the virtual machines with the requested resources in terms of CPU, memory, disks, network configurations, and operating system. They are also responsible to be rack aware during provisioning. Example operating systems involve Linux Ubuntu, CentOS, Windows. 2. Setup services Example technologies involve NGINX, Apache, RabbitMQ, Kafka, Hadoop, Traefik, MySQL, PostgreSQL, Aerospike, MongoDB, Redis, MinIO, Kubernetes, Apache Mesos, Marathon, MariaDB, Galera. 3. Optimize the infrastructure Since there are several components and services that are being used in the infrastructure, there is a scope for improvements in terms of performance, efficiency, and security. The SRE optimizes the components by keeping them up to date, choosing the right service for the right job, patching the servers. 4. Write monitoring scripts When the SRE are involved in maintaining an infrastructure of any size, they never underestimate any component of the infrastructure and write a monitoring script to monitor the components and metrics of each and every one of them. This provides the ability to get real-time alerts on any of the components malfunctioning and also a better view of the infrastructure. The SRE uses programming languages like Bash, Python, Golang, Perl, and tools like daemon processes, Riemann, InfluxDB, OpenTSDB, Kafka, Grafana, Prometheus, and APIs to monitor the infrastructure 5. Write automation scripts If there are more than 10 steps to be performed and chances are that the task has to be performed more than once, the SRE never hesitate to automate the task. This saves time and also prevents human error. The SRE uses programming languages like Bash, Python, Golang, Perl, Ansible to automate the tasks. 6. Manage users on the machines…
 
Site reliability engineering Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application. Now, in order to explain the different responsibilities of an SRE, I have divided it into 4 different categories. I have always seen SRE this way, and definitely not as some ad-hoc process. The four categories in which I would classify the tasks of a Site Reliability Engineer are: Create Monitor Manage Destroy Let's dive deep into each one of them. Create 1. Provision virtual machines / PXE Baremetals SREs are responsible for provisioning the virtual machines with the requested resources in terms of CPU, memory, disks, network configurations, and operating system. They are also responsible to be rack aware during provisioning. Example operating systems involve Linux Ubuntu, CentOS, Windows. 2. Setup services Example technologies involve NGINX, Apache, RabbitMQ, Kafka, Hadoop, Traefik, MySQL, PostgreSQL, Aerospike, MongoDB, Redis, MinIO, Kubernetes, Apache Mesos, Marathon, MariaDB, Galera. 3. Optimize the infrastructure Since there are several components and services that are being used in the infrastructure, there is a scope for improvements in terms of performance, efficiency, and security. The SRE optimizes the components by keeping them up to date, choosing the right service for the right job, patching the servers. 4. Write monitoring scripts When the SRE are involved in maintaining an infrastructure of any size, they never underestimate any component of the infrastructure and write a monitoring script to monitor the components and metrics of each and every one of them. This provides the ability to get real-time alerts on any of the components malfunctioning and also a better view of the infrastructure. The SRE uses programming languages like Bash, Python, Golang, Perl, and tools like daemon processes, Riemann, InfluxDB, OpenTSDB, Kafka, Grafana, Prometheus, and APIs to monitor the infrastructure 5. Write automation scripts If there are more than 10 steps to be performed and chances are that the task has to be performed more than once, the SRE never hesitate to automate the task. This saves time and also prevents human error. The SRE uses programming languages like Bash, Python, Golang, Perl, Ansible to automate the tasks. 6. Manage users on the machines…
 
Site reliability engineering Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application. Now, in order to explain the different responsibilities of an SRE, I have divided it into 4 different categories. I have always seen SRE this way, and definitely not as some ad-hoc process. The four categories in which I would classify the tasks of a Site Reliability Engineer are: Create Monitor Manage Destroy Let's dive deep into each one of them. Create 1. Provision virtual machines / PXE Baremetals SREs are responsible for provisioning the virtual machines with the requested resources in terms of CPU, memory, disks, network configurations, and operating system. They are also responsible to be rack aware during provisioning. Example operating systems involve Linux Ubuntu, CentOS, Windows. 2. Setup services Example technologies involve NGINX, Apache, RabbitMQ, Kafka, Hadoop, Traefik, MySQL, PostgreSQL, Aerospike, MongoDB, Redis, MinIO, Kubernetes, Apache Mesos, Marathon, MariaDB, Galera. 3. Optimize the infrastructure Since there are several components and services that are being used in the infrastructure, there is a scope for improvements in terms of performance, efficiency, and security. The SRE optimizes the components by keeping them up to date, choosing the right service for the right job, patching the servers. 4. Write monitoring scripts When the SRE are involved in maintaining an infrastructure of any size, they never underestimate any component of the infrastructure and write a monitoring script to monitor the components and metrics of each and every one of them. This provides the ability to get real-time alerts on any of the components malfunctioning and also a better view of the infrastructure. The SRE uses programming languages like Bash, Python, Golang, Perl, and tools like daemon processes, Riemann, InfluxDB, OpenTSDB, Kafka, Grafana, Prometheus, and APIs to monitor the infrastructure 5. Write automation scripts If there are more than 10 steps to be performed and chances are that the task has to be performed more than once, the SRE never hesitate to automate the task. This saves time and also prevents human error. The SRE uses programming languages like Bash, Python, Golang, Perl, Ansible to automate the tasks. 6. Manage users on the machines…
 
Site reliability engineering Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application. Now, in order to explain the different responsibilities of an SRE, I have divided it into 4 different categories. I have always seen SRE this way, and definitely not as some ad-hoc process. The four categories in which I would classify the tasks of a Site Reliability Engineer are: Create Monitor Manage Destroy Let's dive deep into each one of them. Create 1. Provision virtual machines / PXE Baremetals SREs are responsible for provisioning the virtual machines with the requested resources in terms of CPU, memory, disks, network configurations, and operating system. They are also responsible to be rack aware during provisioning. Example operating systems involve Linux Ubuntu, CentOS, Windows. 2. Setup services Example technologies involve NGINX, Apache, RabbitMQ, Kafka, Hadoop, Traefik, MySQL, PostgreSQL, Aerospike, MongoDB, Redis, MinIO, Kubernetes, Apache Mesos, Marathon, MariaDB, Galera. 3. Optimize the infrastructure Since there are several components and services that are being used in the infrastructure, there is a scope for improvements in terms of performance, efficiency, and security. The SRE optimizes the components by keeping them up to date, choosing the right service for the right job, patching the servers. 4. Write monitoring scripts When the SRE are involved in maintaining an infrastructure of any size, they never underestimate any component of the infrastructure and write a monitoring script to monitor the components and metrics of each and every one of them. This provides the ability to get real-time alerts on any of the components malfunctioning and also a better view of the infrastructure. The SRE uses programming languages like Bash, Python, Golang, Perl, and tools like daemon processes, Riemann, InfluxDB, OpenTSDB, Kafka, Grafana, Prometheus, and APIs to monitor the infrastructure 5. Write automation scripts If there are more than 10 steps to be performed and chances are that the task has to be performed more than once, the SRE never hesitate to automate the task. This saves time and also prevents human error. The SRE uses programming languages like Bash, Python, Golang, Perl, Ansible to automate the tasks. 6. Manage users on the machines…
 
Hey there! Follow the podcast if you like the episode This is Tharun. In the Developer Tharun Podcast, I speak about Software Engineering Thank you for Listening In this Episode Ways in which you can secure your vault server Hashicorp Vault is a secrets management engine And more... Thank you for listening to my Podcast. Follow my podcast if you find it helpful. Check out my other episodes. I talk about programming & software engineering. YouTube: https://youtube.com/c/developerTharun Blog Article on: https://tharunshiv.com Instagram: @developerTharun Dev.to: https://dev.to/developertharun Udemy: https://www.udemy.com/user/tharun-shiv/ LinkedIn: https://linkedin.com/in/tharunshiv…
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

icon Daily Deals
icon Daily Deals
icon Daily Deals

Quick Reference Guide

Listen to this show while you explore
Play