aum Posted March 22, 2021 Share Posted March 22, 2021 What This is an introductory course in Distributed Systems. Distributed systems is the study of how to build a computer system where the state of the program is divided over more than one machine (or "node"). This course is in active development. At the moment, it consists of a series of short videos. The intention is to create a complete set of video lectures and then add additional content (such as more projects). Sadly progress is slow due to my other commitments getting in the way... Why? Because I love teaching and I know a lot about distributed systems. So why not? Also, I want to learn more about the art of teaching online. Designing and building a short course seemed like a reasonable way of learning this. How should I use this? Watch the videos and enjoy. You will learn more effectively if you are actively working on designing/building/maintaining a distributed system while you study -- so start making something! (Examples of what you could work on: build a multi-user chat system, build a data analysis using Hadoop, attempt to understand Paxos and build your own implementation (note that Paxos is known for being hard to understand...).) If you are already taking a college-level class on distributed systems then watch these videos before or after your lectures to review the material. Check out the class project chat servers, and try them out. If folks start using them, they may become a great way to get questions anwered. (Or, they will become spam honeypots. We'll see.) If you are an instructor and want to use these videos as a part of your class -- feel free to link to this site and send your students here to watch. Please do not make your own copies of the videos or slides, or change them; I like knowing how many people are using and enjoying the videos, and being able to fix and improve them at will. If you want to do something that involves copying this content, send me an email -- I'm happy to listen to your ideas. Topics This course covers the following topics: Introduction What is a distributed system? [video, slides] Why build a distributed system? [video, slides] How to learn distributed systems. [video, slides] How systems fail What could go wrong? [video, slides] Types of failures [video, slides] Byzantine Fault Tolerance [video, slides] How to express your goals: SLIs, SLOs, and SLAs [video, slides] Class Project: building a multiuser chat server [video, slides] How to get agreement -- consensus Paxos Simplified [video, slides] How Counterstrike Works (a.k.a. Time in Distributed Systems) [video, slides] Blockchain Consensus Introduction to Blockchain Consensus [video] What is a blockchain? [video, slides] Bitcoin blockchain consensus [video, slides] Should you use Bitcion blockchain consensus? [video, slides] Distributed System Design Example (Unique ID) [video, no slides -- I've been playing with After Effects] The CAP Theorem [video, no slides -- I've been playing with After Effects] Potential future topics include: Distributed storage systems How to combine unreliable components to make a more reliable system How nodes communicate -- RPCs How nodes find each other -- naming How to persist data -- distributed storage How to secure your system How to operate your distributed system -- the art of SRE Want to watch them all? As I create videos, I'm adding them to this playlist. Class Project It's hard to learn any systems topic without building something. For this class I've created a bare-bones multiuser chat server which you can use as a foundation to build a more interesting distributed system yourself. The source code can be found on GitHub here. You can also try it out (and use it to ask questions of your fellow classmates!). In a misguided attempt to avoid webcrawlers and spam I'm not going to link to the demo servers here, instead you can figure it out yourself: distributedchat dot appspot dot com; and www dot distributedsystemscourse dot com slash dschat. Learning More The most common question I get is "where can I learn more?" Some resources you can explore include: Tanenbaum and van Steen have written a textbook on the topic. I've not read it, but it may be a good resource. Read the reviews to see if it will work for your needs. Lindsey Kuper from UCSC is currently (as of spring 2020, during the COVID-19 outbreak) streaming her distributed systems class on YouTube. It covers many of the theoretical aspects of distributed systems, check it out! Robert Morris from MIT has also posted lectures from his distributed systems class on YouTube. Check it out too! If you want to learn about the most cutting edge research in distributed systems, the papers published at the OSDI and SOSP conferences (amongst others) are a great place to start. If you are interested in the real world realities of building and maintaining distributed systems, Google has published some super valuable books on site reliability engineering and building secure systems. Questions/Feedback This class is very much a work in progress (can't you tell?). I welcome any and all questions or constructive feedback, as I want to make it better! Either leave comments on the videos, or email me at [email protected]. About Me Hi! I'm Chris Colohan. I went to grad school and got a PhD at Carnegie Mellon, then I spent 10 years working at Google building distributed systems (and managing teams which build distributed systems). Systems which I've contributed to include SUIF, MapReduce, TCMalloc, Percolator, Caffeine, Borg, Omega, and Piper. You can find random other information about me here. Source Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.