Syllabus¶
Course description¶
In recent years, large distributed systems have taken a prominent role not just in scientific inquiry, but also in our daily lives. When we perform a search on Google, stream content from Netflix, place an order on Amazon, or catch up on the latest comings-and-goings on Facebook, our seemingly minute requests are processed by complex systems that sometimes include hundreds of thousands of computers, connected by both local and wide area networks.
Recent papers in the field of Distributed Systems have described several solutions (such as BigTable, MapReduce, Spanner, Raft, Dynamo, Cassandra) for managing large-scale data and computation. However, building and using these systems poses a number of more fundamental challenges: How do we keep the system operating correctly even when individual machines fail? How do we ensure that all the machines have a consistent view of the system’s state? (and how do we ensure this in the presence of failures?) How can we determine the order of events in a system where we can’t assume a single global clock?
Many of these fundamental problems were identified and solved over the course of several decades, starting in the 1970’s. In this course, we will engage in reading and discussing seminal work in Distributed Systems from the last 35 years to (1) identify the fundamental issues raised in this earlier work, (2) relate those issues to current research problems, and (3) evaluate and compare the solutions proposed in both early and recent work. During this course, students will also implement a distributed system that requires them to manage distribute resources and evaluate whether the resulting system has certain properties, such as reliability, scalability, etc.
At the end of the quarter, students will be able to:
Identify the research questions posed in a scholarly paper and the solutions proposed in that paper.
Identify the main contributions and conclusions in a scholarly paper, and determine whether they are well supported by evaluating and criticizing the arguments, proofs, or experimental results in that paper.
Compare and contrast different distributed systems for managing large-scale data and computation.
Evaluate whether an implementation of a distributed system is reliable, fault-tolerant, scalable, and/or highly available.
A B+ or higher in CMSC 23300 (Networks and Distributed Systems) is a prerequisite for this course. Students can petition to have this requirement waived, as long as they have taken at least one other 200-level CS systems course.
Course organization¶
This course is divided into three components:
Reading and Discussion of Primary Sources: Several papers will be assigned each week, to be discussed on both Tuesday and Thursday.
Programming Assignments: Two short programming assignments will be given throughout the quarter.
Project and Paper: Students will have to implement a distributed system drawing upon the seminal work covered in the class discussions. Based on their projects, students will have to write a final paper evaluating the features and performance of their project.
The discussion component is described in more detail below.
Paper discussion¶
Please read these additional details on our Spring 2020 class.
Every week, we will discuss several papers in class (see Calendar and Reading List for the list of papers). The papers for a given week will have a common theme, but the papers will be split between the Tuesday and Thursday classes.
At the beginning of the quarter, students will be divided into three groups: A, B, and C. Although the composition of the groups will remain fixed throughout the quarter, the role that each group will take during a discussion section will rotate every week. There are three roles:
The Questioners: This group is responsible for preparing a list of 4–5 discussion questions about the papers to be discussed in class. For a given week, The Questioners must prepare their questions during the preceding week, and send them to the rest of the class by 5pm Saturday. This means that The Questioners must read all the papers for their assigned week several days in advance of the actual discussion sessions. We suggest aiming to read the papers by the end of the day on Thursday, to allow two days to discuss possible questions.
The Answerers: The 2020 notes explain how we will handle discussion via a mix of asynchronous and synchronous methods. During class discussion, this group takes the lead in answering the questions posed by The Questioners. In practice, this means that, whenever there is silence in the discussion, everyone looks at The Answerers to keep the discussion moving.
The Observers: During a discussion, this group will take notes on a shared document. These notes are not meant to be a transcription of what is being said in the discussion; they should capture the major take-away points of the discussion, as well as any issues The Observers feel should be discussed in more depth. The Observers can also search for additional resources, or answers to unresolved questions, on the Internet during the discussion itself.
These roles do not preclude anyone in the class from participating in the discussion. A member of The Observers can jump in when a question is posed, and a member of The Answerers can pose a new question on the fly.
Course Schedule¶
Grading¶
The final grade will be divided as follows:
25% programming assignments (broken down into 10%, 15%)
35% participation in discussions, broken down into:
15%: In-class and Piazza participation
10%: Participation in The Observers
See below for a detailed rubric for each of these components.
20% project
20% final paper
There will be no midterms or final exam.
The “in-class and Piazza participation” grade is an individual grade, scored out of 10, considering both class and Piazza discussions, but with allowance made for those who cannot easily participate in synchronous class discussions:
10: Student participates consistently in all or most class discussions, even when part of The Observers. Student is consistently active on Piazza, not just contributing good questions when the student’s group is The Questioners but also writing/answering posts outside his/her group.
9: Student participates consistently in all or most class discussions. Student consistently contributes good questions, but is only active on Piazza when his/her group is The Questioners.
8: Student has actively participated in class and Piazza discussions, but participation has not been consistent (e.g., very active in one discussion, completely silent in another)
7: Student has participated in class discussions and contributed questions or written/answered posts, but falls below expectations.
0: Student has not participated in any class discussionsi or written any meaningful posts or comments on Piazza.
The “Piazza participation” grade is an individual grade, mostly based on participation when the student is part of The Questioners. It is scored out of 10:
The “Participation in The Observers” grade is a group grade. A grade is assigned to the entire group whenever their role is The Observers, and the final grade is the average of those grades. Each student in the group gets the same grade. It is scored out of 10:
10: Discussion log is detailed and well-written, and the group has supplemented it with external references (not limited to the suggested reading for that week) and/or provided answers to questions that were left unanswered during the discussion.
9: Discussion log is detailed, divided into concrete sections, and well-written. The group has gone beyond just presenting their raw notes from the discussion, and has put some effort into polishing up the notes. Someone who has not attended the discussion or even read the paper would get the gist of what was discussed that week.
8: Discussion log accurately reflects the structure and content of the discussion, but it is closer to a collection of notes than a polished account of the discussion. Someone who attended the discussion would find it useful to recall what was discussed, but someone who did not could find it hard to parse.
7: The discussion log reflects some, but not all, of the discussion. It lacks structure and is composed of a collection of unpolished notes.
Any discussion logs worse than a 7 will receive a 0.
Types of grades¶
Students may take this course for a quality grade (a “letter” grade) or a pass/fail grade. By default, we assume students are taking the class for a quality grade. We will honor all requests to withdraw or take the class pass/fail before the final paper is due.
Note: There is a special grading policy in place for Spring Quarter 2020, in which students majoring or minoring in Computer Science may petition the CS major or minor advisor to allow P/F grading in up to two classes taken during that quarter. If you would like P/F grading for this course, please ask me by the end of week 9, when the petition is due with the major or minor advisor. The CS degrees require that classes in the degree program be completed with a grade of C- or higher. That will be my standard: students taking this course P/F will receive a P if their work would have earned them a quality grade of C- or higher.
Policy on academic honesty¶
The University of Chicago has a formal policy on academic honesty that you are expected to adhere to.
In brief, academic dishonesty (handing in someone else’s work as your own, taking existing code and not citing its origin, etc.) will not be tolerated in this course. Depending on the severity of the offense, you risk getting a hefty point penalty or being dismissed altogether from the course. All occurrences of academic dishonesty will furthermore be referred to the Dean of Students office, which may impose further penalties, including suspension and expulsion.
Even so, discussing the concepts necessary to complete the programming assignments and the project is certainly allowed (and encouraged). Under no circumstances should you show (or email) another student your code or post your solution to a web-page or social media site. If you have discussed parts of an assignment with someone else, then make sure to say so in your submission (e.g., in a README file or as a comment at the top of your source code file). If you consulted other sources, please make sure you cite these sources.
Finally, if you have any questions regarding what would or would not be considered academic dishonesty in this course, please don’t hesitate to ask the instructor.
Asking questions¶
We will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza.
Find our class page at: https://piazza.com/uchicago/spring2020/cmsc23310/home
All questions regarding assignments or material covered in class must be sent to Piazza, and not directly to the instructors or TAs, as this allows your classmates to join in the discussion and benefit from the replies to your question. If you send a message directly to the instructor or the TAs, you will get a gentle reminder that your question should be asked on Piazza.
Piazza has a mechanism that allows you to ask a private question, which will be seen only by the instructors and teaching assistants. This mechanism should be used only for questions that require revealing part of your solution to an assignment.
Piazza also allows students to post anonymously. Anonymous posts will be ignored (you will also get a gentle reminder asking you to not post anonymously). This is a majors-level course: you are expected to feel comfortable sharing your questions and thoughts with your classmates without hiding behind a veil of anonymity.
Additionally, all course announcements will be made through Piazza. It is your responsibility to check Piazza often to see if there are any announcements. Please note that you can configure your Piazza account to send you e-mail notifications every time there is a new post on Piazza. Just go to your Account Settings, then to Class Settings, click on “Edit Notifications” under CMSC 23310. We encourage you to select either the “Real Time” option (get a notification as soon as there are new posts) or the “Smart Digest” option (get a summary of all the posts sent over the last 1-6 hours – you can select the frequency).
Video recording policy¶
As the University temporarily transitions to a remote teaching and learning environment, instructors and students have asked for guidance on the recording of course sessions. Instructors have the discretion to record course sessions, except when recording is required to meet the needs of students granted an accommodation by the Office of Student Disability Services. Recordings and transcripts will be made available to students in the relevant course, the instructor, and other necessary University officials. Recordings in which students are personally identifiable will be managed in accordance with the Family Educational Rights and Privacy Act (FERPA).
This time-limited policy has been implemented to effectively deliver a remote education while safeguarding privacy and protecting rights in courses and instructional materials. Below is an acknowledgment for students designed to govern the use of any recordings and provide instructors and students with guidance on the use of instructional materials.
By attending course sessions, students acknowledge that:
They will not: (i) record, share, or disseminate University of Chicago course sessions, videos, transcripts, audio, or chats; (ii) retain such materials after the end of the course; or (iii) use such materials for any purpose other than in connection with participation in the course.
They will not share links to University of Chicago course sessions with any persons not authorized to be in the course session. Sharing course materials with persons authorized to be in the relevant course is permitted. Syllabi, handouts, slides, and other documents may be shared at the discretion of the instructor.
Course recordings, content, and materials may be covered by copyrights held by the University, the instructor, or third parties. Any unauthorized use of such recordings or course materials may violate such copyrights.
Any violation of this policy will be referred to the Area Dean of Students.
Acknowledgements¶
We gratefully acknowledge the suggestions and feedback provided by Anthony Nicholson, Jacob Matthews, Will Robinson, and Matthew Steffen (all at Google) and Lars Bergstrom (Mozilla) in preparing the reading list for this course.