MIT-6.828-JOS-DOC-Beautify/Lab 4_ Preemptive Multitasking.html at master · EmbroiderSnow/MIT-6.828-JOS-DOC-Beautify · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<html style="--wm-toolbar-height: 67px;"><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8"><script src="Lab%204_%20Preemptive%20Multitasking_files/athena.js" type="text/javascript"></script>
<script type="text/javascript">window.addEventListener('DOMContentLoaded',function(){var v=archive_analytics.values;v.service='wb';v.server_name='wwwb-app213.us.archive.org';v.server_ms=186;archive_analytics.send_pageview({});});</script>
<script type="text/javascript" src="Lab%204_%20Preemptive%20Multitasking_files/bundle-playback.js" charset="utf-8"></script>
<script type="text/javascript" src="Lab%204_%20Preemptive%20Multitasking_files/wombat.js" charset="utf-8"></script>
<script>window.RufflePlayer=window.RufflePlayer||{};window.RufflePlayer.config={"autoplay":"on","unmuteOverlay":"hidden","showSwfDownload":true};</script>
<script type="text/javascript" src="Lab%204_%20Preemptive%20Multitasking_files/ruffle.js"></script>
<script type="text/javascript">
    __wm.init("https://web.archive.org/web");
  __wm.wombat("https://pdos.csail.mit.edu/6.828/2018/labs/lab4/","20250420202217","https://web.archive.org/","web","https://web-static.archive.org/_static/",
	      "1745180537");
</script>
<link rel="stylesheet" type="text/css" href="Lab%204_%20Preemptive%20Multitasking_files/banner-styles.css">
<link rel="stylesheet" type="text/css" href="Lab%204_%20Preemptive%20Multitasking_files/iconochive.css">
<!-- End Wayback Rewrite JS Include -->

<title>Lab 4: Preemptive Multitasking</title>
<link rel="stylesheet" href="Lab%204_%20Preemptive%20Multitasking_files/labs.css" type="text/css">
<script type="text/javascript" src="Lab%204_%20Preemptive%20Multitasking_files/labs.js"></script>

<!-- MIT 6.828 实验文档现代化增强 -->
<link rel="stylesheet" href="labs-modern.css" type="text/css">
<script src="labs-enhance.js"></script>
</head>
<body data-new-gr-c-s-check-loaded="8.933.0" data-gr-ext-installed=""><div class="jump-hdr"><div class="jump-section">Sections &#9663;<div class="jump-drop"><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Introduction" style="padding-left: 1em; background: rgb(192, 192, 255);">Introduction</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Getting-Started" style="padding-left: 2em;">Getting Started</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Lab-Requirements" style="padding-left: 2em;">Lab Requirements</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Part-A--Multiprocessor-Support-and-Cooperative-Multitasking" style="padding-left: 1em;">Part A: Multiprocessor Support and Cooperative Multitasking</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#-Multiprocessor-Support-" style="padding-left: 2em;"> Multiprocessor Support </a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Round-Robin-Scheduling" style="padding-left: 2em;">Round-Robin Scheduling</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#System-Calls-for-Environment-Creation" style="padding-left: 2em;">System Calls for Environment Creation</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Part-B--Copy-on-Write-Fork" style="padding-left: 1em;">Part B: Copy-on-Write Fork</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#User-level-page-fault-handling" style="padding-left: 2em;">User-level page fault handling</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Implementing-Copy-on-Write-Fork" style="padding-left: 2em;">Implementing Copy-on-Write Fork</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Part-C--Preemptive-Multitasking-and-Inter-Process-communication--IPC-" style="padding-left: 1em;">Part C: Preemptive Multitasking and Inter-Process communication (IPC)</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Clock-Interrupts-and-Preemption" style="padding-left: 2em;">Clock Interrupts and Preemption</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Inter-Process-communication--IPC-" style="padding-left: 2em;">Inter-Process communication (IPC)</a></div></div><div class="jump-section">Exercises &#9663;<div class="jump-drop"><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-1">Exercise 1</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-2">Exercise 2</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-3">Exercise 3</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-4">Exercise 4</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-5">Exercise 5</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-6">Exercise 6</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-7">Exercise 7</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-8">Exercise 8</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-9">Exercise 9</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-10">Exercise 10</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-11">Exercise 11</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-12">Exercise 12</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-13">Exercise 13</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-14">Exercise 14</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/#Exercise-15">Exercise 15</a></div></div><div class="jump-section">References &#9663;<div class="jump-drop"><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labguide.html">Lab tools guide</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/i386/toc.htm">80386 manual</a><div>IA32</div><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-1.pdf" style="padding-left: 1em;">Basic architecture</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-2A.pdf" style="padding-left: 1em;">Instruction set A-M</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-2B.pdf" style="padding-left: 1em;">Instruction set N-Z</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-3A.pdf" style="padding-left: 1em;">System programming 1</a><a href="https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/readings/ia32/IA32-3B.pdf" style="padding-left: 1em;">System programming 2</a></div></div></div><!-- BEGIN WAYBACK TOOLBAR INSERT -->
<script>__wm.rw(0);</script>
<div id="wm-ipp-base" lang="en" style="display: block; direction: ltr; height: 67px;" toolbar-mode="auto">
</div><div id="wm-ipp-print">The Wayback Machine - https://web.archive.org/web/20250420202217/https://pdos.csail.mit.edu/6.828/2018/labs/lab4/</div>
<script type="text/javascript">//<![CDATA[
__wm.bt(750,27,25,2,"web","https://pdos.csail.mit.edu/6.828/2018/labs/lab4/","20250420202217",1996,"https://web-static.archive.org/_static/",["https://web-static.archive.org/_static/css/banner-styles.css?v=p7PEIJWi","https://web-static.archive.org/_static/css/iconochive.css?v=3PDvdIFv"], false);
  __wm.rw(1);
//]]></script>
<!-- END WAYBACK TOOLBAR INSERT -->


<h1>Lab 4: Preemptive Multitasking</h1>

<p><b>
Part A due Thursday, October 18, 2018 <br>
Part B due Thursday, October 25, 2018 <br>
Part C due Thursday, November 1, 2018
</b>
</p>

<h2 id="Introduction">Introduction</h2>

<p>
In this lab you will implement preemptive multitasking among multiple
simultaneously active user-mode environments.
</p>

<p>
In part A you will
add multiprocessor support to JOS,
implement round-robin scheduling, and add basic environment
management system calls (calls that create and destroy environments,
and allocate/map memory).
</p>

<p>
In part B, you will implement a Unix-like <code>fork()</code>,
which allows a user-mode environment to create copies of
itself.
</p>

<p>
Finally, in part C you will add support for inter-process
communication (IPC), allowing different user-mode environments to
communicate and synchronize with each other explicitly.  You will also
add support for hardware clock interrupts and preemption.
</p>

<h3 id="Getting-Started">Getting Started</h3>

<p>
Use Git to commit your Lab 3 source, fetch the latest version of the course
repository, and then create a local branch called <tt>lab4</tt> based on our
lab4 branch, <tt>origin/lab4</tt>:
</p>
<pre>athena% <kbd>cd ~/6.828/lab</kbd>
athena% <kbd>add git</kbd>
athena% <kbd>git pull</kbd>
Already up-to-date.
athena% <kbd>git checkout -b lab4 origin/lab4</kbd>
Branch lab4 set up to track remote branch refs/remotes/origin/lab4.
Switched to a new branch "lab4"
athena% <kbd>git merge lab3</kbd>
Merge made by recursive.
...
athena%
</pre>

Lab 4 contains a number of new source files, some of which you should browse
before you start:
<table align="center">
<tbody><tr><td><tt>kern/cpu.h</tt></td>
    <td>Kernel-private definitions for multiprocessor support</td></tr>
<tr><td><tt>kern/mpconfig.c</tt></td>
    <td>Code to read the multiprocessor configuration</td></tr>
<tr><td><tt>kern/lapic.c</tt></td>
    <td>Kernel code driving the local APIC unit in each processor</td></tr>
<tr><td><tt>kern/mpentry.S</tt></td>
    <td>Assembly-language entry code for non-boot CPUs</td></tr>
<tr><td><tt>kern/spinlock.h</tt></td>
    <td>Kernel-private definitions for spin locks, including
	the big kernel lock</td></tr>
<tr><td><tt>kern/spinlock.c</tt></td>
    <td>Kernel code implementing spin locks</td></tr>
<tr><td><tt>kern/sched.c</tt></td>
    <td>Code skeleton of the scheduler that you are about to implement</td></tr>
</tbody></table>

<h3 id="Lab-Requirements">Lab Requirements</h3>

<p>
This lab is divided into three parts, A, B, and C.
We have allocated one week in the schedule for each part.
</p>

<p>
As before,
you will need to do all of the regular exercises described in the lab
and <i>at least one</i> challenge problem.
(You do not need to do one challenge problem per part,
just one for the whole lab.)
Additionally, you will need to write up a brief
description of the challenge problem that you implemented.
If you implement more than one challenge problem,
you only need to describe one of them in the write-up,
though of course you are welcome to do more.
Place the write-up in a file called <tt>answers-lab4.txt</tt>
in the top level of your <tt>lab</tt> directory
before handing in your work.
</p>

<h2 id="Part-A--Multiprocessor-Support-and-Cooperative-Multitasking">Part A: Multiprocessor Support and Cooperative Multitasking</h2>

<p>
In the first part of this lab,
you will first extend JOS to run on a multiprocessor system,
and then implement some new JOS kernel system calls
to allow user-level environments to create
additional new environments.
You will also implement <i>cooperative</i> round-robin scheduling,
allowing the kernel to switch from one environment to another
when the current environment voluntarily relinquishes the CPU (or exits).
Later in part C you will implement <i>preemptive</i> scheduling,
which allows the kernel to re-take control of the CPU from an environment
after a certain time has passed even if the environment does not cooperate.
</p>

<h3 id="-Multiprocessor-Support-"> Multiprocessor Support </h3>

<p>
We are going to make JOS support "symmetric multiprocessing" (SMP), a
multiprocessor model in which all CPUs have equivalent access to
system resources such as memory and I/O buses.  While all CPUs
are functionally identical in SMP, during the boot process they
can be classified into two types: the bootstrap processor (BSP) is
responsible for initializing the system and for booting the operating
system; and the application processors (APs) are activated by the BSP
only after the operating system is up and running. Which processor is
the BSP is determined by the hardware and the BIOS. Up to this point,
all your existing JOS code has been running on the BSP.
</p>

<p>
In an SMP system, each CPU has an accompanying local APIC (LAPIC) unit.
The LAPIC units are responsible for delivering interrupts throughout
the system. The LAPIC also provides its connected CPU with a unique
identifier. In this lab, we make use of the following basic
functionality of the LAPIC unit (in <tt>kern/lapic.c</tt>):
</p>
<ul>
<li>Reading the LAPIC identifier (APIC ID) to tell which CPU our code is
currently running on (see <code>cpunum()</code>). </li>

<li>Sending the <code>STARTUP</code> interprocessor interrupt (IPI) from
the BSP to the APs to bring up other CPUs (see
<code>lapic_startap()</code>).</li>

<li>In part C, we program LAPIC's built-in timer to trigger clock
interrupts to support preemptive multitasking (see
<code>apic_init()</code>).</li>
</ul>

<p>
A processor accesses its LAPIC using memory-mapped I/O (MMIO).
In MMIO, a portion of <i>physical</i> memory is hardwired to the
registers of some I/O devices, so the same load/store instructions
typically used to access memory can be used to access device
registers.  You've already seen one IO hole at physical address
<tt>0xA0000</tt> (we use this to write to the VGA display buffer).
The LAPIC lives in a hole starting at physical address
<tt>0xFE000000</tt> (32MB short of 4GB), so it's too high for us to
access using our usual direct map at KERNBASE.  The JOS virtual memory
map leaves a 4MB gap at <tt>MMIOBASE</tt> so we have a place to map
devices like this.  Since later labs introduce more MMIO regions,
you'll write a simple function to allocate space from this region and
map device memory to it.
</p>

<div class="required"><div id="Exercise-1" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 1.</span>
  Implement <code>mmio_map_region</code> in <tt>kern/pmap.c</tt>.  To
  see how this is used, look at the beginning of
  <code>lapic_init</code> in <tt>kern/lapic.c</tt>.  You'll have to do
  the next exercise, too, before the tests for
  <code>mmio_map_region</code> will run.
</p></div>

<h4>Application Processor Bootstrap</h4>

<p>
Before booting up APs, the BSP should first collect information
about the multiprocessor system, such as the total number of
CPUs, their APIC IDs and the MMIO address of the LAPIC unit.
The <code>mp_init()</code> function in <tt>kern/mpconfig.c</tt>
retrieves this information by reading the MP configuration
table that resides in the BIOS's region of memory.
</p>

<p>
The <code>boot_aps()</code> function (in <tt>kern/init.c</tt>) drives
the AP bootstrap process.  APs start in real mode, much like how the
bootloader started in <tt>boot/boot.S</tt>, so <code>boot_aps()</code>
copies the AP entry code (<tt>kern/mpentry.S</tt>) to a memory
location that is addressable in the real mode.  Unlike with the
bootloader, we have some control over where the AP will start
executing code; we copy the entry code to <tt>0x7000</tt>
(<code>MPENTRY_PADDR</code>), but any unused, page-aligned
physical address below 640KB would work.
</p>

<p>
After that, <code>boot_aps()</code> activates APs one after another, by
sending <code>STARTUP</code> IPIs to the LAPIC unit of the corresponding
AP, along with an initial <code>CS:IP</code> address at which the AP
should start running its entry code (<code>MPENTRY_PADDR</code> in our
case). The entry code in <tt>kern/mpentry.S</tt> is quite similar to
that of <tt>boot/boot.S</tt>. After some brief setup, it puts the AP
into protected mode with paging enabled, and then calls the C setup
routine <code>mp_main()</code> (also in <tt>kern/init.c</tt>).
<code>boot_aps()</code> waits for the AP to signal a
<code>CPU_STARTED</code> flag in <code>cpu_status</code> field of
its <code>struct CpuInfo</code> before going on to wake up the next one.
</p>

<div class="required"><div id="Exercise-2" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 2.</span>
  Read <code>boot_aps()</code> and <code>mp_main()</code> in
  <tt>kern/init.c</tt>, and the assembly code in
  <tt>kern/mpentry.S</tt>.  Make sure you understand the control flow
  transfer during the bootstrap of APs. Then modify your implementation
  of <code>page_init()</code> in <tt>kern/pmap.c</tt> to avoid adding
  the page at <code>MPENTRY_PADDR</code> to the free list, so that we
  can safely copy and run AP bootstrap code at that physical address.
  Your code should pass the updated <code>check_page_free_list()</code>
  test (but might fail the updated <code>check_kern_pgdir()</code>
  test, which we will fix soon).
</p></div>

<div class="question">
<p><span class="header">Question</span></p>
<ol>

  <li>
  Compare <tt>kern/mpentry.S</tt> side by side with
  <tt>boot/boot.S</tt>.  Bearing in mind that <tt>kern/mpentry.S</tt>
  is compiled and linked to run above <code>KERNBASE</code> just like
  everything else in the kernel, what is the purpose of macro
  <code>MPBOOTPHYS</code>? Why is it
  necessary in <tt>kern/mpentry.S</tt> but not in
  <tt>boot/boot.S</tt>? In other words, what could go wrong if it
  were omitted in <tt>kern/mpentry.S</tt>?
  <br>
  Hint: recall the differences between the link address and the
  load address that we have discussed in Lab 1.
  </li>

</ol>
</div>

<h4>Per-CPU State and Initialization</h4>

<p>
When writing a multiprocessor OS, it is important to distinguish
between per-CPU state that is private to each processor, and global
state that the whole system shares.  <tt>kern/cpu.h</tt> defines most
of the per-CPU state, including <code>struct CpuInfo</code>, which stores
per-CPU variables.  <code>cpunum()</code> always returns the ID of the
CPU that calls it, which can be used as an index into arrays like
<code>cpus</code>.  Alternatively, the macro <code>thiscpu</code> is
shorthand for the current CPU's <code>struct CpuInfo</code>.
</p>

<p>
Here is the per-CPU state you should be aware of:
</p>

<ul>
<li>
<p>
<b>Per-CPU kernel stack</b>.
<br>
Because multiple CPUs can trap into the kernel simultaneously,
we need a separate kernel stack for each processor to prevent them from
interfering with each other's execution. The array
<code>percpu_kstacks[NCPU][KSTKSIZE]</code> reserves space for NCPU's
worth of kernel stacks.
</p>

<p>
In Lab 2, you mapped the physical memory that <code>bootstack</code>
refers to as the BSP's kernel stack just below
<code>KSTACKTOP</code>.
Similarly, in this lab, you will map each CPU's kernel stack into this
region with guard pages acting as a buffer between them.  CPU 0's
stack will still grow down from <code>KSTACKTOP</code>; CPU 1's stack
will start <code>KSTKGAP</code> bytes below the bottom of CPU 0's
stack, and so on. <tt>inc/memlayout.h</tt> shows the mapping layout.
</p>
</li>

<li>
<p>
<b>Per-CPU TSS and TSS descriptor</b>.
<br>
A per-CPU task state segment (TSS) is also needed in order to specify
where each CPU's kernel stack lives. The TSS for CPU <i>i</i> is stored
in <code>cpus[i].cpu_ts</code>, and the corresponding TSS descriptor is
defined in the GDT entry <code>gdt[(GD_TSS0 &gt;&gt; 3) + i]</code>. The
global <code>ts</code> variable defined in <tt>kern/trap.c</tt> will
no longer be useful.
</p>
</li>

<li>
<p>
<b>Per-CPU current environment pointer</b>.
<br>
Since each CPU can run different user process simultaneously, we
redefined the symbol <code>curenv</code> to refer to
<code>cpus[cpunum()].cpu_env</code> (or <code>thiscpu-&gt;cpu_env</code>), which
points to the environment <i>currently</i> executing on the
<i>current</i> CPU (the CPU on which the code is running).
</p>
</li>

<li>
<p>
<b>Per-CPU system registers</b>.
<br>
All registers, including system registers, are private to a
CPU. Therefore, instructions that
initialize these registers, such as <code>lcr3()</code>,
<code>ltr()</code>, <code>lgdt()</code>, <code>lidt()</code>, etc., must
be executed once on each CPU. Functions <code>env_init_percpu()</code>
and <code>trap_init_percpu()</code> are defined for this purpose.
</p>
</li>

<p>
In addition to this, if you have added any extra per-CPU state or performed
any additional CPU-specific initialization (by say, setting new bits in
the CPU registers) in your solutions to challenge problems in earlier labs,
be sure to replicate them on each CPU here!
</p>

</ul>

<!-- XXX: describe zoombie env and env_cpunum -->

<div class="required"><div id="Exercise-3" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 3.</span>
  Modify <code>mem_init_mp()</code> (in <tt>kern/pmap.c</tt>) to map
  per-CPU stacks starting
  at <code>KSTACKTOP</code>, as shown in
  <tt>inc/memlayout.h</tt>.  The size of each stack is
  <code>KSTKSIZE</code> bytes plus <code>KSTKGAP</code> bytes of
  unmapped guard pages. Your code should pass the new check in
  <code>check_kern_pgdir()</code>.
</p></div>

<div class="required"><div id="Exercise-4" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 4.</span>
  The code in <code>trap_init_percpu()</code> (<tt>kern/trap.c</tt>)
  initializes the TSS and
  TSS descriptor for the BSP. It worked in Lab 3, but is incorrect
  when running on other CPUs. Change the code so that it can work
  on all CPUs. (Note: your new code should not use the global
  <code>ts</code> variable any more.)
</p></div>

<p>
When you finish the above exercises, run JOS in QEMU with 4 CPUs using
<kbd>make qemu CPUS=4</kbd> (or <kbd>make qemu-nox CPUS=4</kbd>), you
should see output like this:
</p>

<pre>...
Physical memory: 66556K available, base = 640K, extended = 65532K
check_page_alloc() succeeded!
check_page() succeeded!
check_kern_pgdir() succeeded!
check_page_installed_pgdir() succeeded!
SMP: CPU 0 found 4 CPU(s)
enabled interrupts: 1 2
SMP: CPU 1 starting
SMP: CPU 2 starting
SMP: CPU 3 starting
</pre>


<h4>Locking</h4>

<p>
Our current code spins after initializing the AP in
<code>mp_main()</code>. Before letting the AP get any further, we need
to first address race conditions when multiple CPUs run kernel code
simultaneously.  The simplest way to achieve this is to use a <i>big
kernel lock</i>.
The big kernel lock is a single global lock that is held whenever an
environment enters kernel mode, and is released when the environment
returns to user mode. In this model, environments in user mode can run
concurrently on any available CPUs, but no more than one environment can
run in kernel mode; any other environments that try to enter kernel mode
are forced to wait.
</p>

<p>
<tt>kern/spinlock.h</tt> declares the big kernel lock, namely
<code>kernel_lock</code>. It also provides <code>lock_kernel()</code>
and <code>unlock_kernel()</code>, shortcuts to acquire and
release the lock. You should apply the big kernel lock at four locations:
</p>

<ul>
<li>
In <code>i386_init()</code>, acquire the lock before the BSP wakes up the
other CPUs.
</li>
<li>
In <code>mp_main()</code>, acquire the lock after initializing the AP,
and then call <code>sched_yield()</code> to start running environments
on this AP.
</li>
<li>
In <code>trap()</code>, acquire the lock when trapped from user mode.
To determine whether a trap happened in user mode or in kernel mode,
check the low bits of the <code>tf_cs</code>.
</li>
<li>
In <code>env_run()</code>, release the lock <i>right before</i>
switching to user mode. Do not do that too early or too late, otherwise
you will experience races or deadlocks.
</li>
</ul>

<div class="required"><div id="Exercise-5" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 5.</span>
  Apply the big kernel lock as described above, by calling
  <code>lock_kernel()</code> and <code>unlock_kernel()</code> at
  the proper locations.
</p></div>

<p>
How to test if your locking is correct? You can't at this moment! But you
will be able to after you implement the scheduler in the
next exercise.
</p>

<div class="question">
<p><span class="header">Question</span></p>
<ol start="2">
  <li>
  It seems that using the big kernel lock guarantees that only one CPU
  can run the kernel code at a time.
  Why do we still need separate kernel stacks for each CPU?
  Describe a scenario in which using a shared kernel stack will go
  wrong, even with the protection of the big kernel lock.
  </li>
</ol>
</div>

<div class="challenge">
<p><span class="header">Challenge!</span>

  The big kernel lock is simple and easy to use. Nevertheless, it
  eliminates all concurrency in kernel mode. Most
  modern operating systems use different locks to protect different
  parts of their shared state, an
  approach called <i>fine-grained locking</i>.
  Fine-grained locking can increase performance significantly, but is
  more difficult to implement and error-prone. If you are brave
  enough, drop the big kernel lock and embrace concurrency in JOS!
  </p>
  <p>
  It is up to you to decide the locking granularity (the amount of
  data that a lock protects). As a hint, you may consider using
  spin locks to ensure exclusive access to these shared components
  in the JOS kernel:
  </p>
  <ul>
  <li>The page allocator.</li>
  <li>The console driver.</li>
  <li>The scheduler.</li>
  <li>The inter-process communication (IPC) state that you will
  implement in the part C.</li>
  </ul>
</div>


<h3 id="Round-Robin-Scheduling">Round-Robin Scheduling</h3>

<p>
Your next task in this lab is to change the JOS kernel
so that it can alternate between multiple environments
in "round-robin" fashion.
Round-robin scheduling in JOS works as follows:
</p>

<ul>
<li>	The function <code>sched_yield()</code> in the new <tt>kern/sched.c</tt>
	is responsible for selecting a new environment to run.
	It searches sequentially through the <code>envs[]</code> array
	in circular fashion,
	starting just after the previously running environment
	(or at the beginning of the array
	if there was no previously running environment),
	picks the first environment it finds
	with a status of <code>ENV_RUNNABLE</code>
	(see <tt>inc/env.h</tt>),
	and calls <code>env_run()</code> to jump into that environment. </li>

<li>	<code>sched_yield()</code> must never run the same environment
	on two CPUs at the same time.  It can tell that an environment
	is currently running on some CPU (possibly the current CPU)
	because that environment's status will be <code>ENV_RUNNING</code>.</li>

<li>	We have implemented a new system call for you,
	<code>sys_yield()</code>,
	which user environments can call
	to invoke the kernel's <code>sched_yield()</code> function
	and thereby voluntarily give up the CPU to a different environment.  </li>

</ul>


<div class="required"><div id="Exercise-6" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 6.</span>
	Implement round-robin scheduling in <code>sched_yield()</code>
	as described above.  Don't forget to modify
	<code>syscall()</code> to dispatch <code>sys_yield()</code>.
	</p>

        <p>Make sure to invoke <code>sched_yield()</code> in <code>mp_main</code>.

	</p><p> Modify <tt>kern/init.c</tt> to create three (or more!) environments
	 that all run the program <tt>user/yield.c</tt>.
	</p>

	<p>Run <kbd>make qemu</kbd>.
	You should see the environments
	switch back and forth between each other
	five times before terminating, like below.
	</p>

	<p>Test also with several CPUS: <kbd>make qemu CPUS=2</kbd>.
</p><pre>...
Hello, I am environment 00001000.
Hello, I am environment 00001001.
Hello, I am environment 00001002.
Back in environment 00001000, iteration 0.
Back in environment 00001001, iteration 0.
Back in environment 00001002, iteration 0.
Back in environment 00001000, iteration 1.
Back in environment 00001001, iteration 1.
Back in environment 00001002, iteration 1.
...
</pre>
	<p>
	After the <tt>yield</tt> programs exit, there will be no runnable
	environment in the system, the scheduler should
	invoke the JOS kernel monitor.
	If any of this does not happen,
	then fix your code before proceeding.
	</p>

	<!-- No longer true
	<p>
	If you use <kbd>CPUS=1</kbd> at this point, all environments should
	successfully run. Setting CPUS larger than 1 at this time may result in
	a general protection fault, kernel page fault, or other unexpected
	interrupt once there are no more runnable environments due to unhandled
	timer interrupts (which we will fix below!).
	</p>-->
</div>

<div class="question">
<p><span class="header">Question</span></p>
<ol start="3">
  <li>
In your implementation of <code>env_run()</code> you should have
called <code>lcr3()</code>.  Before and after the call to
<code>lcr3()</code>, your code makes references (at least it should)
to the variable <code>e</code>, the argument to <code>env_run</code>.
Upon loading the <code>%cr3</code> register, the addressing context
used by the MMU is instantly changed.  But a virtual
address (namely <code>e</code>) has meaning relative to a given
address context--the address context specifies the physical address to
which the virtual address maps.  Why can the pointer <code>e</code> be
dereferenced both before and after the addressing switch?
  </li>
  <li>
  Whenever the kernel switches from one environment to another,
it must ensure the old environment's registers are saved
so they can be restored properly later.
Why?  Where does this happen?</li>
</ol>
</div>

<div class="challenge">
<p><span class="header">Challenge!</span>
	Add a less trivial scheduling policy to the kernel,
	such as a fixed-priority scheduler that allows each environment
	to be assigned a priority
	and ensures that higher-priority environments
	are always chosen in preference to lower-priority environments.
	If you're feeling really adventurous,
	try implementing a Unix-style adjustable-priority scheduler
	or even a lottery or stride scheduler.
	(Look up "lottery scheduling" and "stride scheduling" in Google.)
	</p>

	<p>
	Write a test program or two
	that verifies that your scheduling algorithm is working correctly
	(i.e., the right environments get run in the right order).
	It may be easier to write these test programs
	once you have implemented <code>fork()</code> and IPC
	in parts B and C of this lab.
</p></div>

<div class="challenge">
<p><span class="header">Challenge!</span>
	The JOS kernel currently does not allow applications
	to use the x86 processor's x87 floating-point unit (FPU),
	MMX instructions, or Streaming SIMD Extensions (SSE).
	Extend the <code>Env</code> structure
	to provide a save area for the processor's floating point state,
	and extend the context switching code
	to save and restore this state properly
	when switching from one environment to another.
	The <code>FXSAVE</code> and <code>FXRSTOR</code> instructions may be useful,
	but note that these are not in the old i386 user's manual
	because they were introduced in more recent processors.
	Write a user-level test program
	that does something cool with floating-point.
</p></div>


<h3 id="System-Calls-for-Environment-Creation">System Calls for Environment Creation</h3>

<p>
Although your kernel is now capable of running and switching between
multiple user-level environments,
it is still limited to running environments
that the <i>kernel</i> initially set up.
You will now implement the necessary JOS system calls
to allow <i>user</i> environments to create and start
other new user environments.
</p>

<p>
Unix provides the <code>fork()</code> system call
as its process creation primitive.
Unix <code>fork()</code> copies
the entire address space of calling process (the parent)
to create a new process (the child).
The only differences between the two observable from user space
are their process IDs and parent process IDs
(as returned by <code>getpid</code> and <code>getppid</code>).
In the parent,
<code>fork()</code> returns the child's process ID,
while in the child, <code>fork()</code> returns 0.
By default, each process gets its own private address space, and
neither process's modifications to memory are visible to the other.
</p>

<p>
You will provide a different, more primitive
set of JOS system calls
for creating new user-mode environments.
With these system calls you will be able to implement
a Unix-like <code>fork()</code> entirely in user space,
in addition to other styles of environment creation.
The new system calls you will write for JOS are as follows:
</p>

<dl>
<dt>	<code>sys_exofork</code>:</dt>
<dd>	This system call creates a new environment with an almost blank slate:
	nothing is mapped in the user portion of its address space,
	and it is not runnable.
	The new environment will have the same register state as the
	parent environment at the time of the <code>sys_exofork</code> call.
	In the parent, <code>sys_exofork</code>
	will return the <code>envid_t</code> of the newly created
	environment
	(or a negative error code if the environment allocation failed).
	In the child, however, it will return 0.
	(Since the child starts out marked as not runnable,
	<code>sys_exofork</code> will not actually return in the child
	until the parent has explicitly allowed this
	by marking the child runnable using....)</dd>

<dt>	<code>sys_env_set_status</code>:</dt>
<dd>	Sets the status of a specified environment
	to <code>ENV_RUNNABLE</code> or <code>ENV_NOT_RUNNABLE</code>.
	This system call is typically used
	to mark a new environment ready to run,
	once its address space and register state
	has been fully initialized.</dd>

<dt>	<code>sys_page_alloc</code>:</dt>
<dd>	Allocates a page of physical memory
	and maps it at a given virtual address
	in a given environment's address space.</dd>

<dt>	<code>sys_page_map</code>:</dt>
<dd>	Copy a page mapping (<i>not</i> the contents of a page!)
	from one environment's address space to another,
	leaving a memory sharing arrangement in place
	so that the new and the old mappings both refer to
	the same page of physical memory.</dd>

<dt>	<code>sys_page_unmap</code>:</dt>
<dd>	Unmap a page mapped at a given virtual address
	in a given environment.</dd>
</dl>


<p>
For all of the system calls above that accept environment IDs,
the JOS kernel supports the convention
that a value of 0 means "the current environment."
This convention is implemented by <code>envid2env()</code>
in <tt>kern/env.c</tt>.
</p>

<p>
We have provided a very primitive implementation
of a Unix-like <code>fork()</code>
in the test program <tt>user/dumbfork.c</tt>.
This test program uses the above system calls
to create and run a child environment
with a copy of its own address space.
The two environments
then switch back and forth using <code>sys_yield</code>
as in the previous exercise.
The parent exits after 10 iterations,
whereas the child exits after 20.
</p>

<div class="required"><div id="Exercise-7" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 7.</span>
	Implement the system calls described above
	in <tt>kern/syscall.c</tt> and make sure <tt>syscall()</tt> calls
	them.
	You will need to use various functions
	in <tt>kern/pmap.c</tt> and <tt>kern/env.c</tt>,
	particularly <code>envid2env()</code>.
	For now, whenever you call <code>envid2env()</code>,
	pass 1 in the <code>checkperm</code> parameter.
	Be sure you check for any invalid system call arguments,
	returning <code>-E_INVAL</code> in that case.
	Test your JOS kernel with <tt>user/dumbfork</tt>
	and make sure it works before proceeding.
</p></div>

<div class="challenge">
<p><span class="header">Challenge!</span>
	Add the additional system calls necessary
	to <i>read</i> all of the vital state of an existing environment
	as well as set it up.
	Then implement a user mode program that forks off a child environment,
	runs it for a while (e.g., a few iterations of <code>sys_yield()</code>),
	then takes a complete snapshot or <i>checkpoint</i>
	of the child environment,
	runs the child for a while longer,
	and finally restores the child environment to the state it was in
	at the checkpoint
	and continues it from there.
	Thus, you are effectively "replaying"
	the execution of the child environment from an intermediate state.
	Make the child environment perform some interaction with the user
	using <code>sys_cgetc()</code> or <code>readline()</code>
	so that the user can view and mutate its internal state,
	and verify that with your checkpoint/restart
	you can give the child environment a case of selective amnesia,
	making it "forget" everything that happened beyond a certain point.
</p></div>


<p>
This completes Part A of the lab;
make sure it passes all of the Part A tests when you run
<kbd>make grade</kbd>, and hand it in using <kbd>make
handin</kbd> as usual.  If you are trying to figure out why a particular
test case is failing, run <kbd>./grade-lab4 -v</kbd>, which will
show you the output of the kernel builds and QEMU runs for each
test, until a test fails.  When a test fails, the script will stop,
and then you can inspect <tt>jos.out</tt> to see what the
kernel actually printed.
</p>

<h2 id="Part-B--Copy-on-Write-Fork">Part B: Copy-on-Write Fork</h2>

<p>
As mentioned earlier,
Unix provides the <code>fork()</code> system call
as its primary process creation primitive.
The <code>fork()</code> system call
copies the address space of the calling process (the parent)
to create a new process (the child).
</p>

<p>
xv6 Unix implements <code>fork()</code> by copying all data from the
parent's pages into new pages allocated for the child.
This is essentially the same approach
that <code>dumbfork()</code> takes.
The copying of the parent's address space into the child is
the most expensive part of the <code>fork()</code> operation.
</p>

<p>
However, a call to <code>fork()</code>
is frequently followed almost immediately
by a call to <code>exec()</code> in the child process,
which replaces the child's memory with a new program.
This is what the the shell typically does, for example.
In this case,
the time spent copying the parent's address space is largely wasted,
because the child process will use
very little of its memory before calling <code>exec()</code>.
</p>

<p>
For this reason,
later versions of Unix took advantage
of virtual memory hardware
to allow the parent and child to <i>share</i>
the memory mapped into their respective address spaces
until one of the processes actually modifies it.
This technique is known as <i>copy-on-write</i>.
To do this,
on <code>fork()</code> the kernel would
copy the address space <i>mappings</i>
from the parent to the child
instead of the contents of the mapped pages,
and at the same time mark the now-shared pages read-only.
When one of the two processes tries to write to one of these shared pages,
the process takes a page fault.
At this point, the Unix kernel realizes that the page
was really a "virtual" or "copy-on-write" copy,
and so it makes a new, private, writable copy of the page for the
faulting process.
In this way, the contents of individual pages aren't actually copied
until they are actually written to.
This optimization makes a <code>fork()</code> followed by
an <code>exec()</code> in the child much cheaper:
the child will probably only need to copy one page
(the current page of its stack)
before it calls <code>exec()</code>.
</p>

<p>
In the next piece of this lab, you will implement a "proper"
Unix-like <code>fork()</code> with copy-on-write,
as a user space library routine.
Implementing <code>fork()</code> and copy-on-write support in user space
has the benefit that the kernel remains much simpler
and thus more likely to be correct.
It also lets individual user-mode programs
define their own semantics for <code>fork()</code>.
A program that wants a slightly different implementation
(for example, the expensive always-copy version like <code>dumbfork()</code>,
or one in which the parent and child actually share memory afterward)
can easily provide its own.
</p>

<h3 id="User-level-page-fault-handling">User-level page fault handling</h3>

<p>
A user-level copy-on-write <code>fork()</code> needs to know about
page faults on write-protected pages, so that's what you'll
implement first.
Copy-on-write is only one of many possible uses
for user-level page fault handling.
</p>

<p>
It's common to set up an address space so that page faults
indicate when some action needs to take place.
For example,
most Unix kernels initially map only a single page
in a new process's stack region,
and allocate and map additional stack pages later "on demand"
as the process's stack consumption increases
and causes page faults on stack addresses that are not yet mapped.
A typical Unix kernel must keep track of what action to take
when a page fault occurs in each region of a process's space.
For example,
a fault in the stack region will typically
allocate and map new page of physical memory.
A fault in the program's BSS region will typically
allocate  a new page, fill it with zeroes, and map it.
In systems with demand-paged executables,
a fault in the text region will read the corresponding page
of the binary off of disk and then map it.
</p>

<p>
This is a lot of information for the kernel to keep track of.
Instead of taking the traditional Unix approach,
you will decide what to do about each page fault in user space,
where bugs are less damaging.
This design has the added benefit of allowing
programs great flexibility in defining their memory regions;
you'll use user-level page fault handling later
for mapping and accessing files on a disk-based file system.
</p>

<h4>Setting the Page Fault Handler</h4>

<p>
In order to handle its own page faults,
a user environment will need to register
a <i>page fault handler entrypoint</i> with the JOS kernel.
The user environment registers its page fault entrypoint
via the new <code>sys_env_set_pgfault_upcall</code> system call.
We have added a new member to the <code>Env</code> structure,
<code>env_pgfault_upcall</code>,
to record this information.
</p>

<div class="required"><div id="Exercise-8" style="position: relative; top: -5em;"></div>
<p><span class="header">Exercise 8.</span>
	Implement the <code>sys_env_set_pgfault_upcall</code> system call.
	Be sure to enable permission checking
	when looking up the environment ID of the target environment,
	since this is a "dangerous" system call.
</p></div>

<h4>Normal and Exception Stacks in User Environments</h4>

<p>
During normal execution,
a user environment in JOS
will run on the <i>normal</i> user stack:
its <tt>ESP</tt> register starts out pointing at <code>USTACKTOP</code>,
and the stack data it pushes resides on the page
between <code>USTACKTOP-PGSIZE</code> and <code>USTACKTOP-1</code> inclusive.
When a page fault occurs in user mode,
however,
the kernel will restart the user environment
running a designated user-level page fault handler
on a different stack,
namely the <i>user exception</i> stack.
In essence, we will make the JOS kernel
implement automatic "stack switching"
on behalf of the user environment,
in much the same way that the x86 <i>processor</i>
already implements stack switching on behalf of JOS
when transferring from user mode to kernel mode!
</p>

<p>
The JOS user exception stack is also one page in size,
and its top is defined to be at virtual address <code>UXSTACKTOP</code>,
so the valid bytes of the user exception stack
are from <code>UXSTACKTOP-PGSIZE</code> through <code>UXSTACKTOP-1</code> inclusive.
While running on this exception stack,
the user-level page fault handler
can use JOS's regular system calls to map new pages or adjust mappings
so as to fix whatever problem originally caused the page fault.
Then the user-level page fault handler returns,
via an assembly language stub,
to the faulting code on the original stack.
</p>

<p>
Each user environment that wants to support user-level page fault handling
will need to allocate memory for its own exception stack,
using the <code>sys_page_alloc()</code> system call introduced in part A.
</p>

<h4>Invoking the User Page Fault Handler</h4>