Skip to content

Commit bb2cb22

Browse files
authored
Add what to verify section (#85)
* Add what to verify section * Fix syntax * Minor updates to verify paragraph in elf part * Update updates * Fix broken link * Move useful info section from appendices * Changes requested * Minor fix on Nasm appendice * Minor typo fixes * Minor fix on cross compiling chapter * Minor fix on cross compile chapter * changes requested
1 parent 6566d6e commit bb2cb22

10 files changed

Lines changed: 63 additions & 19 deletions

File tree

00_Introduction/01_README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,6 @@ Below the list of parts that compose the book:
2525
* *Inter-Process Communication (IPC)* - This part looks at how we might implement IPC for our kernel, allowing isolated programs to communicate with each other in a controlled way.
2626
* *Virtual File System (VFS)* - This part will cover how a kernel presents different file systems to the rest of the system. We'll also take a look at implementing a 'tempfs' that is loaded from a tape archive (tar), similar to initrd.
2727
* *The ELF format* - Once we have a file system we can load files from it, why not load a program? This part looks at writing a simple program loader for ELF64 binaries, and why you would want to use this format.
28-
* *Going Beyond* - The final part (for now). We have implemented all the core components of a kernel, and we are free to go from here. This final chapter contains some ideas for new components that you might want to add, or at least begin thinking about.
28+
* *Going Beyond* - The final part (for now). We have implemented all the core components of a kernel, and we are free to go from here. This final chapter contains some ideas for new components that we might want to add, or at least begin thinking about.
2929

3030
In the appendices we cover various topic, from debugging tips, language specific information, troubleshooting, etc.

05_Scheduling/03_Processes_And_Threads.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@ thread_t* add_thread(process_t* proc, char* name, void(*function)(void*), void*
249249
}
250250
```
251251
252-
You'll notice this function looks almost identical to the `create_process` function from before. That's because a lot of it is the same! The first part of the function is just inserting the new thread at the end of the list of threads.
252+
We can see that this function looks almost identical to the `create_process` outlined earlier in this chapter. That's because a lot of it is the same! The first part of the function is just inserting the new thread at the end of the list of threads.
253253
254254
Let's look at how our `create_process` function would look now:
255255

09_Loading_Elf/01_Elf_Theory.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,12 +78,12 @@ Finding the program headers within an ELF binary is also quite straightforward.
7878
Like section headers, each program header is tighly packed against the next one. This means that program headers can be treated as an array. As an example is possible to loop through the _phdrs_ as follows:
7979

8080
```c
81-
void loop_phdrs(Elf64_Hdr* ehdr) {
82-
Elf64_Phdr* phdrs = *(Elf64_Phdr*)((uintptr_t)ehdr + ehdr->e_phoff);
81+
void loop_phdrs(Elf64_Ehdr* ehdr) {
82+
Elf64_Phdr* phdrs = (Elf64_Phdr*)((uintptr_t)ehdr + ehdr->e_phoff);
8383

8484
for (size_t i = 0; i < ehdr->e_phnum; i++)
8585
{
86-
Elf64_Phdr* program_header = phdrs[i];
86+
Elf64_Phdr program_header = phdrs[i];
8787
//do something with program_header here.
8888
}
8989
}

09_Loading_Elf/02_Loading_And_Running.md

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,56 @@ In the previous chapter we looked at the details of loading program headers, but
1414

1515
- First a copy of the ELF file to be loaded is needed. The recommended way is to load a file via the VFS, but it could be a bootloader module or even embedded into the kernel.
1616
- Then once the ELF is loaded, we need verify that its header is correct. Also check the architecture (machine type) matches the current machine, and that the bit-ness is correct (dont try to run a 32-bit program if you dont support it!).
17-
- Find all the loadable program headers for the ELF, we'll need those in a moment.
17+
- Find all the program headers with the type `PT_LOAD`, we'll need those in a moment.
1818
- Create a new address space for the program to live in. This usually involves creating a new process with a new VMM instance, but the specifics will vary depending on your design. Don't forget to keep the kernel mappings in the higher half!
1919
- Copy the loadable program headers into this new address space. Take care when writing this code, as the program headers may not be page-aligned:. Don't forget to zero the extra bytes between `memsz` and `filesz`.
2020
- Once loaded, set the appropriate permission on the memory each program header lives in: the write, execute (or no-execute) and user flags.
2121
- Now we'll need to create a new thread to act as the main thread for this program, and set its entry point to the `e_entry` field in the ELF header. This field is the start function of the program. You'll also need to create a stack in the memory space of this program for the thread to use, if this wasnt already done as part of our thread creation.
2222

2323
If all of the above are done, then the program is ready to run! We now should be able to enqueue the main thread in the scheduler and let it run.
2424

25+
### Verifying an ELF file
26+
27+
When veryfying an ELF file there are few things we need to check in order to decide if an executable is valid, the fields to validate are at different points in the ELF header. Some can be found in the `e_ident` field, like the following:
28+
29+
* The first thing we want to check is the magic number, this is the `ELFMAG` part. It is expected to be the following values:
30+
31+
| Value | Byte|
32+
|-------|-----|
33+
| `0x7f`| 0 |
34+
| `E` | 1 |
35+
| `L` | 2 |
36+
| `F` | 3 |
37+
38+
* We need to check that the file class match with the one we are supporting. There are two possible classes: 64 and 32. This is byte 4
39+
* The data field indicates the _endiannes_, again this depends on the architecture used. It can be three values: None (0), LSB (1) and MSB (2). For example `x86_64` architecture endiannes is LSB, then the value is expected to be 1. This field is in the byte 5.
40+
* The version field, byte 6, to be a valid elf it has to be set to 1 (EVCURRENT).
41+
* The OS Abi and Abi version they identify the operating system together with the ABI to which the object is targeted and the version of the ABI to which the object is targeted, for now we can ignore them, the should be 0.
42+
43+
Then from the other fields that need validation (that area not in the `e_ident` field) are:
44+
45+
* `e_type`: this identifies the type of elf, for our purpose the one to be considered valid this value should be 2 that indicates an Executable File (ET_EXEC) there are other values that in the future we could support, for example the `ET_DYN` type that is used for position independent code or shared object, but they require more work to be done.
46+
* `e_machine`: this indicates the required architecture for the executable, the value depends on the architectures we are supporting. For example the value for the AMD64 architecture is `62`
47+
48+
Be aware that most of the variables and their values have a specific naming convention, for more information refer to the ELF specs.
49+
50+
Beware that some compilers when generating a simple executable are not using the `ET_EXEC` value, but it could be of the type `ET_REL` (value 1), to obtain an executable we need to link it using a linker. For example if we generated the executable: `example.elf` with `ET_REL` type, we can use `ld` (or another equivalent linker):
51+
52+
```sh
53+
ld -o example.o example.elf
54+
```
55+
56+
For basic executables, we most likely don't need to include any linker script.
57+
58+
If we want to know the type of an elf, we can use the `readelf` command, if we are on a unix-like os:
59+
```sh
60+
readelf -e example.elf
61+
```
62+
63+
Will print out all the executable information, including the type.
64+
65+
66+
2567
## Caveats
2668

2769
As we can already see from the above restrictions there is plenty of room for improvement. There are also some other things to keep in mind:

99_Appendices/C_Language_Info.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -199,8 +199,3 @@ If unfamiliar with caching, each page can have a set of caching attributes appli
199199
One of these is write-combine. It works by queueing up writes to nearby areas of memory, until a buffer is full, and then flushing them to main memory in a single access. This is much faster than accessing main memory each write.
200200

201201
However if we're working with an older x86 cpu, or another platform, this solution is not available. Hence `volatile` would do the job.
202-
203-
### Useful Links
204-
205-
* [IBM inline assembly guide](https://www.ibm.com/docs/en/xl-c-and-cpp-linux/13.1.4?topic=compatibility-inline-assembly-statements) This is not gcc related, but the syntax is identical, and it contains useful and concise info on the constraintsa
206-
* [Fedora Inline assembly list of constraints](https://dmalcolm.fedorapeople.org/gcc/2015-08-31/rst-experiment/how-to-use-inline-assembly-language-in-c-code.html#simple-constraints)

99_Appendices/D_Nasm.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -79,15 +79,16 @@ quad_var:
7979
dq 133.463 ; Example with a real number
8080
```
8181

82-
But what if we want to declare a string? Well in this case we can use a different syntax for db:
82+
If we want to declare a string we need to use a different syntax for db:
8383

8484
```nasm
8585
string_var:
8686
db "Hello", 10
8787
```
88-
What does it mean? We are simply declaring a variable (string_variable) that starts at 'H', and fill the consecutive bytes with the next letters. But what about the last number? It is just an extra byte, that represents the newline character. So what we are really storing is the string _"Hello\\n"_
8988

90-
Now what we have seen so far is valid for a variable that can be initialized with a value, but what if we don't know the value yet, but we want just to "label" it with a variable name? Well is pretty simple, we have equivalent directives for reserving memory:
89+
The above code means that we are declaring a variable (`string_variable`) that starts at 'H', and fill the consecutive bytes with the next letters. And what about the last number? It is just an extra byte, that represents the newline character, so what we are really storing is the string _"Hello\\n"_
90+
91+
What we have seen so far is valid for a variable that can be initialized with a value, but what if we don't know the value yet, but we want just to "label" it with a variable name? Well is pretty simple, we have equivalent directives for reserving memory:
9192

9293
| Directive | Description |
9394
|-------------|---------------------------------|

99_Appendices/E_Cross_Compilers.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ There's three environment variables we're going to use during the build process:
3636

3737
```
3838
export PREFIX="/your/path/to/cross/compiler"
39-
export TARGET="riscv64-elf"
39+
export TARGET="x86_64-elf"
4040
export PATH="$PREFIX/bin:$PATH"
4141
```
4242

@@ -154,16 +154,16 @@ Qemu is quite a large program, so it's recommended to make use of all cores when
154154

155155
## GDB
156156

157-
The steps for building GDB are similar to binutils and GCC. We'll create a temporary working directory and move into it. Gdb has a few extra dependencies we'll need:
157+
The steps for building GDB are similar to binutils and GCC. We'll create a temporary working directory and move into it. Gdb has a few extra dependencies we'll need (name can be different depending on the distribution used):
158158

159159
* libncurses-dev
160-
* libsource-highligh-dev
160+
* libsource-highlight-dev
161161

162162
```bash
163163
path/to/gdb_sources/configure --target=$TARGET --host=x86_64-linux-gnu --prefix="$PREFIX" --disable-werror --enable-tui --enable-source-highlight
164164
```
165165

166-
The last two options enable compiling the text-user-interface (`--enable-tui`) and source code highlighting (`--enable-source-highlight) which are nice-to-haves. These flags can be safely omitted if these aren't features we want.
166+
The last two options enable compiling the text-user-interface (`--enable-tui`) and source code highlighting (`--enable-source-highlight`) which are nice-to-haves. These flags can be safely omitted if these aren't features we want.
167167

168168
The `--target=` flag is special here in that it can also take an option `all` which builds gdb with support for every single architecture it can support. If we're going to develop on one machine but test on multiple architectures (via qemu or real hardware) this is nice. It allows a single instance of gdb to debug multiple architectures without needing different versions of gdb. Often this is how the 'gdb-multiarch' package is created for distros that have it.
169169

99_Appendices/H_Useful_Resources.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,11 @@ This appendix is a collection of links we found useful developing our own kernel
8181
- Osdev Wiki Page ELF Tutorial: [https://wiki.osdev.org/ELF_Tutorial](https://wiki.osdev.org/ELF_Tutorial)
8282
- x86-64 psABI: [https://gitlab.com/x86-psABIs/x86-64-ABI](https://gitlab.com/x86-psABIs/x86-64-ABI)
8383

84+
## C Language Infos
85+
86+
* [IBM inline assembly guide](https://www.ibm.com/docs/en/xl-c-and-cpp-linux/13.1.4?topic=compatibility-inline-assembly-statements) This is not gcc related, but the syntax is identical, and it contains useful and concise info on the constraintsa
87+
* [Fedora Inline assembly list of constraints](https://dmalcolm.fedorapeople.org/gcc/2015-08-31/rst-experiment/how-to-use-inline-assembly-language-in-c-code.html#simple-constraints)
88+
8489
## Nasm
8590

8691
* Nasm Struct Section: [https://www.nasm.us/xdoc/2.15/html/nasmdoc5.html#section-5.9.1](https://www.nasm.us/xdoc/2.15/html/nasmdoc5.html#section-5.9.1)

99_Appendices/J_Updates.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,4 +45,5 @@ Fourth book release
4545

4646
* Typo fixes
4747
* Expand Syscall example chapter
48+
* Expand ELF chapter, and fixing code in one of examples
4849

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ The [latest-master](https://github.com/dreamportdev/Osdev-Notes/releases/tag/lat
7171
* [The Tar File System](08_VirtualFileSystem/03_TarFileSystem.md)
7272
* [Part 9: Loading & Executing ELFs](09_Loading_Elf/README.md)
7373
* [Theory](09_Loading_Elf/01_Elf_Theory.md)
74-
* [Loading and Running](02_Loading_Elf/03_Loading_And_Running.md)
74+
* [Loading and Running](09_Loading_Elf/03_Loading_And_Running.md)
7575
* [Part 10: Going Beyond](10_Going_Beyond/README.md)
7676
* [Extras: Appendices](99_Appendices/README.md)
7777
* [General Troubleshooting](99_Appendices/A_Troubleshooting.md)

0 commit comments

Comments
 (0)