Tasks
Suggestion
Please read through this document before working on any task.
Design Document
Download the design document template of project 2. Read through questions in the document for motivations, and fill it in afterwards.
Argument Passing
Currently, the fn execute(mut file: File, argv: Vec<String>)
does not pass arguments to new processes. You need to implement it in this task.
- Currently the
argv
argument is not used. To implement this functionality, you need to pass theargv
to the user program. - The kernel must put the arguments for the initial function on the stack to allow the user program to assess it. The user program will access the arguments following the normal calling convention (see RISC-V calling convention).
- You can impose a reasonable limit on the length of the command line arguments. For example, you could limit the arguments to those that will fit in a single page (4 kB).
Hint
You can parse argument strings any way you like. A viable implementation maybe like:
- Put
String
s in theargv
at the top (high address) of the stack. You should terminate those string with"\0"
when putting them on the stack.- Push the address of each string (i.e. the contents of user
argv[]
) on the stack. Those values are pointers, you should align them to 8 bytes to ensure fast access. Don't forget thatargv[]
ends with aNULL
.- Prepare the argument
argc
andargv
, and then deliver them to the user program. They are the first and second arguments ofmain
.Here is an argument passing example in 80x86 provided by Pintos. Use the example to help you understand how to set up the stack.
Process Control Syscalls
TacOS currently supports only one syscall — exit
, which terminates the calling process. You will add support for the following new syscalls: halt
, exec
, wait
.
Rust has no stable application binary interfaces at the moment. Therefore, all syscalls are represented in C signiture and user programs are also written in C. (Though, it's not hard at all to write a Rust user program if you try.)
The capability of switching in/out of the user mode is well handled by the skeleton code, and arguments passed by users will be forwarded to:
// src/kernel/trap/syscall.rs
pub fn syscall_handler(id: usize, args: [usize; 3]) -> isize
id
and args
are the number identifier and arguments (at most 3) of the syscall being called, and the return value equals the syscall's return value.
Basically, you dispatch an environment call from users to one of syscalls based on the id
. Then, implement each syscall's functionality as described below.
halt
void halt(void);
Terminates TacOS by calling the shutdown
function in src/sbi.rs
. This should be seldom used, because you lose some information about possible deadlock situations, etc.
exit
void exit(int status);
Terminates the current user program, returning status to the kernel. If the process’s parent waits for it (see below), this is the status that will be returned. Conventionally, a status of 0 indicates success and non-zero values indicate errors.
Note
Every user program that finishes in normally calls exit – even a program that returns from main calls exit indirectly.
exec
int exec(const char* pathname, const char* argv[]);
Runs the executable whose path is given in pathname
, passing any given arguments, and returns the new process’s program id (pid). argv
is a NULL
-terminated array of arguments, where the first argument is usually the executable's name. If the program cannot load or run for any reason, return -1. Thus, the parent process cannot return from a call to exec until it knows whether the child process successfully loaded its executable. You must use appropriate synchronization to ensure this.
Note
Keep in mind
exec
is different from Unixexec
. It can be thought of as a Unixfork
+ Unixexec
.
Tip: Access User Memory Space
exec
is the first syscall where a user passes a pointer to the kernel. Be careful and check if the pointer is valid. You may look up the page table to see if the pointer lays in a valid virtual memory region. Also, do make sure that each character within a string has a valid address. Don't just check the first address.With any invalid arguments, it's just fine to return -1. As for other syscalls with pointer arguments, please always check the validity of their memory addresses.
wait
int wait(int pid);
Waits for a child process pid and retrieves the child’s exit status. If pid is still alive, waits until it terminates. Then, returns the status that pid passed to exit. If pid did not call exit but was terminated by the kernel (e.g. killed due to an exception), wait must return -1. It is perfectly legal for a parent process to wait for child processes that have already terminated by the time the parent calls wait, but the kernel must still allow the parent to retrieve its child’s exit status, or learn that the child was terminated by the kernel.
wait must fail and return -1 immediately if any of the following conditions are true:
-
pid does not refer to a direct child of the calling process. pid is a direct child of the calling process if and only if the calling process received pid as a return value from a successful call to exec. Note that children are not inherited: if A spawns child B and B spawns child process C, then A cannot wait for C, even if B is dead. A call to wait(C) by process A must fail. Similarly, orphaned processes are not assigned to a new parent if their parent process exits before they do.
-
The process that calls wait has already called wait on pid. That is, a process may wait for any given child at most once.
Processes may spawn any number of children, wait for them in any order, and may even exit without having waited for some or all of their children. Your design should consider all the ways in which waits can occur. All of a process’s resources, including its struct thread, must be freed whether its parent ever waits for it or not, and regardless of whether the child exits before or after its parent.
Tip
Implementing wait requires considerably more work than the other syscalls. Schedule your work wisely.
File Operation Syscalls
In addition to process control syscalls, you will also implement the following file operation syscalls,
including open
, read
, write
, remove
, seek
, tell
, fstat
, close
. A lot of syscalls, right? Don't worry. It's not hard at all to implement them, as the skeleton code has a basic thread-safe file system already. Your implementations will simply call the appropriate functions in the file system library.
open
int open(const char* pathname, int flags);
Opens the file located at pathname
. If the specified file does not exist, it may optionally (if O_CREATE
is specified in flags
) be created by open
. Returns a non-negative integer handle called a file descriptor (fd), or -1 if the file could not be opened.
The argument flags
must include one of the following access modes: O_RDONLY
, O_WRONLY
, or O_RDWR
. These request opening the file read-only, write-only, or read/write, respectively.
File descriptors numbered 0 and 1 are reserved for the console: 0 is standard input and 1 is standard output. open
should never return either of these file descriptors, which are valid as system call arguments only as explicitly described below.
Each process has an independent set of file descriptors. File descriptors in TacOS are not inherited by child processes.
When a single file is opened more than once, whether by a single process or different processes, each open returns a new file descriptor. Different file descriptors for a single file are closed independently in separate calls to close
and they do not share a file position.
read
int read(int fd, void* buffer, unsigned size);
Reads size
bytes from the file open as fd
into buffer. Returns the number of bytes actually read (0 at end of file), or -1 if the file could not be read (due to a condition other than end of file, such as fd is invalid). File descriptor 0 reads from the keyboard.
write
int write(int fd, const void* buffer, unsigned size);
Writes size
bytes from buffer to the open file with file descriptor fd
. Returns the number of bytes actually written. Returns -1 if fd does not correspond to an entry in the file descriptor table. Writing past end-of-file would normally extend the file, and file growth is implemented by the basic file system.
File descriptor 1 and 2 should write to the console. You can simply use the kprint
macro which also applys a lock on Stdout
for you, making sure that output by different processes won't be interleaved on the console.
Tip
Before you implement syscall
write
for fd 1, many test cases won't work properly.
remove
int remove(const char* pathname);
Deletes the file at pathname
. Returns 0 if successful, -1 otherwise. A file may be removed regardless of whether it is open or closed, and removing an open file does not close it.
What happens when an open file is removed?
You should implement the standard Unix semantics for files. That is, when a file is removed any process which has a file descriptor for that file may continue to use that descriptor. This means that they can read and write from the file. The file will not have a name, and no other processes will be able to open it, but it will continue to exist until all file descriptors referring to the file are closed or the machine shuts down.
seek
void seek(int fd, unsigned position);
Changes the next byte to be read or written in open file fd
to position
, expressed in bytes from the beginning of the file (0 is the file’s start). If fd does not correspond to an entry in the file descriptor table, this function should do nothing.
A seek past the current end of a file is not an error. A later read obtains 0 bytes, indicating end of file. A later write extends the file, filling any unwritten gap with zeros. These semantics are implemented in the file system and do not require any special effort in the syscall implementation.
tell
int tell(int fd)
Returns the position of the next byte to be read or written in open file fd
, expressed in bytes from the beginning of the file. If the operation is unsuccessful, it returns -1.
fstat
// user/lib/fstat.h
typedef struct {
uint64 ino; // Inode number
uint64 size; // Size of file in bytes
} stat;
int fstat(int fd, stat* buf);
Fills the buf
with the statistic of the open file fd
. Returns -1 if fd does not correspond to an entry in the file descriptor table.
close
int close (int fd);
Closes file descriptor fd
. Exiting or terminating a process must implicitly close all its open file descriptors, as if by calling this function for each one. If the operation is unsuccessful, it returns -1.