From the course: Advanced Linux: The Linux Kernel

Understand system call mechanics - CentOS Tutorial

From the course: Advanced Linux: The Linux Kernel

Start my 1-month free trial

Understand system call mechanics

- [Teacher] Let's talk more about system calls, the mechanics, whats going on with the system call, what really is a system call? System calls, again, are functions implemented by the kernel, but really, are meant to be called from user space, there's going to be correlation between a function you call on user space and the function you call in the Kernel. Now, for new Kernel like 5.3, there's about 340 system calls. That's a moderately big number and some of those are really obscure that were special cases that were used by only one program or one special program, but lots of the common functions you might be familiar with, like read and write are system calls and they're in that list. And when you have Linux Kernel source code, you can look at the file in the "include" directory "uapi", User-space API, underneath "asm", for "asm-generic", "unistd.h", and in there it lists the system calls. And you might want to note that actually the number of system calls varies from architecture to architecture. On ARM, they're might be some system calls that are not available that are available on x86-64, but you know, it's a small number but you want to be aware that it's not an absolute fixed number, we are saying about 340. And because you're going to be calling functions in User-space, right, they're man pages for those and the functions you can call are documented in section two. And typically there is a one to one corresponds between a function you call and a function the Kernel, like the read function, so if you looked up read in the manual, the function you could call, the function that they get implemented in the library, that's documentation you call, but in the Kernel, the read function could be a little bit different. Sometime the parameter lists a little bit different, because the library does some stuff but typically, you know, pretty similar. Now, you're program is going to call "read" and that's going to go in the say, standard C library and the standard C Library is going to to some how poke the kernel and say, "do the read function and here's the parameters." Now how that poking works is architecture dependent, so there's probably a special assimilate instruction that the library code will execute that will tell the Kernel, "hey, do the system call" and there's going to have to be some protocol on how the library tells the Kernel what system call to do and what the parameters are. All right, so how that works depends on the architecture so, you have to get a standard C library for ARM or MIPS, or x86-64 and so forth, because they have to have the special assimilate instruction to do that system call, now wouldn't that be a pain to do that, right, if you wanted to change your program to run on ARM, you'd have to change the code to call system call. All right so, that's why, you know, having a library when you just call the function and library does the right thing, all right, it's so handy. Library invokes Kernel, Kernel figures out what system call, get's the parameter, does it and then it's going to have to return and maybe there's an error. The Kernel has to some how tell the library, things worked our or things didn't work out and here's the error and so forth, right. So the way that works is the function in the Kernel will return a negative value through the ordinary return mechanism and the value of that negative value, the absolute value of that, will be an error code number, saying you didn't have permission or that was an invalid value or whatever the error was. So the library gets back from the Kernel, the return value, and if that value is negative then, the library is going to set a global variable in your process address space called "errno," to the absolute value, what it got back from the Kernel. So if it got back from the Kernel -10, "errno" will be 10, and that 10 will mean whatever error number 10 is. But then the library returns back to your program the value -1, so if you called read and you did something wrong, library called read in the Kernel, do it's mechanism, Kernel would return back, say -10 and then in your program the return value from read you got would be -1 but "errno" would be set to 10, so you can look at "errno" to find out what the error was. It's the library setting "errno" and typically the library will not set "errno" if there wasn't an error, it doesn't bother setting it if there wasn't an error. So "errno" could have some stale value in it, so you want to be aware of that. When you make a system call you always want to check to see if it worked or not and if you got -1 it didn't work. Don't be looking at "errno" typically to see if there was an error. I put the source code to Linux kernel 5.3.0 on my computer here underneath the User-source directory. And we see, in the Linux source tree, we have a sub-directory, "include/uapi/ asm-generic" and we have the file "unistd.h", which is 30k long if we count the lines in it. Get an idea. It's 903 lines long. So that's supposed to have stuff about system calls, so lets just grep -i to ignore case "read" in that file, and I'm going to do that handle little shell shortcut, "!$", which means the last thing on the previous line. You didn't know that, that's going to save you a lot of time. So we are going to grep "read" in any case in that header file, and we got a number of lines, got some comments and we see towards the top there, "#define __NR read 63". So wer're defining a macro and what that's going to mean is, "NR_read" is value 63 that means system call "read" is number 63 and then we go down a couple of lines and see, "readv 65", so it turns out there's a system call called "readv" and there's a "pread" and there's a "preadv" and there's a "readlinkat and so forth. So there's several system calls that have read in there name but the one called jut "read" is number 63. So that's real trivial, you'll probably never use 63, you don't need to care about that and that different kernels have different numbers for read, but the library, needs to know that. So when you build a standard C library, it would use this setter file so that when User-space calls "read" in the library, the library invokes the Kernel saying, "I'm trying to do the system call number 63". So that's the key there, of how the library tells the kernel what system call to do. And if we were to grep from this file, let's say we say "define __NR", we're guessing there is going to be one line for that, one like like that, for every system call. So, grep that, in our "unistd.h" and we count that. 340, there's how I got my estimate of how many system calls. Let's look at documentation. So let's look at "read", so we say "man read", and if we look in the upper left hand corner, it says "read (2), that means section 2, that is the system calls. So we see "read - read from a file of descriptor", and then a synopsis of what to include. So that "unistd.h" is the User-space header file, has the same name as header file we just looked at but remember the file we just looked at is in the kernel, that's a kernel header file, that's not a User-space file, you don't want to confuse those, it might be completely different. All right, then we see the "read" function takes three parameters and returns a size. Now if we scroll down in the man page, system call man pages have a return value section. We see what happens when it worked. It says, "number of bytes read is returned" or in the prens there, "zero indicates end of file." And if we go to the second paragraph in there, it says, "on error, -1 is returned, "and errno is set appropriately". That phrase is like cut and paste in every man page for system calls. All right, so that's what we explain. Let's look at another system call. So lets say "man kill", not to be quoted out of context, and we look at man page and let, oops, if we look in the upper left hand corner it says one. We are looking at section one. One is for commands, all right, so this is probably what you knew, you knew there was a kill command and you are probably used to using kill -9 to kill a process, and in fact, maybe we shouldn't pronounce "kill", "kill", maybe we should pronounce it "send signal" because we could send any signal with the "kill" command, it's really a generic send a signal. Number nine is the "kill" signal, but there's other signals we send too. So If we want to know about the system call for "kill", we have to say "man 2", that sounds bad out of context too, doesn't it, "man 2 kill"? And there we see what we need to include and we see the function takes a couple of parameters, and we see a description and the return value on success, "(at least one signal was sent), zero on return", so zero means success for "kill", zero meant end of file for "read". But again, "on error, -1 is returned, "and errno is set appropriately."

Contents