Understanding Linux Kernel Stack
Published:
Kernel stack pages
- In kernel 2.4.0, the kernel mode stack is right above the task_struct, in do_fork, Line 669 will get two pages, the low address is for task struct, the end is for pt_regs, the address right below pt_regs is for the kernel mode stack.
Figure credit to Linux内核源代码情景分析
- In kernel 3.4.0,
thread_info
shares two pages with the kernel mode stack, see the definition ofthread_union
The user mode registers are saved inpt_regs
, which is on the top of the two pages, same with kernel 2.4.0, the call trace is:do_fork
->copy_process
->copy_thread
->task_pt_regs
The kernel mode registers in process switching are saved inthread_struct
, task_struct has a member of thread_struct.
Figure credit to Understanding the Linux Kernel
- In kernel 3.18 aarch64,
thread_union
is 4 pages large.
Kernel stack usages
For AArch64 kernel 3.10.
- The kernel stack of a process is defined in union
thread_union
. The thread_info struct is at the beginning while the remaining part is kernel stack. - struct
thread_info
is different from structthread_struct
, which is intask_struct
. thread_struct containscpu_context
, which contains the callee-saved registers (x19, …, x28, x29, x30, sp). - The end of kernel stack is
pt_regs
, which contains all the registers and sp, pc, pstate. The middle part of kernel stack is similar to user stack, the frame pointer (x29) which points to the bottom of current frame, which contains the frame pointer of previous frame.
- From above figure, it is easy to see that kernel stack frame is linked by frame pointer and from
unwind_frame
, we know that return address is at frame pointer+8, in higher address.
cpu_context
and pt_regs
struct cpu_context {
unsigned long x19;
unsigned long x20;
unsigned long x21;
unsigned long x22;
unsigned long x23;
unsigned long x24;
unsigned long x25;
unsigned long x26;
unsigned long x27;
unsigned long x28;
unsigned long fp;
unsigned long sp;
unsigned long pc;
};
struct pt_regs {
union {
struct user_pt_regs user_regs;
struct {
u64 regs[31];
u64 sp;
u64 pc;
u64 pstate;
};
};
u64 orig_x0;
u64 syscallno;
u64 orig_addr_limit;
u64 unused; // maintain 16 byte alignment
};
Both of them contains stack pointer sp and pc.
pt_regs
is at the high end of kernel stack, is mainly used for saving user registers in user-kernel mode switching. Therefore, after returning to user space, the first instruction get executed is atpt_regs->pc
.cpu_context
is in task_struct->thread_struct, is mainly used for saving registers of context switch. So right after context switch to a process, itscpu_context->pc
will get executed.
User-kernel (syscall) mode switch
User mode to kernel mode switch happens in
kernel_entry
in entry.S. Note that here stack pointer grows from high address to low address, while in memory allocation of pt_regts, small number registers are in low addresses, which large number registers are in high addresses, so need push large number registers first.Correspondingly, kernel mode to user mode switch will pop out all these registers, as shown in
kernel_exit
.
Context switch
- In context switch,
cpu_switch_to
in entry.S will save registers tocpu_context
.
References
- Understanding the Linux Kernel
- Linux内核源代码情景分析