內核虛擬化KVM/QEMU——Guest OS, Qemu, KVM工作流程
內核虛擬化KVM/QEMU——Guest OS, Qemu, KVM工作流程
這裡主要介紹基於x86平台的Guest Os, Qemu, Kvm工作流程,如圖,通過KVM APIs可以將qemu的command傳遞到kvm:
1.創建VM
system_fd = open("/dev/kvm", xxx);
vm_fd = ioctl(system_fd, KVM_CREATE_VM, xxx);
2.創建VCPU
vcpu_fd = kvm_vm_ioctl(vm_fd, VM_CREATE_VCPU, xxx);
3.運行KVM
status = kvm_vcpu_ioctl(vcpu_fd, KVM_RUN, xxx);
Qemu通過KVM APIs進入KVM后,KVM會切入Guest OS,假如Guest OS運行運行,需要訪問IO等,也就是說要訪問physical device,那麼Qemu與KVM就要進行emulate。 如果是KVM emulate的則由KVM emulate,然後切回Guest OS。如果是Qemu emulate的,則從KVM中進入Qemu,等Qemu中的device model執行完emulate之後,再次在Qemu中調用kvm_vcpu_ioctl(vcpu_fd, KVM_RUN, xxx)進入KVM運行,然後再切回Guest OS.
(圖片勘誤,如果KVM can emulate那麼emulate之後應該層層返回到kvm_x86_ops->run(vcpu),然後才切入guest os,不是直接切入,圖畫完了,不好修改)
Qemu是一個應用程序,所以入口函數當然是main函數,但是一些被type_init修飾的函數會在main函數之前運行。這裡分析的代碼是emulate x86 的一款i440板子。main函數中會調用在main函數中會調用kvm_init函數來創建一個VM(virtual machine),然後調用機器硬體初始化相關的函數,對PCI,memory等進行emulate。然後調用qemu_thread_create創建線程,這個函數會調用pthread_create創建一個線程,每個VCPU依靠一個線程來運行。在線程的處理函數qemu_kvm_cpu_thread_fn中,會調用kvm_init_vcpu來創建一個VCPU(virtual CPU),然後調用kvm_vcpu_ioctl,參數KVM_RUN,這樣就進入KVM中了。進入KVM中第一個執行的函數名字相同,也叫kvm_vcpu_ioctl,最終會調用到kvm_x86_ops->run()進入到Guest OS,如果Guest OS要寫某個埠,會產生一條IO instruction,這時會從Guest OS中退出,調用kvm_x86_ops->handle_exit函數,其實這個函數被賦值為vmx_handle_exit,最終會調用到kvm_vmx_exit_handlers(vcpu),kvm_vmx_exit_handlers是一個函數指針,會根據產生事件的類型來匹配使用那個函數。這裡因為是ioport訪問產生的退出,所以選擇handle_io函數。
view plaincopyprint?01.5549static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
02.5550 = handle_exception,
03.5551 = handle_external_interrupt,
04.5552 = handle_triple_fault,
05.5553 = handle_nmi_window,
06.5554 = handle_io,
07.5555 = handle_cr,
08.5556 = handle_dr,
09.5557 = handle_cpuid,
10.5558 = handle_rdmsr,
11.5559 = handle_wrmsr,
12.5560 = handle_interrupt_window,
13.5561 = handle_halt,
14.5562 = handle_invd,
15.5563 = handle_invlpg,
16.5564 = handle_vmcall,
17.5565 = handle_vmclear,
18.5566 = handle_vmlaunch,
19.5567 = handle_vmptrld,
20.5568 = handle_vmptrst,
21.5569 = handle_vmread,
22.5570 = handle_vmresume,
23.5571 = handle_vmwrite,
24.5572 = handle_vmoff,
25.5573 = handle_vmon,
26.5574 = handle_tpr_below_threshold,
27.5575 = handle_apic_access,
28.5576 = handle_wbinvd,
29.5577 = handle_xsetbv,
30.5578 = handle_task_switch,
31.5579 = handle_machine_check,
32.5580 = handle_ept_violation,
33.5581 = handle_ept_misconfig,
34.5582 = handle_pause,
35.5583 = handle_invalid_op,
36.5584 = handle_invalid_op,
37.5585};
5549static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
5550 = handle_exception,
5551 = handle_external_interrupt,
5552 = handle_triple_fault,
5553 = handle_nmi_window,
5554 = handle_io,
5555 = handle_cr,
5556 = handle_dr,
5557 = handle_cpuid,
5558 = handle_rdmsr,
5559 = handle_wrmsr,
5560 = handle_interrupt_window,
5561 = handle_halt,
5562 = handle_invd,
5563 = handle_invlpg,
5564 = handle_vmcall,
5565 = handle_vmclear,
5566 = handle_vmlaunch,
5567 = handle_vmptrld,
5568 = handle_vmptrst,
5569 = handle_vmread,
5570 = handle_vmresume,
5571 = handle_vmwrite,
5572 = handle_vmoff,
5573 = handle_vmon,
5574 = handle_tpr_below_threshold,
5575 = handle_apic_access,
5576 = handle_wbinvd,
5577 = handle_xsetbv,
5578 = handle_task_switch,
5579 = handle_machine_check,
5580 = handle_ept_violation,
5581 = handle_ept_misconfig,
5582 = handle_pause,
5583 = handle_invalid_op,
5584 = handle_invalid_op,
5585};如果KVM中的handle_io函數可以處理,那麼處理完了再次切入Guest OS。如果是在Qemu中emulate,那麼在KVM中的代碼執行完后,會再次回到Qemu中,調用Qemu中的kvm_handle_io函數,如果可以處理,那麼再次調用kvm_vcpu_ioctl,參數KVM_RUN,進入KVM,否則出錯退出
《解決方案》
謝謝分享