Add support to 'lkvm debug' to inject arbitrary sysrqs using a new
'-s <sysrq>' argument.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This removes the limit for p9 fids and the huge fid array that came along with
it. Instead, it dynamically allocates fids and stores them in a rb-tree.
This is useful when the guest needs a lot of fids, such as when stress
testing guests.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Recent kernels (>= v3.5-rc1) have an ioctl which allows us to retrieve the
list of page sizes supported for the guest.
So rework the cpu info code to use that ioctl when available, falling
back to the same values we used previously if the ioctl is not present.
We may also need to filter the list of page sizes against the page size
of the memory backing guest RAM - this accounts for the unfortunate amount
of code in setup_mmu_info().
Finally we need to turn the structure as returned by the kernel into the
format expected in the device tree.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We are about to add more logic to find_cpu_info(). To support this we
need to pass kvm through to it, and also restructure the return flow
so we can operate on info before it is returned.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Using designated initializers for structs is preferable because it
is self documenting, and more robust against changes to the structure
layout.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
On some powerpc platforms we need to make sure we only advertise page
sizes to the guest which are <= the size of the pages backing guest RAM.
So have mmap_hugetblfs() save the hugetblfs page size for us, and also
teach mmap_anon_or_hugetblfs() to set the page size for anonymous mmap.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
It implements essentially the same logic. The one difference is it sets
MAP_NORESERVE when using anonymous mmap, but I think that is OK.
Reword the comment about hugetblfs, we are no longer required to use
hugepages to back the guest.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Introduce struct disk_image_params to contain all the disk image parameters.
This is useful for adding more disk image parameters, e.g. disk image
cache mode.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
If vhost is enabled for a virtio device, vhost will poll the ioeventfd
in kernel side and there is no need to poll it in userspace. Otherwise,
both vhost kernel and userspace will race to poll.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
VHOST_SET_MEM_TABLE failed: Operation not supported
In vhost_set_memory(), We have
if (mem.padding)
return -EOPNOTSUPP;
So, we need to zero struct vhost_memory.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
I think this code was based on an earlier version of the KVM_SET_ONE_REG
API, which at the time was in agraf's tree but not mainline?
Either way it doesn't compile as is, so fix it up.
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The sed expression for ARCH seems to have been cribbed from the top-level
kernel Makefile, and includes lots of architectures kvmtool doesn't
support - strip it down.
Also call uname -m directly there and get rid of uname_M.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Allow CROSS_COMPILE to be used to prefix CC as is done in the kernel
Makefile. If CROSS_COMPILE is unset it has no effect, and still allows
CC to be overridden.
We need to fix a few places to use ARCH instead of uname_M directly, so
that the overridden setting of ARCH takes effect.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Commit 82ea06e "Introduce KVM_VIRTIO_MMIO_AREA" did just that, but only
for x86. Causing the following commit 5c301a3 "Add virtio-mmio support"
to break the build for powerpc.
We follow what x86 did and place it 16MB past the PCI area, I have no
idea if that is actually a good idea, or whether it works at all.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We need to set the HYPERVISOR flag to let the kernel know we're running
under a hypervisor.
This makes the kernel enable all sorts of para-virtualization options
such as kvm-clock.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[gorcunov@: Add comments on bits]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
While we process 9p requests serially, so there's no point for implementing
flush, we still need to answer it to prevent guest kernel from hanging waiting
for it.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
All blk requests are processed in notify_vq() which is in the context of
ioeventfd thread: ioeventfd__thread(). The processing in notify_vq() may
take a long time to complete and all devices share the single ioeventfd
thead, so this might block other device's notify_vq() being called and
starve other devices.
This patch makes virtio blk's notify_vq() just notify the blk thread
instead of doing the real hard read/write work. Tests show that the
overhead of the notification operations is small.
The reasons for using dedicated thead instead of using thead pool
follow:
1) In thread pool model, each job handling operation:
thread_pool__do_job() takes about 6 or 7 mutex_{lock,unlock} ops. Most
of the mutex are global (job_mutex) which are contented by the threads
in the pool. It's fine for the non performance critical virtio devices,
such as console, rng, etc. But it's not optimal for net and blk devices.
2) Using dedicated threads to handle blk requests opens the door for
user to set different IO priority for the blk threads.
3) It also reduces the contentions between net and blk devices if they
do not share the thead pool.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The queue size for virtio_blk is 256 and AIO_MAX is 32, we might be
short of available aio events if guest issues > 32 requests
simultaneously. Following error is observed when guest running stressed
I/O workload.
Info: disk_image__read error: total=-11
To fix this, let's increase the aio events limit.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Use str[n|l] functions to make sure destination is
not overflowed.
Seems socket path generation should be moved into
a separate helper, but it's for another patch.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
---------
Before:
---------
*** Compatibility Warning ***
virtio-blk device was not detected
While you have requested a virtio-blk device, the guest kernel did not initialize it.
Please make sure that the guest kernel was compiled with CONFIG_VIRTIO_BLK=y enabled in its .config
*** Compatibility Warning ***
virtio-net device was not detected
While you have requested a virtio-net device, the guest kernel did not initialize it.
Please make sure that the guest kernel was compiled with CONFIG_VIRTIO_NET=y enabled in its .config
# KVM session ended normally.
---------
After:
---------
# KVM compatibility warning.
virtio-blk device was not detected.
While you have requested a virtio-blk device, the guest kernel did not initialize it.
Please make sure that the guest kernel was compiled with CONFIG_VIRTIO_BLK=y enabled in .config.
# KVM compatibility warning.
virtio-net device was not detected.
While you have requested a virtio-net device, the guest kernel did not initialize it.
Please make sure that the guest kernel was compiled with CONFIG_VIRTIO_NET=y enabled in .config.
# KVM session ended normally.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
compat_id is initialized to -1 for each type of device. We should add
compat message if the compat_id == -1 which means we haven't added
compat message for this type of device.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch introduces a helper virtio_compat_add_message() to simplify
adding compat message for virtio device.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We read and write in sectors by default. It makes little sense to add
the extra _sector string for read and write ops/function name.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Keep trying if io_submit returns EAGAIN. No need to fail the request.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The new directory name is simpler and easier to type and remember.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The req_mutex was used to protect the request list. In commit b7b038d, I
removed the use of the virtio_blk_req_{pop, push} which needs the
req_mutex, but I forgot to remove the req_mutex. So remove it now.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Increase the amount of FIDs since we easily reach current amount with
simple stress tests.
This should be changed to use a rbtree or something faster in the
future.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Since the 9p functions don't know the size of the fid array, they might
request an FID outside of the allowed range. Use an accessor to prevent
that and to hide the internal implementation from them.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We free the structure, but never remove them from the tree or list, then
we freed them the next time we ran through that structure.
This patch also simplifies irq__exit a bit.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Fixes the following build breakage with '-Werror':
cc1: warnings being treated as errors
x86/boot.c: In function ‘kvm__load_firmware’:
x86/boot.c:29: error: format ‘%lu’ expects type ‘long unsigned int’, but
argument 3 has type ‘__off64_t’
make: *** [x86/boot.o] Erreur 1
Signed-off-by: Jean-Philippe Menil <jean-philippe.menil@univ-nantes.fr>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Otherwise I'm getting the following compile problem on my Fedora
machine. The helper is rather taken from linux kernel.
| [cyrill@moon kvm]$ make tags
| x86/include/kvm/barrier.h:11:25: fatal error: asm/barrier.h: No such file or directory compilation terminated.
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This switches the default behaviour of lkvm if access
on unregistered MMIO address happens -- we don't spam
a user with warning messages anymore. If one needs details
on unhandled MMIOs the --debug-mmio option should be passed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Commit 2084c24 ("do not export kernel's NULL #define to userspace") broken KVM
tool build:
FYI:
CC framebuffer.o
In file included from include/kvm/framebuffer.h:5:0,
from framebuffer.c:1:
../../include/linux/list.h: In function ‘INIT_HLIST_NODE’:
../../include/linux/list.h:572:12: error: ‘NULL’ undeclared (first use in this function)
../../include/linux/list.h:572:12: note: each undeclared identifier is reported only once for each function it appears in
../../include/linux/list.h: In function ‘hlist_move_list’:
../../include/linux/list.h:657:15: error: ‘NULL’ undeclared (first use in this function)
make: *** [framebuffer.o] Error 1
due to this upstream commit:
2084c24a8141 do not export kernel's NULL #define to userspace
Fix that.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
CONFIG_FB_VESA is needed for --sdl or --vnc. Update README for it.
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>