We have always built kvmtool as 64-bit on powerpc, but mainly just out
of habit. There's not AFAIK any reason we *can't* build 32-bit.
So fix up a few places where we were assuming 64-bit, and drop the
Makefile logic that forces 64-bit.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we don't have the kernel header on hand, just define the
minimum set of hcall opcodes and return values we need in order to
build.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Instead of referring to the Linux header including the barrier
macros, copy over the rather simple implementation for the PowerPC
barrier instructions kvmtool uses. This fixes build for powerpc.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
kvmtool used the in-kernel version of the device tree handling
library. Now that we are a proper userland tool, use the system's
library for that purpose. Actually this seems to fix some
long-standing warning generated by the Linux copy.
Also fix up a bogus x86 warning (no FDT needed here).
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Similarily to the generic uapi/linux/kvm.h, each architecture
carries a kvm.h header in its arch/*/include/uapi/asm directory.
These contain bits for the architecture specific interface.
Since we use many recent features in kvmtool, the system headers
provided by the distribution are usually not up-to-date.
Copy the Linux v4.1-rc6 versions of those files for all supported
architectures into the kvmtool tree to get access to the full glory.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This is is usually 0 for most archs. On mips we have two types.
TE (type 0) and MIPS-VZ (type 1). Default to 1 on mips.
Signed-off-by: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The recent introduction of bi-endianness on arm/arm64 had the
odd effect of breaking virtio-pci support on these platforms, as the
device endian field defaults to being VIRTIO_ENDIAN_HOST, which
is the wrong thing to have on a bi-endian capable architecture.
The fix is to check for the endianness on the ioport path the
same way we do it for mmio, which implies passing the vcpu all
the way down. Patch is a bit ugly, but aligns MMIO and ioport nicely.
Tested on arm64 and x86.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In order to be able to find out about the endianness of a virtual
CPU, it is necessary to pass a pointer to the kvm_cpu structure
down to the MMIO accessors.
This patch just pushes such pointer as far as required for the
MMIO accessors to have a play with the vcpu.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch changes VIRTIO_DEFAULT_TRANS to take a struct kvm parameter,
allowing architectures to choose the default transport dynamically.
For ARM, this is driven by an arch-specific cmdline option.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Now that we have some common OF PCI definitions in of_pci.h, make use
of them when generating the devicetree for spapr_pci on ppc.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
All architectures are now doing the same thing for irq__alloc_line:
1. Initialise a global counter to some fixed offset
2. Return the current value of the counter and increment it
This is better off in core code, with each architecture specifying the
initial offset, which is specific to the interrupt controller being used
by the guest.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Since irq__register_device no longer registers a device with anything,
rename it to irq__alloc_line, which better describes what is actually
going on.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
With the removal of the x86 irq rbtree, the only parameter used by
irq__register_device is actually used to return the new line.
This patch removes all of the parameters from irq__register_device and
returns the allocated line directly.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In preparation for moving the irq allocation into generic code, remove
the pin parameter from irq__register_device and temporarily place the
onus on the emulation driver to allocate the pin (which is always 1 and
only used on PCI anyway).
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Rather than performing all config accesses via ioports, map in a 24-bit
memory-mapped configuration space directly below the PCI MMIO region.
This will allow architectures to support PCI without having to support
legacy ioports in the guest kernel. Instead, kvm tool can forward the
config accesses directly to the relevant ioport config callbacks.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Currently the only use of the periodic timer tick in kvmtool is to
handle reading from stdin. Though functional, this periodic tick can be
problematic on slow (eg FPGA) platforms and can cause low interactivity or
even stop the execution from progressing at all.
This patch removes the periodic tick in favour of a dedicated thread blocked
waiting for input from the console. In order to reflect the new behaviour,
the old 'kvm__arch_periodic_tick' function is renamed to 'kvm__arch_read_term'.
In making this change it is necessary to actively flush the emulated serial
console's output buffer after the guest writes to it, as otherwise flushing
only happens with terminal input. Similarly, it is no longer necessary to
flush the buffer when we process input.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
xics_init() assumes kvm->nrcpus is already setup. kvm->nrcpus is setup
in kvm_cpu_init()
Unfortunately xics_init() and kvm_cpu_init() both use base_init(). So
depending on the order randomly determined by the compiler, xics_init()
may initialised see kvm->nrcpus as 0 and not setup any of the icp VCPU
pointers. This manifests itself later in boot when trying to raise an
IRQ resulting in a null pointer deference/segv.
This moves xics_init() to use dev_base_init() to ensure it happens after
kvm_cpu_init().
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
On some powerpc systems, reboot is implemented by an RTAS call by the
name of "system-reboot". Currently we don't implement it in kvmtool,
which means instead the guest prints an error and spins.
This is particularly annoying because when the guest kernel panics it
will try to reboot, and end up spinning in the guest.
We can't implement reboot properly, ie. causing a reboot, but it's still
preferable to implement it as halt rather than not implementing it at
all.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
If an architecture other than x86 wants to make use of ioport devices, the
interrupt lines will likely need remapping from their fixed values.
This patch allows an architecture callback, ioport__map_irq, to map
interrupts as appropriate.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Architectures without a legacy ioport may wish to emulate one, but not
at address 0x0.
This patch introduces KVM_IOPORT_AREA, which each architecture defines
to be the start of the ioport region (i.e. where port addresses are
offset from).
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Kvmtool suppresses any output to a console that has not been elected
as *the* console.
While this makes sense on the input side (we want the input to be sent
to one console driver only), it seems to be the wrong thing to do on
the output side, as it effectively prevents the guest from switching
from one console to another (think earlyprintk using 8250 to virtio
console).
After all, the guest *does* poke this device and outputs something
there.
Just remove the kvm->cfg.active_console test from the output paths.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The kernel can handle a missing timebase-frequency property much better
than one that claims zero.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We should hard-code less of this stuff, but for now this works.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In xics_init() we set the maximum server to kvm->nrcpus, and then set
the nr_servers using maximum server + 1.
That is off by one, in the harmless direction.
Simplify it to just set nr_servers = kvm->nrcpus.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Commit 21692d1 (Beautify debug output) broke the powerpc build because
it changed the signature for kvm__dump_mem() but didn't update all callers.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Currently, only x86 has architecture command-line options (for setting
the BIOS video mode) however this is likely to become more common in the
future.
This patch adds some simple macros and a struct definition to allow
architectures to augment the command-line options with private
definitions. The BIOS video mode option (--vidmode) is also migrated to
the new framework.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Commit 8d35d32d0148 ("kvm tools: add generic device registration
mechanism") introduced a tree-based device lookup-by-bus mechanism as
well as iterators to enumerate the devices on a particular bus.
Whilst both x86 and ppc were converted by the original patch, the spapr
pci changes were incomplete, so include the required changes here.
Compile-tested only on ppc64 970mp. Note that I had to hack the Makefile
in order to build guest_init.o with a toolchain defaulting to ppc:
$(GUEST_INIT): guest/init.c
$(E) " LINK " $@
- $(Q) $(CC) -static guest/init.c -o $@
- $(Q) $(LD) -r -b binary -o guest/guest_init.o $(GUEST_INIT)
+ $(Q) $(CC) -m64 -static guest/init.c -o $@
+ $(Q) $(LD) -m elf64ppc -r -b binary -o guest/guest_init.o $(GUEST_INIT)
$(DEPS):
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
PCI devices are currently registered into the pci_devices array via the
pci__register function, which can then be indexed later by architecture
code to construct device tree nodes. For MMIO devices, there is no such
utility.
Rather than invent a similar mechanism for MMIO, this patch creates a
global device registration mechanism, which allows the device type to be
specified when registered or indexing a device. Current users of the pci
registration code are migrated to the new infrastructure and virtio MMIO
devices are registered at init time.
As part of the device registration, allocation of the device number is
moved out of irq__register_device and performed when adding the device
header to the relevant bus tree, allowing us to maintain separate device
numberspaces for each bus.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Some architectures may provide only a restricted PCI implementation and
therefore prefer MMIO as the transport for virtio devices.
This patch allows the arch backend to specify the default virtio
transport. Some devices (e.g. net) allow the transport to be overriden
by the user and are left alone by this change.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The _FDT macro is useful when generating device trees for a guest, so
make it available to other architectures.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In commit e3d3ced "kernel load/firmware cleanup", the call to
kvm__arch_setup_firmware() was moved. Previously more or less at the end
of the init sequence, but that commit moved it into kvm__init() which
is a core_init() call and so runs quite early.
This broke booting powerpc guests, as setup_fdt() needs to be called
later in the setup sequence. In particular it looks at kvm->nrcpus,
which is uninitialised at that point.
In general setup_fdt() needs to run late in the sequence, as it encodes
the setup of the machine into the device tree.
So move setup_fdt() out of kvm__arch_setup_firmware() and make it a
firmware_init() call of its own.
With this patch I am able to boot guests again on HV KVM.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In commit 06e6648 "move kvm_cpus into struct kvm", kvm_cpu__init() became
kvm_cpu__arch_init() called from a new kvm_cpu__init(), and the call was moved
from the end of the init sequence to much earlier, and in particular prior to
irq__init().
This leads to a segfault on powerpc, because kvm_cpu__arch_init() calls into
xics_cpu_register(), which dereferences vcpu->kvm.icp which is uninitialised
until irq__init().
Later in commit a48488d "use init/exit where possible", irq__init() was pulled
out of the init sequence and made a dev_base_init() routine, on x86. On powerpc
the call to irq__init() was dropped entirely.
Finally, we now have a circular dependency between kvm_cpu__init() (which needs
kvm->arch.icp), and irq__init() (which needs kvm->nrcpus). This is caused by
the combination of commit 89f40a7 "move nrcpus into struct kvm_config",
which moved the global nrcpus into kvm->cfg, and commit 06e6648 "move kvm_cpus
into struct kvm", which moved the setup of kvm->nrcpus from kvm->cfg into
kvm_cpu__init().
To fix it we drop irq__init() entirely, if we ever have a non xics irq option
we can bring it back. We turn xics_system_init() into xics_init(), and have it
do the allocation and setup of the icp/ics, including the per-vcpu setup,
removing the dependency from kvm_cpu__init() (via kvm_cpu__arch_init()).
xics_init() is a base_init() routine, it can't be core, which should be early
enough, fingers crossed.
Finally drop irq__exit(), it does nothing and is never called.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Several caused by commit 8074303 "remove global kvm object",
ioport__setup_arch(), term_getc_iov() & term_getc() in the
spapr_hvcons.c code, and kvm_cpu__reboot() in rtas_power_off().
Commit 221b584 "move active_console into struct kvm_config" added
checks in h_put_term_char() & h_get_term_char() of
kvm->cfg.active_console but needs to be vcpu->kvm->cfg.active_console.
That commit also missed updates to term_putc() & term_getc() in
spapr_rtas.c, and I'm guessing that we need similar checks of
active_console in rtas_put_term_char() & rtas_get_term_char().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
There's no reason the array of guest specific vcpus is global. Move it into
struct kvm.
Also split up arch specific vcpu init from the generic code and call it from
the kvm_cpu initializer.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This config option was 'extern'ed between different objects. Clean it up
and move it into struct kvm_config.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Move all the non-arch specific members into a generic struct, and the arch specific
members into a arch specific kvm_arch. This prevents code duplication across different
archs.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Recent kernels (>= v3.5-rc1) have an ioctl which allows us to retrieve the
list of page sizes supported for the guest.
So rework the cpu info code to use that ioctl when available, falling
back to the same values we used previously if the ioctl is not present.
We may also need to filter the list of page sizes against the page size
of the memory backing guest RAM - this accounts for the unfortunate amount
of code in setup_mmu_info().
Finally we need to turn the structure as returned by the kernel into the
format expected in the device tree.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
We are about to add more logic to find_cpu_info(). To support this we
need to pass kvm through to it, and also restructure the return flow
so we can operate on info before it is returned.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Using designated initializers for structs is preferable because it
is self documenting, and more robust against changes to the structure
layout.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
On some powerpc platforms we need to make sure we only advertise page
sizes to the guest which are <= the size of the pages backing guest RAM.
So have mmap_hugetblfs() save the hugetblfs page size for us, and also
teach mmap_anon_or_hugetblfs() to set the page size for anonymous mmap.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
It implements essentially the same logic. The one difference is it sets
MAP_NORESERVE when using anonymous mmap, but I think that is OK.
Reword the comment about hugetblfs, we are no longer required to use
hugepages to back the guest.
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
I think this code was based on an earlier version of the KVM_SET_ONE_REG
API, which at the time was in agraf's tree but not mainline?
Either way it doesn't compile as is, so fix it up.
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Commit 82ea06e "Introduce KVM_VIRTIO_MMIO_AREA" did just that, but only
for x86. Causing the following commit 5c301a3 "Add virtio-mmio support"
to break the build for powerpc.
We follow what x86 did and place it 16MB past the PCI area, I have no
idea if that is actually a good idea, or whether it works at all.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch adds a "--firmware" command line option to "vm run". You can use
this to try to boot with SeaBIOS, for example:
./vm run --firmware=/usr/share/seabios/bios.bin \
--disk $HOME/images/debian_lenny_amd64_standard.qcow2
This doesn't boot yet for obvious reasons but at least people can now start to
play with external BIOS images easily.
Acked-by Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Yang Bai <hamo.by@gmail.com>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: John Floren <john@jfloren.net>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Asias He <asias.hejun@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pekka Enberg <penberg@kernel.org>