76 Commits

Author SHA1 Message Date
Andre Przywara 52c22e6e64 use <poll.h> instead of <sys/poll.h>
The manpage of poll(2) states that the prototype of poll is defined
in <poll.h>. Use that header file instead of <sys/poll.h> to allow
compilation against musl-libc.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-20 18:25:48 +01:00
Andre Przywara 823c7fd8e9 qcow: fix signedness bugs
Some functions in qcow.c return u64, but are checked against < 0
because they want to check for the -1 error return value.
Do an explicit comparison against the casted -1 to express this
properly.
This was silently compiled out by gcc, but clang complained about it.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-20 18:25:48 +01:00
Andre Przywara 15542bab78 avoid casts when initializing structures
Due to our kernel heritage we have code in kvmtool that relies on
the (still) implicit -std=gnu89 compiler switch.
It turns out that this just affects some structure initialization,
where we currently provide a cast to the type, which upsets GCC for
anything beyond -std=gnu89 (for instance gnu99 or gnu11).
We do need the casts when initializing structures that are not
assigned to the same type, so we put it there explicitly.

This allows us to compile with all the three GNU standards GCC
currently supports: gnu89/90, gnu99 and gnu11.
GCC threatens people with moving to gnu11 as the new default standard,
so lets fix this better sooner than later.
(Compiling without GNU extensions still breaks and I don't bother to
fix that without very good reasons.)

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-20 18:25:47 +01:00
Andreas Herrmann 69f50425bd kvm tools: Fix print format warnings
This should fix following warnings

 builtin-stat.c:93:3: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type '__u64' [-Wformat]
 builtin-run.c:188:4: warning: format '%Lu' expects argument of type 'long long unsigned int', but argument 3 has type '__u64' [-Wformat]
 builtin-run.c:554:3: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 2 has type 'u64' [-Wformat]
 builtin-run.c:554:3: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 3 has type 'u64' [-Wformat]
 builtin-run.c:645:3: warning: format '%Lu' expects argument of type 'long long unsigned int', but argument 4 has type 'u64' [-Wformat]
 disk/core.c:330:4: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type '__dev_t' [-Wformat]
 disk/core.c:330:4: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 5 has type '__dev_t' [-Wformat]
 disk/core.c:330:4: warning: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__ino64_t' [-Wformat]
 mmio.c:134:5: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'u64' [-Wformat]
 util/util.c:101:7: warning: format '%lld' expects argument of type 'long long int', but argument 3 has type 'u64' [-Wformat]
 util/util.c:113:7: warning: format '%lld' expects argument of type 'long long int', but argument 2 has type 'u64' [-Wformat]
 hw/pci-shmem.c:339:3: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'u64' [-Wformat]
 hw/pci-shmem.c:340:3: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'u64' [-Wformat]

as observed when compiling on mips64.

Signed-off-by: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:55 +01:00
Sasha Levin 379d476de1 kvm tools: remove unneeded checks in qcow code
We already know q!=NULL at that point, no need to retest.

Reviewed-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:53 +01:00
Sasha Levin 71bf426ac9 kvm tools: remove unneeded check from disk code
We already know 'disk' is non-null.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:53 +01:00
Sasha Levin a4d8c55eb2 kvm tools: Specify names for VM internal threads
Give threads a meaningful name. This makes debugging much easier, and
everything else much prettier.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[ penberg@kernel.org: specify vcpu names ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:53 +01:00
Sasha Levin 49a8afd1b9 kvm tools: use init/exit where possible
Switch to using init/exit calls instead of the repeating call blocks in builtin-run.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:53 +01:00
Sasha Levin 3b55dcde7f kvm tools: disk image related cleanup
Move io debug delay into kvm_config, the parser out of builtin-run into the disk code
and make the init/exit functions match the rest of the code in style.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He a67da3beff kvm tools: Add initial virtio-scsi support
This patch brings virito-scsi support to kvm tool.

With the introduce of tcm_vhost (vhost-scsi)

   tcm_vhost: Initial merge for vhost level target fabric driver

we can implement virito-scsi by simply having vhost-scsi to handle the
SCSI command.

Howto use:
1) Setup the tcm_vhost target through /sys/kernel/config

   [Stefan Hajnoczi, Thanks for the script to setup tcm_vhost]

   ** Setup wwpn and tpgt
   $ wwpn="naa.0"
   $ tpgt=/sys/kernel/config/target/vhost/$wwpn/tpgt_0
   $ nexus=$tpgt/nexus
   $ mkdir -p $tpgt
   $ echo -n $wwpn > $nexus

   ** Setup lun using /dev/ram
   $ n=0
   $ lun=$tpgt/lun/lun_${n}
   $ data=/sys/kernel/config/target/core/iblock_0/data_${n}
   $ ram=/dev/ram${n}
   $ mkdir -p $lun
   $ mkdir -p $data
   $ echo -n udev_path=${ram} > $data/control
   $ echo -n 1 > $data/enable
   $ ln -s $data $lun

2) Run kvm tool with the new disk option '-d scsi:$wwpn:$tpgt', e.g
   $ lkvm run -k /boot/bzImage -d ~/img/sid.img -d scsi:naa.0:0

Signed-off-by: Asias He <asias.hejun@gmail.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He 5236b50516 kvm tools: Enable O_DIRECT support
With Direct I/O, file reads and writes go directly from the applications
to the storage device, bypassing the operating system read and write
caches. This is useful for applications that manage their own caches.

Open a disk image with O_DIRECT:
   $ lkvm run -d ~/img/test.img,direct

The original readonly flag is still supported.
Open a disk image with O_DIRECT and readonly:
   $ lkvm run -d ~/img/test.img,direct,ro

Signed-off-by: Asias He <asias.hejun@gmail.com>
Acked-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He 97f16d6688 kvm tools: Introduce struct disk_image_params
Introduce struct disk_image_params to contain all the disk image parameters.
This is useful for adding more disk image parameters, e.g. disk image
cache mode.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He fea74936f1 kvm tools: Increase AIO_MAX to 256
The queue size for virtio_blk is 256 and AIO_MAX is 32, we might be
short of available aio events if guest issues > 32 requests
simultaneously. Following error is observed when guest running stressed
I/O workload.

  Info: disk_image__read error: total=-11

To fix this, let's increase the aio events limit.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He b7c2f3bc49 kvm tools: Code cleanup for disk/raw.c
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He 449ca0aefe kvm tools: Code cleanup for disk/qcow.c
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He 54c84566ec kvm tools: Code cleanup for disk/core.c
Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:52 +01:00
Asias He dcd3cd8e4e kvm tools: Simplify disk read write function name
We read and write in sectors by default. It makes little sense to add
the extra _sector string for read and write ops/function name.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:51 +01:00
Asias He 58fd2b7b27 kvm tools: Make raw block device work
Previously, we use mmaped host root partition as guest's root
filesystem. As virtio-9p based root filesystem is supported,
mmaped host root partition approach is not used anymore.

It is useful to use raw block device as guest's disk backend for some
user. e.g. bypass host's fs layer.

This patch makes raw block device work as disk image, user can do
read/write on raw block device, by using DISK_IMAGE_REGULAR instead of
DISK_IMAGE_MMAP for block device

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:51 +01:00
Sasha Levin 9f9207c5ad kvm tools: Fixes for disk image module
Fixes include:
 - Error handling
 - Cleanup
 - Standard init/uninit

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
2015-06-01 16:39:51 +01:00
Cyrill Gorcunov 599ed2a84c kvm tools: Rename pr_error to pr_err to follow kernel convention
The kernel already has pr_err helper lets do the same.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:50 +01:00
Sasha Levin 3a60be0694 kvm tools: Trivial cleanup
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:50 +01:00
Lan Tianyu 2d2179c1d1 kvm tools, qcow: Add support for growing refcount blocks
This patch enables allocating new refcount blocks and so then kvm tools
could expand qcow2 image much larger.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:50 +01:00
Asias He 51476e4751 kvm tools: Drop write operation in ro_ops_nowrite
It is supposed to have no write ops in ro_ops_nowrite disk operation.
However, there is one. Let's remove it.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:49 +01:00
Asias He 54e0fcf4f6 kvm tools: Remove unnecessary assignment in disk/raw.c
ro_ops is never used after the assignment, so no need to do the
assignment.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:49 +01:00
Asias He 618cbb7c90 kvm tools: Get multiple io events at a time
This patch reduces the number of calls to io_getevents() by getting
multiple io events at a time instead of one in disk image thread.

Signed-off-by: Asias He <asias.hejun@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:49 +01:00
Lan Tianyu e184700adb kvm tools, qcow: Add the support for copy-on-write cluster
When meeting request to write the cluster without copied flag,
allocate a new cluster and write original data with modification
to the new cluster. This also adds support for the writing operation
of the qcow2 compressed image. After testing, image file can pass
through "qemu-img check". The performance is needed to be improved.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:49 +01:00
Sasha Levin 7dd5639103 kvm tools: Use correct config defines
For some reason some of the defines were set to HAS_VIRTIO instead of HAS_AIO.

This broke raw blk device.

Reported-and-tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 59e8453a72 kvm tools: Remove async flag from QCOW
QCOW disk image async flag was erroneously enabled, while QCOW doesn't support
async ops yet.

This has caused a hang when booting QCOW images.

Reported-and-tested-by: Richard -rw- Weinberger <richard.weinberger@gmail.com>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin f41a132b0a kvm tools: Use native vectored AIO in virtio-blk
This patch hooks AIO support into virtio-blk, allowing for faster IO.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[ penberg@kernel.org: wrap libaio include with CONFIG_HAS_AIO ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 8b52f877bf kvm tools: Split io request from completion
This patch splits IO request processing from completion notification.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 70e1ec58c5 kvm tools: Remove qcow nowrite function
It is no longer needed due to previous changes.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 5af21162e5 kvm tools: Add optional callback on disk op completion
This patch adds an optional callback to be called when a disk op completes.

Currently theres not much use for it, but it is the infrastructure for adding
aio support.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin b41ca15a23 kvm tools: Modify behaviour on missing ops ptr
In case a read or write op ptr is missing simply ignore it instead of
critically failing. This provides an easier way to prevent read or write
in specific scenarios.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 9d15c39a6e kvm tools: Modify disk ops usage
This patch modifies the definition and usage of ops for read only, mmap and
regular IO.

There is no longer a mix between iov and mmap, and read only no longer implies
mmap (although it will try to use it first).

This allows for more flexibility defining different ops for different
scenarios.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 2534c9b641 kvm tools: Remove the non-iov interface from disk image ops
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Sasha Levin 38c396e485 kvm tools: Switch to using an enum for disk image types
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Lan Tianyu af68c51ae7 kvm tools: Add support for the read operation of qcow and qcow2 compressed image
This patch adds the decompression operation when confirming the qcow or
qcow2 image is compressed. This patch also divides the read cluster
fucntion into two which are respective for qcow and qcow2 in order to be
convenient to support these two kind images. Add some macros for qcow.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
[ penberg@kernel.org: make zlib optional ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:48 +01:00
Pekka Enberg 3ecac800ec kvm tools, qcow: Add support for writing to zero refcount clusters
This patch adds support for writing to zero refcount clusters. Refcount blocks
are cached in like L2 tables and flushed upon VIRTIO_BLK_T_FLUSH and when
evicted from the LRU cache.

With this patch applied, 'qemu-img check' no longer complains about referenced
clusters with zero reference count after

  dd if=/dev/zero of=/mnt/tmp

where '/mnt' is freshly generated QCOW2 image.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 17f68274f5 kvm tools, qcow: Force read-only mode for QCOW images
The QCOW write support isn't stable enough for wide-spread use so force
read-only mode for QCOW images.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg e94cdf08a3 kvm tools, qcow: Rename L2 table lookup functions
In preparation for refcount block caching, rename L2 table lookup functions to
use less generic names.

Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 7b4eb53047 kvm tools, qcow: Move L2 cache to 'struct qcow_l1_table'
In preparation for refcount block cache, move L2 cache data structures to
'struct qcow_l1_table'.

Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 3fb67b939e kvm tools, qcow: Unify L1 and L2 variable names
This patch unifies qcow_read_cluster() and qcow_write_cluster() L1 and L2 table
variable names to make the code more readable.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 473aaa2d4c kvm tools: Rename 'struct qcow_table' to 'struct qcow_l1_table'
This patch renames the ambiguous 'struct qcow_table' to 'struct qcow_l1_table'
in preparation for introducing 'struct qcow_refcount_table'.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Sasha Levin eab301d059 kvm tools: Fix warning on 32bit
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 121dd76e2b kvm tools, qcow: Fix copy-on-write image corruption
We don't handle refcount table properly so make sure we only write to clusters
that have the "copied" flag set.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Sasha Levin ff6462e808 kvm tools: Implement VIRTIO_BLK_T_GET_ID
Return device id when requested by virtio-blk.
Device id is currently based on the device information and the inode
number of the underlying disk image.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg b2ebe61b25 kvm tools, qcow: I/O error on compressed sectors
We currently don't support compressed sectors in QCOW images so warn the user
about it and return a I/O error.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg aff88976e8 kvm tools, qcow: Flush only dirty L2 tables
This patch improves qcow_l2_cache_write() to only flush dirty L2 tables.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg a4e46515fd kvm tools, qcow: Delayed L2 table writeout
This patch delays writeout for new L2 tables like we do for L1 tables. If a L2
table has non-allocated clusters, we mark that in the in-memory L2 table but
don't actually write it to disk until the L2 table is thrown out of LRU cache
or when qcow_disk_flush() is called. That makes writes to new clusters volatile
before VIRTIO_BLK_T_FLUSH is issued without corrupting the QCOW image on I/O
error.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00
Pekka Enberg 4bd7e48bc8 kvm tools, qcow: Use big endian order for L2 table entries
Don't keep the in-memory array in CPU byte order to simplify delayed L2 table
writeout.

Cc: Asias He <asias.hejun@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Prasad Joshi <prasadjoshi124@gmail.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2015-06-01 16:39:46 +01:00