Compare commits

...

288 Commits

Author SHA1 Message Date
Michael Tokarev 6bbce8b464 Update version for 8.0.5 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-21 19:23:20 +03:00
Marc-André Lureau fcf58d6f20 tpm: fix crash when FD >= 1024 and unnecessary errors due to EINTR
Replace select() with poll() to fix a crash when QEMU has a large number
of FDs. Also use RETRY_ON_EINTR to avoid unnecessary errors due to EINTR.

Cc: qemu-stable@nongnu.org
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2020133
Fixes: 56a3c24ffc ("tpm: Probe for connected TPM 1.2 or TPM 2")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
(cherry picked from commit 8e32ddff69)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Janosch Frank 0ef930a29f s390x/ap: fix missing subsystem reset registration
A subsystem reset contains a reset of AP resources which has been
missing.  Adding the AP bridge to the list of device types that need
reset fixes this issue.

Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: a51b3153 ("s390x/ap: base Adjunct Processor (AP) object model")
Message-ID: <20230823142219.1046522-2-seiden@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 297ec01f0b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Marc-André Lureau 6c575436cd ui: fix crash when there are no active_console
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
812	    return con->hw_ops->ui_info != NULL;
(gdb) bt
#0  0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
#1  0x00005555558a44b1 in protocol_client_msg (vs=0x5555578c76c0, data=0x5555581e93f0 <incomplete sequence \373>, len=24) at ../ui/vnc.c:2585
#2  0x00005555558a19ac in vnc_client_read (vs=0x5555578c76c0) at ../ui/vnc.c:1607
#3  0x00005555558a1ac2 in vnc_client_io (ioc=0x5555581eb0e0, condition=G_IO_IN, opaque=0x5555578c76c0) at ../ui/vnc.c:1635

Fixes:
https://issues.redhat.com/browse/RHEL-2600

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Albert Esteve <aesteve@redhat.com>
(cherry picked from commit 48a35e12fa)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Stefan Berger 36540b367e hw/tpm: TIS on sysbus: Remove unsupport ppi command line option
The ppi command line option for the TIS device on sysbus never worked
and caused an immediate segfault. Remove support for it since it also
needs support in the firmware and needs testing inside the VM.

Reproducer with the ppi=on option passed:

qemu-system-aarch64 \
   -machine virt,gic-version=3 \
   -m 4G  \
   -nographic -no-acpi \
   -chardev socket,id=chrtpm,path=/tmp/mytpm1/swtpm-sock \
   -tpmdev emulator,id=tpm0,chardev=chrtpm \
   -device tpm-tis-device,tpmdev=tpm0,ppi=on
[...]
Segmentation fault (core dumped)

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230713171955.149236-1-stefanb@linux.ibm.com
(cherry picked from commit 4c46fe2ed4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Leon Schuermann 70c97e75d7 target/riscv/pmp.c: respect mseccfg.RLB for pmpaddrX changes
When the rule-lock bypass (RLB) bit is set in the mseccfg CSR, the PMP
configuration lock bits must not apply. While this behavior is
implemented for the pmpcfgX CSRs, this bit is not respected for
changes to the pmpaddrX CSRs. This patch ensures that pmpaddrX CSR
writes work even on locked regions when the global rule-lock bypass is
enabled.

Signed-off-by: Leon Schuermann <leons@opentitan.org>
Reviewed-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230829215046.1430463-1-leon@is.currently.online>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 4e3adce124)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Daniel Henrique Barboza 1805e05db3 target/riscv: fix satp_mode_finalize() when satp_mode.supported = 0
In the same emulated RISC-V host, the 'host' KVM CPU takes 4 times
longer to boot than the 'rv64' KVM CPU.

The reason is an unintended behavior of riscv_cpu_satp_mode_finalize()
when satp_mode.supported = 0, i.e. when cpu_init() does not set
satp_mode_max_supported(). satp_mode_max_from_map(map) does:

31 - __builtin_clz(map)

This means that, if satp_mode.supported = 0, satp_mode_supported_max
wil be '31 - 32'. But this is C, so satp_mode_supported_max will gladly
set it to UINT_MAX (4294967295). After that, if the user didn't set a
satp_mode, set_satp_mode_default_map(cpu) will make

cfg.satp_mode.map = cfg.satp_mode.supported

So satp_mode.map = 0. And then satp_mode_map_max will be set to
satp_mode_max_from_map(cpu->cfg.satp_mode.map), i.e. also UINT_MAX. The
guard "satp_mode_map_max > satp_mode_supported_max" doesn't protect us
here since both are UINT_MAX.

And finally we have 2 loops:

        for (int i = satp_mode_map_max - 1; i >= 0; --i) {

Which are, in fact, 2 loops from UINT_MAX -1 to -1. This is where the
extra delay when booting the 'host' CPU is coming from.

Commit 43d1de32f8 already set a precedence for satp_mode.supported = 0
in a different manner. We're doing the same here. If supported == 0,
interpret as 'the CPU wants the OS to handle satp mode alone' and skip
satp_mode_finalize().

We'll also put a guard in satp_mode_max_from_map() to assert out if map
is 0 since the function is not ready to deal with it.

Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 6f23aaeb9b ("riscv: Allow user to set the satp mode")
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230817152903.694926-1-dbarboza@ventanamicro.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 3a2fc23563)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Conor Dooley e94ea3c6db hw/riscv: virt: Fix riscv,pmu DT node path
On a dtb dumped from the virt machine, dt-validate complains:
soc: pmu: {'riscv,event-to-mhpmcounters': [[1, 1, 524281], [2, 2, 524284], [65561, 65561, 524280], [65563, 65563, 524280], [65569, 65569, 524280]], 'compatible': ['riscv,pmu']} should not be valid under {'type': 'object'}
        from schema $id: http://devicetree.org/schemas/simple-bus.yaml#
That's pretty cryptic, but running the dtb back through dtc produces
something a lot more reasonable:
Warning (simple_bus_reg): /soc/pmu: missing or empty reg/ranges property

Moving the riscv,pmu node out of the soc bus solves the problem.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20230727-groom-decline-2c57ce42841c@spud>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9ff3140631)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
LIU Zhiwei 9bac2bcf10 linux-user/riscv: Use abi type for target_ucontext
We should not use types dependend on host arch for target_ucontext.
This bug is found when run rv32 applications.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230811055438.1945-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit ae7d4d625c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Jason Chien c00a9ec061 hw/intc: Make rtc variable names consistent
The variables whose values are given by cpu_riscv_read_rtc() should be named
"rtc". The variables whose value are given by cpu_riscv_read_rtc_raw()
should be named "rtc_r".

Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-2-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9382a9eafc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Jason Chien fd1a0c89c6 hw/intc: Fix upper/lower mtime write calculation
When writing the upper mtime, we should keep the original lower mtime
whose value is given by cpu_riscv_read_rtc() instead of
cpu_riscv_read_rtc_raw(). The same logic applies to writes to lower mtime.

Signed-off-by: Jason Chien <jason.chien@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-1-jason.chien@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit e0922b73ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-20 10:18:14 +03:00
Thomas Huth a57e4cc6fe hw/char/riscv_htif: Fix the console syscall on big endian hosts
Values that have been read via cpu_physical_memory_read() from the
guest's memory have to be swapped in case the host endianess differs
from the guest.

Fixes: a6e13e31d5 ("riscv_htif: Support console output via proxy syscall")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230721094720.902454-3-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 058096f1c5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fix in hw/char/riscv_htif.c for #include)
2023-09-20 10:18:14 +03:00
Thomas Huth aeb931d82b include/exec: Provide the tswap() functions for target independent code, too
In some cases of target independent code, it would be useful to have access
to the functions that swap endianess in case it differs between guest and
host. Thus re-implement the tswapXX() functions in a new header that can be
included separately. The check whether the swapping is needed continues to
be done at compile-time for target specific code, while it is done at
run-time in target-independent code.

Message-Id: <20230411183418.1640500-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 24be3369ad)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(trivial change needed for the next commit 058096f1c5
 "hw/char/riscv_htif: Fix the console syscall on big endian hosts")
2023-09-20 10:17:02 +03:00
Thomas Huth 3af03de983 hw/char/riscv_htif: Fix printing of console characters on big endian hosts
The character that should be printed is stored in the 64 bit "payload"
variable. The code currently tries to print it by taking the address
of the variable and passing this pointer to qemu_chr_fe_write(). However,
this only works on little endian hosts where the least significant bits
are stored on the lowest address. To do this in a portable way, we have
to store the value in an uint8_t variable instead.

Fixes: 5033606780 ("RISC-V HTIF Console")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230721094720.902454-2-thuth@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit c255946e3d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-12 13:37:02 +03:00
Colton Lewis 53a4e7ef42 arm64: Restore trapless ptimer access
Due to recent KVM changes, QEMU is setting a ptimer offset resulting
in unintended trap and emulate access and a consequent performance
hit. Filter out the PTIMER_CNT register to restore trapless ptimer
access.

Quoting Andrew Jones:

Simply reading the CNT register and writing back the same value is
enough to set an offset, since the timer will have certainly moved
past whatever value was read by the time it's written.  QEMU
frequently saves and restores all registers in the get-reg-list array,
unless they've been explicitly filtered out (with Linux commit
680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array). So, to
restore trapless ptimer accesses, we need a QEMU patch to filter out
the register.

See
https://lore.kernel.org/kvmarm/gsntttsonus5.fsf@coltonlewis-kvm.c.googlers.com/T/#m0770023762a821db2a3f0dd0a7dc6aa54e0d0da9
for additional context.

Cc: qemu-stable@nongnu.org
Signed-off-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Colton Lewis <coltonlewis@google.com>
Message-id: 20230831190052.129045-1-coltonlewis@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 682814e2a3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-11 22:33:58 +03:00
Kevin Wolf 41af7a9bc4 virtio: Drop out of coroutine context in virtio_load()
virtio_load() as a whole should run in coroutine context because it
reads from the migration stream and we don't want this to block.

However, it calls virtio_set_features_nocheck() and devices don't
expect their .set_features callback to run in a coroutine and therefore
call functions that may not be called in coroutine context. To fix this,
drop out of coroutine context for calling virtio_set_features_nocheck().

Without this fix, the following crash was reported:

  #0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
  #1  0x00007efc738c05d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
  #2  0x00007efc73873d26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
  #3  0x00007efc738477f3 in __GI_abort () at abort.c:79
  #4  0x00007efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()",
     file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", line=line@entry=275, function=function@entry=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:92
  #5  0x00007efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275,
     function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:101
  #6  0x0000560aebcd8dd6 in bdrv_register_buf ()
  #7  0x0000560aeb97ed97 in ram_block_added.llvm ()
  #8  0x0000560aebb8303f in ram_block_add.llvm ()
  #9  0x0000560aebb834fa in qemu_ram_alloc_internal.llvm ()
  #10 0x0000560aebb2ac98 in vfio_region_mmap ()
  #11 0x0000560aebb3ea0f in vfio_bars_register ()
  #12 0x0000560aebb3c628 in vfio_realize ()
  #13 0x0000560aeb90f0c2 in pci_qdev_realize ()
  #14 0x0000560aebc40305 in device_set_realized ()
  #15 0x0000560aebc48e07 in property_set_bool.llvm ()
  #16 0x0000560aebc46582 in object_property_set ()
  #17 0x0000560aebc4cd58 in object_property_set_qobject ()
  #18 0x0000560aebc46ba7 in object_property_set_bool ()
  #19 0x0000560aeb98b3ca in qdev_device_add_from_qdict ()
  #20 0x0000560aebb1fbaf in virtio_net_set_features ()
  #21 0x0000560aebb46b51 in virtio_set_features_nocheck ()
  #22 0x0000560aebb47107 in virtio_load ()
  #23 0x0000560aeb9ae7ce in vmstate_load_state ()
  #24 0x0000560aeb9d2ee9 in qemu_loadvm_state_main ()
  #25 0x0000560aeb9d45e1 in qemu_loadvm_state ()
  #26 0x0000560aeb9bc32c in process_incoming_migration_co.llvm ()
  #27 0x0000560aebeace56 in coroutine_trampoline.llvm ()

Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-832
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-3-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92e2e6a867)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-11 22:31:56 +03:00
Marc-André Lureau 5929f53091 qxl: don't assert() if device isn't yet initialized
If the PCI BAR isn't yet mapped or was unmapped, QXL_IO_SET_MODE will
assert(). Instead, report a guest bug and keep going.

This can be reproduced with:

cat << EOF | ./qemu-system-x86_64 -vga qxl -m 2048 -nodefaults -qtest stdio
outl 0xcf8 0x8000101c
outl 0xcfc 0xc000
outl 0xcf8 0x80001001
outl 0xcfc 0x01000000
outl 0xc006 0x00
EOF

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1829

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 95bef686e4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Thomas Huth c68b844d33 hw/net/vmxnet3: Fix guest-triggerable assert()
The assert() that checks for valid MTU sizes can be triggered by
the guest (e.g. with the reproducer code from the bug ticket
https://gitlab.com/qemu-project/qemu/-/issues/517 ). Let's avoid
this problem by simply logging the error and refusing to activate
the device instead.

Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: change format specifier from %d to %u for uint32_t argument]
(cherry picked from commit 90a0778421)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Markus Armbruster 42edb4723a docs tests: Fix use of migrate_set_parameter
docs/multi-thread-compression.txt uses parameter names with
underscores instead of dashes.  Wrong since day one.

docs/rdma.txt, tests/qemu-iotests/181, and tests/qtest/test-hmp.c are
wrong the same way since commit cbde7be900 (v6.0.0).  Hard to see,
as test-hmp doesn't check whether the commands work, and iotest 181
appears to be unaffected.

Fixes: 263170e679 (docs: Add a doc about multiple thread compression)
Fixes: cbde7be900 (migrate: remove QMP/HMP commands for speed, downtime and cache size)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit b21a6e31a1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Thomas Huth 45b61f730d qemu-options.hx: Rephrase the descriptions of the -hd* and -cdrom options
The current description says that these options will create a device
on the IDE bus, which is only true on x86. So rephrase these sentences
a little bit to speak of "default bus" instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bcd8e24308)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Hang Yu c2ec46c694 hw/i2c/aspeed: Fix TXBUF transmission start position error
According to the ast2600 datasheet and the linux aspeed i2c driver,
the TXBUF transmission start position should be TXBUF[0] instead
of TXBUF[1],so the arg pool_start is useless,and the address is not
included in TXBUF.So even if Tx Count equals zero,there is at least
1 byte data needs to be transmitted,and M_TX_CMD should not be cleared
at this condition.The driver url is:
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v5.15/drivers/i2c/busses/i2c-ast2600.c

Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 6054fc73e8 ("aspeed/i2c: Add support for pool buffer transfers")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 961faf3ddb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Hang Yu 4a398e64ba hw/i2c/aspeed: Fix Tx count and Rx size error in buffer pool mode
Fixed inconsistency between the regisiter bit field definition header file
and the ast2600 datasheet. The reg name is I2CD1C:Pool Buffer Control
Register in old register mode and  I2CC0C: Master/Slave Pool Buffer Control
Register in new register mode. They share bit field
[12:8]:Transmit Data Byte Count and bit field
[29:24]:Actual Received Pool Buffer Size according to the datasheet.
According to the ast2600 datasheet,the actual Tx count is
Transmit Data Byte Count plus 1, and the max Rx size is
Receive Pool Buffer Size plus 1, both in Pool Buffer Control Register.
The version before forgot to plus 1, and mistake Rx count for Rx size.

Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn>
Fixes: 3be3d6ccf2 ("aspeed: i2c: Migrate to registerfields API")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 97b8aa5ae9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Niklas Cassel 4f6c553717 hw/ide/ahci: fix broken SError handling
When encountering an NCQ error, you should not write the NCQ tag to the
SError register. This is completely wrong.

The SError register has a clear definition, where each bit represents a
different error, see PxSERR definition in AHCI 1.3.1.

If we write a random value (like the NCQ tag) in SError, e.g. Linux will
read SError, and will trigger arbitrary error handling depending on the
NCQ tag that happened to be executing.

In case of success, ncq_cb() will call ncq_finish().
In case of error, ncq_cb() will call ncq_err() (which will clear
ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
sufficient to tell if finished should get set or not.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-9-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 9f89423537)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Niklas Cassel 9c7e2253eb hw/ide/ahci: fix ahci_write_fis_sdb()
When there is an error, we need to raise a TFES error irq, see AHCI 1.3.1,
5.3.13.1 SDB:Entry.

If ERR_STAT is set, we jump to state ERR:FatalTaskfile, which will raise
a TFES IRQ unconditionally, regardless if the I bit is set in the FIS or
not.

Thus, we should never raise a normal IRQ after having sent an error IRQ.

It is valid to signal successfully completed commands as finished in the
same SDB FIS that generates the error IRQ. The important thing is that
commands that did not complete successfully (e.g. commands that were
aborted, do not get the finished bit set).

Before this commit, there was never a TFES IRQ raised on NCQ error.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-8-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 7e85cb0db4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:40:11 +03:00
Niklas Cassel f7cca09987 hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
Successfully means ERR_STAT, BUSY and DRQ are all cleared.

A command that has ERR_STAT set, does not get to clear PxCI.
See AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI,
and 5.3.16.5 ERR:FatalTaskfile.

In the case of non-NCQ commands, not clearing PxCI is needed in order
for host software to be able to see which command slot that failed.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-7-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 1a16ce64fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:58 +03:00
Niklas Cassel 2eaf7775fc hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
According to AHCI 1.3.1 definition of PxSACT:
This field is cleared when PxCMD.ST is written from a '1' to a '0' by
software. This field is not cleared by a COMRESET or a software reset.

According to AHCI 1.3.1 definition of PxCI:
This field is also cleared when PxCMD.ST is written from a '1' to a '0'
by software.

Clearing PxCMD.ST is part of the error recovery procedure, see
AHCI 1.3.1, section "6.2 Error Recovery".

If we don't clear PxCI on error recovery, the previous command will
incorrectly still be marked as pending after error recovery.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-6-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d73b84d0b6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Niklas Cassel 7bcd32128b hw/ide/ahci: simplify and document PxCI handling
The AHCI spec states that:
For NCQ, PxCI is cleared on command queued successfully.

For non-NCQ, PxCI is cleared on command completed successfully.
(A non-NCQ command that completes with error does not clear PxCI.)

The current QEMU implementation either clears PxCI in check_cmd(),
or in ahci_cmd_done().

check_cmd() will clear PxCI for a command if handle_cmd() returns 0.
handle_cmd() will return -1 if BUSY or DRQ is set.

The QEMU implementation for NCQ commands will currently not set BUSY
or DRQ, so they will always have PxCI cleared by handle_cmd().
ahci_cmd_done() will never even get called for NCQ commands.

Non-NCQ commands are executed by ide_bus_exec_cmd().
Non-NCQ commands in QEMU are implemented either in a sync or in an async
way.

For non-NCQ commands implemented in a sync way, the command handler will
return true, and when ide_bus_exec_cmd() sees that a command handler
returns true, it will call ide_cmd_done() (which will call
ahci_cmd_done()). For a command implemented in a sync way,
ahci_cmd_done() will do nothing (since busy_slot is not set). Instead,
after ide_bus_exec_cmd() has finished, check_cmd() will clear PxCI for
these commands.

For non-NCQ commands implemented in an async way (using either aiocb or
pio_aiocb), the command handler will return false, ide_bus_exec_cmd()
will not call ide_cmd_done(), instead it is expected that the async
callback function will call ide_cmd_done() once the async command is
done. handle_cmd() will set busy_slot, if and only if BUSY or DRQ is
set, and this is checked _after_ ide_bus_exec_cmd() has returned.
handle_cmd() will return -1, so check_cmd() will not clear PxCI.
When the async callback calls ide_cmd_done() (which will call
ahci_cmd_done()), it will see that busy_slot is set, and
ahci_cmd_done() will clear PxCI.

This seems racy, since busy_slot is set _after_ ide_bus_exec_cmd() has
returned. The callback might come before busy_slot gets set. And it is
quite confusing that ahci_cmd_done() will be called for all non-NCQ
commands when the command is done, but will only clear PxCI in certain
cases, even though it will always write a D2H FIS and raise an IRQ.

Even worse, in the case where ahci_cmd_done() does not clear PxCI, it
still raises an IRQ. Host software might thus read an old PxCI value,
since PxCI is cleared (by check_cmd()) after the IRQ has been raised.

Try to simplify this by always setting busy_slot for non-NCQ commands,
such that ahci_cmd_done() will always be responsible for clearing PxCI
for non-NCQ commands.

For NCQ commands, clear PxCI when we receive the D2H FIS, but before
raising the IRQ, see AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and
RegFIS:ClearCI.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-5-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit e2a5d9b3d9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Niklas Cassel 362a4d8658 hw/ide/ahci: write D2H FIS when processing NCQ command
The way that BUSY + PxCI is cleared for NCQ (FPDMA QUEUED) commands is
described in SATA 3.5a Gold:

11.15 FPDMA QUEUED command protocol
DFPDMAQ2: ClearInterfaceBsy
"Transmit Register Device to Host FIS with the BSY bit cleared to zero
and the DRQ bit cleared to zero and Interrupt bit cleared to zero to
mark interface ready for the next command."

PxCI is currently cleared by handle_cmd(), but we don't write the D2H
FIS to the FIS Receive Area that actually caused PxCI to be cleared.

Similar to how ahci_pio_transfer() calls ahci_write_fis_pio() with an
additional parameter to write a PIO Setup FIS without raising an IRQ,
add a parameter to ahci_write_fis_d2h() so that ahci_write_fis_d2h()
also can write the FIS to the FIS Receive Area without raising an IRQ.

Change process_ncq_command() to call ahci_write_fis_d2h() without
raising an IRQ (similar to ahci_pio_transfer()), such that the FIS
Receive Area is in sync with the PxTFD shadow register.

E.g. Linux reads status and error fields from the FIS Receive Area
directly, so it is wise to keep the FIS Receive Area and the PxTFD
shadow register in sync.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-4-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 2967dc8209)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Niklas Cassel 67894ec9fd hw/ide/core: set ERR_STAT in unsupported command completion
Currently, the first time sending an unsupported command
(e.g. READ LOG DMA EXT) will not have ERR_STAT set in the completion.
Sending the unsupported command again, will correctly have ERR_STAT set.

When ide_cmd_permitted() returns false, it calls ide_abort_command().
ide_abort_command() first calls ide_transfer_stop(), which will call
ide_transfer_halt() and ide_cmd_done(), after that ide_abort_command()
sets ERR_STAT in status.

ide_cmd_done() for AHCI will call ahci_write_fis_d2h() which writes the
current status in the FIS, and raises an IRQ. (The status here will not
have ERR_STAT set!).

Thus, we cannot call ide_transfer_stop() before setting ERR_STAT, as
ide_transfer_stop() will result in the FIS being written and an IRQ
being raised.

The reason why it works the second time, is that ERR_STAT will still
be set from the previous command, so when writing the FIS, the
completion will correctly have ERR_STAT set.

Set ERR_STAT before writing the FIS (calling cmd_done), so that we will
raise an error IRQ correctly when receiving an unsupported command.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-3-nks@flawful.org
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c3461c6264)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Richard Henderson 956b96f9e2 target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1779
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit af03aeb631)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Nicholas Piggin ea25506b5d ppc/vof: Fix missed fields in VOF cleanup
Failing to reset the of_instance_last makes ihandle allocation continue
to increase, which causes record-replay replay fail to match the
recorded trace.

Not resetting claimed_base makes VOF eventually run out of memory after
some resets.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: fc8c745d50 ("spapr: Implement Open Firmware client interface")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 7b8589d7ce)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Maksim Kostin fcb49ea23c hw/ppc/e500: fix broken snapshot replay
ppce500_reset_device_tree is registered for system reset, but after
c4b075318e this function rerandomizes rng-seed via
qemu_guest_getrandom_nofail. And when loading a snapshot, it tries to read
EVENT_RANDOM that doesn't exist, so we have an error:

  qemu-system-ppc: Missing random event in the replay log

To fix this, use qemu_register_reset_nosnapshotload instead of
qemu_register_reset.

Reported-by: Vitaly Cheptsov <cheptsov@ispras.ru>
Fixes: c4b075318e ("hw/ppc: pass random seed to fdt ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1634
Signed-off-by: Maksim Kostin <maksim.kostin@ispras.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 6ec65b69ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Fabiano Rosas e8bb4dc55a block-migration: Ensure we don't crash during migration cleanup
We can fail the blk_insert_bs() at init_blk_migration(), leaving the
BlkMigDevState without a dirty_bitmap and BlockDriverState. Account
for the possibly missing elements when doing cleanup.

Fix the following crashes:

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
359         BlockDriverState *bs = bitmap->bs;
 #0  0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
 #1  0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371
 #2  0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
7073        QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
 #0  0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
 #1  0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095
 #2  0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690

Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230731203338.27581-1-farosas@suse.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f187609f27)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Philippe Mathieu-Daudé 3c934310ff docs/about/license: Update LICENSE URL
In early 2021 (see commit 2ad784339e "docs: update README to use
GitLab repo URLs") almost all of the code base was converted to
point to GitLab instead of git.qemu.org. During 2023, git.qemu.org
switched from a git mirror to a http redirect to GitLab (see [1]).

Update the LICENSE URL to match its previous content, displaying
the file raw content similarly to gitweb 'blob_plain' format ([2]).

[1] https://lore.kernel.org/qemu-devel/CABgObfZu3mFc8tM20K-yXdt7F-7eV-uKZN4sKDarSeu7DYoRbA@mail.gmail.com/
[2] https://git-scm.com/docs/gitweb#Documentation/gitweb.txt-blobplain

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230822125716.55295-1-philmd@linaro.org>
(cherry picked from commit 09a3fffae0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Richard Henderson d4c0ac705d target/arm: Fix 64-bit SSRA
Typo applied byte-wise shift instead of double-word shift.

Cc: qemu-stable@nongnu.org
Fixes: 631e565450 ("target/arm: Create gen_gvec_[us]sra")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1737
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230821022025.397682-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit cd1e4db736)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Richard Henderson 09640031ed target/arm: Fix SME ST1Q
A typo, noted in the bug report, resulting in an
incorrect write offset.

Cc: qemu-stable@nongnu.org
Fixes: 7390e0e9ab ("target/arm: Implement SME LD1, ST1")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1833
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230818214255.146905-1-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4b3520fd93)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Akihiko Odaki f5cb21416e accel/kvm: Specify default IPA size for arm64
Before this change, the default KVM type, which is used for non-virt
machine models, was 0.

The kernel documentation says:
> On arm64, the physical address size for a VM (IPA Size limit) is
> limited to 40bits by default. The limit can be configured if the host
> supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
> identifier, where IPA_Bits is the maximum width of any physical
> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
> machine type identifier.
>
> e.g, to configure a guest to use 48bit physical address size::
>
>     vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>
> The requested size (IPA_Bits) must be:
>
>  ==   =========================================================
>   0   Implies default size, 40bits (for backward compatibility)
>   N   Implies N bits, where N is a positive integer such that,
>       32 <= N <= Host_IPA_Limit
>  ==   =========================================================

> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host
> and is dependent on the CPU capability and the kernel configuration.
> The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the
> KVM_CHECK_EXTENSION ioctl() at run-time.
>
> Creation of the VM will fail if the requested IPA size (whether it is
> implicit or explicit) is unsupported on the host.
https://docs.kernel.org/virt/kvm/api.html#kvm-create-vm

So if Host_IPA_Limit < 40, specifying 0 as the type will fail. This
actually confused libvirt, which uses "none" machine model to probe the
KVM availability, on M2 MacBook Air.

Fix this by using Host_IPA_Limit as the default type when
KVM_CAP_ARM_VM_IPA_SIZE is available.

Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-3-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1ab445af8c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Akihiko Odaki aa152711db kvm: Introduce kvm_arch_get_default_type hook
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.

For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.

Cc: qemu-stable@nongnu.org
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5e0d65909c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Thomas Huth f2f8e74ff4 include/hw/virtio/virtio-gpu: Fix virtio-gpu with blob on big endian hosts
Using "-device virtio-gpu,blob=true" currently does not work on big
endian hosts (like s390x). The guest kernel prints an error message
like:

 [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x10c)

and the display stays black. When running QEMU with "-d guest_errors",
it shows an error message like this:

 virtio_gpu_create_mapping_iov: nr_entries is too big (83886080 > 16384)

which indicates that this value has not been properly byte-swapped.
And indeed, the virtio_gpu_create_blob_bswap() function (that should
swap the fields in the related structure) fails to swap some of the
entries. After correctly swapping all missing values here, too, the
virtio-gpu device is now also working with blob=true on s390x hosts.

Fixes: e0933d91b1 ("virtio-gpu: Add virtio_gpu_resource_create_blob")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2230469
Message-Id: <20230815122007.928049-1-thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit d194362910)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Ilya Leoshkevich 96fd3b8508 target/s390x: Check reserved bits of VFMIN/VFMAX's M5
VFMIN and VFMAX should raise a specification exceptions when bits 1-3
of M5 are set.

Cc: qemu-stable@nongnu.org
Fixes: da4807527f ("s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804234621.252522-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6a2ea61518)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Ilya Leoshkevich 62ac9cbb6f target/s390x: Fix VSTL with a large length
The length is always truncated to 16 bytes. Do not probe more than
that.

Cc: qemu-stable@nongnu.org
Fixes: 0e0a5b49ad ("s390x/tcg: Implement VECTOR STORE WITH LENGTH")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804235624.263260-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6db3518ba4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Ilya Leoshkevich 14f78932e0 target/s390x: Use a 16-bit immediate in VREP
Unlike most other instructions that contain an immediate element index,
VREP's one is 16-bit, and not 4-bit. The code uses only 8 bits, so
using, e.g., 0x101 does not lead to a specification exception.

Fix by checking all 16 bits.

Cc: qemu-stable@nongnu.org
Fixes: 28d08731b1 ("s390x/tcg: Implement VECTOR REPLICATE")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230807163459.849766-1-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 23e87d419f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Ilya Leoshkevich 179a37924d target/s390x: Fix the "ignored match" case in VSTRS
Currently the emulation of VSTRS recognizes partial matches in presence
of \0 in the haystack, which, according to PoP, is not correct:

    If the ZS flag is one and a zero byte was detected
    in the second operand, then there can not be a
    partial match ...

Add a check for this. While at it, fold a number of explicitly handled
special cases into the generic logic.

Cc: qemu-stable@nongnu.org
Reported-by: Claudio Fontana <cfontana@suse.de>
Closes: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00633.html
Fixes: 1d706f3141 ("target/s390x: vxeh2: vector string search")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230804233748.218935-3-iii@linux.ibm.com>
Tested-by: Claudio Fontana <cfontana@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 791b2b6a93)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Bernhard Beschow b4b3aac5b5 hw/sd/sdhci: Do not force sdhci_mmio_*_ops onto all SD controllers
Since commit c0a55a0c9d "hw/sd/sdhci: Support big endian SD host controller
interfaces" sdhci_common_realize() forces all SD card controllers to use either
sdhci_mmio_le_ops or sdhci_mmio_be_ops, depending on the "endianness" property.
However, there are device models which use different MMIO ops: TYPE_IMX_USDHC
uses usdhc_mmio_ops and TYPE_S3C_SDHCI uses sdhci_s3c_mmio_ops.

Forcing sdhci_mmio_le_ops breaks SD card handling on the "sabrelite" board, for
example. Fix this by defaulting the io_ops to little endian and switch to big
endian in sdhci_common_realize() only if there is a matchig big endian variant
available.

Fixes: c0a55a0c9d ("hw/sd/sdhci: Support big endian SD host controller
interfaces")

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-Id: <20230709080950.92489-1-shentey@gmail.com>
(cherry picked from commit 3b83079015)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Luca Bonissi af0c16fae9 Fixed incorrect LLONG alignment for openrisc and cris
OpenRISC (or1k) has long long alignment to 4 bytes, but currently not
defined in abitypes.h. This lead to incorrect packing of /epoll_event/
structure and eventually infinite loop while waiting for file
descriptor[s] event[s].

Fixed also CRIS alignments (1 byte for all types).

Signed-off-by: Luca Bonissi <qemu@bonslack.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1770
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 6ee960823d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Richard Henderson 40cfe12cb6 include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for nios2
Based on gcc's nios2.h setting BIGGEST_ALIGNMENT to 32 bits.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit ea9812d93f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Richard Henderson 43d2db4928 include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for microblaze
Based on gcc's microblaze.h setting BIGGEST_ALIGNMENT to 32 bits.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit e73f27003e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Nathan Egge ee638bc5b5 linux-user/elfload: Set V in ELF_HWCAP for RISC-V
Set V bit for hwcap if misa is set.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1793
Signed-off-by: Nathan Egge <negge@xiph.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Tested-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230803131424.40744-1-negge@xiph.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 4333f0924c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Klaus Jensen bb5f9036d5 hw/nvme: fix null pointer access in ruh update
The Reclaim Unit Update operation in I/O Management Receive does not
verify the presence of a configured endurance group prior to accessing
it.

Fix this.

Cc: qemu-stable@nongnu.org
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 3439ba9c5d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Klaus Jensen 43328764f7 hw/nvme: fix null pointer access in directive receive
nvme_directive_receive() does not check if an endurance group has been
configured (set) prior to testing if flexible data placement is enabled
or not.

Fix this.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1815
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reviewed-by: Jesper Wendel Devantier <j.devantier@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 6c8f8456cb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Ankit Kumar f47369c3d1 hw/nvme: fix CRC64 for guard tag
The nvme CRC64 generator expects the caller to pass inverted seed value.
Pass inverted crc value for metadata buffer.

Cc: qemu-stable@nongnu.org
Fixes: 44219b6029 ("hw/nvme: 64-bit pi support")
Signed-off-by: Ankit Kumar <ankit.kumar@samsung.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit dbdb13f931)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Klaus Jensen cbd3c5db76 hw/nvme: fix compliance issue wrt. iosqes/iocqes
As of prior to this patch, the controller checks the value of CC.IOCQES
and CC.IOSQES prior to enabling the controller. As reported by Ben in
GitLab issue #1691, this is not spec compliant. The controller should
only check these values when queues are created.

This patch moves these checks to nvme_create_cq(). We do not need to
check it in nvme_create_sq() since that will error out if the completion
queue is not already created.

Also, since the controller exclusively supports SQEs of size 64 bytes
and CQEs of size 16 bytes, hard code that.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1691
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 6a33f2e920)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Klaus Jensen dd496f92b9 hw/nvme: fix oob memory read in fdp events log
As reported by Trend Micro's Zero Day Initiative, an oob memory read
vulnerability exists in nvme_fdp_events(). The host-provided offset is
not verified.

Fix this.

This is only exploitable when Flexible Data Placement mode (fdp=on) is
enabled.

Fixes: CVE-2023-4135
Fixes: 73064edfb8 ("hw/nvme: flexible data placement emulation")
Reported-by: Trend Micro's Zero Day Initiative
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit ecb1b7b082)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Dongli Zhang a11a2007a5 dump: kdump-zlib data pages not dumped with pvtime/aarch64
The kdump-zlib data pages are not dumped from aarch64 host when the
'pvtime' is involved, that is, when the block->target_end is not aligned to
page_size. In the below example, it is expected to dump two blocks.

(qemu) info mtree -f
... ...
  00000000090a0000-00000000090a0fff (prio 0, ram): pvtime KVM
... ...
  0000000040000000-00000001bfffffff (prio 0, ram): mach-virt.ram KVM
... ...

However, there is an issue with get_next_page() so that the pages for
"mach-virt.ram" will not be dumped.

At line 1296, although we have reached at the end of the 'pvtime' block,
since it is not aligned to the page_size (e.g., 0x10000), it will not break
at line 1298.

1255 static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr,
1256                           uint8_t **bufptr, DumpState *s)
... ...
1294             memcpy(buf + addr % page_size, hbuf, n);
1295             addr += n;
1296             if (addr % page_size == 0) {
1297                 /* we filled up the page */
1298                 break;
1299             }

As a result, get_next_page() will continue to the next
block ("mach-virt.ram"). Finally, when get_next_page() returns to the
caller:

- 'pfnptr' is referring to the 'pvtime'
- but 'blockptr' is referring to the "mach-virt.ram"

When get_next_page() is called the next time, "*pfnptr += 1" still refers
to the prior 'pvtime'. It will exit immediately because it is out of the
range of the current "mach-virt.ram".

The fix is to break when it is time to come to the next block, so that both
'pfnptr' and 'blockptr' refer to the same block.

Fixes: 94d788408d ("dump: fix kdump to work over non-aligned blocks")
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230713055819.30497-1-dongli.zhang@oracle.com>
(cherry picked from commit 8a64609eea)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
BALATON Zoltan 07b7ec0af0 hw/i2c: Fix bitbang_i2c_data trace event
The clock and data values were logged swapped. Correct the trace event
text to match what is logged. Also fix a typo in a comment nearby.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 8ada214a90)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Zhao Liu abb4828d5d hw/smbios: Fix core count in type4
>From SMBIOS 3.0 specification, core count field means:

Core Count is the number of cores detected by the BIOS for this
processor socket. [1]

Before 003f230e37 ("machine: Tweak the order of topology members in
struct CpuTopology"), MachineState.smp.cores means "the number of cores
in one package", and it's correct to use smp.cores for core count.

But 003f230e37 changes the smp.cores' meaning to "the number of cores
in one die" and doesn't change the original smp.cores' use in smbios as
well, which makes core count in type4 go wrong.

Fix this issue with the correct "cores per socket" caculation.

[1] SMBIOS 3.0.0, section 7.5.6, Processor Information - Core Count

Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 196ea60a73)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Zhao Liu ce2e3879a4 hw/smbios: Fix thread count in type4
>From SMBIOS 3.0 specification, thread count field means:

Thread Count is the total number of threads detected by the BIOS for
this processor socket. It is a processor-wide count, not a
thread-per-core count. [1]

So here we should use threads per socket other than threads per core.

[1] SMBIOS 3.0.0, section 7.5.8, Processor Information - Thread Count

Fixes: c97294ec1b ("SMBIOS: Build aggregate smbios tables and entry point")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7298fd7de5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Zhao Liu c107dab494 hw/smbios: Fix smbios_smp_sockets caculation
smp.sockets is the number of sockets which is configured by "-smp" (
otherwise, the default is 1). Trying to recalculate it here with another
rules leads to errors, such as:

1. 003f230e37 ("machine: Tweak the order of topology members in struct
   CpuTopology") changes the meaning of smp.cores but doesn't fix
   original smp.cores uses.

   With the introduction of cluster, now smp.cores means the number of
   cores in one cluster. So smp.cores * smp.threads just means the
   threads in a cluster not in a socket.

2. On the other hand, we shouldn't use smp.cpus here because it
   indicates the initial number of online CPUs at the boot time, and is
   not mathematically related to smp.sockets.

So stop reinventing the another wheel and use the topo values that
has been calculated.

Fixes: 003f230e37 ("machine: Tweak the order of topology members in struct CpuTopology")
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d79a284a44)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Zhao Liu cfff72b21e machine: Add helpers to get cores/threads per socket
The number of cores/threads per socket are needed for smbios, and are
also useful for other modules.

Provide the helpers to wrap the calculation of cores/threads per socket
so that we can avoid calculation errors caused by other modules miss
topology changes.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a1d027be95)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:39:41 +03:00
Alexander Bulekov e7f12ce43d pnv_lpc: disable reentrancy detection for lpc-hc
As lpc-hc is designed for re-entrant calls from xscom, mark it
re-entrancy safe.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
[clg: mark opb_master_regs as re-entrancy safe also ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230526073850.2772197-1-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 76f9ebffcd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 48c04e42f0 loongarch: mark loongarch_ipi_iocsr re-entrnacy safe
loongarch_ipi_iocsr MRs rely on re-entrant IO through the ipi_send
function. As such, mark these MRs re-entrancy-safe.

Fixes: a2e1753b80 ("memory: prevent dma-reentracy issues")
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Message-Id: <20230506112145.3563708-1-alxndr@bu.edu>
Signed-off-by: Song Gao <gaosong@loongson.cn>
(cherry picked from commit 6d0589e0e6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 305ffdeca8 apic: disable reentrancy detection for apic-msi
As the code is designed for re-entrant calls to apic-msi, mark apic-msi
as reentrancy-safe.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-9-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 50795ee051)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 151649da1b raven: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from raven_io_ops to
pci-conf, mark raven_io_ops as reentrancy-safe.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230427211013.2994127-8-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6dad5a6810)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 83d080e85a bcm2835_property: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from bcm2835_property to
bcm2835_mbox and back into bcm2835_property, mark iomem as
reentrancy-safe.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-7-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 985c4a4e54)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Thomas Huth 0f0fb19d2b lsi53c895a: disable reentrancy detection for MMIO region, too
While trying to use a SCSI disk on the LSI controller with an
older version of Fedora (25), I'm getting:

 qemu: warning: Blocked re-entrant IO on MemoryRegion: lsi-mmio at addr: 0x34

and the SCSI controller is not usable. Seems like we have to
disable the reentrancy checker for the MMIO region, too, to
get this working again.

The problem could be reproduced it like this:

./qemu-system-x86_64 -accel kvm -m 2G -machine q35 \
 -device lsi53c810,id=lsi1 -device scsi-hd,drive=d0 \
 -drive if=none,id=d0,file=.../somedisk.qcow2 \
 -cdrom Fedora-Everything-netinst-i386-25-1.3.iso

Where somedisk.qcow2 is an image that contains already some partitions
and file systems.

In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"

Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.

Message-Id: <20230516090556.553813-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit d139fe9ad8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov db43c7db20 lsi53c895a: disable reentrancy detection for script RAM
As the code is designed to use the memory APIs to access the script ram,
disable reentrancy checks for the pseudo-RAM ram_io MemoryRegion.

In the future, ram_io may be converted from an IO to a proper RAM MemoryRegion.

Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-6-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit bfd6e7ae6a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov fd9de51ea3 hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues.

Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f63192b054)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov db56206f78 checkpatch: add qemu_bh_new/aio_bh_new checks
Advise authors to use the _guarded versions of the APIs, instead.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-4-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit ef56ffbdd6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 6a33d4b345 async: avoid use-after-free on re-entrancy guard
A BH callback can free the BH, causing a use-after-free in aio_bh_call.
Fix that by keeping a local copy of the re-entrancy guard pointer.

Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58513
Fixes: 9c86c97f12 ("async: Add an optional reentrancy guard to the BH API")
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230501141956.3444868-1-alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 7915bd06f2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:38:01 +03:00
Alexander Bulekov 932cf49f06 async: Add an optional reentrancy guard to the BH API
Devices can pass their MemoryReentrancyGuard (from their DeviceState),
when creating new BHes. Then, the async API will toggle the guard
before/after calling the BH call-back. This prevents bh->mmio reentrancy
issues.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-3-alxndr@bu.edu>
[thuth: Fix "line over 90 characters" checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 9c86c97f12)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:01:33 +03:00
Alexander Bulekov a08c78dda7 memory: prevent dma-reentracy issues
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA.  The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:

1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case

These issues have led to problems such as stack-exhaustion and
use-after-frees.

Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Resolves: CVE-2023-0330

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a2e1753b80)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-09-10 19:01:33 +03:00
Michael Tokarev 83a9cdbd65 Update version for 8.0.4 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-07 15:05:10 +03:00
Matt Borgerson 7cb0210fcc target/i386: Check CR0.TS before enter_mmx
When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.

Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.

Signed-off-by: Matt Borgerson <contact@mborgerson.com>
Message-ID: <CADc=-s5F10muEhLs4f3mxqsEPAHWj0XFfOC2sfFMVHrk9fcpMg@mail.gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b2ea6450d8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Nicholas Piggin 979cdfbbfd target/ppc: Fix VRMA page size for ISA v3.0
Until v2.07s, the VRMA page size (L||LP) was encoded in LPCR[VRMASD].
In v3.0 that moved to the partition table PS field.

The powernv machine can now run KVM HPT guests on POWER9/10 CPUs with
this fix and the patch to add ASDR.

Fixes: 3367c62f52 ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230730111842.39292-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 0e2a3ec368)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Nicholas Piggin b96bb74e3a target/ppc: Fix pending HDEC when entering PM state
HDEC is defined to not wake from PM state. There is a check in the HDEC
timer to avoid setting the interrupt if we are in a PM state, but no
check on PM entry to lower HDEC if it already fired. This can cause a
HDECR wake up and  QEMU abort with unsupported exception in Power Save
mode.

Fixes: 4b236b621b ("ppc: Initial HDEC support")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-4-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 9915dac484)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Nicholas Piggin bfe876cb30 target/ppc: Implement ASDR register for ISA v3.0 for HPT
The ASDR register was introduced in ISA v3.0. It has not been
implemented for HPT. With HPT, ASDR is the format of the slbmte RS
operand (containing VSID), which matches the ppc_slb_t field.

Fixes: 3367c62f52 ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-2-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 9201af0969)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Hawkins Jiawei 1d711f97a5 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."

Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.

Yet the problem is that, vhost_vdpa_net_load_mq() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.

This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.

Fixes: f64c7cda69 ("vdpa: Add vhost_vdpa_net_load_mq")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <ec515ebb0b4f56368751b9e318e245a5d994fa72.1688438055.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f45fd95ec9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Hawkins Jiawei f43e4e2594 vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."

Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.

Yet the problem is that, vhost_vdpa_net_load_mac() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.

This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.

Fixes: f73c0c43ac ("vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <a21731518644abbd0c495c5b7960527c5911f80d.1688438055.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b479bc3c9d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Hawkins Jiawei ade1bed2b7 vdpa: Fix possible use-after-free for VirtQueueElement
QEMU uses vhost_handle_guest_kick() to forward guest's available
buffers to the vdpa device in SVQ avail ring.

In vhost_handle_guest_kick(), a `g_autofree` `elem` is used to
iterate through the available VirtQueueElements. This `elem` is
then passed to `svq->ops->avail_handler`, specifically to the
vhost_vdpa_net_handle_ctrl_avail(). If this handler fails to
process the CVQ command, vhost_handle_guest_kick() regains
ownership of the `elem`, and either frees it or requeues it.

Yet the problem is that, vhost_vdpa_net_handle_ctrl_avail()
mistakenly frees the `elem`, even if it fails to forward the
CVQ command to vdpa device. This can result in a use-after-free
for the `elem` in vhost_handle_guest_kick().

This patch solves this problem by refactoring
vhost_vdpa_net_handle_ctrl_avail() to only freeing the `elem` if
it owns it.

Fixes: bd907ae4b0 ("vdpa: manual forward CVQ buffers")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <e3f2d7db477734afe5c6a5ab3fa8b8317514ea34.1688746840.git.yin31149@gmail.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 031b1abacb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 20:49:49 +03:00
Zhenzhong Duan e85ab8f753 vfio/pci: Disable INTx in vfio_realize error path
When vfio realize fails, INTx isn't disabled if it has been enabled.
This may confuse host side with unhandled interrupt report.

Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit adee0da036)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-05 08:39:54 +03:00
Thomas Huth 48be003029 include/hw/i386/x86-iommu: Fix struct X86IOMMU_MSIMessage for big endian hosts
The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.

Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-7-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit e1e56c07d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Thomas Huth dab9a65dfa hw/i386/x86-iommu: Fix endianness issue in x86_iommu_irq_to_msi_message()
The values in "msg" are assembled in host endian byte order (the other
field are also not swapped), so we must not swap the __addr_head here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-6-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 37cf5cecb0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Thomas Huth e0711f74b2 hw/i386/intel_iommu: Fix index calculation in vtd_interrupt_remap_msi()
The values in "addr" are populated locally in this function in host
endian byte order, so we must not swap the index_l field here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-5-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fcd8027423)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Thomas Huth 4f558fd185 hw/i386/intel_iommu: Fix struct VTDInvDescIEC on big endian hosts
On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-4-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 4572b22cf9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Thomas Huth b3c94ecf3c hw/i386/intel_iommu: Fix endianness problems related to VTD_IR_TableEntry
The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
  to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
  a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
  data[2] array, but the struct is a mixture between 32 bit values
  (the first 8 bytes) and 64 bit values (the second 8 bytes), so this
  cannot work as expected.

Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-3-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 642ba89672)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Thomas Huth bc5740e178 hw/i386/intel_iommu: Fix trivial endianness problems
After reading the guest memory with dma_memory_read(), we have
to make sure that we byteswap the little endian data to the host's
byte order.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-2-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit cc2a08480e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Yuri Benditovich 715e8123ed pci: do not respond config requests after PCI device eject
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224964

In migration with VF failover, Windows guest and ACPI hot
unplug we do not need to satisfy config requests, otherwise
the guest immediately detects the device and brings up its
driver. Many network VF's are stuck on the guest PCI bus after
the migration.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20230728084049.191454-1-yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 348e354417)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Helge Deller 868b90e44a target/hppa: Move iaoq registers and thus reduce generated code size
On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
the next to-be-executed instructions addresses. Each generated TB writes those
registers at least once, so those registers are used heavily in generated
code.

Looking at the generated assembly, for a x86-64 host this code
to write the address $0x7ffe826f into iaoq_f is generated:
0x7f73e8000184:  c7 85 d4 01 00 00 6f 82  movl     $0x7ffe826f, 0x1d4(%rbp)
0x7f73e800018c:  fe 7f
0x7f73e800018e:  c7 85 d8 01 00 00 73 82  movl     $0x7ffe8273, 0x1d8(%rbp)
0x7f73e8000196:  fe 7f

With the trivial change, by moving the variables iaoq_f and iaoq_b to
the top of struct CPUArchState, the offset to %rbp is reduced (from
0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
generated code per move instruction:
0x7fc1e800018c:  c7 45 00 6f 82 fe 7f     movl     $0x7ffe826f, (%rbp)
0x7fc1e8000193:  c7 45 04 73 82 fe 7f     movl     $0x7ffe8273, 4(%rbp)

Overall this is a reduction of generated code (not a reduction of
number of instructions).
A test run with checks the generated code size by running "/bin/ls"
with qemu-user shows that the code size shrinks from 1616767 to 1569273
bytes, which is ~97% of the former size.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: qemu-stable@nongnu.org
(cherry picked from commit f8c0fd9804)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
zhenwei pi 60c42b8623 cryptodev: Handle unexpected request to avoid crash
Generally guest side should discover which services the device is
able to offer, then do requests on device.

However it's also possible to break this rule in a guest. Handle
unexpected request here to avoid NULL pointer dereference.

Fixes: e7a775fd ('cryptodev: Account statistics')
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Xiao Lei <nop.leixiao@gmail.com>
Cc: Yongkang Jia <kangel@zju.edu.cn>
Reported-by: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-3-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 15b11a1da6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
zhenwei pi 49f1e02bac virtio-crypto: verify src&dst buffer length for sym request
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.

This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.

Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: Yiming Tao <taoym@zju.edu.cn>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9d38a84347)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Li Feng fd902c54e5 vhost: fix the fd leak
When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.

Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")

Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20230731121018.2856310-2-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
(cherry picked from commit 18f2971ce4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Eric Auger 18963f458f hw/virtio-iommu: Fix potential OOB access in virtio_iommu_handle_command()
In the virtio_iommu_handle_command() when a PROBE request is handled,
output_size takes a value greater than the tail size and on a subsequent
iteration we can get a stack out-of-band access. Initialize the
output_size on each iteration.

The issue was found with ASAN. Credits to:
Yiming Tao(Zhejiang University)
Gaoning Pan(Zhejiang University)

Fixes: 1733eebb9e ("virtio-iommu: Implement RESV_MEM probe request")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: qemu-stable@nongnu.org

Message-Id: <20230717162126.11693-1-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cf2f89edf3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Peter Maydell 71e05c42cc target/m68k: Fix semihost lseek offset computation
The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-stable@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801154519.3505531-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 8caaae7319)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Keith Packard 3d81ba8da4 target/nios2: Fix semihost lseek offset computation
The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fixes: d1e23cbaa4 ("target/nios2: Use semihosting/syscalls.h")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230731235245.295513-1-keithp@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 71e2dd6aa1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Keith Packard adef4fe350 target/nios2: Pass semihosting arg to exit
Instead of using R_ARG0 (the semihost function number), use R_ARG1
(the provided exit status).

Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230801152245.332749-1-keithp@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit c11d5bdae7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
David Woodhouse f8592e9431 hw/xen: fix off-by-one in xen_evtchn_set_gsi()
Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.

Also fix up an assert() that has the same problem, that Coverity didn't see.

Fixes: 4f81baa33e ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801175747.145906-2-dwmw2@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cf885b1957)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Daniel P. Berrangé 5300472ec0 io: remove io watch if TLS channel is closed during handshake
The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 10be627d2b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Anthony PERARD ca93a302a0 xen-block: Avoid leaks on new error path
Commit 1898293990 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.

So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.

Coverity only spotted the leak of qdicts (*_layer variables).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: Coverity CID 1508722, 1398649
Fixes: 1898293990 ("xen-block: Use specific blockdev driver")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230704171819.42564-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit aa36243514)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Anthony PERARD 157529eee6 thread-pool: signal "request_cond" while locked
thread_pool_free() might have been called on the `pool`, which would
be a reason for worker_thread() to quit. In this case,
`pool->request_cond` is been destroyed.

If worker_thread() didn't managed to signal `request_cond` before it
been destroyed by thread_pool_free(), we got:
    util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.

One backtrace:
    __GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
	function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
    qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
    worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
    qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
    start_thread (arg=<optimized out>) at pthread_create.c:486

Reported here:
    https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u

To avoid issue, keep lock while sending a signal to `request_cond`.

Fixes: 900fa208f5 ("thread-pool: replace semaphore with condition variable")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230714152720.5077-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit f4f71363fc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Helge Deller 5a87bcee89 linux-user/armeb: Fix __kernel_cmpxchg() for armeb
Commit 7f4f0d9ea8 ("linux-user/arm: Implement __kernel_cmpxchg with host
atomics") switched to use qatomic_cmpxchg() to swap a word with the memory
content, but missed to endianess-swap the oldval and newval values when
emulating an armeb CPU, which expects words to be stored in big endian in
the guest memory.

The bug can be verified with qemu >= v7.0 on any little-endian host, when
starting the armeb binary of the upx program, which just hangs without
this patch.

Cc: qemu-stable@nongnu.org
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Reported-by: John Reiser <jreiser@BitWagon.com>
Closes: https://github.com/upx/upx/issues/687
Message-Id: <ZMQVnqY+F+5sTNFd@p100>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 38dd78c41e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Richard Henderson f8e673df7e target/ppc: Disable goto_tb with architectural singlestep
The change to use translator_use_goto_tb went too far, as the
CF_SINGLE_STEP flag managed by the translator only handles
gdb single stepping and not the architectural single stepping
modeled in DisasContext.singlestep_enabled.

Fixes: 6e9cc373ec ("target/ppc: Use translator_use_goto_tb")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1795
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 2e718e6657)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-04 19:14:46 +03:00
Richard Henderson 357b42486c util/interval-tree: Use qatomic_set_mb in rb_link_node
Ensure that the stores to rb_left and rb_right are complete before
inserting the new node into the tree.  Otherwise a concurrent reader
could see garbage in the new leaf.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 4c8baa02d3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: s/qatomic_set_mb/qatomic_mb_set/ for 8.0 - it was renamed later)
2023-08-04 19:13:51 +03:00
Richard Henderson b2ec463649 util/interval-tree: Use qatomic_read for left/right while searching
Fixes a race condition (generally without optimization) in which
the subtree is re-read after the protecting if condition.

Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 055b86e0f0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-08-01 08:59:58 +03:00
Peter Maydell 2eee26f579 target/arm: Avoid writing to constant TCGv in trans_CSEL()
In commit 0b188ea05a we changed the implementation of
trans_CSEL() to use tcg_constant_i32(). However, this change
was incorrect, because the implementation of the function
sets up the TCGv_i32 rn and rm to be either zero or else
a TCG temp created in load_reg(), and these TCG temps are
then in both cases written to by the emitted TCG ops.
The result is that we hit a TCG assertion:

qemu-system-arm: ../../tcg/tcg.c:4455: tcg_reg_alloc_mov: Assertion `!temp_readonly(ots)' failed.

(or on a non-debug build, just produce a garbage result)

Adjust the code so that rn and rm are always writeable
temporaries whether the instruction is using the special
case "0" or a normal register as input.

Cc: qemu-stable@nongnu.org
Fixes: 0b188ea05a ("target/arm: Use tcg_constant in trans_CSEL")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230727103906.2641264-1-peter.maydell@linaro.org
(cherry picked from commit 2b0d656ab6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 21:06:29 +03:00
Peter Maydell 2bff614256 target/arm: Special case M-profile in debug_helper.c code
A lot of the code called from helper_exception_bkpt_insn() is written
assuming A-profile, but we will also call this helper on M-profile
CPUs when they execute a BKPT insn.  This used to work by accident,
but recent changes mean that we will hit an assert when some of this
code calls down into lower level functions that end up calling
arm_security_space_below_el3(), arm_el_is_aa64(), and other functions
that now explicitly assert that the guest CPU is not M-profile.

Handle M-profile directly to avoid the assertions:
 * in arm_debug_target_el(), M-profile debug exceptions always
   go to EL1
 * in arm_debug_exception_fsr(), M-profile always uses the short
   format FSR (compare commit d7fe699be5, though in this case
   the code in arm_v7m_cpu_do_interrupt() does not need to
   look at the FSR value at all)

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1775
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230721143239.1753066-1-peter.maydell@linaro.org
(cherry picked from commit 5d78893f39)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Peter Maydell 220869aae1 hw/arm/smmu: Handle big-endian hosts correctly
The implementation of the SMMUv3 has multiple places where it reads a
data structure from the guest and directly operates on it without
doing a guest-to-host endianness conversion.  Since all SMMU data
structures are little-endian, this means that the SMMU doesn't work
on a big-endian host.  In particular, this causes the Avocado test
  machine_aarch64_virt.py:Aarch64VirtMachine.test_alpine_virt_tcg_gic_max
to fail on an s390x host.

Add appropriate byte-swapping on reads and writes of guest in-memory
data structures so that the device works correctly on big-endian
hosts.

As part of this we constrain queue_read() to operate only on Cmd
structs and queue_write() on Evt structs, because in practice these
are the only data structures the two functions are used with, and we
need to know what the data structure is to be able to byte-swap its
parts correctly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230717132641.764660-1-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
(cherry picked from commit c6445544d4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Viktor Prutyanov 123b4291f9 virtio-net: pass Device-TLB enable/disable events to vhost
If vhost is enabled for virtio-net, Device-TLB enable/disable events
must be passed to vhost for proper IOMMU unmap flag selection.

Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-3-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cd9b834688)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Viktor Prutyanov 8eed78e2bf vhost: register and change IOMMU flag depending on Device-TLB state
The guest can disable or never enable Device-TLB. In these cases,
it can't be used even if enabled in QEMU. So, check Device-TLB state
before registering IOMMU notifier and select unmap flag depending on
that. Also, implement a way to change IOMMU notifier flag if Device-TLB
state is changed.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ee071f67f7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Viktor Prutyanov 5f3fe5657d virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
According to PCIe Address Translation Services specification 5.1.3.,
ATS Control Register has Enable bit to enable/disable ATS. Guest may
enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI
device. So, raise/lower a flag and call a trigger function to pass this
event to a device implementation.

Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Message-Id: <20230512135122.70403-2-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 206e91d143)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Thomas Huth 0827053612 target/loongarch: Fix the CSRRD CPUID instruction on big endian hosts
The test in tests/avocado/machine_loongarch.py is currently failing
on big endian hosts like s390x. By comparing the traces between running
the QEMU_EFI.fd bios on a s390x and on a x86 host, it's quickly obvious
that the CSRRD instruction for the CPUID is behaving differently. And
indeed: The code currently does a long read (i.e. 64 bit) from the
address that points to the CPUState->cpu_index field (with tcg_gen_ld_tl()
in the trans_csrrd() function). But this cpu_index field is only an "int"
(i.e. 32 bit). While this dirty pointer magic works on little endian hosts,
it of course fails on big endian hosts. Fix it by using a proper helper
function instead.

Message-Id: <20230720175307.854460-1-thuth@redhat.com>
Reviewed-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c34ad45992)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich c8b714f047 target/s390x: Fix assertion failure in VFMIN/VFMAX with type 13
Type 13 is reserved, so using it should result in specification
exception. Due to an off-by-1 error the code triggers an assertion at a
later point in time instead.

Cc: qemu-stable@nongnu.org
Fixes: da4807527f ("s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)")
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-8-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit ff537b0370)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich c5498fdda0 target/s390x: Make MC raise specification exception when class >= 16
MC requires bit positions 8-11 (upper 4 bits of class) to be zeros,
otherwise it must raise a specification exception.

Cc: qemu-stable@nongnu.org
Fixes: 20d143e2ca ("s390x/tcg: Implement MONITOR CALL")
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-6-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 9c028c057a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich 76507abbe1 target/s390x: Fix ICM with M3=0
When the mask is zero, access exceptions should still be recognized for
1 byte at the second-operand address. CC should be set to 0.

Cc: qemu-stable@nongnu.org
Fixes: e023e832d0 ("s390x: translate engine for s390x CPU")
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a2025557ed)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich e5e8a86064 target/s390x: Fix CONVERT TO LOGICAL/FIXED with out-of-range inputs
CONVERT TO LOGICAL/FIXED deviate from IEEE 754 in that they raise an
inexact exception on out-of-range inputs. float_flag_invalid_cvti
aligns nicely with that behavior, so convert it to
S390_IEEE_MASK_INEXACT.

Cc: qemu-stable@nongnu.org
Fixes: defb0e3157 ("s390x: Implement opcode helpers")
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 53684e344a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich 6bd56e0f82 target/s390x: Fix CLM with M3=0
When the mask is zero, access exceptions should still be recognized for
1 byte at the second-operand address. CC should be set to 0.

Cc: qemu-stable@nongnu.org
Fixes: defb0e3157 ("s390x: Implement opcode helpers")
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 4b6e4c0b82)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich bdbf5e1016 target/s390x: Make CKSM raise an exception if R2 is odd
R2 designates an even-odd register pair; the instruction should raise
a specification exception when R2 is not even.

Cc: qemu-stable@nongnu.org
Fixes: e023e832d0 ("s390x: translate engine for s390x CPU")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230724082032.66864-2-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 761b0aa938)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Ilya Leoshkevich 6f7c39a912 tcg/{i386, s390x}: Add earlyclobber to the op_add2's first output
i386 and s390x implementations of op_add2 require an earlyclobber,
which is currently missing. This breaks VCKSM in s390x guests. E.g., on
x86_64 the following op:

    add2_i32 tmp2,tmp3,tmp2,tmp3,tmp3,tmp2   dead: 0 2 3 4 5  pref=none,0xffff

is translated to:

    addl     %ebx, %r12d
    adcl     %r12d, %ebx

Introduce a new C_N1_O1_I4 constraint, and make sure that earlyclobber
of aliased outputs is honored.

Cc: qemu-stable@nongnu.org
Fixes: 82790a8709 ("tcg: Add markup for output requires new register")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230719221310.1968845-7-iii@linux.ibm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 22d2e5351a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Jordan Niethe 59a728a031 tcg/ppc: Fix race in goto_tb implementation
Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
goto_tb to ensure only a single instruction was patched to prevent
incorrect behavior if a thread was in the middle of multiple
instructions when they were replaced. However this introduced a race
between loading the jmp target into TCG_REG_TB and patching and
executing the direct branch.

The relevant part of the goto_tb implementation:

    ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
  patch_location:
    mtctr TCG_REG_TB
    bctr

tb_target_set_jmp_target() will replace 'patch_location' with a direct
branch if the target is in range. The direct branch now relies on
TCG_REG_TB being set up correctly by the ld. Prior to this commit
multiple instructions were patched in for the direct branch case; these
instructions would initialize TCG_REG_TB to the same value as the branch
target.

Imagine the following sequence:

1) Thread A is executing the goto_tb sequence and loads the jmp
   target into TCG_REG_TB.

2) Thread B updates the jmp target address and calls
   tb_target_set_jmp_target(). This patches a new direct branch into the
   goto_tb sequence.

3) Thread A executes the newly patched direct branch. The value in
   TCG_REG_TB still contains the old jmp target.

TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
eventually crash after performing memory accesses generated from a
faulty value in TCG_REG_TB.

This presents as segfaults or illegal instruction exceptions.

Do not revert commit 20b6643324 as it did fix a different race
condition. Instead remove the direct branch optimization and always use
indirect branches.

The direct branch optimization can be re-added later with a race free
sequence.

Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1726
Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Tested-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Message-Id: <20230717093001.13167-1-jniethe5@gmail.com>
(cherry picked from commit 736a1588c1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 09:10:51 +03:00
Denis V. Lunev 5a61789df8 qemu-nbd: regression with arguments passing into nbd_client_thread()
Unfortunately
    commit 03b6762144
    (8.0:  feb0814b3b)
    Author: Denis V. Lunev <den@openvz.org>
    Date:   Mon Jul 17 16:55:40 2023 +0200
    qemu-nbd: pass structure into nbd_client_thread instead of plain char*
has introduced a regression. struct NbdClientOpts resides on stack inside
'if' block. This specifically means that this stack space could be reused
once the execution will leave that block of the code.

This means that parameters passed into nbd_client_thread could be
overwritten at any moment.

The patch moves the data to the namespace of main() function effectively
preserving it for the whole process lifetime.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: <qemu-stable@nongnu.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20230727105828.324314-1-den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit e5b815b0de)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: add reference to feb0814b3b for 8.0 branch)
2023-07-31 09:10:51 +03:00
Denis V. Lunev bdfecfbc1d qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Commit e6df58a557
    Author: Hanna Reitz <hreitz@redhat.com>
    Date:   Wed May 8 23:18:18 2019 +0200
    qemu-nbd: Do not close stderr

has introduced an interesting regression. Original behavior of
    ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
was the following:
 * qemu-nbd was started as a daemon
 * the command execution is done and ssh exited with success

The patch has changed this behavior and 'ssh' command now hangs forever.

According to the normal specification of the daemon() call, we should
endup with STDERR pointing to /dev/null. That should be done at the
very end of the successful startup sequence when the pipe to the
bootstrap process (used for diagnostics) is no longer needed.

This could be achived in the same way as done for 'qemu-nbd -c' case.
That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
STDERR does the trick.

This also leads to proper 'ssh' connection closing which fixes my
original problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Hanna Reitz <hreitz@redhat.com>
CC: <qemu-stable@nongnu.org>
Message-ID: <20230717145544.194786-3-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 5c56dd27a2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Denis V. Lunev feb0814b3b qemu-nbd: pass structure into nbd_client_thread instead of plain char*
We are going to pass additional flag inside next patch.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: <qemu-stable@nongnu.org>
Message-ID: <20230717145544.194786-2-den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 03b6762144)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Helge Deller f90a8b9357 linux-user: Fix signed math overflow in brk() syscall
Fix the math overflow when calculating the new_malloc_size.

new_host_brk_page and brk_page are unsigned integers. If userspace
reduces the heap, new_host_brk_page is lower than brk_page which results
in a huge positive number (but should actually be negative).

Fix it by adding a proper check and as such make the code more readable.

Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
(cherry picked from commit eac78a4b0b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Helge Deller c4a4731408 linux-user: Prohibit brk() to to shrink below initial heap address
Since commit 86f04735ac ("linux-user: Fix brk() to release pages") it's
possible for userspace applications to reduce their memory footprint by
calling brk() with a lower address and free up memory. Before that commit
guest heap memory was never unmapped.

But the Linux kernel prohibits to reduce brk() below the initial memory
address which is set at startup by the set_brk() function in binfmt_elf.c.
Such a range check was missed in commit 86f04735ac.

This patch adds the missing check by storing the initial brk value in
initial_target_brk and verify any new brk addresses against that value.

Tested with the i386 upx binary from
https://github.com/upx/upx/releases/download/v4.0.2/upx-4.0.2-i386_linux.tar.xz

Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
(cherry picked from commit dfe49864af)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Helge Deller 0102c92a1c linux-user: Fix qemu brk() to not zero bytes on current page
The qemu brk() implementation is too aggressive and cleans remaining bytes
on the current page above the last brk address.

But some existing applications are buggy and read/write bytes above their
current heap address. On a phyiscal machine this does not trigger a
runtime error as long as the access happens on the same page. Additionally
the Linux kernel allocates only full pages and does no zeroing on already
allocated pages, even if the brk address is lowered.

Fix qemu to behave the same way as the kernel does. Do not touch already
allocated pages, and - when running with different page sizes of guest and
host - zero out only those memory areas where the host page size is bigger
than the guest page size.

Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com>
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Cc: qemu-stable@nongnu.org
Buglink: https://github.com/upx/upx/issues/683
(cherry picked from commit 15ad98536a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Klaus Jensen 5de88d6e10 hw/nvme: fix endianness issue for shadow doorbells
In commit 2fda0726e5 ("hw/nvme: fix missing endian conversions for
doorbell buffers"), we fixed shadow doorbells for big-endian guests
running on little endian hosts. But I did not fix little-endian guests
on big-endian hosts. Fix this.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1765
Fixes: 3f7fe8de3d ("hw/nvme: Implement shadow doorbell buffer support")
Cc: qemu-stable@nongnu.org
Reported-by: Thomas Huth <thuth@redhat.com>
Tested-by: Cédric Le Goater <clg@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit ea3c76f149)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Andreas Schwab 0167759c9a linux-user: Make sure initial brk(0) is page-aligned
Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Message-Id: <mvmpm55qnno.fsf@suse.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit d28b3c90cf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Mauro Matteo Cascella 35720b3d90 ui/vnc-clipboard: fix infinite loop in inflate_buffer (CVE-2023-3255)
A wrong exit condition may lead to an infinite loop when inflating a
valid zlib buffer containing some extra bytes in the `inflate_buffer`
function. The bug only occurs post-authentication. Return the buffer
immediately if the end of the compressed data has been reached
(Z_STREAM_END).

Fixes: CVE-2023-3255
Fixes: 0bf41cab ("ui/vnc: clipboard support")
Reported-by: Kevin Denis <kevin.denis@synacktiv.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230704084210.101822-1-mcascell@redhat.com>
(cherry picked from commit d921fea338)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Philippe Mathieu-Daudé d1063b6551 linux-user/arm: Do not allocate a commpage at all for M-profile CPUs
Since commit fbd3c4cff6 ("linux-user/arm: Mark the commpage
executable") executing bare-metal (linked with rdimon.specs)
cortex-M code fails as:

  $ qemu-arm -cpu cortex-m3 ~/hello.exe.m3
  qemu-arm: ../../accel/tcg/user-exec.c:492: page_set_flags: Assertion `last <= GUEST_ADDR_MAX' failed.
  Aborted (core dumped)

Commit 4f5c67f8df ("linux-user/arm: Take more care allocating
commpage") already took care of not allocating a commpage for
M-profile CPUs, however it had to be reverted as commit 6cda41daa2.

Re-introduce the M-profile fix from commit 4f5c67f8df.

Fixes: fbd3c4cff6 ("linux-user/arm: Mark the commpage executable")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1755
Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230711153408.68389-1-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit d713cf4d6c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Richard Henderson fa72d8bcf4 tcg: Fix info_in_idx increment in layout_arg_by_ref
Off by one error, failing to take into account that layout_arg_1
already incremented info_in_idx for the first piece.  We only
need care for the n-1 TCG_CALL_ARG_BY_REF_N pieces here.

Cc: qemu-stable@nongnu.org
Fixes: 313bdea84d ("tcg: Add TCG_CALL_{RET,ARG}_BY_REF")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1751
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit e18ed26ce7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Pierrick Bouvier 7b336dcd06 linux-user/syscall: Implement execve without execveat
Support for execveat syscall was implemented in 55bbe4 and is available
since QEMU 8.0.0. It relies on host execveat, which is widely available
on most of Linux kernels today.

However, this change breaks qemu-user self emulation, if "host" qemu
version is less than 8.0.0. Indeed, it does not implement yet execveat.
This strange use case happens with most of distribution today having
binfmt support.

With a concrete failing example:
$ qemu-x86_64-7.2 qemu-x86_64-8.0 /bin/bash -c /bin/ls
/bin/bash: line 1: /bin/ls: Function not implemented
-> not implemented means execve returned ENOSYS

qemu-user-static 7.2 and 8.0 can be conveniently grabbed from debian
packages qemu-user-static* [1].

One usage of this is running wine-arm64 from linux-x64 (details [2]).
This is by updating qemu embedded in docker image that we ran into this
issue.

The solution to update host qemu is not always possible. Either it's
complicated or ask you to recompile it, or simply is not accessible
(GitLab CI, GitHub Actions). Thus, it could be worth to implement execve
without relying on execveat, which is the goal of this patch.

This patch was tested with example presented in this commit message.

[1] http://ftp.us.debian.org/debian/pool/main/q/qemu/
[1] https://www.linaro.org/blog/emulate-windows-on-arm/

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230705121023.973284-1-pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 7a8d9f3a0e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Olaf Hering c280ac3b60 hw/ide/piix: properly initialize the BMIBA register
According to the 82371FB documentation (82371FB.pdf, 2.3.9. BMIBA-BUS
MASTER INTERFACE BASE ADDRESS REGISTER, April 1997), the register is
32bit wide. To properly reset it to default values, all 32bit need to be
cleared. Bit #0 "Resource Type Indicator (RTE)" needs to be enabled.

The initial change wrote just the lower 8 bit, leaving parts of the "Bus
Master Interface Base Address" address at bit 15:4 unchanged.

Fixes: e6a71ae327 ("Add support for 82371FB (Step A1) and Improved support for 82371SB (Function 1)")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230712074721.14728-1-olaf@aepfle.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 230dfd9257)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Marcin Nowakowski 520d5fb4cb target/mips: enable GINVx support for I6400 and I6500
GINVI and GINVT operations are supported on MIPS I6400 and I6500 cores,
so indicate that properly in CP0.Config5 register bits [16:15].

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@fungible.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230630072806.3093704-1-marcin.nowakowski@fungible.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit baf21eebc3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Ilya Leoshkevich b2b1b99da9 target/s390x: Fix LRA when DAT is off
LRA should perform DAT regardless of whether it's on or off.
Disable DAT check for MMU_S390_LRA.

Fixes: defb0e3157 ("s390x: Implement opcode helpers")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-7-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit b0ef81062d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Ilya Leoshkevich 523f529d40 target/s390x: Fix LRA overwriting the top 32 bits on DAT error
When a DAT error occurs, LRA is supposed to write the error information
to the bottom 32 bits of R1, and leave the top 32 bits of R1 alone.

Fix by passing the original value of R1 into helper and copying the
top 32 bits to the return value.

Fixes: d8fe4a9c28 ("target-s390: Convert LRA")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-6-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6da311a60d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Ilya Leoshkevich eefa524832 target/s390x: Fix MVCRL with a large value in R0
Using a large R0 causes an assertion error:

    qemu-s390x: target/s390x/tcg/mem_helper.c:183: access_prepare_nf: Assertion `size > 0 && size <= 4096' failed.

Even though PoP explicitly advises against using more than 8 bits for the
size, an emulator crash is never a good thing.

Fix by truncating the size to 8 bits.

Fixes: ea0a1053e2 ("s390x/tcg: Implement Miscellaneous-Instruction-Extensions Facility 3 for the s390x")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 92a5753461)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Ilya Leoshkevich aa308958e6 target/s390x: Fix MDEB and MDEBR
These instructions multiply 32 bits by 32 bits, not 32 bits by 64 bits.

Fixes: 83b00736f3 ("target-s390: Convert FP MULTIPLY")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-4-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit fed9a4fe0c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Ilya Leoshkevich 70ba7cbf50 target/s390x: Fix EPSW CC reporting
EPSW should explicitly calculate and insert CC, like IPM does.

Fixes: e30a9d3fea ("target-s390: Implement EPSW")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230704081506.276055-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 110b1bac2e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Avihai Horon f48e3ec581 vfio: Fix null pointer dereference bug in vfio_bars_finalize()
vfio_realize() has the following flow:
1. vfio_bars_prepare() -- sets VFIOBAR->size.
2. msix_early_setup().
3. vfio_bars_register() -- allocates VFIOBAR->mr.

After vfio_bars_prepare() is called msix_early_setup() can fail. If it
does fail, vfio_bars_register() is never called and VFIOBAR->mr is not
allocated.

In this case, vfio_bars_finalize() is called as part of the error flow
to free the bars' resources. However, vfio_bars_finalize() calls
object_unparent() for VFIOBAR->mr after checking only VFIOBAR->size, and
thus we get a null pointer dereference.

Fix it by checking VFIOBAR->mr in vfio_bars_finalize().

Fixes: 89d5202edc ("vfio/pci: Allow relocating MSI-X MMIO")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 8af87a3ec7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Helge Deller 43462f7706 linux-user: Fix accept4(SOCK_NONBLOCK) syscall
The Linux accept4() syscall allows two flags only: SOCK_NONBLOCK and
SOCK_CLOEXEC, and returns -EINVAL if any other bits have been set.

Change the qemu implementation accordingly, which means we can not use
the fcntl_flags_tbl[] translation table which allows too many other
values.

Beside the correction in behaviour, this actually fixes the accept4()
emulation for hppa, mips and alpha targets for which SOCK_NONBLOCK is
different than TARGET_SOCK_NONBLOCK (aka O_NONBLOCK).

The fix can be verified with the testcase of the debian lwt package,
which hangs forever in a read() syscall without this patch.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit dca4c8384d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:38 +03:00
Helge Deller 741df485e8 linux-user: Fix fcntl() and fcntl64() to return O_LARGEFILE for 32-bit targets
When running a 32-bit guest on a 64-bit host, fcntl[64](F_GETFL) should
return with the TARGET_O_LARGEFILE flag set, because all 64-bit hosts
support large files unconditionally.

But on 64-bit hosts, O_LARGEFILE has the value 0, so the flag
translation can't be done with the fcntl_flags_tbl[]. Instead add the
TARGET_O_LARGEFILE flag afterwards.

Note that for 64-bit guests the compiler will optimize away this code,
since TARGET_O_LARGEFILE is zero.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit e0ddf8eac9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-31 08:52:37 +03:00
Nicholas Piggin 73d6ac24c8 hw/ppc: Fix clock update drift
The clock update logic reads the clock twice to compute the new clock
value, with a value derived from the later time subtracted from a value
derived from the earlier time. The delta causes time to be lost.

This can ultimately result in time becoming unsynchronized between CPUs
and that can cause OS lockups, timeouts, watchdogs, etc. This can be
seen running a KVM guest (that causes lots of TB updates) on a powernv
SMP machine.

Fix this by reading the clock once.

Cc: qemu-stable@nongnu.org
Fixes: dbdd25065e ("Implement time-base start/stop helpers.")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-ID: <20230629020713.327745-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 2ad2e113de)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-17 14:23:05 +03:00
Fiona Ebner 720db12b8b qemu_cleanup: begin drained section after vm_shutdown()
in order to avoid requests being stuck in a BlockBackend's request
queue during cleanup. Having such requests can lead to a deadlock [0]
with a virtio-scsi-pci device using iothread that's busy with IO when
initiating a shutdown with QMP 'quit'.

There is a race where such a queued request can continue sometime
(maybe after bdrv_child_free()?) during bdrv_root_unref_child() [1].
The completion will hold the AioContext lock and wait for the BQL
during SCSI completion, but the main thread will hold the BQL and
wait for the AioContext as part of bdrv_root_unref_child(), leading to
the deadlock [0].

[0]:

> Thread 3 (Thread 0x7f3bbd87b700 (LWP 135952) "qemu-system-x86"):
> #0  __lll_lock_wait (futex=futex@entry=0x564183365f00 <qemu_global_mutex>, private=0) at lowlevellock.c:52
> #1  0x00007f3bc1c0d843 in __GI___pthread_mutex_lock (mutex=0x564183365f00 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80
> #2  0x0000564182939f2e in qemu_mutex_lock_impl (mutex=0x564183365f00 <qemu_global_mutex>, file=0x564182b7f774 "../softmmu/physmem.c", line=2593) at ../util/qemu-thread-posix.c:94
> #3  0x000056418247cc2a in qemu_mutex_lock_iothread_impl (file=0x564182b7f774 "../softmmu/physmem.c", line=2593) at ../softmmu/cpus.c:504
> #4  0x00005641826d5325 in prepare_mmio_access (mr=0x5641856148a0) at ../softmmu/physmem.c:2593
> #5  0x00005641826d6fe7 in address_space_stl_internal (as=0x56418679b310, addr=4276113408, val=16418, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at /home/febner/repos/qemu/memory_ldst.c.inc:318
> #6  0x00005641826d7154 in address_space_stl_le (as=0x56418679b310, addr=4276113408, val=16418, attrs=..., result=0x0) at /home/febner/repos/qemu/memory_ldst.c.inc:357
> #7  0x0000564182374b07 in pci_msi_trigger (dev=0x56418679b0d0, msg=...) at ../hw/pci/pci.c:359
> #8  0x000056418237118b in msi_send_message (dev=0x56418679b0d0, msg=...) at ../hw/pci/msi.c:379
> #9  0x0000564182372c10 in msix_notify (dev=0x56418679b0d0, vector=8) at ../hw/pci/msix.c:542
> #10 0x000056418243719c in virtio_pci_notify (d=0x56418679b0d0, vector=8) at ../hw/virtio/virtio-pci.c:77
> #11 0x00005641826933b0 in virtio_notify_vector (vdev=0x5641867a34a0, vector=8) at ../hw/virtio/virtio.c:1985
> #12 0x00005641826948d6 in virtio_irq (vq=0x5641867ac078) at ../hw/virtio/virtio.c:2461
> #13 0x0000564182694978 in virtio_notify (vdev=0x5641867a34a0, vq=0x5641867ac078) at ../hw/virtio/virtio.c:2473
> #14 0x0000564182665b83 in virtio_scsi_complete_req (req=0x7f3bb000e5d0) at ../hw/scsi/virtio-scsi.c:115
> #15 0x00005641826670ce in virtio_scsi_complete_cmd_req (req=0x7f3bb000e5d0) at ../hw/scsi/virtio-scsi.c:641
> #16 0x000056418266736b in virtio_scsi_command_complete (r=0x7f3bb0010560, resid=0) at ../hw/scsi/virtio-scsi.c:712
> #17 0x000056418239aac6 in scsi_req_complete (req=0x7f3bb0010560, status=2) at ../hw/scsi/scsi-bus.c:1526
> #18 0x000056418239e090 in scsi_handle_rw_error (r=0x7f3bb0010560, ret=-123, acct_failed=false) at ../hw/scsi/scsi-disk.c:242
> #19 0x000056418239e13f in scsi_disk_req_check_error (r=0x7f3bb0010560, ret=-123, acct_failed=false) at ../hw/scsi/scsi-disk.c:265
> #20 0x000056418239e482 in scsi_dma_complete_noio (r=0x7f3bb0010560, ret=-123) at ../hw/scsi/scsi-disk.c:340
> #21 0x000056418239e5d9 in scsi_dma_complete (opaque=0x7f3bb0010560, ret=-123) at ../hw/scsi/scsi-disk.c:371
> #22 0x00005641824809ad in dma_complete (dbs=0x7f3bb000d9d0, ret=-123) at ../softmmu/dma-helpers.c:107
> #23 0x0000564182480a72 in dma_blk_cb (opaque=0x7f3bb000d9d0, ret=-123) at ../softmmu/dma-helpers.c:127
> #24 0x00005641827bf78a in blk_aio_complete (acb=0x7f3bb00021a0) at ../block/block-backend.c:1563
> #25 0x00005641827bfa5e in blk_aio_write_entry (opaque=0x7f3bb00021a0) at ../block/block-backend.c:1630
> #26 0x000056418295638a in coroutine_trampoline (i0=-1342102448, i1=32571) at ../util/coroutine-ucontext.c:177
> #27 0x00007f3bc0caed40 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #28 0x00007f3bbd8757f0 in ?? ()
> #29 0x0000000000000000 in ?? ()
>
> Thread 1 (Thread 0x7f3bbe3e9280 (LWP 135944) "qemu-system-x86"):
> #0  __lll_lock_wait (futex=futex@entry=0x5641856f2a00, private=0) at lowlevellock.c:52
> #1  0x00007f3bc1c0d8d1 in __GI___pthread_mutex_lock (mutex=0x5641856f2a00) at ../nptl/pthread_mutex_lock.c:115
> #2  0x0000564182939f2e in qemu_mutex_lock_impl (mutex=0x5641856f2a00, file=0x564182c0e319 "../util/async.c", line=728) at ../util/qemu-thread-posix.c:94
> #3  0x000056418293a140 in qemu_rec_mutex_lock_impl (mutex=0x5641856f2a00, file=0x564182c0e319 "../util/async.c", line=728) at ../util/qemu-thread-posix.c:149
> #4  0x00005641829532d5 in aio_context_acquire (ctx=0x5641856f29a0) at ../util/async.c:728
> #5  0x000056418279d5df in bdrv_set_aio_context_commit (opaque=0x5641856e6e50) at ../block.c:7493
> #6  0x000056418294e288 in tran_commit (tran=0x56418630bfe0) at ../util/transactions.c:87
> #7  0x000056418279d880 in bdrv_try_change_aio_context (bs=0x5641856f7130, ctx=0x56418548f810, ignore_child=0x0, errp=0x0) at ../block.c:7626
> #8  0x0000564182793f39 in bdrv_root_unref_child (child=0x5641856f47d0) at ../block.c:3242
> #9  0x00005641827be137 in blk_remove_bs (blk=0x564185709880) at ../block/block-backend.c:914
> #10 0x00005641827bd689 in blk_remove_all_bs () at ../block/block-backend.c:583
> #11 0x0000564182798699 in bdrv_close_all () at ../block.c:5117
> #12 0x000056418248a5b2 in qemu_cleanup () at ../softmmu/runstate.c:821
> #13 0x0000564182738603 in qemu_default_main () at ../softmmu/main.c:38
> #14 0x0000564182738631 in main (argc=30, argv=0x7ffd675a8a48) at ../softmmu/main.c:48
>
> (gdb) p *((QemuMutex*)0x5641856f2a00)
> $1 = {lock = {__data = {__lock = 2, __count = 2, __owner = 135952, ...
> (gdb) p *((QemuMutex*)0x564183365f00)
> $2 = {lock = {__data = {__lock = 2, __count = 0, __owner = 135944, ...

[1]:

> Thread 1 "qemu-system-x86" hit Breakpoint 5, bdrv_drain_all_end () at ../block/io.c:551
> #0  bdrv_drain_all_end () at ../block/io.c:551
> #1  0x00005569810f0376 in bdrv_graph_wrlock (bs=0x0) at ../block/graph-lock.c:156
> #2  0x00005569810bd3e0 in bdrv_replace_child_noperm (child=0x556982e2d7d0, new_bs=0x0) at ../block.c:2897
> #3  0x00005569810bdef2 in bdrv_root_unref_child (child=0x556982e2d7d0) at ../block.c:3227
> #4  0x00005569810e8137 in blk_remove_bs (blk=0x556982e42880) at ../block/block-backend.c:914
> #5  0x00005569810e7689 in blk_remove_all_bs () at ../block/block-backend.c:583
> #6  0x00005569810c2699 in bdrv_close_all () at ../block.c:5117
> #7  0x0000556980db45b2 in qemu_cleanup () at ../softmmu/runstate.c:821
> #8  0x0000556981062603 in qemu_default_main () at ../softmmu/main.c:38
> #9  0x0000556981062631 in main (argc=30, argv=0x7ffd7a82a418) at ../softmmu/main.c:48
> [Switching to Thread 0x7fe76dab2700 (LWP 103649)]
>
> Thread 3 "qemu-system-x86" hit Breakpoint 4, blk_inc_in_flight (blk=0x556982e42880) at ../block/block-backend.c:1505
> #0  blk_inc_in_flight (blk=0x556982e42880) at ../block/block-backend.c:1505
> #1  0x00005569810e8f36 in blk_wait_while_drained (blk=0x556982e42880) at ../block/block-backend.c:1312
> #2  0x00005569810e9231 in blk_co_do_pwritev_part (blk=0x556982e42880, offset=3422961664, bytes=4096, qiov=0x556983028060, qiov_offset=0, flags=0) at ../block/block-backend.c:1402
> #3  0x00005569810e9a4b in blk_aio_write_entry (opaque=0x556982e2cfa0) at ../block/block-backend.c:1628
> #4  0x000055698128038a in coroutine_trampoline (i0=-2090057872, i1=21865) at ../util/coroutine-ucontext.c:177
> #5  0x00007fe770f50d40 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #6  0x00007ffd7a829570 in ?? ()
> #7  0x0000000000000000 in ?? ()

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20230706131418.423713-1-f.ebner@proxmox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ca2a5e630d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-09 16:02:39 +03:00
Laurent Vivier bcb1e0522e virtio-net: correctly report maximum tx_queue_size value
Maximum value for tx_queue_size depends on the backend type.
1024 for vDPA/vhost-user, 256 for all the others.

The value is returned by virtio_net_max_tx_queue_size() to set the
parameter:

    n->net_conf.tx_queue_size = MIN(virtio_net_max_tx_queue_size(n),
                                    n->net_conf.tx_queue_size);

But the parameter checking uses VIRTQUEUE_MAX_SIZE (1024).

So the parameter is silently ignored and ethtool reports a different
value than the one provided by the user.

   ... -netdev tap,... -device virtio-net,tx_queue_size=1024

    # ethtool -g enp0s2
    Ring parameters for enp0s2:
    Pre-set maximums:
    RX:		256
    RX Mini:	n/a
    RX Jumbo:	n/a
    TX:		256
    Current hardware settings:
    RX:		256
    RX Mini:	n/a
    RX Jumbo:	n/a
    TX:		256

   ... -netdev vhost-user,... -device virtio-net,tx_queue_size=2048

    Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024

With this patch the correct maximum value is checked and displayed.

For vDPA/vhost-user:

    Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024

For all the others:

    Invalid tx_queue_size (= 512), must be a power of 2 between 256 and 256

Fixes: 2eef278b9e ("virtio-net: fix tx queue size for !vhost-user")
Cc: mst@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 4271f40383)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-09 14:46:14 +03:00
Michael Tokarev a342ce9dfe Update version for 8.0.3 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-09 00:34:33 +03:00
Richard Henderson fb64b62378 target/arm: Fix SME full tile indexing
For the outer product set of insns, which take an entire matrix
tile as output, the argument is not a combined tile+column.
Therefore using get_tile_rowcol was incorrect, as we extracted
the tile number from itself.

The test case relies only on assembler support for SME, since
no release of GCC recognizes -march=armv9-a+sme yet.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: dropped now-unneeded changes to sysregs CFLAGS]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1f51573f79)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: fixup context in tests/tcg/aarch64/Makefile.target)
2023-07-08 09:17:22 +03:00
Mark Cave-Ayland d2402a83a7 accel/tcg: Assert one page in tb_invalidate_phys_page_range__locked
Ensure that that both the start and last addresses are within
the same guest page.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230629082522.606219-3-mark.cave-ayland@ilande.co.uk>
[rth: Use tcg_debug_assert, simplify the expression]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit e665cf72fe)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-02 22:10:28 +03:00
Mark Cave-Ayland 78e8c9c1a1 accel/tcg: Fix start page passed to tb_invalidate_phys_page_range__locked
Due to a copy-paste error in tb_invalidate_phys_range, the wrong
start address was passed to tb_invalidate_phys_page_range__locked.
Correct is to use the start of each page in turn.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: e506ad6a05 ("accel/tcg: Pass last not end to tb_invalidate_phys_range")
Message-Id: <20230629082522.606219-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3307e08c6f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-02 22:09:17 +03:00
Dongwon Kim 477ab906d1 ui/gtk: set the area of the scanout texture correctly
x and y offsets and width and height of the scanout texture
is not correctly configured in case guest scanout frame is
dmabuf.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Message-ID: <20230621213150.29573-1-dongwon.kim@intel.com>
(cherry picked from commit 37802a24eb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-02 07:49:12 +03:00
Richard Henderson df1e45c9df linux-user: Avoid mmap of the last byte of the reserved_va
There is an overflow problem in mmap_find_vma_reserved:
when reserved_va == UINT32_MAX, end may overflow to 0.
Rather than a larger rewrite at this time, simply avoid
the final byte of the VA, which avoids searching the
final page, which avoids the overflow.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1741
Fixes: 95059f9c ("include/exec: Change reserved_va semantics to last byte")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230629080835.71371-1-richard.henderson@linaro.org>
(cherry picked from commit 605a8b5491)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-07-01 13:57:27 +03:00
Shameer Kolothum 383fb8c05c vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path
When vfio_enable_vectors() returns with less than requested nr_vectors
we retry with what kernel reported back. But the retry path doesn't
call vfio_prepare_kvm_msi_virq_batch() and this results in,

qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1
qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed

Fixes: dc580d51f7 ("vfio: defer to commit kvm irq routing when enable msi/msix")
Reviewed-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit c174088923)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30 19:02:13 +03:00
Zhenzhong Duan 58b3e4ff5f vfio/pci: Fix a segfault in vfio_realize
The kvm irqchip notifier is only registered if the device supports
INTx, however it's unconditionally removed in vfio realize error
path. If the assigned device does not support INTx, this will cause
QEMU to crash when vfio realize fails. Change it to conditionally
remove the notifier only if the notify hook is setup.

Before fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Connection closed by foreign host.

After fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Error: vfio 0000:81:11.1: xres and yres properties require display=on
(qemu)

Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 357bd7932a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30 19:00:39 +03:00
Nicholas Piggin 55ee115e7a target/ppc: Fix decrementer time underflow and infinite timer loop
It is possible to store a very large value to the decrementer that it
does not raise the decrementer exception so the timer is scheduled, but
the next time value wraps and is treated as in the past.

This can occur if (u64)-1 is stored on a zero-triggered exception, or
(u64)-1 is stored twice on an underflow-triggered exception, for
example.

If such a value is set in DECAR, it gets stored to the decrementer by
the timer function, which then immediately causes another timer, which
hangs QEMU.

Clamp the decrementer to the implemented width, and use that as the
value for the timer calculation, effectively preventing this overflow.

Reported-by: sdicaro@DDCI.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530131214.373524-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 09d2db9f46)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30 09:18:28 +03:00
Laurent Vivier ce6331222d vhost: fix vhost_dev_enable_notifiers() error case
in vhost_dev_enable_notifiers(), if virtio_bus_set_host_notifier(true)
fails, we call vhost_dev_disable_notifiers() that executes
virtio_bus_set_host_notifier(false) on all queues, even on queues that
have failed to be initialized.

This triggers a core dump in memory_region_del_eventfd():

 virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
 vhost VQ 1 notifier binding failed: 24
 .../softmmu/memory.c:2611: memory_region_del_eventfd: Assertion `i != mr->ioeventfd_nb' failed.

Fix the problem by providing to vhost_dev_disable_notifiers() the
number of queues to disable.

Fixes: 8771589b6f ("vhost: simplify vhost_dev_enable_notifiers")
Cc: longpeng2@huawei.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230602162735.3670785-1-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 92099aa4e9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30 09:13:00 +03:00
Eugenio Pérez 246b0cf1ac vdpa: mask _F_CTRL_GUEST_OFFLOADS for vhost vdpa devices
QEMU does not emulate it so it must be disabled as long as the backend
does not support it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173328.1917385-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit 51e84244a7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-30 09:11:26 +03:00
Nicholas Piggin 5e8838524a icount: don't adjust virtual time backwards after warp
The icount-based QEMU_CLOCK_VIRTUAL runs ahead of the RT clock at times.
When warping, it is possible it is still ahead at the end of the warp,
which causes icount adaptive mode to adjust it backward. This can result
in the machine observing time going backwards.

Prevent this by clamping adaptive adjustment to 0 at minimum.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230627061406.241847-1-npiggin@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 67f85346ca)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-29 18:17:49 +03:00
Markus Armbruster a76c5126ec Revert "hw/sparc64/niagara: Use blk_name() instead of open-coding it"
This reverts commit 1881f336a3.

This commit breaks "-drive if=pflash,readonly=on,file=image.iso".  It
claims to merely replace an open-coded version of blk_name() by a
call, but that's not the case.  Sorry for the inconvenience!

Reported-by: Jakub Jermář <jakub@jermar.eu>
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230515151104.1350155-1-armbru@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit ac5e8c1dec)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-28 18:57:11 +03:00
Vivek Kasireddy 8c792a532e virtio-gpu: Make non-gl display updates work again when blob=true
In the case where the console does not have gl capability, and
if blob is set to true, make sure that the display updates still
work. Commit e86a93f554 accidentally broke this by misplacing
the return statement (in resource_flush) causing the updates to
be silently ignored.

Fixes: e86a93f554 ("virtio-gpu: splitting one extended mode guest fb into n-scanouts")
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230623060454.3749910-1-vivek.kasireddy@intel.com>
(cherry picked from commit 34e29d85a7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-28 18:52:17 +03:00
Marc-André Lureau dc4c852d4d ui: return NULL when getting cursor without a console
VNC may try to get the current cursor even when there are no consoles
and crashes. Simple reproducer is qemu with -nodefaults.

Fixes: (again)
https://gitlab.com/qemu-project/qemu/-/issues/1548

Fixes: commit 385ac97f8 ("ui: keep current cursor with QemuConsole")
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230428154807.2143652-1-marcandre.lureau@redhat.com>
(cherry picked from commit 333e7599a0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-28 18:50:48 +03:00
Ani Sinha aab37b2002 vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
When a peer nic is still attached to the vdpa backend, it is too early to free
up the vhost-net and vdpa structures. If these structures are freed here, then
QEMU crashes when the guest is being shut down. The following call chain
would result in an assertion failure since the pointer returned from
vhost_vdpa_get_vhost_net() would be NULL:

do_vm_stop() -> vm_state_notify() -> virtio_set_status() ->
virtio_net_vhost_status() -> get_vhost_net().

Therefore, we defer freeing up the structures until at guest shutdown
time when qemu_cleanup() calls net_cleanup() which then calls
qemu_del_net_client() which would eventually call vhost_vdpa_cleanup()
again to free up the structures. This time, the loop in net_cleanup()
ensures that vhost_vdpa_cleanup() will be called one last time when
all the peer nics are detached and freed.

All unit tests pass with this change.

CC: imammedo@redhat.com
CC: jusual@redhat.com
CC: mst@redhat.com
Fixes: CVE-2023-3301
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619065209.442185-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a0d7215e33)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context change for stable-8.0)
2023-06-26 19:55:29 +03:00
Eugenio Pérez d33534a4c7 vdpa: fix not using CVQ buffer in case of error
Bug introducing when refactoring.  Otherway, the guest never received
the used buffer.

Fixes: be4278b65f ("vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173451.1917999-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit d45243bcfc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:55:29 +03:00
Prasad Pandit 46fe2deaef vhost: release virtqueue objects in error path
vhost_dev_start function does not release virtqueue objects when
event_notifier_init() function fails. Release virtqueue objects
and log a message about function failure.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-3-ppandit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 77ece20ba0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:55:29 +03:00
Prasad Pandit c10525874c vhost: release memory_listener object in error path
vhost_dev_start function does not release memory_listener object
in case of an error. This may crash the guest when vhost is unable
to set memory table:

  stack trace of thread 125653:
  Program terminated with signal SIGSEGV, Segmentation fault
  #0  memory_listener_register (qemu-kvm + 0x6cda0f)
  #1  vhost_dev_start (qemu-kvm + 0x699301)
  #2  vhost_net_start (qemu-kvm + 0x45b03f)
  #3  virtio_net_set_status (qemu-kvm + 0x665672)
  #4  qmp_set_link (qemu-kvm + 0x548fd5)
  #5  net_vhost_user_event (qemu-kvm + 0x552c45)
  #6  tcp_chr_connect (qemu-kvm + 0x88d473)
  #7  tcp_chr_new_client (qemu-kvm + 0x88cf83)
  #8  tcp_chr_accept (qemu-kvm + 0x88b429)
  #9  qio_net_listener_channel_func (qemu-kvm + 0x7ac07c)
  #10 g_main_context_dispatch (libglib-2.0.so.0 + 0x54e2f)

Release memory_listener objects in the error path.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-2-ppandit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: c471ad0e9b ("vhost_net: device IOTLB support")
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1e3ffb34f7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:55:29 +03:00
Helge Deller b62e5d8ac1 target/hppa: Update to SeaBIOS-hppa version 8
Update SeaBIOS-hppa to version 8.

Fixes:
- boot of HP-UX with SMP, and
- reboot of Linux and HP-UX with SMP

Enhancements:
- show qemu version in boot menu
- adds exit menu entry in boot menu to quit emulation
- allow to trace PCD_CHASSIS codes & machine run status

Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 34ec3aea54)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:35:09 +03:00
Helge Deller 06f32b3dcf target/hppa: New SeaBIOS-hppa version 7
Update SeaBIOS-hppa to version 7 which fixes a boot problem
with Debian-12 install CD images.

The problem with Debian-12 is, that the ramdisc got bigger
than what the firmware could load in one call to the LSI
scsi driver.

Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit bb9c998ca9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this one before picking next 34ec3aea54 "SeaBIOS-hppa version 8")
2023-06-26 19:34:58 +03:00
Helge Deller 29c753001b target/hppa: Provide qemu version via fw_cfg to firmware
Give current QEMU version string to SeaBIOS-hppa via fw_cfg interface so
that the firmware can show the QEMU version in the boot menu info.

Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 069d296669)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:34:58 +03:00
Helge Deller 8fa1462292 target/hppa: Fix OS reboot issues
When the OS triggers a reboot, the reset helper function sends a
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) together with an
EXCP_HLT exception to halt the CPUs.

So, at reboot when initializing the CPUs again, make sure to set all
instruction pointers to the firmware entry point, disable any interrupts,
disable data and instruction translations, enable PSW_Q bit  and tell qemu
to unhalt (halted=0) the CPUs again.

This fixes the various reboot issues which were seen when rebooting a
Linux VM, including the case where even the monarch CPU has been virtually
halted from the OS (e.g. via "chcpu -d 0" inside the Linux VM).

Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 50ba97e928)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 19:34:58 +03:00
Peter Maydell deb40cf67a pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
The xkb official name for the Arabic keyboard layout is 'ara'.
However xkb has for at least the past 15 years also permitted it to
be named via the legacy synonym 'ar'.  In xkeyboard-config 2.39 this
synoynm was removed, which breaks compilation of QEMU:

FAILED: pc-bios/keymaps/ar
/home/fred/qemu-git/src/qemu/build-full/qemu-keymap -f pc-bios/keymaps/ar -l ar
xkbcommon: ERROR: Couldn't find file "symbols/ar" in include paths
xkbcommon: ERROR: 1 include paths searched:
xkbcommon: ERROR: 	/usr/share/X11/xkb
xkbcommon: ERROR: 3 include paths could not be added:
xkbcommon: ERROR: 	/home/fred/.config/xkb
xkbcommon: ERROR: 	/home/fred/.xkb
xkbcommon: ERROR: 	/etc/xkb
xkbcommon: ERROR: Abandoning symbols file "(unnamed)"
xkbcommon: ERROR: Failed to compile xkb_symbols
xkbcommon: ERROR: Failed to compile keymap

The upstream xkeyboard-config change removing the compat
mapping is:
https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/commit/470ad2cd8fea84d7210377161d86b31999bb5ea6

Make QEMU always ask for the 'ara' xkb layout, which should work on
both older and newer xkeyboard-config.  We leave the QEMU name for
this keyboard layout as 'ar'; it is not the only one where our name
for it deviates from the xkb standard name.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230620162024.1132013-1-peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1709
(cherry picked from commit 497fad3897)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 01:31:47 +03:00
Peter Maydell cf7950282d host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
We use __builtin_subcll() to do a 64-bit subtract with borrow-in and
borrow-out when the host compiler supports it.  Unfortunately some
versions of Apple Clang have a bug in their implementation of this
intrinsic which means it returns the wrong value.  The effect is that
a QEMU built with the affected compiler will hang when emulating x86
or m68k float80 division.

The upstream LLVM issue is:
https://github.com/llvm/llvm-project/issues/55253

The commit that introduced the bug apparently never made it into an
upstream LLVM release without the subsequent fix
https://github.com/llvm/llvm-project/commit/fffb6e6afdbaba563189c1f715058ed401fbc88d
but unfortunately it did make it into Apple Clang 14.0, as shipped
in Xcode 14.3 (14.2 is reported to be OK). The Apple bug number is
FB12210478.

Add ifdefs to avoid use of __builtin_subcll() on Apple Clang version
14 or greater.  There is not currently a version of Apple Clang which
has the bug fix -- when one appears we should be able to add an upper
bound to the ifdef condition so we can start using the builtin again.
We make the lower bound a conservative "any Apple clang with major
version 14 or greater" because the consequences of incorrectly
disabling the builtin when it would work are pretty small and the
consequences of not disabling it when we should are pretty bad.

Many thanks to those users who both reported this bug and also
did a lot of work in identifying the root cause; in particular
to Daniel Bertalan and osy.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1631
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1659
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel Bertalan <dani@danielbertalan.dev>
Tested-by: Tested-By: Solra Bizna <solra@bizna.name>
Message-id: 20230622130823.1631719-1-peter.maydell@linaro.org
(cherry picked from commit b0438861ef)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-26 01:31:07 +03:00
Bastian Koppelmann 72a23f2991 target/tricore: Add CHECK_REG_PAIR() for insn accessing 64 bit regs
some insns were not checking if an even index was used to access a 64
bit register. In the worst case that could lead to a buffer overflow as
reported in https://gitlab.com/qemu-project/qemu/-/issues/1698.

Reported-by: Siqi Chen <coc.cyqh@gmail.com>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20230612113245.56667-4-kbastian@mail.uni-paderborn.de>
(cherry picked from commit 6991777ec4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-23 09:46:01 +03:00
Siqi Chen b9e1415e16 target/tricore: Fix out-of-bounds index in imask instruction
When translating  "imask" instruction of Tricore architecture, QEMU did not check whether the register index was out of bounds, resulting in a global-buffer-overflow.

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1698
Reported-by: Siqi Chen <coc.cyqh@gmail.com>
Signed-off-by: Siqi Chen <coc.cyqh@gmail.com>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20230612065633.149152-1-coc.cyqh@gmail.com>
Message-Id: <20230612113245.56667-2-kbastian@mail.uni-paderborn.de>
(cherry picked from commit d34b092cab)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-23 09:44:47 +03:00
Peter Maydell 4553eee156 hw/timer/nrf51_timer: Don't lose time when timer is queried in tight loop
The nrf51_timer has a free-running counter which we implement using
the pattern of using two fields (update_counter_ns, counter) to track
the last point at which we calculated the counter value, and the
counter value at that time.  Then we can find the current counter
value by converting the difference in wall-clock time between then
and now to a tick count that we need to add to the counter value.

Unfortunately the nrf51_timer's implementation of this has a bug
which means it loses time every time update_counter() is called.
After updating s->counter it always sets s->update_counter_ns to
'now', even though the actual point when s->counter hit the new value
will be some point in the past (half a tick, say).  In the worst case
(guest code in a tight loop reading the counter, icount mode) the
counter is continually queried less than a tick after it was last
read, so s->counter never advances but s->update_counter_ns does, and
the guest never makes forward progress.

The fix for this is to only advance update_counter_ns to the
timestamp of the last tick, not all the way to 'now'.  (This is the
pattern used in hw/misc/mps2-fpgaio.c's counter.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230606134917.3782215-1-peter.maydell@linaro.org
(cherry picked from commit d2f9a79a8c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22 10:38:38 +03:00
Peter Maydell 22d71f9eb3 hw/intc/allwinner-a10-pic: Handle IRQ levels other than 0 or 1
In commit 2c5fa0778c we fixed an endianness bug in the Allwinner
A10 PIC model; however in the process we introduced a regression.
This is because the old code was robust against the incoming 'level'
argument being something other than 0 or 1, whereas the new code was
not.

In particular, the allwinner-sdhost code treats its IRQ line
as 0-vs-non-0 rather than 0-vs-1, so when the SD controller
set its IRQ line for any reason other than transmit the
interrupt controller would ignore it. The observed effect
was a guest timeout when rebooting the guest kernel.

Handle level values other than 0 or 1, to restore the old
behaviour.

Fixes: 2c5fa0778c ("hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()")
(Mjt:  af08c70ef5 in stable-8.0)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230606104609.3692557-2-peter.maydell@linaro.org
(cherry picked from commit f837b468cd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22 10:37:43 +03:00
Peter Maydell f38ca28c66 target/arm: Return correct result for LDG when ATA=0
The LDG instruction loads the tag from a memory address (identified
by [Xn + offset]), and then merges that tag into the destination
register Xt. We implemented this correctly for the case when
allocation tags are enabled, but didn't get it right when ATA=0:
instead of merging the tag bits into Xt, we merged them into the
memory address [Xn + offset] and then set Xt to that.

Merge the tag bits into the old Xt value, as they should be.

Cc: qemu-stable@nongnu.org
Fixes: c15294c1e3 ("target/arm: Implement LDG, STG, ST2G instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 7e2788471f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22 10:35:22 +03:00
Peter Maydell 2bdaf89162 target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics
The atomic memory operations are supposed to return the old memory
data value in the destination register.  This value is not
sign-extended, even if the operation is the signed minimum or
maximum.  (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)

We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.

Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-2-peter.maydell@linaro.org
(cherry picked from commit 243705aa6e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-22 10:34:30 +03:00
Cédric Le Goater fb8b14025b aspeed/hace: Initialize g_autofree pointer
As mentioned in docs/devel/style.rst "Automatic memory deallocation":

* Variables declared with g_auto* MUST always be initialized,
  otherwise the cleanup function will use uninitialized stack memory

This avoids QEMU to coredump when running the "hash test" command
under Zephyr.

Cc: Steven Lee <steven_lee@aspeedtech.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: qemu-stable@nongnu.org
Fixes: c5475b3f9a ("hw: Model ASPEED's Hash and Crypto Engine")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20230421131547.2177449-1-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit c8f48b120b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-16 16:15:56 +03:00
Yin Wang 4a83e27b21 hw/riscv: qemu crash when NUMA nodes exceed available CPUs
Command "qemu-system-riscv64 -machine virt
-m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G"
would trigger this problem.Backtrace with:
 #0  0x0000555555b5b1a4 in riscv_numa_get_default_cpu_node_id  at ../hw/riscv/numa.c:211
 #1  0x00005555558ce510 in machine_numa_finish_cpu_init  at ../hw/core/machine.c:1230
 #2  0x00005555558ce9d3 in machine_run_board_init  at ../hw/core/machine.c:1346
 #3  0x0000555555aaedc3 in qemu_init_board  at ../softmmu/vl.c:2513
 #4  0x0000555555aaf064 in qmp_x_exit_preconfig  at ../softmmu/vl.c:2609
 #5  0x0000555555ab1916 in qemu_init  at ../softmmu/vl.c:3617
 #6  0x000055555585463b in main  at ../softmmu/main.c:47
This commit fixes the issue by adding parameter checks.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Yin Wang <yin.wang@intel.com>
Message-Id: <20230519023758.1759434-1-yin.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit b9cedbf19c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-14 13:02:36 +03:00
Nicholas Piggin e7d265340e target/ppc: Fix PMU hflags calculation
Some of the PMU hflags bits can go out of synch, for example a store to
MMCR0 with PMCjCE=1 fails to update hflags correctly and results in
hflags mismatch:

  qemu: fatal: TCG hflags mismatch (current:0x2408003d rebuilt:0x240a003d)

This can be reproduced by running perf on a recent machine.

Some of the fragility here is the duplication of PMU hflags calculations.
This change consolidates that in a single place to update pmu-related
hflags, to be called after a well defined state changes.

The post-load PMU update is pulled out of the MSR update because it does
not depend on the MSR value.

Fixes: 8b3d1c49a9 ("target/ppc: Add new PMC HFLAGS")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530130447.372617-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 6494d2c1fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11 11:15:29 +03:00
Nicholas Piggin 1de8291e29 target/ppc: Fix nested-hv HEAI delivery
ppc hypervisors turn HEAI interrupts into program interrupts injected
into the guest that executed the illegal instruction, if the hypervisor
doesn't handle it some other way.

The nested-hv implementation failed to account for this HEAI->program
conversion. The virtual hypervisor wants to see the HEAI when running
a nested guest, so that interrupt type can be returned to its KVM
caller.

Fixes: 7cebc5db2e ("target/ppc: Introduce a vhyp framework for nested HV support")
Cc: balaton@eik.bme.hu
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230530132127.385001-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 6c242e79b8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11 11:14:51 +03:00
Nicholas Piggin 3c6346625b target/ppc: Fix lqarx to set cpu_reserve
lqarx does not set cpu_reserve, which causes stqcx. to never succeed.

Cc: qemu-stable@nongnu.org
Fixes: 94bf265867 ("target/ppc: Use atomic load for LQ and LQARX")
Fixes: 57b38ffd0c ("target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230605025445.161932-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit e025e8f5a8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11 11:07:26 +03:00
Anastasia Belova d09e35feb5 vnc: move assert in vnc_worker_thread_loop
job may be NULL if queue->exit is true. Check
it before dereference job.

Fixes: f31f9c1080 ("vnc: add magic cookie to VncState")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bdfca8a22f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11 11:03:26 +03:00
Mattias Nissler 967e42986f hw/remote: Fix vfu_cfg trace offset format
The printed offset value is prefixed with 0x, but was actually printed
in decimal. To spare others the confusion, adjust the format specifier
to hexadecimal.

Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 5fb9e82955)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-11 11:02:28 +03:00
Christian Schoenebeck b9d2887be4 9pfs: prevent opening special files (CVE-2023-2861)
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.

With QEMU this could only be exploited in the following unsafe setups:

  - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
    security model.

or

  - Using 9p 'proxy' fs driver (which is running its helper daemon as
    root).

These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.

Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
(cherry picked from commit f6b0de53fb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-08 23:46:46 +03:00
Mark Somerville 828af6b31f qga: Fix suspend on Linux guests without systemd
Allow the Linux guest agent to attempt each of the suspend methods
(systemctl, pm-* and writing to /sys) in turn.

Prior to this guests without systemd failed to suspend due to
`guest_suspend` returning early regardless of the return value of
`systemd_supports_mode`.

Signed-off-by: Mark Somerville <mark@qpok.net>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
(cherry picked from commit 86dcb6ab9b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-08 13:30:40 +03:00
Jagannathan Raman fe88635449 docs: fix multi-process QEMU documentation
Fix a typo in the system documentation for multi-process QEMU.

Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7771e8b863)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 22:49:05 +03:00
David Woodhouse 6a69a58b1c hw/xen: Fix broken check for invalid state in xs_be_open()
Coverity points out that if (!s && !s->impl) isn't really what we intended
to do here. CID 1508131.

Fixes: 0324751272 ("hw/xen: Add emulated implementation of XenStore operations")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230412185102.441523-6-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit c9bdfe8d58)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 22:47:56 +03:00
David Woodhouse 1797de7f82 hw/xen: Fix memory leak in libxenstore_open() for Xen
There was a superfluous allocation of the XS handle, leading to it
being leaked on both the error path and the success path (where it gets
allocated again).

Spotted by Coverity (CID 1508098).

Fixes: ba2a92db1f ("hw/xen: Add xenstore operations to allow redirection to internal emulation")
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230412185102.441523-3-dwmw2@infradead.org>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 8442232eba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 22:47:46 +03:00
Thomas Huth 903c71251b hw/mips/malta: Fix the malta machine on big endian hosts
Booting a Linux kernel with the malta machine is currently broken
on big endian hosts. The cpu_to_gt32 macro wants to byteswap a value
for little endian targets only, but uses the wrong way to do this:
cpu_to_[lb]e32 works the other way round on big endian hosts! Fix
it by using the same ways on both, big and little endian hosts.

Fixes: 0c8427baf0 ("hw/mips/malta: Use bootloader helper to set BAR registers")
Cc: qemu-stable@nongnu.org
Message-Id: <20230330152613.232082-1-thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit dc96009afd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 454d4e4380 s390x/tcg: Fix CPU address returned by STIDP
In qemu-user-s390x, /proc/cpuinfo contains:

	processor 0: version = 00,  identification = 000000,  machine = 8561
	processor 1: version = 00,  identification = 400000,  machine = 8561

The highest nibble is supposed to contain the CPU address, but it's off
by 2 bits. Fix the shift value and provide a symbolic constant for it.

With the fix we get:

	processor 0: version = 00,  identification = 000000,  machine = 8561
	processor 1: version = 00,  identification = 100000,  machine = 8561

Fixes: 076d4d39b6 ("s390x/cpumodel: wire up cpu type + id for TCG")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230605113950.1169228-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 71b11cbe1c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 5cd229064a tests/tcg/s390x: Test MXDB and MXDBR
Add a small test to prevent regressions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230601223027.795501-3-iii@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 2b956244a9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 1b4417178e target/s390x: Fix MXDB and MXDBR
These instructions multiply 64 bits by 64 bits, not 128 bits by 64 bits.

Reported-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
Fixes: 2b91240f95 ("target/s390x: Use Int128 for passing float128")
Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2211472
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230601223027.795501-2-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a7f4add793)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 373cc0f3b5 tests/tcg/s390x: Test single-stepping SVC
Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230510230213.330134-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit be4a4cb429)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 9b7c1e431e linux-user/s390x: Fix single-stepping SVC
Currently single-stepping SVC executes two instructions. The reason is
that EXCP_DEBUG for the SVC instruction itself is masked by EXCP_SVC.
Fix by re-raising EXCP_DEBUG.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230510230213.330134-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 01b9990a3f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 0a3a9ae1f2 tests/tcg/s390x: Test LOCFHR
Add a small test to prevent regressions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230526181240.1425579-5-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 230976232f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 0d4bced374 target/s390x: Fix LOCFHR taking the wrong half of R2
LOCFHR should write top-to-top, but QEMU erroneously writes
bottom-to-top.

Fixes: 45aa9aa3b7 ("target/s390x: Implement load-on-condition-2 insns")
Cc: qemu-stable@nongnu.org
Reported-by: Mikhail Mitskevich <mitskevichmn@gmail.com>
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1668
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230526181240.1425579-4-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 3180b17362)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 8776c6cf6a tests/tcg/s390x: Test LCBB
Add a test to prevent regressions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230526181240.1425579-3-iii@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 05d000fb4d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 76d4eb3a5e target/s390x: Fix LCBB overwriting the top 32 bits
LCBB is supposed to overwrite only the bottom 32 bits, but QEMU
erroneously overwrites the entire register.

Fixes: 6d9303322e ("s390x/tcg: Implement LOAD COUNT TO BLOCK BOUNDARY")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230526181240.1425579-2-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 079181b9bc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 12:36:29 +03:00
Ilya Leoshkevich 6a9f9e6499 tests/tcg/s390x: Test EXECUTE of relative branches
Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230426235813.198183-3-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit bfa72590df)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: forgotten testcase for commit b858c53ef6)
2023-06-07 12:35:15 +03:00
Peter Maydell f81a5d6863 target/arm: Explicitly select short-format FSR for M-profile
For M-profile, there is no guest-facing A-profile format FSR, but we
still use the env->exception.fsr field to pass fault information from
the point where a fault is raised to the code in
arm_v7m_cpu_do_interrupt() which interprets it and sets the M-profile
specific fault status registers.  So it doesn't matter whether we
fill in env->exception.fsr in the short format or the LPAE format, as
long as both sides agree.  As it happens arm_v7m_cpu_do_interrupt()
assumes short-form.

In compute_fsr_fsc() we weren't explicitly choosing short-form for
M-profile, but instead relied on it falling out in the wash because
arm_s1_regime_using_lpae_format() would be false.  This was broken in
commit 452c67a4 when we added v8R support, because we said "PMSAv8 is
always LPAE format" (as it is for v8R), forgetting that we were
implicitly using this code path on M-profile. At that point we would
hit a g_assert_not_reached():
 ERROR:../../target/arm/internals.h:549:arm_fi_to_lfsc: code should not be reached

#7  0x0000555555e055f7 in arm_fi_to_lfsc (fi=0x7fffecff9a90) at ../../target/arm/internals.h:549
#8  0x0000555555e05a27 in compute_fsr_fsc (env=0x555557356670, fi=0x7fffecff9a90, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff9a1c)
    at ../../target/arm/tlb_helper.c:95
#9  0x0000555555e05b62 in arm_deliver_fault (cpu=0x555557354800, addr=268961344, access_type=MMU_INST_FETCH, mmu_idx=1, fi=0x7fffecff9a90)
    at ../../target/arm/tlb_helper.c:132
#10 0x0000555555e06095 in arm_cpu_tlb_fill (cs=0x555557354800, address=268961344, size=1, access_type=MMU_INST_FETCH, mmu_idx=1, probe=false, retaddr=0)
    at ../../target/arm/tlb_helper.c:260

The specific assertion changed when commit fcc7404eff added
"assert not M-profile" to arm_is_secure_below_el3(), because the
conditions being checked in compute_fsr_fsc() include
arm_el_is_aa64(), which will end up calling arm_is_secure_below_el3()
and asserting before we try to call arm_fi_to_lfsc():

#7  0x0000555555efaf43 in arm_is_secure_below_el3 (env=0x5555574665a0) at ../../target/arm/cpu.h:2396
#8  0x0000555555efb103 in arm_is_el2_enabled (env=0x5555574665a0) at ../../target/arm/cpu.h:2448
#9  0x0000555555efb204 in arm_el_is_aa64 (env=0x5555574665a0, el=1) at ../../target/arm/cpu.h:2509
#10 0x0000555555efbdfd in compute_fsr_fsc (env=0x5555574665a0, fi=0x7fffecff99e0, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff996c)

Avoid the assertion and the incorrect FSR format selection by
explicitly making M-profile use the short-format in this function.

Fixes: 452c67a427 ("target/arm: Enable TTBCR_EAE for ARMv8-R AArch32")a
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1658
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230523131726.866635-1-peter.maydell@linaro.org
(cherry picked from commit d7fe699be5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Clément Chigot 505f0c68c9 hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS,
the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result
in a positive number as ms->smp.cpus is a unsigned int.
This will raise the following error afterwards, as Qemu will try to
instantiate some additional RPUs.
  | $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102
  | **
  | ERROR:../src/tcg/tcg.c:777:tcg_register_thread:
  |   assertion failed: (n < tcg_max_ctxs)

Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20230524143714.565792-1-chigot@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c9ba1c9f02)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Tommy Wu cdda1ce158 hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop.
When we receive a packet from the xilinx_axienet and then try to s2mem
through the xilinx_axidma, if the descriptor ring buffer is full in the
xilinx axidma driver, we’ll assert the DMASR.HALTED in the
function : stream_process_s2mem and return 0. In the end, we’ll be stuck in
an infinite loop in axienet_eth_rx_notify.

This patch checks the DMASR.HALTED state when we try to push data
from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted,
we will not keep pushing the data and then prevent the infinte loop.

Signed-off-by: Tommy Wu <tommy.wu@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 31afe04586)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Volker Rümelin bc8e883065 ui/sdl2: disable SDL_HINT_GRAB_KEYBOARD on Windows
Windows sends an extra left control key up/down input event for
every right alt key up/down input event for keyboards with
international layout. Since commit 830473455f ("ui/sdl2: fix
handling of AltGr key on Windows") QEMU uses a Windows low level
keyboard hook procedure to reliably filter out the special left
control key and to grab the keyboard on Windows.

The SDL2 version 2.0.16 introduced its own Windows low level
keyboard hook procedure to grab the keyboard. Windows calls this
callback before the QEMU keyboard hook procedure. This disables
the special left control key filter when the keyboard is grabbed.

To fix the problem, disable the SDL2 Windows low level keyboard
hook procedure.

Reported-by: Bernhard Beschow <shentey@gmail.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230418062823.5683-1-vr_qemu@t-online.de>
(cherry picked from commit 1dfea3f212)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Bernhard Beschow e0968d21e2 ui/sdl2: Grab Alt+F4 also under Windows
SDL doesn't grab Alt+F4 under Windows by default. Pressing Alt+F4 thus closes
the VM immediately without confirmation, possibly leading to data loss. Fix
this by always grabbing Alt+F4 on Windows hosts, too.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-3-shentey@gmail.com>
(cherry picked from commit 083db9db44)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Bernhard Beschow 772a83c6db ui/sdl2: Grab Alt+Tab also in fullscreen mode
By default, SDL grabs Alt+Tab only in non-fullscreen mode. This causes Alt+Tab
to switch tasks on the host rather than in the VM in fullscreen mode while it
switches tasks in non-fullscreen mode in the VM. Fix this confusing behavior
by grabbing Alt+Tab in fullscreen mode, always causing tasks to be switched in
the VM.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-2-shentey@gmail.com>
(cherry picked from commit efc00a3709)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Marc-André Lureau 9e36edcf03 ui/dbus: fix compilation when GBM && !OPENGL
commit 4814d3cbf ("ui/dbus: restrict opengl to gbm-enabled config")
assumes that whenever GBM is available, OpenGL is. This is not always
the case, let's further restrict opengl-related paths and fix some
compilation issues.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132348.1024663-1-marcandre.lureau@redhat.com>
(cherry picked from commit 0b31e48d62)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Marc-André Lureau e0baf24b4a ui/sdl2: fix surface_gl_update_texture: Assertion 'gls' failed
Before sdl2_gl_update() is called, sdl2_gl_switch() may decide to
destroy the console window and its associated shaders.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1644
Fixes: c84ab0a500 ("ui/console: optionally update after gfx switch")

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20230511074217.4171842-1-marcandre.lureau@redhat.com>
(cherry picked from commit b3a654d82e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:37 +03:00
Erico Nunes eef92fac91 ui/gtk-egl: fix scaling for cursor position in scanout mode
vc->gfx.w and vc->gfx.h are not updated appropriately in this code path,
which leads to a different scaling factor for rendering the cursor on
some edge cases (e.g. the focus has left and re-entered the gtk window).
This can be reproduced using vhost-user-gpu with the gtk ui on the x11
backend.
Use the surface dimensions which are already updated accordingly.

Signed-off-by: Erico Nunes <ernunes@redhat.com>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230320160856.364319-2-ernunes@redhat.com>
(cherry picked from commit f8a951bb95)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Erico Nunes 7fd387715b ui/gtk: use widget size for cursor motion event
The gd_motion_event size has some calculations for the cursor position,
which also take into account things like different size of the
framebuffer compared to the window size.
The use of window size makes things more difficult though, as at least
in the case of Wayland includes the size of ui elements like a menu bar
at the top of the window. This leads to a wrong position calculation by
a few pixels.
Fix it by using the size of the widget, which already returns the size
of the actual space to render the framebuffer.

Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Message-Id: <20230320160856.364319-1-ernunes@redhat.com>
(cherry picked from commit 2f31663ed4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Erico Nunes 76b7002ec7 ui/gtk: fix passing y0_top parameter to scanout
The dmabuf->y0_top flag is passed to .dpy_gl_scanout_dmabuf(), however
in the gtk ui both implementations dropped it when doing the next
scanout_texture call.

Fixes flipped linux console using vhost-user-gpu with the gtk ui
display.

Signed-off-by: Erico Nunes <ernunes@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230220175605.43759-1-ernunes@redhat.com>
(cherry picked from commit 94400fa53f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Bernhard Beschow 880f7d12be hw/ppc/prep: Fix wiring of PIC -> CPU interrupt
Commit cef2e7148e ("hw/isa/i82378: Remove intermediate IRQ forwarder")
passes s->cpu_intr to i8259_init() in i82378_realize() directly. However, s-
>cpu_intr isn't initialized yet since that happens after the south bridge's
pci_realize_and_unref() in board code. Fix this by initializing s->cpu_intr
before realizing the south bridge.

Fixes: cef2e7148e ("hw/isa/i82378: Remove intermediate IRQ forwarder")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230304114043.121024-4-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 2237af5e60)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Richard Purdie 864ce70c1c target/ppc: Fix fallback to MFSS for MFFS* instructions on pre 3.0 ISAs
The following commits changed the code such that the fallback to MFSS for MFFSCRN,
MFFSCRNI, MFFSCE and MFFSL on pre 3.0 ISAs was removed and became an illegal instruction:

  bf8adfd88b - target/ppc: Move mffscrn[i] to decodetree
  394c2e2fda - target/ppc: Move mffsce to decodetree
  3e5bce70ef - target/ppc: Move mffsl to decodetree

The hardware will handle them as a MFFS instruction as the code did previously.
This means applications that were segfaulting under qemu when encountering these
instructions which is used in glibc libm functions for example.

The fallback for MFFSCDRN and MFFSCDRNI added in a later patch was also missing.

This patch restores the fallback to MFSS for these instructions on pre 3.0s ISAs
as the hardware decoder would, fixing the segfaulting libm code. It doesn't have
the fallback for 3.0 onwards to match hardware behaviour.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Reviewed-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230510111913.1718734-1-richard.purdie@linuxfoundation.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 5260ecffd2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Thomas Huth 25b846c85e scripts/device-crash-test: Add a parameter to run with TCG only
We're currently facing the problem that the device-crash-test script
runs twice as long in the CI when a runner supports KVM - which sometimes
results in a timeout of the CI job. To get a more deterministic runtime
here, add an option to the script that allows to run it with TCG only.

Reported-by: Eldon Stegall <eldon-qemu@eldondev.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230414145845.456145-3-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230424092249.58552-6-alex.bennee@linaro.org>
(cherry picked from commit 8b869aa591)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Thomas Huth eca6ebee52 gitlab-ci: Avoid to re-run "configure" in the device-crash-test jobs
After "make check-venv" had been added to these jobs, they started
to re-run "configure" each time since our logic in the makefile
thinks that some files are out of date here. Avoid it with the same
trick that we are using in buildtest-template.yml already by disabling
the up-to-date check via NINJA=":".

Fixes: 1d8cf47e5b ("tests: run 'device-crash-test' from tests/venv")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230414145845.456145-2-thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230424092249.58552-5-alex.bennee@linaro.org>
(cherry picked from commit 4d3bd91b26)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-31 09:42:36 +03:00
Michael Tokarev f7f686b61c Update version for 8.0.2 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-30 14:59:47 +03:00
Kevin Wolf bea933e430 block/export: Fix null pointer dereference in error path
There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.

Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).

Fixes: Coverity CID 1509238
Fixes: de79b52604
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a184563778)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-30 00:16:58 +03:00
Michael Tokarev dabb4183d1 Update version for 8.0.1 release
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-29 18:19:22 +03:00
Paolo Bonzini ff692a15bb virtio: qmp: fix memory leak
The VirtioInfoList is already allocated by QAPI_LIST_PREPEND and
need not be allocated by the caller.

Fixes Coverity CID 1508724.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0bfd14149b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-27 12:54:07 +03:00
Igor Mammedov 134253a4fe machine: do not crash if default RAM backend name has been stolen
QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.

 $QEMU -machine pc \
       -object memory-backend-ram,id=pc.ram,size=4294967296

 Actual results:
 QEMU 7.2.0 monitor - type 'help' for more information
 (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
 qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
 Aborted (core dumped)

Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230522131717.3780533-1-imammedo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a37531f238)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-26 19:46:01 +03:00
Thomas Huth e49884a909 hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.

The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit b987718bbb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-26 18:56:39 +03:00
Paolo Bonzini 9d622451fd usb/ohci: Set pad to 0 after frame update
When the OHCI controller's framenumber is incremented, HccaPad1 register
should be set to zero (Ref OHCI Spec 4.4)

ReactOS uses hccaPad1 to determine if the OHCI hardware is running,
consequently it fails this check in current qemu master.

Signed-off-by: Ryan Wendland <wendland@live.com.au>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6301460ce9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-26 10:49:10 +03:00
Akihiko Odaki 668aeea0ec util/vfio-helpers: Use g_file_read_link()
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:

In file included from /usr/include/features.h:490,
                 from /usr/include/bits/libc-header-start.h:33,
                 from /usr/include/stdint.h:26,
                 from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
                 from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
                 from ../util/vfio-helpers.c:13:
In function 'readlink',
    inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
    inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
    inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
/usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull]
  119 |   return __glibc_fortify (readlink, __len, sizeof (char),
      |          ^~~~~~~~~~~~~~~

This error implies the allocated buffer can be NULL. Use
g_file_read_link(), which allocates buffer automatically to avoid the
error.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit dbdea0dbfe)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-26 10:47:30 +03:00
Stefan Hajnoczi fae9449998 rtl8139: fix large_send_mss divide-by-zero
If the driver sets large_send_mss to 0 then a divide-by-zero occurs.
Even if the division wasn't a problem, the for loop that emits MSS-sized
packets would never terminate.

Solve these issues by skipping offloading when large_send_mss=0.

This issue was found by OSS-Fuzz as part of Alexander Bulekov's device
fuzzing work. The reproducer is:

  $ cat << EOF | ./qemu-system-i386 -display none -machine accel=qtest, -m \
  512M,slots=1,maxmem=0xffff000000000000 -machine q35 -nodefaults -device \
  rtl8139,netdev=net0 -netdev user,id=net0 -device \
  pc-dimm,id=nv1,memdev=mem1,addr=0xb800a64602800000 -object \
  memory-backend-ram,id=mem1,size=2M  -qtest stdio
  outl 0xcf8 0x80000814
  outl 0xcfc 0xe0000000
  outl 0xcf8 0x80000804
  outw 0xcfc 0x06
  write 0xe0000037 0x1 0x04
  write 0xe00000e0 0x2 0x01
  write 0x1 0x1 0x04
  write 0x3 0x1 0x98
  write 0xa 0x1 0x8c
  write 0xb 0x1 0x02
  write 0xc 0x1 0x46
  write 0xd 0x1 0xa6
  write 0xf 0x1 0xb8
  write 0xb800a646028c000c 0x1 0x08
  write 0xb800a646028c000e 0x1 0x47
  write 0xb800a646028c0010 0x1 0x02
  write 0xb800a646028c0017 0x1 0x06
  write 0xb800a646028c0036 0x1 0x80
  write 0xe00000d9 0x1 0x40
  EOF

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582
Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582
Cc: qemu-stable@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>
Fixes: 6d71357a3b ("rtl8139: honor large send MSS value")
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 792676c165)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 18:07:21 +03:00
Akihiko Odaki 02bd13ae3a igb: Always copy ethernet header
igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dc9ef1bf45)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:37:48 +03:00
Akihiko Odaki c84bcff3d3 e1000e: Always copy ethernet header
e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.

The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.

Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 310a128eae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:35:30 +03:00
Akihiko Odaki 5c4f2f1b60 net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2f0fa232b8)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:34:46 +03:00
Akihiko Odaki ba3c7bf178 igb: Clear IMS bits when committing ICR access
The datasheet says contradicting statements regarding ICR accesses so it
is not reliable to determine the behavior of ICR accesses. However,
e1000e does clear IMS bits when reading ICR accesses and Linux also
expects ICR accesses will clear IMS bits according to:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f0b1df5c45)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:34:28 +03:00
Akihiko Odaki 6e260100d0 igb: Do not require CTRL.VME for tx VLAN tagging
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)

There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.

Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit e209716749)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:34:04 +03:00
Akihiko Odaki 9ff3fe63fc igb: Fix Rx packet type encoding
igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ed447c60b3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:33:54 +03:00
Akihiko Odaki 0f7ca2bf2c e1000x: Fix BPRC and MPRC
Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.

Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.

Always determine what counters to update by inspecting the packets.

Fixes: 3b27430177 ("e1000: Implementing various counters")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f3f9b726af)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-24 16:31:45 +03:00
timothee.cocault@gmail.com eb134d1d58 e1000e: Fix tx/rx counters
The bytes and packets counter registers are cleared on read.

Copying the "total counter" registers to the "good counter" registers has
side effects.
If the "total" register is never read by the OS, it only gets incremented.
This leads to exponential growth of the "good" register.

This commit increments the counters individually to avoid this.

Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 8d689f6aae)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-23 22:07:47 +03:00
Kevin Wolf a7002f15c8 nbd/server: Fix drained_poll to wake coroutine in right AioContext
nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.

The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.

To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7c1f51bf38)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 20:54:40 +03:00
Kevin Wolf d001f222e3 graph-lock: Disable locking for now
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.

We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 80fc5d2600)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 20:54:40 +03:00
Stefan Hajnoczi 84d839e499 block: compile out assert_bdrv_graph_readable() by default
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.

The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501173443.153062-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 58a2e3f5c3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this one up so the next patch which disables this applies cleanly)
2023-05-22 20:53:40 +03:00
Stefan Hajnoczi a0b89ba845 tested: add test for nested aio_poll() in poll handlers
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-3-stefanha@redhat.com>
[kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling]
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 844a12a63e)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 19:38:44 +03:00
Stefan Hajnoczi a91defe16b aio-posix: do not nest poll handlers
QEMU's event loop supports nesting, which means that event handler
functions may themselves call aio_poll(). The condition that triggered a
handler must be reset before the nested aio_poll() call, otherwise the
same handler will be called and immediately re-enter aio_poll. This
leads to an infinite loop and stack exhaustion.

Poll handlers are especially prone to this issue, because they typically
reset their condition by finishing the processing of pending work.
Unfortunately it is during the processing of pending work that nested
aio_poll() calls typically occur and the condition has not yet been
reset.

Disable a poll handler during ->io_poll_ready() so that a nested
aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
disabled poll handler and its associated fd handler do not run during
the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
aio_poll() could cause it to run again. If the fd handler is pending
inside nested aio_poll(), then it will also run again.

In theory fd handlers can be affected by the same issue, but they are
more likely to reset the condition before calling nested aio_poll().

This is a special case and it's somewhat complex, but I don't see a way
around it as long as nested aio_poll() is supported.

Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181
Fixes: c382706925 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK")
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6d740fb01b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 19:38:44 +03:00
Mauro Matteo Cascella 81d13aa5e0 virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_request
Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype.

Fixes: 0e660a6f90 ("crypto: Introduce RSA algorithm")
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Yiming Tao <taoym@zju.edu.cn>
Message-Id: <20230509075317.1132301-1-mcascell@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: zhenwei pi<pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3e69908907)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 19:38:44 +03:00
Eugenio Pérez 302ac06ab9 virtio-net: not enable vq reset feature unconditionally
The commit 93a97dc520 ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.

The entire final commit is unneeded as device system already enable or
disable the feature properly.

This reverts commit 93a97dc520.
Fixes: 93a97dc520 ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1fac00f70b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 19:38:44 +03:00
Leonardo Bras adc49750d2 hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :

qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument

The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:

./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]

In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.

Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 5ed3dabe57)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-22 19:38:44 +03:00
Hawkins Jiawei a9144eed6c vhost: fix possible wrap in SVQ descriptor ring
QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().

Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.

Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.

This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.

This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.

Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230509084817.3973-1-yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit 5d410557de)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-19 23:23:50 +03:00
Xinyu Li 0de5117819 target/i386: fix avx2 instructions vzeroall and vpermdq
vzeroall: xmm_regs should be used instead of xmm_t0
vpermdq: bit 3 and 7 of imm should be considered

Signed-off-by: Xinyu Li <lixinyu20s@ict.ac.cn>
Message-Id: <20230510145222.586487-1-lixinyu20s@ict.ac.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 056d649007)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 20:46:55 +03:00
Paolo Bonzini db8051ad59 target/i386: fix operand size for VCOMI/VUCOMI instructions
Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
the single and double precision versions are distinguished through a
prefix, however they use no-prefix and 0x66 for SS and SD respectively.
Scalar values usually are associated with 0xF2 and 0xF3.

Because of these, they incorrectly perform a 128-bit memory load instead
of a 32- or 64-bit load.  Fix this by writing a custom decoding function.

I tested that the reproducer is fixed and the test-avx output does not
change.

Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637
Fixes: f8d19eec0d ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2b55e479e6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 20:46:47 +03:00
Paolo Bonzini 1e029102e6 scsi-generic: fix buffer overflow on block limits inquiry
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic
device makes qemu crash.  This is caused by a buffer overflow when
scsi-generic patches the block limits VPD page.

Do the operations on a temporary on-stack buffer that is guaranteed
to be large enough.

Reported-by: Théo Maillart <tmaillart@freebox.fr>
Analyzed-by: Théo Maillart <tmaillart@freebox.fr>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9bd634b2f5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 20:46:29 +03:00
Richard Henderson c283a4bc76 target/arm: Fix vd == vm overlap in sve_ldff1_z
If vd == vm, copy vm to scratch, so that we can pre-zero
the output and still access the gather indicies.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1612
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230504104232.1877774-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit a6771f2f5c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 17:56:51 +03:00
Eric Blake c0ad2a9191 migration: Attempt disk reactivation in more failure scenarios
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed.  This failure to
reactivate is also present in commit 6039dd5b1c (also covering the new
s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
s->block_inactive is set more reliably).

Consolidate the two labels back into one - no matter HOW migration is
failed, if there is any chance we can reach vm_start() after having
attempted inactivation, it is essential that we have tried to restart
disks before then.  This also makes the cleanup more like
migrate_fd_cancel().

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502205212.134680-1-eblake@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6dab4c93ec)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: minor context tweak near added comment in migration/migration.c)
2023-05-18 16:59:30 +03:00
Eric Blake d2a811dd7d migration: Minor control flow simplification
No need to declare a temporary variable.

Suggested-by: Juan Quintela <quintela@redhat.com>
Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better")
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5d39f44d7a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 16:57:44 +03:00
Eric Blake cb898262a4 migration: Handle block device inactivation failures better
Consider what happens when performing a migration between two host
machines connected to an NFS server serving multiple block devices to
the guest, when the NFS server becomes unavailable.  The migration
attempts to inactivate all block devices on the source (a necessary
step before the destination can take over); but if the NFS server is
non-responsive, the attempt to inactivate can itself fail.  When that
happens, the destination fails to get the migrated guest (good,
because the source wasn't able to flush everything properly):

  (qemu) qemu-kvm: load of migration failed: Input/output error

at which point, our only hope for the guest is for the source to take
back control.  With the current code base, the host outputs a message, but then appears to resume:

  (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)

  (src qemu)info status
   VM status: running

but a second migration attempt now asserts:

  (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.

Whether the guest is recoverable on the source after the first failure
is debatable, but what we do not want is to have qemu itself fail due
to an assertion.  It looks like the problem is as follows:

In migration.c:migration_completion(), the source sets 'inactivate' to
true (since COLO is not enabled), then tries
savevm.c:qemu_savevm_state_complete_precopy() with a request to
inactivate block devices.  In turn, this calls
block.c:bdrv_inactivate_all(), which fails when flushing runs up
against the non-responsive NFS server.  With savevm failing, we are
now left in a state where some, but not all, of the block devices have
been inactivated; but migration_completion() then jumps to 'fail'
rather than 'fail_invalidate' and skips an attempt to reclaim those
those disks by calling bdrv_activate_all().  Even if we do attempt to
reclaim disks, we aren't taking note of failure there, either.

Thus, we have reached a state where the migration engine has forgotten
all state about whether a block device is inactive, because we did not
set s->block_inactive in enough places; so migration allows the source
to reach vm_start() and resume execution, violating the block layer
invariant that the guest CPUs should not be restarted while a device
is inactive.  Note that the code in migration.c:migrate_fd_cancel()
will also try to reactivate all block devices if s->block_inactive was
set, but because we failed to set that flag after the first failure,
the source assumes it has reclaimed all devices, even though it still
has remaining inactivated devices and does not try again.  Normally,
qmp_cont() will also try to reactivate all disks (or correctly fail if
the disks are not reclaimable because NFS is not yet back up), but the
auto-resumption of the source after a migration failure does not go
through qmp_cont().  And because we have left the block layer in an
inconsistent state with devices still inactivated, the later migration
attempt is hitting the assertion failure.

Since it is important to not resume the source with inactive disks,
this patch marks s->block_inactive before attempting inactivation,
rather than after succeeding, in order to prevent any vm_start() until
it has successfully reactivated all devices.

See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 403d18ae38)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 16:57:30 +03:00
Michael Tokarev 45a67df841 linux-user: fix getgroups/setgroups allocations
linux-user getgroups(), setgroups(), getgroups32() and setgroups32()
used alloca() to allocate grouplist arrays, with unchecked gidsetsize
coming from the "guest".  With NGROUPS_MAX being 65536 (linux, and it
is common for an application to allocate NGROUPS_MAX for getgroups()),
this means a typical allocation is half the megabyte on the stack.
Which just overflows stack, which leads to immediate SIGSEGV in actual
system getgroups() implementation.

An example of such issue is aptitude, eg
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72

Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that),
and use heap allocation for grouplist instead of alloca().  While at it,
fix coding style and make all 4 implementations identical.

Try to not impose random limits - for example, allow gidsetsize to be
negative for getgroups() - just do not allocate negative-sized grouplist
in this case but still do actual getgroups() call.  But do not allow
negative gidsetsize for setgroups() since its argument is unsigned.

Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is
not an error if set size will be NGROUPS_MAX+1. But we should not allow
integer overflow for the array being allocated. Maybe it is enough to
just call g_try_new() and return ENOMEM if it fails.

Maybe there's also no need to convert setgroups() since this one is
usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, -
this is apparently a kernel-imposed limit for runtime group set).

The patch fixes aptitude segfault mentioned above.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 1e35d32789)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 14:44:13 +03:00
Daniil Kovalev 69a6ea7c4b linux-user: Fix mips fp64 executables loading
If a program requires fr1, we should set the FR bit of CP0 control status
register and add F64 hardware flag. The corresponding `else if` branch
statement is copied from the linux kernel sources (see `arch_check_elf` function
in linux/arch/mips/kernel/elf.c).

Signed-off-by: Daniil Kovalev <dkovalev@compiler-toolchain-for.me>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230404052153.16617-1-dkovalev@compiler-toolchain-for.me>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit a0f8d2701b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 14:44:13 +03:00
Alex Bennée 0b1b5a4204 tests/docker: bump the xtensa base to debian:11-slim
Stretch is going out of support so things like security updates will
fail. As the toolchain itself is binary it hopefully won't mind the
underlying OS being updated.

Message-Id: <20230503091244.1450613-3-alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reported-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3217b84f3c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 14:44:13 +03:00
Lizhi Yang eb82a80f51 docs/about/emulation: fix typo
Duplicated word "are".

Signed-off-by: Lizhi Yang <sledgeh4w@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230511080119.99018-1-sledgeh4w@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c70bb9a771)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-17 12:27:58 +03:00
Claudio Imbrenda 8ad637881f util/async-teardown: wire up query-command-line-options
Add new -run-with option with an async-teardown=on|off parameter. It is
visible in the output of query-command-line-options QMP command, so it
can be discovered and used by libvirt.

The option -async-teardown is now redundant, deprecate it.

Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Fixes: c891c24b1a ("os-posix: asynchronous teardown for shutdown on Linux")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230505120051.36605-2-imbrenda@linux.ibm.com>
[thuth: Add curly braces to fix error with GCC 8.5, fix bug in deprecated.rst]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 80bd81cadd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context tweak in docs/about/deprecated.rst)
2023-05-17 12:27:58 +03:00
Claudio Imbrenda 21b54a683d s390x/pv: Fix spurious warning with asynchronous teardown
Kernel commit 292a7d6fca33 ("KVM: s390: pv: fix asynchronous teardown
for small VMs") causes the KVM_PV_ASYNC_CLEANUP_PREPARE ioctl to fail
if the VM is not larger than 2GiB. QEMU would attempt it and fail,
print an error message, and then proceed with a normal teardown.

Avoid attempting to use asynchronous teardown altogether when the VM is
not larger than 2 GiB. This will avoid triggering the error message and
also avoid pointless overhead; normal teardown is fast enough for small
VMs.

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Fixes: c3a073c610 ("s390x/pv: Add support for asynchronous teardown for reboot")
Link: https://lore.kernel.org/all/20230421085036.52511-2-imbrenda@linux.ibm.com/
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230510105531.30623-2-imbrenda@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Fix inline function parameter in pv.h]
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 88693ab2a5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-17 12:27:58 +03:00
Richard Henderson 36cd9bc8e2 tcg/i386: Set P_REXW in tcg_out_addi_ptr
The REXW bit must be set to produce a 64-bit pointer result; the
bit is disabled in 32-bit mode, so we can do this unconditionally.

Fixes: 7d9e1ee424 ("tcg/i386: Adjust assert in tcg_out_addi_ptr")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1592
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1642
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 988998503b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-17 12:27:58 +03:00
Jason Andryuk 117f33c9a7 9pfs/xen: Fix segfault on shutdown
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed
out when free is called.  Do the teardown in _disconnect().  This
matches the setup done in _connect().

trace-events are also added for the XenDevOps functions.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20230502143722.15613-1-jandryuk@gmail.com>
[C.S.: - Remove redundant return in xen_9pfs_free().
       - Add comment to trace-events. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
(cherry picked from commit 92e667f6fd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-17 12:27:58 +03:00
Ilya Leoshkevich e347aa89dd s390x/tcg: Fix LDER instruction format
It's RRE, not RXE.

Found by running valgrind's none/tests/s390x/bfp-2.

Fixes: 86b59624c4 ("s390x/tcg: Implement LOAD LENGTHENED short HFP to long HFP")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230511134726.469651-1-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 970641de01)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-17 12:27:58 +03:00
Ilya Leoshkevich b858c53ef6 target/s390x: Fix EXECUTE of relative branches
Fix a problem similar to the one fixed by commit 703d03a4aa
("target/s390x: Fix EXECUTE of relative long instructions"), but now
for relative branches.

Reported-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230426235813.198183-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit e8ecdfeb30)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-16 19:08:39 +03:00
Shivaprasad G Bhat 7ceebe3f90 tcg: ppc64: Fix mask generation for vextractdm
In function do_extractm() the mask is calculated as
dup_const(1 << (element_width - 1)). '1' being signed int
works fine for MO_8,16,32. For MO_64, on PPC64 host
this ends up becoming 0 on compilation. The vextractdm
uses MO_64, and it ends up having mask as 0.

Explicitly use 1ULL instead of signed int 1 like its
used everywhere else.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1536
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Lucas Mateus Castro <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <168319292809.1159309.5817546227121323288.stgit@ltc-boston1.aus.stglabs.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 6a5d81b172)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-15 18:42:50 +03:00
Cédric Le Goater 950882af67 async: Suppress GCC13 false positive in aio_bh_poll()
GCC13 reports an error :

../util/async.c: In function ‘aio_bh_poll’:
include/qemu/queue.h:303:22: error: storing the address of local variable ‘slice’ in ‘*ctx.bh_slice_list.sqh_last’ [-Werror=dangling-pointer=]
  303 |     (head)->sqh_last = &(elm)->field.sqe_next;                          \
      |     ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
../util/async.c:169:5: note: in expansion of macro ‘QSIMPLEQ_INSERT_TAIL’
  169 |     QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
      |     ^~~~~~~~~~~~~~~~~~~~
../util/async.c:161:17: note: ‘slice’ declared here
  161 |     BHListSlice slice;
      |                 ^~~~~
../util/async.c:161:17: note: ‘ctx’ declared here

But the local variable 'slice' is removed from the global context list
in following loop of the same routine. Add a pragma to silent GCC.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230420202939.1982044-1-clg@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d66ba6dc1c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: cherry-picked to stable-8.0 to eliminate CI failures on win*)
2023-05-15 18:03:51 +03:00
Peter Maydell e09f912550 target/arm: Correct AArch64.S2MinTxSZ 32-bit EL1 input size check
In check_s2_mmu_setup() we have a check that is attempting to
implement the part of AArch64.S2MinTxSZ that is specific to when EL1
is AArch32:

    if !s1aarch64 then
        // EL1 is AArch32
        min_txsz = Min(min_txsz, 24);

Unfortunately we got this wrong in two ways:

(1) The minimum txsz corresponds to a maximum inputsize, but we got
the sense of the comparison wrong and were faulting for all
inputsizes less than 40 bits

(2) We try to implement this as an extra check that happens after
we've done the same txsz checks we would do for an AArch64 EL1, but
in fact the pseudocode is *loosening* the requirements, so that txsz
values that would fault for an AArch64 EL1 do not fault for AArch32
EL1, because it does Min(old_min, 24), not Max(old_min, 24).

You can see this also in the text of the Arm ARM in table D8-8, which
shows that where the implemented PA size is less than 40 bits an
AArch32 EL1 is still OK with a configured stage2 T0SZ for a 40 bit
IPA, whereas if EL1 is AArch64 then the T0SZ must be big enough to
constrain the IPA to the implemented PA size.

Because of part (2), we can't do this as a separate check, but
have to integrate it into aa64_va_parameters(). Add a new argument
to that function to indicate that EL1 is 32-bit. All the existing
callsites except the one in get_phys_addr_lpae() can pass 'false',
because they are either doing a lookup for a stage 1 regime or
else they don't care about the tsz/tsz_oob fields.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1627
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230509092059.3176487-1-peter.maydell@linaro.org
(cherry picked from commit 478dccbb99)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-14 11:25:57 +03:00
Peter Maydell 80a2c1b5fe ui: Fix pixel colour channel order for PNG screenshots
When we take a PNG screenshot the ordering of the colour channels in
the data is not correct, resulting in the image having weird
colouring compared to the actual display.  (Specifically, on a
little-endian host the blue and red channels are swapped; on
big-endian everything is wrong.)

This happens because the pixman idea of the pixel data and the libpng
idea differ.  PIXMAN_a8r8g8b8 defines that pixels are 32-bit values,
with A in bits 24-31, R in bits 16-23, G in bits 8-15 and B in bits
0-7.  This means that on little-endian systems the bytes in memory
are
   B G R A
and on big-endian systems they are
   A R G B

libpng, on the other hand, thinks of pixels as being a series of
values for each channel, so its format PNG_COLOR_TYPE_RGB_ALPHA
always wants bytes in the order
   R G B A

This isn't the same as the pixman order for either big or little
endian hosts.

The alpha channel is also unnecessary bulk in the output PNG file,
because there is no alpha information in a screenshot.

To handle the endianness issue, we already define in ui/qemu-pixman.h
various PIXMAN_BE_* and PIXMAN_LE_* values that give consistent
byte-order pixel channel formats.  So we can use PIXMAN_BE_r8g8b8 and
PNG_COLOR_TYPE_RGB, which both have an in-memory byte order of
    R G B
and 3 bytes per pixel.

(PPM format screenshots get this right; they already use the
PIXMAN_BE_r8g8b8 format.)

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1622
Fixes: 9a0a119a38 ("Added parameter to take screenshot with screendump as PNG")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20230502135548.2451309-1-peter.maydell@linaro.org
(cherry picked from commit cd22a0f520)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-14 11:24:05 +03:00
Peter Maydell 3148fe1ac8 target/arm: Fix handling of SW and NSW bits for stage 2 walks
We currently don't correctly handle the VSTCR_EL2.SW and VTCR_EL2.NSW
configuration bits.  These allow configuration of whether the stage 2
page table walks for Secure IPA and NonSecure IPA should do their
descriptor reads from Secure or NonSecure physical addresses. (This
is separate from how the translation table base address and other
parameters are set: an NS IPA always uses VTTBR_EL2 and VTCR_EL2
for its base address and walk parameters, regardless of the NSW bit,
and similarly for Secure.)

Provide a new function ptw_idx_for_stage_2() which returns the
MMU index to use for descriptor reads, and use it to set up
the .in_ptw_idx wherever we call get_phys_addr_lpae().

For a stage 2 walk, wherever we call get_phys_addr_lpae():
 * .in_ptw_idx should be ptw_idx_for_stage_2() of the .in_mmu_idx
 * .in_secure should be true if .in_mmu_idx is Stage2_S

This allows us to correct S1_ptw_translate() so that it consistently
always sets its (out_secure, out_phys) to the result it gets from the
S2 walk (either by calling get_phys_addr_lpae() or by TLB lookup).
This makes better conceptual sense because the S2 walk should return
us an (address space, address) tuple, not an address that we then
randomly assign to S or NS.

Our previous handling of SW and NSW was broken, so guest code
trying to use these bits to put the s2 page tables in the "other"
address space wouldn't work correctly.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1600
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230504135425.2748672-3-peter.maydell@linaro.org
(cherry picked from commit fcc0b0418f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-14 11:23:49 +03:00
Richard Henderson 4b59b5bd14 accel/tcg: Fix atomic_mmu_lookup for reads
A copy-paste bug had us looking at the victim cache for writes.

Cc: qemu-stable@nongnu.org
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: 08dff435e2 ("tcg: Probe the proper permissions for atomic ops")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230505204049.352469-1-richard.henderson@linaro.org>
(cherry picked from commit 8c313254e6)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-11 15:40:16 +03:00
Jonathan Cameron 488ad8b302 hw/pci-bridge: pci_expander_bridge fix type in pxb_cxl_dev_reset()
Reproduce issue with

configure --enable-qom-cast-debug ...

qemu-system-x86_64 -display none -machine q35,cxl=on -device pxb-cxl,bus=pcie.0

  hw/pci-bridge/pci_expander_bridge.c:54:PXB_DEV: Object 0x5570e0b1ada0 is not an instance of type pxb
  Aborted

The type conversion results in the right state structure, but PXB_DEV is
not a parent of PXB_CXL_DEV hence the error. Rather than directly
cleaning up the inheritance, this is the minimal fix which will be
followed by the cleanup.

Fixes: 154070eaf6 ("hw/pxb-cxl: Support passthrough HDM Decoders unless overridden")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20230420142750.6950-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9136f661c7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Bin Meng f91d0db71e target/riscv: Restore the predicate() NULL check behavior
When reading a non-existent CSR QEMU should raise illegal instruction
exception, but currently it just exits due to the g_assert() check.

This actually reverts commit 0ee342256a.
Some comments are also added to indicate that predicate() must be
provided for an implemented CSR.

Reported-by: Fei Wu <fei2.wu@intel.com>
Signed-off-by: Bin Meng <bmeng@tinylab.org>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230417043054.3125614-1-bmeng@tinylab.org>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit eae04c4c13)
(mjt: context edit after ce3af0bbbc "target/riscv: add support for Zcmt extension")
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
LIU Zhiwei f5301431e8 target/riscv: Fix itrigger when icount is used
When I boot a ubuntu image, QEMU output a "Bad icount read" message and exit.
The reason is that when execute helper_mret or helper_sret, it will
cause a call to icount_get_raw_locked (), which needs set can_do_io flag
on cpustate.

Thus we setting this flag when execute these two instructions.

Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230324064011.976-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit df3ac6da47)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Kevin Wolf 3b02d0db4a block: Don't call no_coroutine_fns in qmp_block_resize()
This QMP handler runs in a coroutine, so it must use the corresponding
no_co_wrappers instead.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2185688
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 0c7d204f50)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Kevin Wolf e0deae4f49 block: bdrv/blk_co_unref() for calls in coroutine context
These functions must not be called in coroutine context, because they
need write access to the graph.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b2ab5f545f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Kevin Wolf 38a598aee3 block: Consistently call bdrv_activate() outside coroutine
Migration code can call bdrv_activate() in coroutine context, whereas
other callers call it outside of coroutines. As it calls other code that
is not supposed to run in coroutines, standardise on running outside of
coroutines.

This adds a no_co_wrapper to switch to the main loop before calling
bdrv_activate().

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit da4afaff07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Kevin Wolf 2197a94cb4 block: Fix use after free in blockdev_mark_auto_del()
job_cancel_locked() drops the job list lock temporarily and it may call
aio_poll(). We must assume that the list has changed after this call.
Also, with unlucky timing, it can end up freeing the job during
job_completed_txn_abort_locked(), making the job pointer invalid, too.

For both reasons, we can't just continue at block_job_next_locked(job).
Instead, start at the head of the list again after job_cancel_locked()
and skip those jobs that we already cancelled (or that are completing
anyway).

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230503140142.474404-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e2626874a3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Paolo Bonzini 8322e5300f meson: leave unnecessary modules out of the build
meson.build files choose whether to build modules based on foo.found()
expressions.  If a feature is enabled (e.g. --enable-gtk), these expressions
are true even if the code is not used by any emulator, and this results
in an unexpected difference between modular and non-modular builds.

For non-modular builds, the files are not included in any binary, and
therefore the source files are never processed.  For modular builds,
however, all .so files are unconditionally built by default, and therefore
a normal "make" tries to build them.  However, the corresponding trace-*.h
files are absent due to this conditional:

if have_system
  trace_events_subdirs += [
    ...
    'ui',
    ...
  ]
endif

which was added to avoid wasting time running tracetool on unused trace-events
files.  This causes a compilation failure; fix it by skipping module builds
entirely if (depending on the module directory) have_block or have_system
are false.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ef709860ea)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Shivaprasad G Bhat 4dc5df865c softfloat: Fix the incorrect computation in float32_exp2
The float32_exp2 function is computing wrong exponent of 2.

For example, with the following set of values {0.1, 2.0, 2.0, -1.0},
the expected output would be {1.071773, 4.000000, 4.000000, 0.500000}.
Instead, the function is computing {1.119102, 3.382044, 3.382044, -0.191022}

Looking at the code, the float32_exp2() attempts to do this

                  2     3     4     5           n
  x        x     x     x     x     x           x
 e  = 1 + --- + --- + --- + --- + --- + ... + --- + ...
           1!    2!    3!    4!    5!          n!

But because of the typo it ends up doing

  x        x     x     x     x     x           x
 e  = 1 + --- + --- + --- + --- + --- + ... + --- + ...
           1!    2!    3!    4!    5!          n!

This is because instead of the xnp which holds the numerator, parts_muladd
is using the xp which is just 'x'.  Commit '572c4d862ff2' refactored this
function, and mistakenly used xp instead of xnp.

Cc: qemu-stable@nongnu.org
Fixes: 572c4d862f "softfloat: Convert float32_exp2 to FloatParts"
Partially-Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1623
Reported-By: Luca Barbato (https://gitlab.com/lu-zero)
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Message-Id: <168304110865.537992.13059030916325018670.stgit@localhost.localdomain>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 1098cc3fcf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Stefan Hajnoczi a458252c16 block/export: call blk_set_dev_ops(blk, NULL, NULL)
Most export types install BlockDeviceOps pointers. It is easy to forget
to remove them because that happens automatically via the "drive" qdev
property in hw/ but not block/export/.

Put blk_set_dev_ops(blk, NULL, NULL) calls in the core export.c code so
the export types don't need to remember.

This fixes the nbd and vhost-user-blk export types.

Fixes: fd6afc501a ("nbd/server: Use drained block ops to quiesce the server")
Fixes: ca858a5fe9 ("vhost-user-blk-server: notify client about disk resize")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502211119.720647-1-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit de79b52604)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell f6227dd60d hw/net/allwinner-sun8i-emac: Correctly byteswap descriptor fields
In allwinner-sun8i-emac we just read directly from guest memory into
a host FrameDescriptor struct and back.  This only works on
little-endian hosts.  Reading and writing of descriptors is already
abstracted into functions; make those functions also handle the
byte-swapping so that TransferDescriptor structs as seen by the rest
of the code are always in host-order, and fix two places that were
doing ad-hoc descriptor reading without using the functions.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230424165053.1428857-3-peter.maydell@linaro.org
(cherry picked from commit a4ae17e5ec)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell 2daa9e4d7e hw/sd/allwinner-sdhost: Correctly byteswap descriptor fields
In allwinner_sdhost_process_desc() we just read directly from
guest memory into a host TransferDescriptor struct and back.
This only works on little-endian hosts. Abstract the reading
and writing of descriptors into functions that handle the
byte-swapping so that TransferDescriptor structs as seen by
the rest of the code are always in host-order.

This fixes a failure of one of the avocado tests on s390.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230424165053.1428857-2-peter.maydell@linaro.org
(cherry picked from commit 3e20d90824)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell 6944823a6f target/arm: Define and use new load_cpu_field_low32()
In several places in the 32-bit Arm translate.c, we try to use
load_cpu_field() to load from a CPUARMState field into a TCGv_i32
where the field is actually 64-bit. This works on little-endian
hosts, but gives the wrong half of the register on big-endian.

Add a new load_cpu_field_low32() which loads the low 32 bits
of a 64-bit field into a TCGv_i32. The new macro includes a
compile-time check against accidentally using it on a field
of the wrong size. Use it to fix the two places in the code
where we were using load_cpu_field() on a 64-bit field.

This fixes a bug where on big-endian hosts the guest would
crash after executing an ERET instruction, and a more corner
case one where some UNDEFs for attempted accesses to MSR
banked registers from Secure EL1 might go to the wrong EL.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230424153909.1419369-2-peter.maydell@linaro.org
(cherry picked from commit 7f3a3d3dc4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell af08c70ef5 hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()
The Allwinner PIC model uses set_bit() and clear_bit() to update the
values in its irq_pending[] array when an interrupt arrives.  However
it is using these functions wrongly: they work on an array of type
'long', and it is passing an array of type 'uint32_t'.  Because the
code manually figures out the right array element, this works on
little-endian hosts and on 32-bit big-endian hosts, where bits 0..31
in a 'long' are in the same place as they are in a 'uint32_t'.
However it breaks on 64-bit big-endian hosts.

Remove the use of set_bit() and clear_bit() in favour of using
deposit32() on the array element.  This fixes a bug where on
big-endian 64-bit hosts the guest kernel would hang early on in
bootup.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230424152833.1334136-1-peter.maydell@linaro.org
(cherry picked from commit 2c5fa0778c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell 975f12aa52 hw/arm/raspi: Use arm_write_bootloader() to write boot code
When writing the secondary-CPU stub boot loader code to the guest,
use arm_write_bootloader() instead of directly calling
rom_add_blob_fixed().  This fixes a bug on big-endian hosts, because
arm_write_bootloader() will correctly byte-swap the host-byte-order
array values into the guest-byte-order to write into the guest
memory.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230424152717.1333930-4-peter.maydell@linaro.org
(cherry picked from commit 0acbdb4c4a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Cédric Le Goater 5477a21350 hw/arm/aspeed: Use arm_write_bootloader() to write the bootloader
When writing the secondary-CPU stub boot loader code to the guest,
use arm_write_bootloader() instead of directly calling
rom_add_blob_fixed().  This fixes a bug on big-endian hosts, because
arm_write_bootloader() will correctly byte-swap the host-byte-order
array values into the guest-byte-order to write into the guest
memory.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230424152717.1333930-3-peter.maydell@linaro.org
[PMM: Moved the "make arm_write_bootloader() function public" part
 to its own patch; updated commit message to note that this fixes
 an actual bug; adjust to the API changes noted in previous commit]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 902bba549f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Cédric Le Goater 168f193c5b hw/arm/boot: Make write_bootloader() public as arm_write_bootloader()
The arm boot.c code includes a utility function write_bootloader()
which assists in writing a boot-code fragment into guest memory,
including handling endianness and fixing it up with entry point
addresses and similar things.  This is useful not just for the boot.c
code but also in board model code, so rename it to
arm_write_bootloader() and make it globally visible.

Since we are making it public, make its API a little neater: move the
AddressSpace* argument to be next to the hwaddr argument, and allow
the fixupcontext array to be const, since we never modify it in this
function.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20230424152717.1333930-2-peter.maydell@linaro.org
[PMM: Split out from another patch by Cédric, added doc comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 0fe43f0abf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell 61ef050639 hw/net/msf2-emac: Don't modify descriptor in-place in emac_store_desc()
The msf2-emac ethernet controller has functions emac_load_desc() and
emac_store_desc() which read and write the in-memory descriptor
blocks and handle conversion between guest and host endianness.

As currently written, emac_store_desc() does the endianness
conversion in-place; this means that it effectively consumes the
input EmacDesc struct, because on a big-endian host the fields will
be overwritten with the little-endian versions of their values.
Unfortunately, in all the callsites the code continues to access
fields in the EmacDesc struct after it has called emac_store_desc()
-- specifically, it looks at the d.next field.

The effect of this is that on a big-endian host networking doesn't
work because the address of the next descriptor is corrupted.

We could fix this by making the callsite avoid using the struct; but
it's more robust to have emac_store_desc() leave its input alone.

(emac_load_desc() also does an in-place conversion, but here this is
fine, because the function is supposed to be initializing the
struct.)

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20230424151919.1333299-1-peter.maydell@linaro.org
(cherry picked from commit d565f58b38)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Akihiko Odaki f0c5a78029 target/arm: Initialize debug capabilities only once
kvm_arm_init_debug() used to be called several times on a SMP system as
kvm_arch_init_vcpu() calls it. Move the call to kvm_arch_init() to make
sure it will be called only once; otherwise it will overwrite pointers
to memory allocated with the previous call and leak it.

Fixes: e4482ab7e3 ("target-arm: kvm - add support for HW assisted debug")
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230405153644.25300-1-akihiko.odaki@daynix.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit ad5c6ddea3)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Peter Maydell 9448a0fa11 docs/about/deprecated.rst: Add "since 7.1" tag to dtb-kaslr-seed deprecation
In commit 5242876f37 we deprecated the dtb-kaslr-seed property of
the virt board, but forgot the "since n.n" tag in the documentation
of this in deprecated.rst.

This deprecation note first appeared in the 7.1 release, so
retrospectively add the correct "since 7.1" annotation to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230420122256.1023709-1-peter.maydell@linaro.org
(cherry picked from commit ac64ebbecf)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Alex Bennée 8c3cf36260 qemu-options: finesse the recommendations around -blockdev
We are a bit premature in recommending -blockdev/-device as the best
way to configure block devices. It seems there are times the more
human friendly -drive still makes sense especially when -snapshot is
involved.

Improve the language to hopefully make things clearer.

Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230424092249.58552-7-alex.bennee@linaro.org>
(cherry picked from commit c1654c3e37)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Wang Liang f528cfc3fa block/monitor: Fix crash when executing HMP commit
hmp_commit() calls blk_is_available() from a non-coroutine context (and
in the main loop). blk_is_available() is a co_wrapper_mixed_bdrv_rdlock
function, and in the non-coroutine context it calls AIO_WAIT_WHILE(),
which crashes if the aio_context lock is not taken before.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1615
Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20230424103902.45265-1-wangliangzz@126.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8c1e8fb2e7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Igor Mammedov bb47b5bc2e acpi: pcihp: allow repeating hot-unplug requests
with Q35 using ACPI PCI hotplug by default, user's request to unplug
device is ignored when it's issued before guest OS has been booted.
And any additional attempt to request device hot-unplug afterwards
results in following error:

  "Device XYZ is already in the process of unplug"

arguably it can be considered as a regression introduced by [2],
before which it was possible to issue unplug request multiple
times.

Accept new uplug requests after timeout (1ms). This brings ACPI PCI
hotplug on par with native PCIe unplug behavior [1] and allows user
to repeat unplug requests at propper times.
Set expire timeout to arbitrary 1msec so user won't be able to
flood guest with SCI interrupts by calling device_del in tight loop.

PS:
ACPI spec doesn't mandate what OSPM can do with GPEx.status
bits set before it's booted => it's impl. depended.
Status bits may be retained (I tested with one Windows version)
or cleared (Linux since 2.6 kernel times) during guest's ACPI
subsystem initialization.
Clearing status bits (though not wrong per se) hides the unplug
event from guest, and it's upto user to repeat device_del later
when guest is able to handle unplug requests.

1) 18416c62e3 ("pcie: expire pending delete")
2)
Fixes: cce8944cc9 ("qdev-monitor: Forbid repeated device_del")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
CC: mst@redhat.com
CC: anisinha@redhat.com
CC: jusual@redhat.com
CC: kraxel@redhat.com
Message-Id: <20230418090449.2155757-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
(cherry picked from commit 0f689cf5ad)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-10 20:50:38 +03:00
Axel Heider 134a1a3320 hw/timer/imx_epit: fix limit check
Fix the limit check. If the limit is less than the compare value,
the timer can never reach this value, thus it will never fire.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1491
Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
Message-id: 168070611775.20412.2883242077302841473-2@git.sr.ht
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 25d758175d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-27 08:51:16 +03:00
Axel Heider ac7f07ebc8 hw/timer/imx_epit: don't shadow variable
Fix issue reported by Coverity.

Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
Message-id: 168070611775.20412.2883242077302841473-1@git.sr.ht
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 542fd43d79)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-27 08:51:16 +03:00
Yang Zhong 3ed99d232c target/i386: Change wrong XFRM value in SGX CPUID leaf
The previous patch wrongly replaced FEAT_XSAVE_XCR0_{LO|HI} with
FEAT_XSAVE_XSS_{LO|HI} in CPUID(EAX=12,ECX=1):{ECX,EDX}.  As a result,
SGX enclaves only supported SSE and x87 feature (xfrm=0x3).

Fixes: 301e90675c ("target/i386: Enable support for XSAVES based features")
Signed-off-by: Yang Zhong <yang.zhong@linux.intel.com>
Reviewed-by: Yang Weijiang <weijiang.yang@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20230406064041.420039-1-yang.zhong@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 72497cff89)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-27 08:51:16 +03:00
Paolo Bonzini 6f7b9f7b6f vnc: avoid underflow when accessing user-provided address
If hostlen is zero, there is a possibility that addrstr[hostlen - 1]
underflows and, if a closing bracked is there, hostlen - 2 is passed
to g_strndup() on the next line.  If websocket==false then
addrstr[0] would be a colon, but if websocket==true this could in
principle happen.

Fix it by checking hostlen.

Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3f9c41c5df)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-26 18:17:51 +03:00
302 changed files with 3414 additions and 1341 deletions
+3 -3
View File
@@ -102,8 +102,8 @@ crash-test-debian:
IMAGE: debian-amd64
script:
- cd build
- make check-venv
- tests/venv/bin/python3 scripts/device-crash-test -q ./qemu-system-i386
- make NINJA=":" check-venv
- tests/venv/bin/python3 scripts/device-crash-test -q --tcg-only ./qemu-system-i386
build-system-fedora:
extends:
@@ -145,7 +145,7 @@ crash-test-fedora:
IMAGE: fedora
script:
- cd build
- make check-venv
- make NINJA=":" check-venv
- tests/venv/bin/python3 scripts/device-crash-test -q ./qemu-system-ppc
- tests/venv/bin/python3 scripts/device-crash-test -q ./qemu-system-riscv32
+1 -1
View File
@@ -1 +1 @@
8.0.0
8.0.5
+3 -1
View File
@@ -2382,7 +2382,7 @@ static int kvm_init(MachineState *ms)
KVMState *s;
const KVMCapabilityInfo *missing_cap;
int ret;
int type = 0;
int type;
uint64_t dirty_log_manual_caps;
qemu_mutex_init(&kml_slots_lock);
@@ -2447,6 +2447,8 @@ static int kvm_init(MachineState *ms)
type = mc->kvm_type(ms, kvm_type);
} else if (mc->kvm_type) {
type = mc->kvm_type(ms, NULL);
} else {
type = kvm_arch_get_default_type(ms);
}
do {
+1 -1
View File
@@ -1830,7 +1830,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
} else /* if (prot & PAGE_READ) */ {
tlb_addr = tlbe->addr_read;
if (!tlb_hit(tlb_addr, addr)) {
if (!VICTIM_TLB_HIT(addr_write, addr)) {
if (!VICTIM_TLB_HIT(addr_read, addr)) {
tlb_fill(env_cpu(env), addr, size,
MMU_DATA_LOAD, mmu_idx, retaddr);
index = tlb_index(env, mmu_idx, addr);
+9 -4
View File
@@ -1093,6 +1093,9 @@ tb_invalidate_phys_page_range__locked(struct page_collection *pages,
TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
#endif /* TARGET_HAS_PRECISE_SMC */
/* Range may not cross a page. */
tcg_debug_assert(((start ^ last) & TARGET_PAGE_MASK) == 0);
/*
* We remove all the TBs in the range [start, last].
* XXX: see if in some cases it could be faster to invalidate all the code
@@ -1183,15 +1186,17 @@ void tb_invalidate_phys_range(tb_page_addr_t start, tb_page_addr_t last)
index_last = last >> TARGET_PAGE_BITS;
for (index = start >> TARGET_PAGE_BITS; index <= index_last; index++) {
PageDesc *pd = page_find(index);
tb_page_addr_t bound;
tb_page_addr_t page_start, page_last;
if (pd == NULL) {
continue;
}
assert_page_locked(pd);
bound = (index << TARGET_PAGE_BITS) | ~TARGET_PAGE_MASK;
bound = MIN(bound, last);
tb_invalidate_phys_page_range__locked(pages, pd, start, bound, 0);
page_start = index << TARGET_PAGE_BITS;
page_last = page_start | ~TARGET_PAGE_MASK;
page_last = MIN(page_last, last);
tb_invalidate_phys_page_range__locked(pages, pd,
page_start, page_last, 0);
}
page_collection_unlock(pages);
}
+10
View File
@@ -191,6 +191,11 @@ static int cryptodev_backend_account(CryptoDevBackend *backend,
if (algtype == QCRYPTODEV_BACKEND_ALG_ASYM) {
CryptoDevBackendAsymOpInfo *asym_op_info = op_info->u.asym_op_info;
len = asym_op_info->src_len;
if (unlikely(!backend->asym_stat)) {
error_report("cryptodev: Unexpected asym operation");
return -VIRTIO_CRYPTO_NOTSUPP;
}
switch (op_info->op_code) {
case VIRTIO_CRYPTO_AKCIPHER_ENCRYPT:
CryptodevAsymStatIncEncrypt(backend, len);
@@ -210,6 +215,11 @@ static int cryptodev_backend_account(CryptoDevBackend *backend,
} else if (algtype == QCRYPTODEV_BACKEND_ALG_SYM) {
CryptoDevBackendSymOpInfo *sym_op_info = op_info->u.sym_op_info;
len = sym_op_info->src_len;
if (unlikely(!backend->sym_stat)) {
error_report("cryptodev: Unexpected sym operation");
return -VIRTIO_CRYPTO_NOTSUPP;
}
switch (op_info->op_code) {
case VIRTIO_CRYPTO_CIPHER_ENCRYPT:
CryptodevSymStatIncEncrypt(backend, len);
+2 -9
View File
@@ -112,12 +112,8 @@ static int tpm_util_request(int fd,
void *response,
size_t responselen)
{
fd_set readfds;
GPollFD fds[1] = { {.fd = fd, .events = G_IO_IN } };
int n;
struct timeval tv = {
.tv_sec = 1,
.tv_usec = 0,
};
n = write(fd, request, requestlen);
if (n < 0) {
@@ -127,11 +123,8 @@ static int tpm_util_request(int fd,
return -EFAULT;
}
FD_ZERO(&readfds);
FD_SET(fd, &readfds);
/* wait for a second */
n = select(fd + 1, &readfds, NULL, NULL, &tv);
n = RETRY_ON_EINTR(g_poll(fds, 1, 1000));
if (n != 1) {
return -errno;
}
+1 -1
View File
@@ -680,7 +680,7 @@ int coroutine_fn bdrv_co_create_opts_simple(BlockDriver *drv,
ret = 0;
out:
blk_unref(blk);
blk_co_unref(blk);
return ret;
}
+9 -1
View File
@@ -2018,7 +2018,15 @@ void blk_activate(BlockBackend *blk, Error **errp)
return;
}
bdrv_activate(bs, errp);
/*
* Migration code can call this function in coroutine context, so leave
* coroutine context if necessary.
*/
if (qemu_in_coroutine()) {
bdrv_co_activate(bs, errp);
} else {
bdrv_activate(bs, errp);
}
}
bool coroutine_fn blk_co_is_inserted(BlockBackend *blk)
+3 -3
View File
@@ -355,7 +355,7 @@ block_crypto_co_create_generic(BlockDriverState *bs, int64_t size,
ret = 0;
cleanup:
qcrypto_block_free(crypto);
blk_unref(blk);
blk_co_unref(blk);
return ret;
}
@@ -661,7 +661,7 @@ block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
ret = 0;
fail:
bdrv_unref(bs);
bdrv_co_unref(bs);
return ret;
}
@@ -730,7 +730,7 @@ fail:
bdrv_co_delete_file_noerr(bs);
}
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_QCryptoBlockCreateOptions(create_opts);
qobject_unref(cryptoopts);
return ret;
+5 -1
View File
@@ -192,7 +192,10 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
return exp;
fail:
blk_unref(blk);
if (blk) {
blk_set_dev_ops(blk, NULL, NULL);
blk_unref(blk);
}
aio_context_release(ctx);
if (exp) {
g_free(exp->id);
@@ -219,6 +222,7 @@ static void blk_exp_delete_bh(void *opaque)
assert(exp->refcount == 0);
QLIST_REMOVE(exp, next);
exp->drv->delete(exp);
blk_set_dev_ops(exp->blk, NULL, NULL);
blk_unref(exp->blk);
qapi_event_send_block_export_deleted(exp->id);
g_free(exp->id);
-1
View File
@@ -346,7 +346,6 @@ static void vduse_blk_exp_delete(BlockExport *exp)
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
vblk_exp);
blk_set_dev_ops(exp->blk, NULL, NULL);
ret = vduse_dev_destroy(vblk_exp->dev);
if (ret != -EBUSY) {
unlink(vblk_exp->recon_file);
+27
View File
@@ -30,8 +30,10 @@ BdrvGraphLock graph_lock;
/* Protects the list of aiocontext and orphaned_reader_count */
static QemuMutex aio_context_list_lock;
#if 0
/* Written and read with atomic operations. */
static int has_writer;
#endif
/*
* A reader coroutine could move from an AioContext to another.
@@ -88,6 +90,7 @@ void unregister_aiocontext(AioContext *ctx)
g_free(ctx->bdrv_graph);
}
#if 0
static uint32_t reader_count(void)
{
BdrvGraphRWlock *brdv_graph;
@@ -105,10 +108,17 @@ static uint32_t reader_count(void)
assert((int32_t)rd >= 0);
return rd;
}
#endif
void bdrv_graph_wrlock(void)
{
GLOBAL_STATE_CODE();
/*
* TODO Some callers hold an AioContext lock when this is called, which
* causes deadlocks. Reenable once the AioContext locking is cleaned up (or
* AioContext locks are gone).
*/
#if 0
assert(!qatomic_read(&has_writer));
/* Make sure that constantly arriving new I/O doesn't cause starvation */
@@ -139,11 +149,13 @@ void bdrv_graph_wrlock(void)
} while (reader_count() >= 1);
bdrv_drain_all_end();
#endif
}
void bdrv_graph_wrunlock(void)
{
GLOBAL_STATE_CODE();
#if 0
QEMU_LOCK_GUARD(&aio_context_list_lock);
assert(qatomic_read(&has_writer));
@@ -155,10 +167,13 @@ void bdrv_graph_wrunlock(void)
/* Wake up all coroutine that are waiting to read the graph */
qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
#endif
}
void coroutine_fn bdrv_graph_co_rdlock(void)
{
/* TODO Reenable when wrlock is reenabled */
#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -223,10 +238,12 @@ void coroutine_fn bdrv_graph_co_rdlock(void)
qemu_co_queue_wait(&reader_queue, &aio_context_list_lock);
}
}
#endif
}
void coroutine_fn bdrv_graph_co_rdunlock(void)
{
#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -249,6 +266,7 @@ void coroutine_fn bdrv_graph_co_rdunlock(void)
if (qatomic_read(&has_writer)) {
aio_wait_kick();
}
#endif
}
void bdrv_graph_rdlock_main_loop(void)
@@ -265,11 +283,20 @@ void bdrv_graph_rdunlock_main_loop(void)
void assert_bdrv_graph_readable(void)
{
/* reader_count() is slow due to aio_context_list_lock lock contention */
/* TODO Reenable when wrlock is reenabled */
#if 0
#ifdef CONFIG_DEBUG_GRAPH_LOCK
assert(qemu_in_main_thread() || reader_count());
#endif
#endif
}
void assert_bdrv_graph_writable(void)
{
assert(qemu_in_main_thread());
/* TODO Reenable when wrlock is reenabled */
#if 0
assert(qatomic_read(&has_writer));
#endif
}
+6 -4
View File
@@ -214,15 +214,17 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
error_report("Device '%s' not found", device);
return;
}
if (!blk_is_available(blk)) {
error_report("Device '%s' has no medium", device);
return;
}
bs = bdrv_skip_implicit_filters(blk_bs(blk));
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (!blk_is_available(blk)) {
error_report("Device '%s' has no medium", device);
aio_context_release(aio_context);
return;
}
ret = bdrv_commit(bs);
aio_context_release(aio_context);
+3 -3
View File
@@ -613,8 +613,8 @@ static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
ret = 0;
out:
blk_unref(blk);
bdrv_unref(bs);
blk_co_unref(blk);
bdrv_co_unref(bs);
return ret;
exit:
@@ -691,7 +691,7 @@ parallels_co_create_opts(BlockDriver *drv, const char *filename,
done:
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+3 -3
View File
@@ -915,8 +915,8 @@ static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
g_free(tmp);
ret = 0;
exit:
blk_unref(qcow_blk);
bdrv_unref(bs);
blk_co_unref(qcow_blk);
bdrv_co_unref(bs);
qcrypto_block_free(crypto);
return ret;
}
@@ -1015,7 +1015,7 @@ qcow_co_create_opts(BlockDriver *drv, const char *filename,
fail:
g_free(backing_fmt);
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+7 -7
View File
@@ -3705,7 +3705,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
goto out;
}
blk_unref(blk);
blk_co_unref(blk);
blk = NULL;
/*
@@ -3785,7 +3785,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
}
}
blk_unref(blk);
blk_co_unref(blk);
blk = NULL;
/* Reopen the image without BDRV_O_NO_FLUSH to flush it before returning.
@@ -3810,9 +3810,9 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
ret = 0;
out:
blk_unref(blk);
bdrv_unref(bs);
bdrv_unref(data_bs);
blk_co_unref(blk);
bdrv_co_unref(bs);
bdrv_co_unref(data_bs);
return ret;
}
@@ -3943,8 +3943,8 @@ finish:
}
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_unref(data_bs);
bdrv_co_unref(bs);
bdrv_co_unref(data_bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+3 -3
View File
@@ -748,8 +748,8 @@ static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
ret = 0; /* success */
out:
g_free(l1_table);
blk_unref(blk);
bdrv_unref(bs);
blk_co_unref(blk);
bdrv_co_unref(bs);
return ret;
}
@@ -819,7 +819,7 @@ bdrv_qed_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+3 -3
View File
@@ -886,8 +886,8 @@ static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
ret = 0;
exit:
blk_unref(blk);
bdrv_unref(bs_file);
blk_co_unref(blk);
bdrv_co_unref(bs_file);
g_free(bmap);
return ret;
}
@@ -975,7 +975,7 @@ vdi_co_create_opts(BlockDriver *drv, const char *filename,
done:
qobject_unref(qdict);
qapi_free_BlockdevCreateOptions(create_options);
bdrv_unref(bs_file);
bdrv_co_unref(bs_file);
return ret;
}
+3 -3
View File
@@ -2053,8 +2053,8 @@ static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
ret = 0;
delete_and_exit:
blk_unref(blk);
bdrv_unref(bs);
blk_co_unref(blk);
bdrv_co_unref(bs);
g_free(creator);
return ret;
}
@@ -2144,7 +2144,7 @@ vhdx_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+9 -9
View File
@@ -2306,7 +2306,7 @@ exit:
if (pbb) {
*pbb = blk;
} else {
blk_unref(blk);
blk_co_unref(blk);
blk = NULL;
}
}
@@ -2516,12 +2516,12 @@ vmdk_co_do_create(int64_t size,
if (strcmp(blk_bs(backing)->drv->format_name, "vmdk")) {
error_setg(errp, "Invalid backing file format: %s. Must be vmdk",
blk_bs(backing)->drv->format_name);
blk_unref(backing);
blk_co_unref(backing);
ret = -EINVAL;
goto exit;
}
ret = vmdk_read_cid(blk_bs(backing), 0, &parent_cid);
blk_unref(backing);
blk_co_unref(backing);
if (ret) {
error_setg(errp, "Failed to read parent CID");
goto exit;
@@ -2542,14 +2542,14 @@ vmdk_co_do_create(int64_t size,
blk_bs(extent_blk)->filename);
created_size += cur_size;
extent_idx++;
blk_unref(extent_blk);
blk_co_unref(extent_blk);
}
/* Check whether we got excess extents */
extent_blk = extent_fn(-1, extent_idx, flat, split, compress, zeroed_grain,
opaque, NULL);
if (extent_blk) {
blk_unref(extent_blk);
blk_co_unref(extent_blk);
error_setg(errp, "List of extents contains unused extents");
ret = -EINVAL;
goto exit;
@@ -2590,7 +2590,7 @@ vmdk_co_do_create(int64_t size,
ret = 0;
exit:
if (blk) {
blk_unref(blk);
blk_co_unref(blk);
}
g_free(desc);
g_free(parent_desc_line);
@@ -2641,7 +2641,7 @@ vmdk_co_create_opts_cb(int64_t size, int idx, bool flat, bool split,
errp)) {
goto exit;
}
bdrv_unref(bs);
bdrv_co_unref(bs);
exit:
g_free(ext_filename);
return blk;
@@ -2797,12 +2797,12 @@ static BlockBackend * coroutine_fn vmdk_co_create_cb(int64_t size, int idx,
return NULL;
}
blk_set_allow_write_beyond_eof(blk, true);
bdrv_unref(bs);
bdrv_co_unref(bs);
if (size != -1) {
ret = vmdk_init_extent(blk, size, flat, compress, zeroed_grain, errp);
if (ret) {
blk_unref(blk);
blk_co_unref(blk);
blk = NULL;
}
}
+3 -3
View File
@@ -1082,8 +1082,8 @@ static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
}
out:
blk_unref(blk);
bdrv_unref(bs);
blk_co_unref(blk);
bdrv_co_unref(bs);
return ret;
}
@@ -1162,7 +1162,7 @@ vpc_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
bdrv_unref(bs);
bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
+16 -6
View File
@@ -153,12 +153,22 @@ void blockdev_mark_auto_del(BlockBackend *blk)
JOB_LOCK_GUARD();
for (job = block_job_next_locked(NULL); job;
job = block_job_next_locked(job)) {
if (block_job_has_bdrv(job, blk_bs(blk))) {
do {
job = block_job_next_locked(NULL);
while (job && (job->job.cancelled ||
job->job.deferred_to_main_loop ||
!block_job_has_bdrv(job, blk_bs(blk))))
{
job = block_job_next_locked(job);
}
if (job) {
/*
* This drops the job lock temporarily and polls, so we need to
* restart processing the list from the start after this.
*/
job_cancel_locked(&job->job, false);
}
}
} while (job);
dinfo->auto_del = 1;
}
@@ -2430,7 +2440,7 @@ void coroutine_fn qmp_block_resize(const char *device, const char *node_name,
return;
}
blk = blk_new_with_bs(bs, BLK_PERM_RESIZE, BLK_PERM_ALL, errp);
blk = blk_co_new_with_bs(bs, BLK_PERM_RESIZE, BLK_PERM_ALL, errp);
if (!blk) {
return;
}
@@ -2445,7 +2455,7 @@ void coroutine_fn qmp_block_resize(const char *device, const char *node_name,
bdrv_co_lock(bs);
bdrv_drained_end(bs);
blk_unref(blk);
blk_co_unref(blk);
bdrv_co_unlock(bs);
}
Vendored
+1
View File
@@ -806,6 +806,7 @@ for opt do
--enable-debug)
# Enable debugging options that aren't excessively noisy
debug_tcg="yes"
meson_option_parse --enable-debug-graph-lock ""
meson_option_parse --enable-debug-mutex ""
meson_option_add -Doptimization=0
fortify_source="no"
+6 -2
View File
@@ -111,6 +111,10 @@ Use ``-machine acpi=off`` instead.
The HAXM project has been retired (see https://github.com/intel/haxm#status).
Use "whpx" (on Windows) or "hvf" (on macOS) instead.
``-async-teardown`` (since 8.1)
'''''''''''''''''''''''''''''''
Use ``-run-with async-teardown=on`` instead.
QEMU Machine Protocol (QMP) commands
------------------------------------
@@ -219,8 +223,8 @@ Use the more generic event ``DEVICE_UNPLUG_GUEST_ERROR`` instead.
System emulator machines
------------------------
Arm ``virt`` machine ``dtb-kaslr-seed`` property
''''''''''''''''''''''''''''''''''''''''''''''''
Arm ``virt`` machine ``dtb-kaslr-seed`` property (since 7.1)
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
The ``dtb-kaslr-seed`` property on the ``virt`` board has been
deprecated; use the new name ``dtb-randomness`` instead. The new name
+1 -1
View File
@@ -99,7 +99,7 @@ depending on the guest architecture.
- Yes
- A configurable 32 bit soft core now owned by Cadence
A number of features are are only available when running under
A number of features are only available when running under
emulation including :ref:`Record/Replay<replay>` and :ref:`TCG Plugins`.
.. _Semihosting:
+1 -1
View File
@@ -8,4 +8,4 @@ QEMU is a trademark of Fabrice Bellard.
QEMU is released under the `GNU General Public
License <https://www.gnu.org/licenses/gpl-2.0.txt>`__, version 2. Parts
of QEMU have specific licenses, see file
`LICENSE <https://git.qemu.org/?p=qemu.git;a=blob_plain;f=LICENSE>`__.
`LICENSE <https://gitlab.com/qemu-project/qemu/-/raw/master/LICENSE>`__.
+7
View File
@@ -61,6 +61,7 @@ There are several old APIs that use the main loop AioContext:
* LEGACY qemu_aio_set_event_notifier() - monitor an event notifier
* LEGACY timer_new_ms() - create a timer
* LEGACY qemu_bh_new() - create a BH
* LEGACY qemu_bh_new_guarded() - create a BH with a device re-entrancy guard
* LEGACY qemu_aio_wait() - run an event loop iteration
Since they implicitly work on the main loop they cannot be used in code that
@@ -72,8 +73,14 @@ Instead, use the AioContext functions directly (see include/block/aio.h):
* aio_set_event_notifier() - monitor an event notifier
* aio_timer_new() - create a timer
* aio_bh_new() - create a BH
* aio_bh_new_guarded() - create a BH with a device re-entrancy guard
* aio_poll() - run an event loop iteration
The qemu_bh_new_guarded/aio_bh_new_guarded APIs accept a "MemReentrancyGuard"
argument, which is used to check for and prevent re-entrancy problems. For
BHs associated with devices, the reentrancy-guard is contained in the
corresponding DeviceState and named "mem_reentrancy_guard".
The AioContext can be obtained from the IOThread using
iothread_get_aio_context() or for the main loop using qemu_get_aio_context().
Code that takes an AioContext argument works both in IOThreads or the main
+6 -6
View File
@@ -117,13 +117,13 @@ to support the multiple thread compression migration:
{qemu} migrate_set_capability compress on
3. Set the compression thread count on source:
{qemu} migrate_set_parameter compress_threads 12
{qemu} migrate_set_parameter compress-threads 12
4. Set the compression level on the source:
{qemu} migrate_set_parameter compress_level 1
{qemu} migrate_set_parameter compress-level 1
5. Set the decompression thread count on destination:
{qemu} migrate_set_parameter decompress_threads 3
{qemu} migrate_set_parameter decompress-threads 3
6. Start outgoing migration:
{qemu} migrate -d tcp:destination.host:4444
@@ -133,9 +133,9 @@ to support the multiple thread compression migration:
The following are the default settings:
compress: off
compress_threads: 8
decompress_threads: 2
compress_level: 1 (which means best speed)
compress-threads: 8
decompress-threads: 2
compress-level: 1 (which means best speed)
So, only the first two steps are required to use the multiple
thread compression in migration. You can do more if the default
+1 -1
View File
@@ -89,7 +89,7 @@ RUNNING:
First, set the migration speed to match your hardware's capabilities:
QEMU Monitor Command:
$ migrate_set_parameter max_bandwidth 40g # or whatever is the MAX of your RDMA device
$ migrate_set_parameter max-bandwidth 40g # or whatever is the MAX of your RDMA device
Next, on the destination machine, add the following to the QEMU command line:
+1 -1
View File
@@ -4,7 +4,7 @@ Multi-process QEMU
==================
This document describes how to configure and use multi-process qemu.
For the design document refer to docs/devel/qemu-multiprocess.
For the design document refer to docs/devel/multi-process.rst.
1) Configuration
----------------
+2 -2
View File
@@ -1293,8 +1293,8 @@ static bool get_next_page(GuestPhysBlock **blockptr, uint64_t *pfnptr,
memcpy(buf + addr % page_size, hbuf, n);
addr += n;
if (addr % page_size == 0) {
/* we filled up the page */
if (addr % page_size == 0 || addr >= block->target_end) {
/* we filled up the page or the current block is finished */
break;
}
} else {
+1 -1
View File
@@ -5135,7 +5135,7 @@ float32 float32_exp2(float32 a, float_status *status)
float64_unpack_canonical(&rp, float64_one, status);
for (i = 0 ; i < 15 ; i++) {
float64_unpack_canonical(&tp, float32_exp2_coefficients[i], status);
rp = *parts_muladd(&tp, &xp, &rp, 0, status);
rp = *parts_muladd(&tp, &xnp, &rp, 0, status);
xnp = *parts_mul(&xnp, &xp, status);
}
+25 -2
View File
@@ -26,6 +26,7 @@
#include "qemu/xattr.h"
#include "9p-iov-marshal.h"
#include "hw/9pfs/9p-proxy.h"
#include "hw/9pfs/9p-util.h"
#include "fsdev/9p-iov-marshal.h"
#define PROGNAME "virtfs-proxy-helper"
@@ -338,6 +339,28 @@ static void resetugid(int suid, int sgid)
}
}
/*
* Open regular file or directory. Attempts to open any special file are
* rejected.
*
* returns file descriptor or -1 on error
*/
static int open_regular(const char *pathname, int flags, mode_t mode)
{
int fd;
fd = open(pathname, flags, mode);
if (fd < 0) {
return fd;
}
if (close_if_special_file(fd) < 0) {
return -1;
}
return fd;
}
/*
* send response in two parts
* 1) ProxyHeader
@@ -682,7 +705,7 @@ static int do_create(struct iovec *iovec)
if (ret < 0) {
goto unmarshal_err_out;
}
ret = open(path.data, flags, mode);
ret = open_regular(path.data, flags, mode);
if (ret < 0) {
ret = -errno;
}
@@ -707,7 +730,7 @@ static int do_open(struct iovec *iovec)
if (ret < 0) {
goto err_out;
}
ret = open(path.data, flags);
ret = open_regular(path.data, flags, 0);
if (ret < 0) {
ret = -errno;
}
+39
View File
@@ -13,6 +13,8 @@
#ifndef QEMU_9P_UTIL_H
#define QEMU_9P_UTIL_H
#include "qemu/error-report.h"
#ifdef O_PATH
#define O_PATH_9P_UTIL O_PATH
#else
@@ -95,6 +97,7 @@ static inline int errno_to_dotl(int err) {
#endif
#define qemu_openat openat
#define qemu_fstat fstat
#define qemu_fstatat fstatat
#define qemu_mkdirat mkdirat
#define qemu_renameat renameat
@@ -108,6 +111,38 @@ static inline void close_preserve_errno(int fd)
errno = serrno;
}
/**
* close_if_special_file() - Close @fd if neither regular file nor directory.
*
* @fd: file descriptor of open file
* Return: 0 on regular file or directory, -1 otherwise
*
* CVE-2023-2861: Prohibit opening any special file directly on host
* (especially device files), as a compromised client could potentially gain
* access outside exported tree under certain, unsafe setups. We expect
* client to handle I/O on special files exclusively on guest side.
*/
static inline int close_if_special_file(int fd)
{
struct stat stbuf;
if (qemu_fstat(fd, &stbuf) < 0) {
close_preserve_errno(fd);
return -1;
}
if (!S_ISREG(stbuf.st_mode) && !S_ISDIR(stbuf.st_mode)) {
error_report_once(
"9p: broken or compromised client detected; attempt to open "
"special file (i.e. neither regular file, nor directory)"
);
close(fd);
errno = ENXIO;
return -1;
}
return 0;
}
static inline int openat_dir(int dirfd, const char *name)
{
return qemu_openat(dirfd, name,
@@ -142,6 +177,10 @@ again:
return -1;
}
if (close_if_special_file(fd) < 0) {
return -1;
}
serrno = errno;
/* O_NONBLOCK was only needed to open the file. Let's drop it. We don't
* do that with O_PATH since fcntl(F_SETFL) isn't supported, and openat()
+6
View File
@@ -48,3 +48,9 @@ v9fs_readlink(uint16_t tag, uint8_t id, int32_t fid) "tag %d id %d fid %d"
v9fs_readlink_return(uint16_t tag, uint8_t id, char* target) "tag %d id %d name %s"
v9fs_setattr(uint16_t tag, uint8_t id, int32_t fid, int32_t valid, int32_t mode, int32_t uid, int32_t gid, int64_t size, int64_t atime_sec, int64_t mtime_sec) "tag %u id %u fid %d iattr={valid %d mode %d uid %d gid %d size %"PRId64" atime=%"PRId64" mtime=%"PRId64" }"
v9fs_setattr_return(uint16_t tag, uint8_t id) "tag %u id %u"
# xen-9p-backend.c
xen_9pfs_alloc(char *name) "name %s"
xen_9pfs_connect(char *name) "name %s"
xen_9pfs_disconnect(char *name) "name %s"
xen_9pfs_free(char *name) "name %s"
+26 -14
View File
@@ -25,6 +25,8 @@
#include "qemu/iov.h"
#include "fsdev/qemu-fsdev.h"
#include "trace.h"
#define VERSIONS "1"
#define MAX_RINGS 8
#define MAX_RING_ORDER 9
@@ -61,6 +63,7 @@ typedef struct Xen9pfsDev {
int num_rings;
Xen9pfsRing *rings;
MemReentrancyGuard mem_reentrancy_guard;
} Xen9pfsDev;
static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev);
@@ -336,6 +339,8 @@ static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev)
Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
int i;
trace_xen_9pfs_disconnect(xendev->name);
for (i = 0; i < xen_9pdev->num_rings; i++) {
if (xen_9pdev->rings[i].evtchndev != NULL) {
qemu_set_fd_handler(qemu_xen_evtchn_fd(xen_9pdev->rings[i].evtchndev),
@@ -344,40 +349,41 @@ static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev)
xen_9pdev->rings[i].local_port);
xen_9pdev->rings[i].evtchndev = NULL;
}
}
}
static int xen_9pfs_free(struct XenLegacyDevice *xendev)
{
Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
int i;
if (xen_9pdev->rings[0].evtchndev != NULL) {
xen_9pfs_disconnect(xendev);
}
for (i = 0; i < xen_9pdev->num_rings; i++) {
if (xen_9pdev->rings[i].data != NULL) {
xen_be_unmap_grant_refs(&xen_9pdev->xendev,
xen_9pdev->rings[i].data,
xen_9pdev->rings[i].intf->ref,
(1 << xen_9pdev->rings[i].ring_order));
xen_9pdev->rings[i].data = NULL;
}
if (xen_9pdev->rings[i].intf != NULL) {
xen_be_unmap_grant_ref(&xen_9pdev->xendev,
xen_9pdev->rings[i].intf,
xen_9pdev->rings[i].ref);
xen_9pdev->rings[i].intf = NULL;
}
if (xen_9pdev->rings[i].bh != NULL) {
qemu_bh_delete(xen_9pdev->rings[i].bh);
xen_9pdev->rings[i].bh = NULL;
}
}
g_free(xen_9pdev->id);
xen_9pdev->id = NULL;
g_free(xen_9pdev->tag);
xen_9pdev->tag = NULL;
g_free(xen_9pdev->path);
xen_9pdev->path = NULL;
g_free(xen_9pdev->security_model);
xen_9pdev->security_model = NULL;
g_free(xen_9pdev->rings);
xen_9pdev->rings = NULL;
}
static int xen_9pfs_free(struct XenLegacyDevice *xendev)
{
trace_xen_9pfs_free(xendev->name);
return 0;
}
@@ -389,6 +395,8 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
V9fsState *s = &xen_9pdev->state;
QemuOpts *fsdev;
trace_xen_9pfs_connect(xendev->name);
if (xenstore_read_fe_int(&xen_9pdev->xendev, "num-rings",
&xen_9pdev->num_rings) == -1 ||
xen_9pdev->num_rings > MAX_RINGS || xen_9pdev->num_rings < 1) {
@@ -443,7 +451,9 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
xen_9pdev->rings[i].ring.out = xen_9pdev->rings[i].data +
XEN_FLEX_RING_SIZE(ring_order);
xen_9pdev->rings[i].bh = qemu_bh_new(xen_9pfs_bh, &xen_9pdev->rings[i]);
xen_9pdev->rings[i].bh = qemu_bh_new_guarded(xen_9pfs_bh,
&xen_9pdev->rings[i],
&xen_9pdev->mem_reentrancy_guard);
xen_9pdev->rings[i].out_cons = 0;
xen_9pdev->rings[i].out_size = 0;
xen_9pdev->rings[i].inprogress = false;
@@ -496,6 +506,8 @@ out:
static void xen_9pfs_alloc(struct XenLegacyDevice *xendev)
{
trace_xen_9pfs_alloc(xendev->name);
xenstore_write_be_str(xendev, "versions", VERSIONS);
xenstore_write_be_int(xendev, "max-rings", MAX_RINGS);
xenstore_write_be_int(xendev, "max-ring-page-order", MAX_RING_ORDER);
+10
View File
@@ -357,6 +357,16 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
* acpi_pcihp_eject_slot() when the operation is completed.
*/
pdev->qdev.pending_deleted_event = true;
/* if unplug was requested before OSPM is initialized,
* linux kernel will clear GPE0.sts[] bits during boot, which effectively
* hides unplug event. And than followup qmp_device_del() calls remain
* blocked by above flag permanently.
* Unblock qmp_device_del() by setting expire limit, so user can
* repeat unplug request later when OSPM has been booted.
*/
pdev->qdev.pending_deleted_expires_ms =
qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); /* 1 msec */
s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
}
+20 -18
View File
@@ -200,33 +200,35 @@ struct AspeedMachineState {
static void aspeed_write_smpboot(ARMCPU *cpu,
const struct arm_boot_info *info)
{
static const uint32_t poll_mailbox_ready[] = {
AddressSpace *as = arm_boot_address_space(cpu, info);
static const ARMInsnFixup poll_mailbox_ready[] = {
/*
* r2 = per-cpu go sign value
* r1 = AST_SMP_MBOX_FIELD_ENTRY
* r0 = AST_SMP_MBOX_FIELD_GOSIGN
*/
0xee100fb0, /* mrc p15, 0, r0, c0, c0, 5 */
0xe21000ff, /* ands r0, r0, #255 */
0xe59f201c, /* ldr r2, [pc, #28] */
0xe1822000, /* orr r2, r2, r0 */
{ 0xee100fb0 }, /* mrc p15, 0, r0, c0, c0, 5 */
{ 0xe21000ff }, /* ands r0, r0, #255 */
{ 0xe59f201c }, /* ldr r2, [pc, #28] */
{ 0xe1822000 }, /* orr r2, r2, r0 */
0xe59f1018, /* ldr r1, [pc, #24] */
0xe59f0018, /* ldr r0, [pc, #24] */
{ 0xe59f1018 }, /* ldr r1, [pc, #24] */
{ 0xe59f0018 }, /* ldr r0, [pc, #24] */
0xe320f002, /* wfe */
0xe5904000, /* ldr r4, [r0] */
0xe1520004, /* cmp r2, r4 */
0x1afffffb, /* bne <wfe> */
0xe591f000, /* ldr pc, [r1] */
AST_SMP_MBOX_GOSIGN,
AST_SMP_MBOX_FIELD_ENTRY,
AST_SMP_MBOX_FIELD_GOSIGN,
{ 0xe320f002 }, /* wfe */
{ 0xe5904000 }, /* ldr r4, [r0] */
{ 0xe1520004 }, /* cmp r2, r4 */
{ 0x1afffffb }, /* bne <wfe> */
{ 0xe591f000 }, /* ldr pc, [r1] */
{ AST_SMP_MBOX_GOSIGN },
{ AST_SMP_MBOX_FIELD_ENTRY },
{ AST_SMP_MBOX_FIELD_GOSIGN },
{ 0, FIXUP_TERMINATOR }
};
static const uint32_t fixupcontext[FIXUP_MAX] = { 0 };
rom_add_blob_fixed("aspeed.smpboot", poll_mailbox_ready,
sizeof(poll_mailbox_ready),
info->smp_loader_start);
arm_write_bootloader("aspeed.smpboot", as, info->smp_loader_start,
poll_mailbox_ready, fixupcontext);
}
static void aspeed_reset_secondary(ARMCPU *cpu,
+8 -27
View File
@@ -60,26 +60,6 @@ AddressSpace *arm_boot_address_space(ARMCPU *cpu,
return cpu_get_address_space(cs, asidx);
}
typedef enum {
FIXUP_NONE = 0, /* do nothing */
FIXUP_TERMINATOR, /* end of insns */
FIXUP_BOARDID, /* overwrite with board ID number */
FIXUP_BOARD_SETUP, /* overwrite with board specific setup code address */
FIXUP_ARGPTR_LO, /* overwrite with pointer to kernel args */
FIXUP_ARGPTR_HI, /* overwrite with pointer to kernel args (high half) */
FIXUP_ENTRYPOINT_LO, /* overwrite with kernel entry point */
FIXUP_ENTRYPOINT_HI, /* overwrite with kernel entry point (high half) */
FIXUP_GIC_CPU_IF, /* overwrite with GIC CPU interface address */
FIXUP_BOOTREG, /* overwrite with boot register address */
FIXUP_DSB, /* overwrite with correct DSB insn for cpu */
FIXUP_MAX,
} FixupType;
typedef struct ARMInsnFixup {
uint32_t insn;
FixupType fixup;
} ARMInsnFixup;
static const ARMInsnFixup bootloader_aarch64[] = {
{ 0x580000c0 }, /* ldr x0, arg ; Load the lower 32-bits of DTB */
{ 0xaa1f03e1 }, /* mov x1, xzr */
@@ -150,9 +130,10 @@ static const ARMInsnFixup smpboot[] = {
{ 0, FIXUP_TERMINATOR }
};
static void write_bootloader(const char *name, hwaddr addr,
const ARMInsnFixup *insns, uint32_t *fixupcontext,
AddressSpace *as)
void arm_write_bootloader(const char *name,
AddressSpace *as, hwaddr addr,
const ARMInsnFixup *insns,
const uint32_t *fixupcontext)
{
/* Fix up the specified bootloader fragment and write it into
* guest memory using rom_add_blob_fixed(). fixupcontext is
@@ -214,8 +195,8 @@ static void default_write_secondary(ARMCPU *cpu,
fixupcontext[FIXUP_DSB] = CP15_DSB_INSN;
}
write_bootloader("smpboot", info->smp_loader_start,
smpboot, fixupcontext, as);
arm_write_bootloader("smpboot", as, info->smp_loader_start,
smpboot, fixupcontext);
}
void arm_write_secure_board_setup_dummy_smc(ARMCPU *cpu,
@@ -1186,8 +1167,8 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
fixupcontext[FIXUP_ENTRYPOINT_LO] = entry;
fixupcontext[FIXUP_ENTRYPOINT_HI] = entry >> 32;
write_bootloader("bootloader", info->loader_start,
primary_loader, fixupcontext, as);
arm_write_bootloader("bootloader", as, info->loader_start,
primary_loader, fixupcontext);
if (info->write_board_setup) {
info->write_board_setup(cpu, info);
+34 -30
View File
@@ -16,6 +16,7 @@
#include "qemu/units.h"
#include "qemu/cutils.h"
#include "qapi/error.h"
#include "hw/arm/boot.h"
#include "hw/arm/bcm2836.h"
#include "hw/registerfields.h"
#include "qemu/error-report.h"
@@ -124,20 +125,22 @@ static const char *board_type(uint32_t board_rev)
static void write_smpboot(ARMCPU *cpu, const struct arm_boot_info *info)
{
static const uint32_t smpboot[] = {
0xe1a0e00f, /* mov lr, pc */
0xe3a0fe00 + (BOARDSETUP_ADDR >> 4), /* mov pc, BOARDSETUP_ADDR */
0xee100fb0, /* mrc p15, 0, r0, c0, c0, 5;get core ID */
0xe7e10050, /* ubfx r0, r0, #0, #2 ;extract LSB */
0xe59f5014, /* ldr r5, =0x400000CC ;load mbox base */
0xe320f001, /* 1: yield */
0xe7953200, /* ldr r3, [r5, r0, lsl #4] ;read mbox for our core*/
0xe3530000, /* cmp r3, #0 ;spin while zero */
0x0afffffb, /* beq 1b */
0xe7853200, /* str r3, [r5, r0, lsl #4] ;clear mbox */
0xe12fff13, /* bx r3 ;jump to target */
0x400000cc, /* (constant: mailbox 3 read/clear base) */
static const ARMInsnFixup smpboot[] = {
{ 0xe1a0e00f }, /* mov lr, pc */
{ 0xe3a0fe00 + (BOARDSETUP_ADDR >> 4) }, /* mov pc, BOARDSETUP_ADDR */
{ 0xee100fb0 }, /* mrc p15, 0, r0, c0, c0, 5;get core ID */
{ 0xe7e10050 }, /* ubfx r0, r0, #0, #2 ;extract LSB */
{ 0xe59f5014 }, /* ldr r5, =0x400000CC ;load mbox base */
{ 0xe320f001 }, /* 1: yield */
{ 0xe7953200 }, /* ldr r3, [r5, r0, lsl #4] ;read mbox for our core */
{ 0xe3530000 }, /* cmp r3, #0 ;spin while zero */
{ 0x0afffffb }, /* beq 1b */
{ 0xe7853200 }, /* str r3, [r5, r0, lsl #4] ;clear mbox */
{ 0xe12fff13 }, /* bx r3 ;jump to target */
{ 0x400000cc }, /* (constant: mailbox 3 read/clear base) */
{ 0, FIXUP_TERMINATOR }
};
static const uint32_t fixupcontext[FIXUP_MAX] = { 0 };
/* check that we don't overrun board setup vectors */
QEMU_BUILD_BUG_ON(SMPBOOT_ADDR + sizeof(smpboot) > MVBAR_ADDR);
@@ -145,9 +148,8 @@ static void write_smpboot(ARMCPU *cpu, const struct arm_boot_info *info)
QEMU_BUILD_BUG_ON((BOARDSETUP_ADDR & 0xf) != 0
|| (BOARDSETUP_ADDR >> 4) >= 0x100);
rom_add_blob_fixed_as("raspi_smpboot", smpboot, sizeof(smpboot),
info->smp_loader_start,
arm_boot_address_space(cpu, info));
arm_write_bootloader("raspi_smpboot", arm_boot_address_space(cpu, info),
info->smp_loader_start, smpboot, fixupcontext);
}
static void write_smpboot64(ARMCPU *cpu, const struct arm_boot_info *info)
@@ -161,26 +163,28 @@ static void write_smpboot64(ARMCPU *cpu, const struct arm_boot_info *info)
* the primary CPU goes into the kernel. We put these variables inside
* a rom blob, so that the reset for ROM contents zeroes them for us.
*/
static const uint32_t smpboot[] = {
0xd2801b05, /* mov x5, 0xd8 */
0xd53800a6, /* mrs x6, mpidr_el1 */
0x924004c6, /* and x6, x6, #0x3 */
0xd503205f, /* spin: wfe */
0xf86678a4, /* ldr x4, [x5,x6,lsl #3] */
0xb4ffffc4, /* cbz x4, spin */
0xd2800000, /* mov x0, #0x0 */
0xd2800001, /* mov x1, #0x0 */
0xd2800002, /* mov x2, #0x0 */
0xd2800003, /* mov x3, #0x0 */
0xd61f0080, /* br x4 */
static const ARMInsnFixup smpboot[] = {
{ 0xd2801b05 }, /* mov x5, 0xd8 */
{ 0xd53800a6 }, /* mrs x6, mpidr_el1 */
{ 0x924004c6 }, /* and x6, x6, #0x3 */
{ 0xd503205f }, /* spin: wfe */
{ 0xf86678a4 }, /* ldr x4, [x5,x6,lsl #3] */
{ 0xb4ffffc4 }, /* cbz x4, spin */
{ 0xd2800000 }, /* mov x0, #0x0 */
{ 0xd2800001 }, /* mov x1, #0x0 */
{ 0xd2800002 }, /* mov x2, #0x0 */
{ 0xd2800003 }, /* mov x3, #0x0 */
{ 0xd61f0080 }, /* br x4 */
{ 0, FIXUP_TERMINATOR }
};
static const uint32_t fixupcontext[FIXUP_MAX] = { 0 };
static const uint64_t spintables[] = {
0, 0, 0, 0
};
rom_add_blob_fixed_as("raspi_smpboot", smpboot, sizeof(smpboot),
info->smp_loader_start, as);
arm_write_bootloader("raspi_smpboot", as, info->smp_loader_start,
smpboot, fixupcontext);
rom_add_blob_fixed_as("raspi_spintables", spintables, sizeof(spintables),
SPINTABLE_ADDR, as);
}
+1 -2
View File
@@ -192,8 +192,7 @@ static int get_pte(dma_addr_t baseaddr, uint32_t index, uint64_t *pte,
dma_addr_t addr = baseaddr + index * sizeof(*pte);
/* TODO: guarantee 64-bit single-copy atomicity */
ret = dma_memory_read(&address_space_memory, addr, pte, sizeof(*pte),
MEMTXATTRS_UNSPECIFIED);
ret = ldq_le_dma(&address_space_memory, addr, pte, MEMTXATTRS_UNSPECIFIED);
if (ret != MEMTX_OK) {
info->type = SMMU_PTW_ERR_WALK_EABT;
+31 -8
View File
@@ -98,20 +98,34 @@ static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
trace_smmuv3_write_gerrorn(toggled & pending, s->gerrorn);
}
static inline MemTxResult queue_read(SMMUQueue *q, void *data)
static inline MemTxResult queue_read(SMMUQueue *q, Cmd *cmd)
{
dma_addr_t addr = Q_CONS_ENTRY(q);
MemTxResult ret;
int i;
return dma_memory_read(&address_space_memory, addr, data, q->entry_size,
MEMTXATTRS_UNSPECIFIED);
ret = dma_memory_read(&address_space_memory, addr, cmd, sizeof(Cmd),
MEMTXATTRS_UNSPECIFIED);
if (ret != MEMTX_OK) {
return ret;
}
for (i = 0; i < ARRAY_SIZE(cmd->word); i++) {
le32_to_cpus(&cmd->word[i]);
}
return ret;
}
static MemTxResult queue_write(SMMUQueue *q, void *data)
static MemTxResult queue_write(SMMUQueue *q, Evt *evt_in)
{
dma_addr_t addr = Q_PROD_ENTRY(q);
MemTxResult ret;
Evt evt = *evt_in;
int i;
ret = dma_memory_write(&address_space_memory, addr, data, q->entry_size,
for (i = 0; i < ARRAY_SIZE(evt.word); i++) {
cpu_to_le32s(&evt.word[i]);
}
ret = dma_memory_write(&address_space_memory, addr, &evt, sizeof(Evt),
MEMTXATTRS_UNSPECIFIED);
if (ret != MEMTX_OK) {
return ret;
@@ -291,7 +305,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
SMMUEventInfo *event)
{
int ret;
int ret, i;
trace_smmuv3_get_ste(addr);
/* TODO: guarantee 64-bit single-copy atomicity */
@@ -304,6 +318,9 @@ static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
event->u.f_ste_fetch.addr = addr;
return -EINVAL;
}
for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
le32_to_cpus(&buf->word[i]);
}
return 0;
}
@@ -313,7 +330,7 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
CD *buf, SMMUEventInfo *event)
{
dma_addr_t addr = STE_CTXPTR(ste);
int ret;
int ret, i;
trace_smmuv3_get_cd(addr);
/* TODO: guarantee 64-bit single-copy atomicity */
@@ -326,6 +343,9 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
event->u.f_ste_fetch.addr = addr;
return -EINVAL;
}
for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
le32_to_cpus(&buf->word[i]);
}
return 0;
}
@@ -407,7 +427,7 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
return -EINVAL;
}
if (s->features & SMMU_FEATURE_2LVL_STE) {
int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
int l1_ste_offset, l2_ste_offset, max_l2_ste, span, i;
dma_addr_t l1ptr, l2ptr;
STEDesc l1std;
@@ -431,6 +451,9 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
event->u.f_ste_fetch.addr = l1ptr;
return -EINVAL;
}
for (i = 0; i < ARRAY_SIZE(l1std.word); i++) {
le32_to_cpus(&l1std.word[i]);
}
span = L1STD_SPAN(&l1std);
+1 -1
View File
@@ -213,7 +213,7 @@ static void xlnx_zynqmp_create_rpu(MachineState *ms, XlnxZynqMPState *s,
const char *boot_cpu, Error **errp)
{
int i;
int num_rpus = MIN(ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS,
int num_rpus = MIN((int)(ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS),
XLNX_ZYNQMP_NUM_RPU_CPUS);
if (num_rpus <= 0) {
+2 -1
View File
@@ -127,7 +127,8 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
} else {
s->ctx = qemu_get_aio_context();
}
s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
s->bh = aio_bh_new_guarded(s->ctx, notify_guest_bh, s,
&DEVICE(vdev)->mem_reentrancy_guard);
s->batch_notify_vqs = bitmap_new(conf->num_queues);
*dataplane = s;
+3 -2
View File
@@ -633,8 +633,9 @@ XenBlockDataPlane *xen_block_dataplane_create(XenDevice *xendev,
} else {
dataplane->ctx = qemu_get_aio_context();
}
dataplane->bh = aio_bh_new(dataplane->ctx, xen_block_dataplane_bh,
dataplane);
dataplane->bh = aio_bh_new_guarded(dataplane->ctx, xen_block_dataplane_bh,
dataplane,
&DEVICE(xendev)->mem_reentrancy_guard);
return dataplane;
}
+6 -5
View File
@@ -763,14 +763,15 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
drive = g_new0(XenBlockDrive, 1);
drive->id = g_strdup(id);
file_layer = qdict_new();
driver_layer = qdict_new();
rc = stat(filename, &st);
if (rc) {
error_setg_errno(errp, errno, "Could not stat file '%s'", filename);
goto done;
}
file_layer = qdict_new();
driver_layer = qdict_new();
if (S_ISBLK(st.st_mode)) {
qdict_put_str(file_layer, "driver", "host_device");
} else {
@@ -778,7 +779,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
}
qdict_put_str(file_layer, "filename", filename);
g_free(filename);
if (mode && *mode != 'w') {
qdict_put_bool(file_layer, "read-only", true);
@@ -813,7 +813,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
qdict_put_str(file_layer, "locking", "off");
qdict_put_str(driver_layer, "driver", driver);
g_free(driver);
qdict_put(driver_layer, "file", file_layer);
@@ -824,6 +823,8 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
qobject_unref(driver_layer);
done:
g_free(filename);
g_free(driver);
if (*errp) {
xen_block_drive_destroy(drive, NULL);
return NULL;
+7 -5
View File
@@ -29,6 +29,7 @@
#include "chardev/char-fe.h"
#include "qemu/timer.h"
#include "qemu/error-report.h"
#include "exec/tswap.h"
#define RISCV_DEBUG_HTIF 0
#define HTIF_DEBUG(fmt, ...) \
@@ -167,11 +168,11 @@ static void htif_handle_tohost_write(HTIFState *s, uint64_t val_written)
} else {
uint64_t syscall[8];
cpu_physical_memory_read(payload, syscall, sizeof(syscall));
if (syscall[0] == PK_SYS_WRITE &&
syscall[1] == HTIF_DEV_CONSOLE &&
syscall[3] == HTIF_CONSOLE_CMD_PUTC) {
if (tswap64(syscall[0]) == PK_SYS_WRITE &&
tswap64(syscall[1]) == HTIF_DEV_CONSOLE &&
tswap64(syscall[3]) == HTIF_CONSOLE_CMD_PUTC) {
uint8_t ch;
cpu_physical_memory_read(syscall[2], &ch, 1);
cpu_physical_memory_read(tswap64(syscall[2]), &ch, 1);
qemu_chr_fe_write(&s->chr, &ch, 1);
resp = 0x100 | (uint8_t)payload;
} else {
@@ -190,7 +191,8 @@ static void htif_handle_tohost_write(HTIFState *s, uint64_t val_written)
s->tohost = 0; /* clear to indicate we read */
return;
} else if (cmd == HTIF_CONSOLE_CMD_PUTC) {
qemu_chr_fe_write(&s->chr, (uint8_t *)&payload, 1);
uint8_t ch = (uint8_t)payload;
qemu_chr_fe_write(&s->chr, &ch, 1);
resp = 0x100 | (uint8_t)payload;
} else {
qemu_log("HTIF device %d: unknown command\n", device);
+2 -1
View File
@@ -985,7 +985,8 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
return;
}
port->bh = qemu_bh_new(flush_queued_data_bh, port);
port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port,
&dev->mem_reentrancy_guard);
port->elem = NULL;
}
+10
View File
@@ -197,3 +197,13 @@ void machine_parse_smp_config(MachineState *ms,
return;
}
}
unsigned int machine_topo_get_cores_per_socket(const MachineState *ms)
{
return ms->smp.cores * ms->smp.clusters * ms->smp.dies;
}
unsigned int machine_topo_get_threads_per_socket(const MachineState *ms)
{
return ms->smp.threads * machine_topo_get_cores_per_socket(ms);
}
+9
View File
@@ -43,6 +43,7 @@ GlobalProperty hw_compat_7_2[] = {
{ "e1000e", "migrate-timadj", "off" },
{ "virtio-mem", "x-early-migration", "false" },
{ "migration", "x-preempt-pre-7-2", "true" },
{ TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
};
const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
@@ -1332,6 +1333,14 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
}
} else if (machine_class->default_ram_id && machine->ram_size &&
numa_uses_legacy_mem()) {
if (object_property_find(object_get_objects_root(),
machine_class->default_ram_id)) {
error_setg(errp, "object name '%s' is reserved for the default"
" RAM backend, it can't be used for any other purposes."
" Change the object's 'id' to something else",
machine_class->default_ram_id);
return;
}
if (!create_default_memdev(current_machine, mem_path, errp)) {
return;
}
+10 -4
View File
@@ -1591,7 +1591,10 @@ static void qxl_set_mode(PCIQXLDevice *d, unsigned int modenr, int loadvm)
}
d->guest_slots[0].slot = slot;
assert(qxl_add_memslot(d, 0, devmem, QXL_SYNC) == 0);
if (qxl_add_memslot(d, 0, devmem, QXL_SYNC) != 0) {
qxl_set_guest_bug(d, "device isn't initialized yet");
return;
}
d->guest_primary.surface = surface;
qxl_create_guest_primary(d, 0, QXL_SYNC);
@@ -2201,11 +2204,14 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error **errp)
qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl);
qxl->update_irq = qemu_bh_new(qxl_update_irq_bh, qxl);
qxl->update_irq = qemu_bh_new_guarded(qxl_update_irq_bh, qxl,
&DEVICE(qxl)->mem_reentrancy_guard);
qxl_reset_state(qxl);
qxl->update_area_bh = qemu_bh_new(qxl_render_update_area_bh, qxl);
qxl->ssd.cursor_bh = qemu_bh_new(qemu_spice_cursor_refresh_bh, &qxl->ssd);
qxl->update_area_bh = qemu_bh_new_guarded(qxl_render_update_area_bh, qxl,
&DEVICE(qxl)->mem_reentrancy_guard);
qxl->ssd.cursor_bh = qemu_bh_new_guarded(qemu_spice_cursor_refresh_bh, &qxl->ssd,
&DEVICE(qxl)->mem_reentrancy_guard);
}
static void qxl_realize_primary(PCIDevice *dev, Error **errp)
+26 -7
View File
@@ -498,6 +498,8 @@ static void virtio_gpu_resource_flush(VirtIOGPU *g,
struct virtio_gpu_resource_flush rf;
struct virtio_gpu_scanout *scanout;
pixman_region16_t flush_region;
bool within_bounds = false;
bool update_submitted = false;
int i;
VIRTIO_GPU_FILL_CMD(rf);
@@ -518,13 +520,28 @@ static void virtio_gpu_resource_flush(VirtIOGPU *g,
rf.r.x < scanout->x + scanout->width &&
rf.r.x + rf.r.width >= scanout->x &&
rf.r.y < scanout->y + scanout->height &&
rf.r.y + rf.r.height >= scanout->y &&
console_has_gl(scanout->con)) {
dpy_gl_update(scanout->con, 0, 0, scanout->width,
scanout->height);
rf.r.y + rf.r.height >= scanout->y) {
within_bounds = true;
if (console_has_gl(scanout->con)) {
dpy_gl_update(scanout->con, 0, 0, scanout->width,
scanout->height);
update_submitted = true;
}
}
}
return;
if (update_submitted) {
return;
}
if (!within_bounds) {
qemu_log_mask(LOG_GUEST_ERROR, "%s: flush bounds outside scanouts"
" bounds for flush %d: %d %d %d %d\n",
__func__, rf.resource_id, rf.r.x, rf.r.y,
rf.r.width, rf.r.height);
cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
return;
}
}
if (!res->blob &&
@@ -1339,8 +1356,10 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error **errp)
g->ctrl_vq = virtio_get_queue(vdev, 0);
g->cursor_vq = virtio_get_queue(vdev, 1);
g->ctrl_bh = qemu_bh_new(virtio_gpu_ctrl_bh, g);
g->cursor_bh = qemu_bh_new(virtio_gpu_cursor_bh, g);
g->ctrl_bh = qemu_bh_new_guarded(virtio_gpu_ctrl_bh, g,
&qdev->mem_reentrancy_guard);
g->cursor_bh = qemu_bh_new_guarded(virtio_gpu_cursor_bh, g,
&qdev->mem_reentrancy_guard);
QTAILQ_INIT(&g->reslist);
QTAILQ_INIT(&g->cmdq);
QTAILQ_INIT(&g->fenceq);
+8 -3
View File
@@ -168,6 +168,11 @@ static inline int stream_idle(struct Stream *s)
return !!(s->regs[R_DMASR] & DMASR_IDLE);
}
static inline int stream_halted(struct Stream *s)
{
return !!(s->regs[R_DMASR] & DMASR_HALTED);
}
static void stream_reset(struct Stream *s)
{
s->regs[R_DMASR] = DMASR_HALTED; /* starts up halted. */
@@ -269,7 +274,7 @@ static void stream_process_mem2s(struct Stream *s, StreamSink *tx_data_dev,
uint64_t addr;
bool eop;
if (!stream_running(s) || stream_idle(s)) {
if (!stream_running(s) || stream_idle(s) || stream_halted(s)) {
return;
}
@@ -326,7 +331,7 @@ static size_t stream_process_s2mem(struct Stream *s, unsigned char *buf,
unsigned int rxlen;
size_t pos = 0;
if (!stream_running(s) || stream_idle(s)) {
if (!stream_running(s) || stream_idle(s) || stream_halted(s)) {
return 0;
}
@@ -407,7 +412,7 @@ xilinx_axidma_data_stream_can_push(StreamSink *obj,
XilinxAXIDMAStreamSink *ds = XILINX_AXI_DMA_DATA_STREAM(obj);
struct Stream *s = &ds->dma->streams[1];
if (!stream_running(s) || stream_idle(s)) {
if (!stream_running(s) || stream_idle(s) || stream_halted(s)) {
ds->dma->notify = notify;
ds->dma->notify_opaque = notify_opaque;
return false;
+13 -2
View File
@@ -122,6 +122,7 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
{
FWCfgState *fw_cfg;
uint64_t val;
const char qemu_version[] = QEMU_VERSION;
fw_cfg = fw_cfg_init_mem(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4);
fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, ms->smp.cpus);
@@ -147,6 +148,10 @@ static FWCfgState *create_fw_cfg(MachineState *ms)
fw_cfg_add_i16(fw_cfg, FW_CFG_BOOT_DEVICE, ms->boot_config.order[0]);
qemu_register_boot_set(fw_cfg_boot_set, fw_cfg);
fw_cfg_add_file(fw_cfg, "/etc/qemu-version",
g_memdup(qemu_version, sizeof(qemu_version)),
sizeof(qemu_version));
return fw_cfg;
}
@@ -417,10 +422,16 @@ static void hppa_machine_reset(MachineState *ms, ShutdownCause reason)
/* Start all CPUs at the firmware entry point.
* Monarch CPU will initialize firmware, secondary CPUs
* will enter a small idle look and wait for rendevouz. */
* will enter a small idle loop and wait for rendevouz. */
for (i = 0; i < smp_cpus; i++) {
cpu_set_pc(CPU(cpu[i]), firmware_entry);
CPUState *cs = CPU(cpu[i]);
cpu_set_pc(cs, firmware_entry);
cpu[i]->env.psw = PSW_Q;
cpu[i]->env.gr[5] = CPU_HPA + i * 0x1000;
cs->exception_index = -1;
cs->halted = 0;
}
/* already initialized by machine_hppa_init()? */
+9 -27
View File
@@ -226,7 +226,7 @@ static int aspeed_i2c_dma_read(AspeedI2CBus *bus, uint8_t *data)
return 0;
}
static int aspeed_i2c_bus_send(AspeedI2CBus *bus, uint8_t pool_start)
static int aspeed_i2c_bus_send(AspeedI2CBus *bus)
{
AspeedI2CClass *aic = ASPEED_I2C_GET_CLASS(bus->controller);
int ret = -1;
@@ -236,10 +236,10 @@ static int aspeed_i2c_bus_send(AspeedI2CBus *bus, uint8_t pool_start)
uint32_t reg_byte_buf = aspeed_i2c_bus_byte_buf_offset(bus);
uint32_t reg_dma_len = aspeed_i2c_bus_dma_len_offset(bus);
int pool_tx_count = SHARED_ARRAY_FIELD_EX32(bus->regs, reg_pool_ctrl,
TX_COUNT);
TX_COUNT) + 1;
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, TX_BUFF_EN)) {
for (i = pool_start; i < pool_tx_count; i++) {
for (i = 0; i < pool_tx_count; i++) {
uint8_t *pool_base = aic->bus_pool_base(bus);
trace_aspeed_i2c_bus_send("BUF", i + 1, pool_tx_count,
@@ -273,7 +273,7 @@ static int aspeed_i2c_bus_send(AspeedI2CBus *bus, uint8_t pool_start)
}
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, TX_DMA_EN, 0);
} else {
trace_aspeed_i2c_bus_send("BYTE", pool_start, 1,
trace_aspeed_i2c_bus_send("BYTE", 0, 1,
bus->regs[reg_byte_buf]);
ret = i2c_send(bus->bus, bus->regs[reg_byte_buf]);
}
@@ -293,7 +293,7 @@ static void aspeed_i2c_bus_recv(AspeedI2CBus *bus)
uint32_t reg_dma_len = aspeed_i2c_bus_dma_len_offset(bus);
uint32_t reg_dma_addr = aspeed_i2c_bus_dma_addr_offset(bus);
int pool_rx_count = SHARED_ARRAY_FIELD_EX32(bus->regs, reg_pool_ctrl,
RX_COUNT);
RX_SIZE) + 1;
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, RX_BUFF_EN)) {
uint8_t *pool_base = aic->bus_pool_base(bus);
@@ -418,7 +418,7 @@ static void aspeed_i2c_bus_cmd_dump(AspeedI2CBus *bus)
uint32_t reg_intr_sts = aspeed_i2c_bus_intr_sts_offset(bus);
uint32_t reg_dma_len = aspeed_i2c_bus_dma_len_offset(bus);
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, RX_BUFF_EN)) {
count = SHARED_ARRAY_FIELD_EX32(bus->regs, reg_pool_ctrl, TX_COUNT);
count = SHARED_ARRAY_FIELD_EX32(bus->regs, reg_pool_ctrl, TX_COUNT) + 1;
} else if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, RX_DMA_EN)) {
count = bus->regs[reg_dma_len];
} else { /* BYTE mode */
@@ -446,10 +446,8 @@ static void aspeed_i2c_bus_cmd_dump(AspeedI2CBus *bus)
*/
static void aspeed_i2c_bus_handle_cmd(AspeedI2CBus *bus, uint64_t value)
{
uint8_t pool_start = 0;
uint32_t reg_intr_sts = aspeed_i2c_bus_intr_sts_offset(bus);
uint32_t reg_cmd = aspeed_i2c_bus_cmd_offset(bus);
uint32_t reg_pool_ctrl = aspeed_i2c_bus_pool_ctrl_offset(bus);
uint32_t reg_dma_len = aspeed_i2c_bus_dma_len_offset(bus);
if (!aspeed_i2c_check_sram(bus)) {
@@ -483,27 +481,11 @@ static void aspeed_i2c_bus_handle_cmd(AspeedI2CBus *bus, uint64_t value)
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, M_START_CMD, 0);
/*
* The START command is also a TX command, as the slave
* address is sent on the bus. Drop the TX flag if nothing
* else needs to be sent in this sequence.
*/
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, TX_BUFF_EN)) {
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_pool_ctrl, TX_COUNT)
== 1) {
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, M_TX_CMD, 0);
} else {
/*
* Increase the start index in the TX pool buffer to
* skip the address byte.
*/
pool_start++;
}
} else if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, TX_DMA_EN)) {
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, TX_DMA_EN)) {
if (bus->regs[reg_dma_len] == 0) {
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, M_TX_CMD, 0);
}
} else {
} else if (!SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, TX_BUFF_EN)) {
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_cmd, M_TX_CMD, 0);
}
@@ -520,7 +502,7 @@ static void aspeed_i2c_bus_handle_cmd(AspeedI2CBus *bus, uint64_t value)
if (SHARED_ARRAY_FIELD_EX32(bus->regs, reg_cmd, M_TX_CMD)) {
aspeed_i2c_set_state(bus, I2CD_MTXD);
if (aspeed_i2c_bus_send(bus, pool_start)) {
if (aspeed_i2c_bus_send(bus)) {
SHARED_ARRAY_FIELD_DP32(bus->regs, reg_intr_sts, TX_NAK, 1);
i2c_end_transfer(bus->bus);
} else {
+1 -1
View File
@@ -70,7 +70,7 @@ static int bitbang_i2c_ret(bitbang_i2c_interface *i2c, int level)
return level & i2c->last_data;
}
/* Leave device data pin unodified. */
/* Leave device data pin unmodified. */
static int bitbang_i2c_nop(bitbang_i2c_interface *i2c)
{
return bitbang_i2c_ret(i2c, i2c->device_out);
+1 -1
View File
@@ -5,7 +5,7 @@ bitbang_i2c_state(const char *old_state, const char *new_state) "state %s -> %s"
bitbang_i2c_addr(uint8_t addr) "Address 0x%02x"
bitbang_i2c_send(uint8_t byte) "TX byte 0x%02x"
bitbang_i2c_recv(uint8_t byte) "RX byte 0x%02x"
bitbang_i2c_data(unsigned dat, unsigned clk, unsigned old_out, unsigned new_out) "dat %u clk %u out %u -> %u"
bitbang_i2c_data(unsigned clk, unsigned dat, unsigned old_out, unsigned new_out) "clk %u dat %u out %u -> %u"
# core.c
+14 -9
View File
@@ -755,6 +755,8 @@ static int vtd_get_pdire_from_pdir_table(dma_addr_t pasid_dir_base,
return -VTD_FR_PASID_TABLE_INV;
}
pdire->val = le64_to_cpu(pdire->val);
return 0;
}
@@ -779,6 +781,9 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState *s,
pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
return -VTD_FR_PASID_TABLE_INV;
}
for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
pe->val[i] = le64_to_cpu(pe->val[i]);
}
/* Do translation type check */
if (!vtd_pe_type_check(x86_iommu, pe)) {
@@ -3322,14 +3327,15 @@ static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
return -VTD_FR_IR_ROOT_INVAL;
}
trace_vtd_ir_irte_get(index, le64_to_cpu(entry->data[1]),
le64_to_cpu(entry->data[0]));
entry->data[0] = le64_to_cpu(entry->data[0]);
entry->data[1] = le64_to_cpu(entry->data[1]);
trace_vtd_ir_irte_get(index, entry->data[1], entry->data[0]);
if (!entry->irte.present) {
error_report_once("%s: detected non-present IRTE "
"(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
__func__, index, le64_to_cpu(entry->data[1]),
le64_to_cpu(entry->data[0]));
__func__, index, entry->data[1], entry->data[0]);
return -VTD_FR_IR_ENTRY_P;
}
@@ -3337,14 +3343,13 @@ static int vtd_irte_get(IntelIOMMUState *iommu, uint16_t index,
entry->irte.__reserved_2) {
error_report_once("%s: detected non-zero reserved IRTE "
"(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
__func__, index, le64_to_cpu(entry->data[1]),
le64_to_cpu(entry->data[0]));
__func__, index, entry->data[1], entry->data[0]);
return -VTD_FR_IR_IRTE_RSVD;
}
if (sid != X86_IOMMU_SID_INVALID) {
/* Validate IRTE SID */
source_id = le32_to_cpu(entry->irte.source_id);
source_id = entry->irte.source_id;
switch (entry->irte.sid_vtype) {
case VTD_SVT_NONE:
break;
@@ -3398,7 +3403,7 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, uint16_t index,
irq->trigger_mode = irte.irte.trigger_mode;
irq->vector = irte.irte.vector;
irq->delivery_mode = irte.irte.delivery_mode;
irq->dest = le32_to_cpu(irte.irte.dest_id);
irq->dest = irte.irte.dest_id;
if (!iommu->intr_eime) {
#define VTD_IR_APIC_DEST_MASK (0xff00ULL)
#define VTD_IR_APIC_DEST_SHIFT (8)
@@ -3453,7 +3458,7 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
goto out;
}
index = addr.addr.index_h << 15 | le16_to_cpu(addr.addr.index_l);
index = addr.addr.index_h << 15 | addr.addr.index_l;
#define VTD_IR_MSI_DATA_SUBHANDLE (0x0000ffff)
#define VTD_IR_MSI_DATA_RESERVED (0xffff0000)
+9
View File
@@ -321,12 +321,21 @@ typedef enum VTDFaultReason {
/* Interrupt Entry Cache Invalidation Descriptor: VT-d 6.5.2.7. */
struct VTDInvDescIEC {
#if HOST_BIG_ENDIAN
uint64_t reserved_2:16;
uint64_t index:16; /* Start index to invalidate */
uint64_t index_mask:5; /* 2^N for continuous int invalidation */
uint64_t resved_1:22;
uint64_t granularity:1; /* If set, it's global IR invalidation */
uint64_t type:4; /* Should always be 0x4 */
#else
uint32_t type:4; /* Should always be 0x4 */
uint32_t granularity:1; /* If set, it's global IR invalidation */
uint32_t resved_1:22;
uint32_t index_mask:5; /* 2^N for continuous int invalidation */
uint32_t index:16; /* Start index to invalidate */
uint32_t reserved_2:16;
#endif
};
typedef struct VTDInvDescIEC VTDInvDescIEC;
+2 -2
View File
@@ -1587,7 +1587,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int gsi)
found:
pirq_inuse_word(s, pirq) |= pirq_inuse_bit(pirq);
if (gsi >= 0) {
assert(gsi <= IOAPIC_NUM_PINS);
assert(gsi < IOAPIC_NUM_PINS);
s->gsi_pirq[gsi] = pirq;
}
s->pirq[pirq].gsi = gsi;
@@ -1601,7 +1601,7 @@ bool xen_evtchn_set_gsi(int gsi, int level)
assert(qemu_mutex_iothread_locked());
if (!s || gsi < 0 || gsi > IOAPIC_NUM_PINS) {
if (!s || gsi < 0 || gsi >= IOAPIC_NUM_PINS) {
return false;
}
+1 -1
View File
@@ -1688,7 +1688,7 @@ static struct qemu_xs_handle *xs_be_open(void)
XenXenstoreState *s = xen_xenstore_singleton;
struct qemu_xs_handle *h;
if (!s && !s->impl) {
if (!s || !s->impl) {
errno = -ENOSYS;
return NULL;
}
+1 -1
View File
@@ -63,7 +63,7 @@ void x86_iommu_irq_to_msi_message(X86IOMMUIrq *irq, MSIMessage *msg_out)
msg.redir_hint = irq->redir_hint;
msg.dest = irq->dest;
msg.__addr_hi = irq->dest & 0xffffff00;
msg.__addr_head = cpu_to_le32(0xfee);
msg.__addr_head = 0xfee;
/* Keep this from original MSI address bits */
msg.__not_used = irq->msi_addr_last_bits;
+81 -32
View File
@@ -41,9 +41,10 @@
#include "trace.h"
static void check_cmd(AHCIState *s, int port);
static int handle_cmd(AHCIState *s, int port, uint8_t slot);
static void handle_cmd(AHCIState *s, int port, uint8_t slot);
static void ahci_reset_port(AHCIState *s, int port);
static bool ahci_write_fis_d2h(AHCIDevice *ad);
static bool ahci_write_fis_d2h(AHCIDevice *ad, bool d2h_fis_i);
static void ahci_clear_cmd_issue(AHCIDevice *ad, uint8_t slot);
static void ahci_init_d2h(AHCIDevice *ad);
static int ahci_dma_prepare_buf(const IDEDMA *dma, int32_t limit);
static bool ahci_map_clb_address(AHCIDevice *ad);
@@ -328,6 +329,11 @@ static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
ahci_check_irq(s);
break;
case AHCI_PORT_REG_CMD:
if ((pr->cmd & PORT_CMD_START) && !(val & PORT_CMD_START)) {
pr->scr_act = 0;
pr->cmd_issue = 0;
}
/* Block any Read-only fields from being set;
* including LIST_ON and FIS_ON.
* The spec requires to set ICC bits to zero after the ICC change
@@ -591,9 +597,8 @@ static void check_cmd(AHCIState *s, int port)
if ((pr->cmd & PORT_CMD_START) && pr->cmd_issue) {
for (slot = 0; (slot < 32) && pr->cmd_issue; slot++) {
if ((pr->cmd_issue & (1U << slot)) &&
!handle_cmd(s, port, slot)) {
pr->cmd_issue &= ~(1U << slot);
if (pr->cmd_issue & (1U << slot)) {
handle_cmd(s, port, slot);
}
}
}
@@ -618,7 +623,7 @@ static void ahci_init_d2h(AHCIDevice *ad)
return;
}
if (ahci_write_fis_d2h(ad)) {
if (ahci_write_fis_d2h(ad, true)) {
ad->init_d2h_sent = true;
/* We're emulating receiving the first Reg H2D Fis from the device;
* Update the SIG register, but otherwise proceed as normal. */
@@ -801,8 +806,14 @@ static void ahci_write_fis_sdb(AHCIState *s, NCQTransferState *ncq_tfs)
pr->scr_act &= ~ad->finished;
ad->finished = 0;
/* Trigger IRQ if interrupt bit is set (which currently, it always is) */
if (sdb_fis->flags & 0x40) {
/*
* TFES IRQ is always raised if ERR_STAT is set, regardless of I bit.
* If ERR_STAT is not set, trigger SDBS IRQ if interrupt bit is set
* (which currently, it always is).
*/
if (sdb_fis->status & ERR_STAT) {
ahci_trigger_irq(s, ad, AHCI_PORT_IRQ_BIT_TFES);
} else if (sdb_fis->flags & 0x40) {
ahci_trigger_irq(s, ad, AHCI_PORT_IRQ_BIT_SDBS);
}
}
@@ -850,7 +861,7 @@ static void ahci_write_fis_pio(AHCIDevice *ad, uint16_t len, bool pio_fis_i)
}
}
static bool ahci_write_fis_d2h(AHCIDevice *ad)
static bool ahci_write_fis_d2h(AHCIDevice *ad, bool d2h_fis_i)
{
AHCIPortRegs *pr = &ad->port_regs;
uint8_t *d2h_fis;
@@ -864,7 +875,7 @@ static bool ahci_write_fis_d2h(AHCIDevice *ad)
d2h_fis = &ad->res_fis[RES_FIS_RFIS];
d2h_fis[0] = SATA_FIS_TYPE_REGISTER_D2H;
d2h_fis[1] = (1 << 6); /* interrupt bit */
d2h_fis[1] = d2h_fis_i ? (1 << 6) : 0; /* interrupt bit */
d2h_fis[2] = s->status;
d2h_fis[3] = s->error;
@@ -890,7 +901,10 @@ static bool ahci_write_fis_d2h(AHCIDevice *ad)
ahci_trigger_irq(ad->hba, ad, AHCI_PORT_IRQ_BIT_TFES);
}
ahci_trigger_irq(ad->hba, ad, AHCI_PORT_IRQ_BIT_DHRS);
if (d2h_fis_i) {
ahci_trigger_irq(ad->hba, ad, AHCI_PORT_IRQ_BIT_DHRS);
}
return true;
}
@@ -998,7 +1012,6 @@ static void ncq_err(NCQTransferState *ncq_tfs)
ide_state->error = ABRT_ERR;
ide_state->status = READY_STAT | ERR_STAT;
ncq_tfs->drive->port_regs.scr_err |= (1 << ncq_tfs->tag);
qemu_sglist_destroy(&ncq_tfs->sglist);
ncq_tfs->used = 0;
}
@@ -1008,7 +1021,7 @@ static void ncq_finish(NCQTransferState *ncq_tfs)
/* If we didn't error out, set our finished bit. Errored commands
* do not get a bit set for the SDB FIS ACT register, nor do they
* clear the outstanding bit in scr_act (PxSACT). */
if (!(ncq_tfs->drive->port_regs.scr_err & (1 << ncq_tfs->tag))) {
if (ncq_tfs->used) {
ncq_tfs->drive->finished |= (1 << ncq_tfs->tag);
}
@@ -1120,6 +1133,24 @@ static void process_ncq_command(AHCIState *s, int port, const uint8_t *cmd_fis,
return;
}
/*
* A NCQ command clears the bit in PxCI after the command has been QUEUED
* successfully (ERROR not set, BUSY and DRQ cleared).
*
* For NCQ commands, PxCI will always be cleared here.
*
* (Once the NCQ command is COMPLETED, the device will send a SDB FIS with
* the interrupt bit set, which will clear PxSACT and raise an interrupt.)
*/
ahci_clear_cmd_issue(ad, slot);
/*
* In reality, for NCQ commands, PxCI is cleared after receiving a D2H FIS
* without the interrupt bit set, but since ahci_write_fis_d2h() can raise
* an IRQ on error, we need to call them in reverse order.
*/
ahci_write_fis_d2h(ad, false);
ncq_tfs->used = 1;
ncq_tfs->drive = ad;
ncq_tfs->slot = slot;
@@ -1192,6 +1223,7 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
{
IDEState *ide_state = &s->dev[port].port.ifs[0];
AHCICmdHdr *cmd = get_cmd_header(s, port, slot);
AHCIDevice *ad = &s->dev[port];
uint16_t opts = le16_to_cpu(cmd->opts);
if (cmd_fis[1] & 0x0F) {
@@ -1268,11 +1300,19 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
/* Reset transferred byte counter */
cmd->status = 0;
/*
* A non-NCQ command clears the bit in PxCI after the command has COMPLETED
* successfully (ERROR not set, BUSY and DRQ cleared).
*
* For non-NCQ commands, PxCI will always be cleared by ahci_cmd_done().
*/
ad->busy_slot = slot;
/* We're ready to process the command in FIS byte 2. */
ide_bus_exec_cmd(&s->dev[port].port, cmd_fis[2]);
}
static int handle_cmd(AHCIState *s, int port, uint8_t slot)
static void handle_cmd(AHCIState *s, int port, uint8_t slot)
{
IDEState *ide_state;
uint64_t tbl_addr;
@@ -1283,12 +1323,12 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot)
if (s->dev[port].port.ifs[0].status & (BUSY_STAT|DRQ_STAT)) {
/* Engine currently busy, try again later */
trace_handle_cmd_busy(s, port);
return -1;
return;
}
if (!s->dev[port].lst) {
trace_handle_cmd_nolist(s, port);
return -1;
return;
}
cmd = get_cmd_header(s, port, slot);
/* remember current slot handle for later */
@@ -1298,7 +1338,7 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot)
ide_state = &s->dev[port].port.ifs[0];
if (!ide_state->blk) {
trace_handle_cmd_badport(s, port);
return -1;
return;
}
tbl_addr = le64_to_cpu(cmd->tbl_addr);
@@ -1307,7 +1347,7 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot)
DMA_DIRECTION_TO_DEVICE, MEMTXATTRS_UNSPECIFIED);
if (!cmd_fis) {
trace_handle_cmd_badfis(s, port);
return -1;
return;
} else if (cmd_len != 0x80) {
ahci_trigger_irq(s, &s->dev[port], AHCI_PORT_IRQ_BIT_HBFS);
trace_handle_cmd_badmap(s, port, cmd_len);
@@ -1331,15 +1371,6 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot)
out:
dma_memory_unmap(s->as, cmd_fis, cmd_len, DMA_DIRECTION_TO_DEVICE,
cmd_len);
if (s->dev[port].port.ifs[0].status & (BUSY_STAT|DRQ_STAT)) {
/* async command, complete later */
s->dev[port].busy_slot = slot;
return -1;
}
/* done handling the command */
return 0;
}
/* Transfer PIO data between RAM and device */
@@ -1493,23 +1524,41 @@ static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write)
return 1;
}
static void ahci_clear_cmd_issue(AHCIDevice *ad, uint8_t slot)
{
IDEState *ide_state = &ad->port.ifs[0];
if (!(ide_state->status & ERR_STAT) &&
!(ide_state->status & (BUSY_STAT | DRQ_STAT))) {
ad->port_regs.cmd_issue &= ~(1 << slot);
}
}
/* Non-NCQ command is done - This function is never called for NCQ commands. */
static void ahci_cmd_done(const IDEDMA *dma)
{
AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
IDEState *ide_state = &ad->port.ifs[0];
trace_ahci_cmd_done(ad->hba, ad->port_no);
/* no longer busy */
if (ad->busy_slot != -1) {
ad->port_regs.cmd_issue &= ~(1 << ad->busy_slot);
ahci_clear_cmd_issue(ad, ad->busy_slot);
ad->busy_slot = -1;
}
/* update d2h status */
ahci_write_fis_d2h(ad);
/*
* In reality, for non-NCQ commands, PxCI is cleared after receiving a D2H
* FIS with the interrupt bit set, but since ahci_write_fis_d2h() will raise
* an IRQ, we need to call them in reverse order.
*/
ahci_write_fis_d2h(ad, true);
if (ad->port_regs.cmd_issue && !ad->check_bh) {
ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
if (!(ide_state->status & ERR_STAT) &&
ad->port_regs.cmd_issue && !ad->check_bh) {
ad->check_bh = qemu_bh_new_guarded(ahci_check_cmd_bh, ad,
&ad->mem_reentrancy_guard);
qemu_bh_schedule(ad->check_bh);
}
}
+1
View File
@@ -321,6 +321,7 @@ struct AHCIDevice {
bool init_d2h_sent;
AHCICmdHdr *cur_cmd;
NCQTransferState ncq_tfs[AHCI_MAX_CMDS];
MemReentrancyGuard mem_reentrancy_guard;
};
struct AHCIPCIState {
+4 -2
View File
@@ -513,6 +513,7 @@ BlockAIOCB *ide_issue_trim(
BlockCompletionFunc *cb, void *cb_opaque, void *opaque)
{
IDEState *s = opaque;
IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
/* Paired with a decrement in ide_trim_bh_cb() */
@@ -520,7 +521,8 @@ BlockAIOCB *ide_issue_trim(
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
&DEVICE(dev)->mem_reentrancy_guard);
iocb->ret = 0;
iocb->qiov = qiov;
iocb->i = -1;
@@ -531,9 +533,9 @@ BlockAIOCB *ide_issue_trim(
void ide_abort_command(IDEState *s)
{
ide_transfer_stop(s);
s->status = READY_STAT | ERR_STAT;
s->error = ABRT_ERR;
ide_transfer_stop(s);
}
static void ide_set_retry(IDEState *s)
+1 -1
View File
@@ -118,7 +118,7 @@ static void piix_ide_reset(DeviceState *dev)
pci_set_word(pci_conf + PCI_COMMAND, 0x0000);
pci_set_word(pci_conf + PCI_STATUS,
PCI_STATUS_DEVSEL_MEDIUM | PCI_STATUS_FAST_BACK);
pci_set_byte(pci_conf + 0x20, 0x01); /* BMIBA: 20-23h */
pci_set_long(pci_conf + 0x20, 0x1); /* BMIBA: 20-23h */
}
static bool pci_piix_init_bus(PCIIDEState *d, unsigned i, Error **errp)
+2 -5
View File
@@ -49,12 +49,9 @@ static void aw_a10_pic_update(AwA10PICState *s)
static void aw_a10_pic_set_irq(void *opaque, int irq, int level)
{
AwA10PICState *s = opaque;
uint32_t *pending_reg = &s->irq_pending[irq / 32];
if (level) {
set_bit(irq % 32, (void *)&s->irq_pending[irq / 32]);
} else {
clear_bit(irq % 32, (void *)&s->irq_pending[irq / 32]);
}
*pending_reg = deposit32(*pending_reg, irq % 32, 1, !!level);
aw_a10_pic_update(s);
}
+7
View File
@@ -885,6 +885,13 @@ static void apic_realize(DeviceState *dev, Error **errp)
memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, "apic-msi",
APIC_SPACE_SIZE);
/*
* apic-msi's apic_mem_write can call into ioapic_eoi_broadcast, which can
* write back to apic-msi. As such mark the apic-msi region re-entrancy
* safe.
*/
s->io_memory.disable_reentrancy_guard = true;
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
local_apics[s->id] = s;
+4
View File
@@ -215,6 +215,10 @@ static void loongarch_ipi_init(Object *obj)
for (cpu = 0; cpu < MAX_IPI_CORE_NUM; cpu++) {
memory_region_init_io(&s->ipi_iocsr_mem[cpu], obj, &loongarch_ipi_ops,
&lams->ipi_core[cpu], "loongarch_ipi_iocsr", 0x48);
/* loongarch_ipi_iocsr performs re-entrant IO through ipi_send */
s->ipi_iocsr_mem[cpu].disable_reentrancy_guard = true;
sysbus_init_mmio(sbd, &s->ipi_iocsr_mem[cpu]);
memory_region_init_io(&s->ipi64_iocsr_mem[cpu], obj, &loongarch_ipi64_ops,
+6 -5
View File
@@ -64,13 +64,13 @@ static void riscv_aclint_mtimer_write_timecmp(RISCVAclintMTimerState *mtimer,
uint64_t next;
uint64_t diff;
uint64_t rtc_r = cpu_riscv_read_rtc(mtimer);
uint64_t rtc = cpu_riscv_read_rtc(mtimer);
/* Compute the relative hartid w.r.t the socket */
hartid = hartid - mtimer->hartid_base;
mtimer->timecmp[hartid] = value;
if (mtimer->timecmp[hartid] <= rtc_r) {
if (mtimer->timecmp[hartid] <= rtc) {
/*
* If we're setting an MTIMECMP value in the "past",
* immediately raise the timer interrupt
@@ -81,7 +81,7 @@ static void riscv_aclint_mtimer_write_timecmp(RISCVAclintMTimerState *mtimer,
/* otherwise, set up the future timer interrupt */
qemu_irq_lower(mtimer->timer_irqs[hartid]);
diff = mtimer->timecmp[hartid] - rtc_r;
diff = mtimer->timecmp[hartid] - rtc;
/* back to ns (note args switched in muldiv64) */
uint64_t ns_diff = muldiv64(diff, NANOSECONDS_PER_SECOND, timebase_freq);
@@ -208,11 +208,12 @@ static void riscv_aclint_mtimer_write(void *opaque, hwaddr addr,
return;
} else if (addr == mtimer->time_base || addr == mtimer->time_base + 4) {
uint64_t rtc_r = cpu_riscv_read_rtc_raw(mtimer->timebase_freq);
uint64_t rtc = cpu_riscv_read_rtc(mtimer);
if (addr == mtimer->time_base) {
if (size == 4) {
/* time_lo for RV32/RV64 */
mtimer->time_delta = ((rtc_r & ~0xFFFFFFFFULL) | value) - rtc_r;
mtimer->time_delta = ((rtc & ~0xFFFFFFFFULL) | value) - rtc_r;
} else {
/* time for RV64 */
mtimer->time_delta = value - rtc_r;
@@ -220,7 +221,7 @@ static void riscv_aclint_mtimer_write(void *opaque, hwaddr addr,
} else {
if (size == 4) {
/* time_hi for RV32/RV64 */
mtimer->time_delta = (value << 32 | (rtc_r & 0xFFFFFFFF)) - rtc_r;
mtimer->time_delta = (value << 32 | (rtc & 0xFFFFFFFF)) - rtc_r;
} else {
qemu_log_mask(LOG_GUEST_ERROR,
"aclint-mtimer: invalid time_hi write: %08x",
-2
View File
@@ -29,7 +29,6 @@
#include "qemu/datadir.h"
#include "qapi/error.h"
#include "elf.h"
#include "kvm_mips.h"
#include "hw/char/serial.h"
#include "hw/intc/loongson_liointc.h"
#include "hw/mips/mips.h"
@@ -617,7 +616,6 @@ static void loongson3v_machine_class_init(ObjectClass *oc, void *data)
mc->max_cpus = LOONGSON_MAX_VCPUS;
mc->default_ram_id = "loongson3.highram";
mc->default_ram_size = 1600 * MiB;
mc->kvm_type = mips_kvm_type;
mc->minimum_page_bits = 14;
}
+2 -2
View File
@@ -629,9 +629,9 @@ static void bl_setup_gt64120_jump_kernel(void **p, uint64_t run_addr,
/* Bus endianess is always reversed */
#if TARGET_BIG_ENDIAN
#define cpu_to_gt32 cpu_to_le32
#define cpu_to_gt32(x) (x)
#else
#define cpu_to_gt32 cpu_to_be32
#define cpu_to_gt32(x) bswap32(x)
#endif
/* setup MEM-to-PCI0 mapping as done by YAMON */
+1 -1
View File
@@ -189,7 +189,7 @@ static void do_hash_operation(AspeedHACEState *s, int algo, bool sg_mode,
bool acc_mode)
{
struct iovec iov[ASPEED_HACE_MAX_SG];
g_autofree uint8_t *digest_buf;
g_autofree uint8_t *digest_buf = NULL;
size_t digest_len = 0;
int niov = 0;
int i;
+7
View File
@@ -382,6 +382,13 @@ static void bcm2835_property_init(Object *obj)
memory_region_init_io(&s->iomem, OBJECT(s), &bcm2835_property_ops, s,
TYPE_BCM2835_PROPERTY, 0x10);
/*
* bcm2835_property_ops call into bcm2835_mbox, which in-turn reads from
* iomem. As such, mark iomem as re-entracy safe.
*/
s->iomem.disable_reentrancy_guard = true;
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
sysbus_init_irq(SYS_BUS_DEVICE(s), &s->mbox_irq);
}
+4 -2
View File
@@ -228,8 +228,10 @@ static void imx_rngc_realize(DeviceState *dev, Error **errp)
sysbus_init_mmio(sbd, &s->iomem);
sysbus_init_irq(sbd, &s->irq);
s->self_test_bh = qemu_bh_new(imx_rngc_self_test, s);
s->seed_bh = qemu_bh_new(imx_rngc_seed, s);
s->self_test_bh = qemu_bh_new_guarded(imx_rngc_self_test, s,
&dev->mem_reentrancy_guard);
s->seed_bh = qemu_bh_new_guarded(imx_rngc_seed, s,
&dev->mem_reentrancy_guard);
}
static void imx_rngc_reset(DeviceState *dev)
+1 -1
View File
@@ -914,7 +914,7 @@ static void mac_dbdma_realize(DeviceState *dev, Error **errp)
{
DBDMAState *s = MAC_DBDMA(dev);
s->bh = qemu_bh_new(DBDMA_run_bh, s);
s->bh = qemu_bh_new_guarded(DBDMA_run_bh, s, &dev->mem_reentrancy_guard);
}
static void mac_dbdma_class_init(ObjectClass *oc, void *data)
+15 -7
View File
@@ -350,8 +350,13 @@ static void allwinner_sun8i_emac_get_desc(AwSun8iEmacState *s,
FrameDescriptor *desc,
uint32_t phys_addr)
{
dma_memory_read(&s->dma_as, phys_addr, desc, sizeof(*desc),
uint32_t desc_words[4];
dma_memory_read(&s->dma_as, phys_addr, &desc_words, sizeof(desc_words),
MEMTXATTRS_UNSPECIFIED);
desc->status = le32_to_cpu(desc_words[0]);
desc->status2 = le32_to_cpu(desc_words[1]);
desc->addr = le32_to_cpu(desc_words[2]);
desc->next = le32_to_cpu(desc_words[3]);
}
static uint32_t allwinner_sun8i_emac_next_desc(AwSun8iEmacState *s,
@@ -400,10 +405,15 @@ static uint32_t allwinner_sun8i_emac_tx_desc(AwSun8iEmacState *s,
}
static void allwinner_sun8i_emac_flush_desc(AwSun8iEmacState *s,
FrameDescriptor *desc,
const FrameDescriptor *desc,
uint32_t phys_addr)
{
dma_memory_write(&s->dma_as, phys_addr, desc, sizeof(*desc),
uint32_t desc_words[4];
desc_words[0] = cpu_to_le32(desc->status);
desc_words[1] = cpu_to_le32(desc->status2);
desc_words[2] = cpu_to_le32(desc->addr);
desc_words[3] = cpu_to_le32(desc->next);
dma_memory_write(&s->dma_as, phys_addr, &desc_words, sizeof(desc_words),
MEMTXATTRS_UNSPECIFIED);
}
@@ -638,8 +648,7 @@ static uint64_t allwinner_sun8i_emac_read(void *opaque, hwaddr offset,
break;
case REG_TX_CUR_BUF: /* Transmit Current Buffer */
if (s->tx_desc_curr != 0) {
dma_memory_read(&s->dma_as, s->tx_desc_curr, &desc, sizeof(desc),
MEMTXATTRS_UNSPECIFIED);
allwinner_sun8i_emac_get_desc(s, &desc, s->tx_desc_curr);
value = desc.addr;
} else {
value = 0;
@@ -652,8 +661,7 @@ static uint64_t allwinner_sun8i_emac_read(void *opaque, hwaddr offset,
break;
case REG_RX_CUR_BUF: /* Receive Current Buffer */
if (s->rx_desc_curr != 0) {
dma_memory_read(&s->dma_as, s->rx_desc_curr, &desc, sizeof(desc),
MEMTXATTRS_UNSPECIFIED);
allwinner_sun8i_emac_get_desc(s, &desc, s->rx_desc_curr);
value = desc.addr;
} else {
value = 0;
+5 -6
View File
@@ -637,9 +637,8 @@ xmit_seg(E1000State *s)
e1000x_inc_reg_if_not_full(s->mac_reg, TPT);
e1000x_grow_8reg_if_not_full(s->mac_reg, TOTL, s->tx.size + 4);
s->mac_reg[GPTC] = s->mac_reg[TPT];
s->mac_reg[GOTCL] = s->mac_reg[TOTL];
s->mac_reg[GOTCH] = s->mac_reg[TOTH];
e1000x_inc_reg_if_not_full(s->mac_reg, GPTC);
e1000x_grow_8reg_if_not_full(s->mac_reg, GOTCL, s->tx.size + 4);
}
static void
@@ -827,12 +826,10 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
}
if (ismcast && (rctl & E1000_RCTL_MPE)) { /* promiscuous mcast */
e1000x_inc_reg_if_not_full(s->mac_reg, MPRC);
return 1;
}
if (isbcast && (rctl & E1000_RCTL_BAM)) { /* broadcast enabled */
e1000x_inc_reg_if_not_full(s->mac_reg, BPRC);
return 1;
}
@@ -923,6 +920,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
size_t desc_offset;
size_t desc_size;
size_t total_size;
eth_pkt_types_e pkt_type;
if (!e1000x_hw_rx_enabled(s->mac_reg)) {
return -1;
@@ -972,6 +970,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
size -= 4;
}
pkt_type = get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf));
rdh_start = s->mac_reg[RDH];
desc_offset = 0;
total_size = size + e1000x_fcs_len(s->mac_reg);
@@ -1037,7 +1036,7 @@ e1000_receive_iov(NetClientState *nc, const struct iovec *iov, int iovcnt)
}
} while (desc_offset < total_size);
e1000x_update_rx_total_stats(s->mac_reg, size, total_size);
e1000x_update_rx_total_stats(s->mac_reg, pkt_type, size, total_size);
n = E1000_ICS_RXT0;
if ((rdt = s->mac_reg[RDT]) < s->mac_reg[RDH])
+15 -36
View File
@@ -711,9 +711,8 @@ e1000e_on_tx_done_update_stats(E1000ECore *core, struct NetTxPkt *tx_pkt)
g_assert_not_reached();
}
core->mac[GPTC] = core->mac[TPT];
core->mac[GOTCL] = core->mac[TOTL];
core->mac[GOTCH] = core->mac[TOTH];
e1000x_inc_reg_if_not_full(core->mac, GPTC);
e1000x_grow_8reg_if_not_full(core->mac, GOTCL, tot_len);
}
static void
@@ -1488,24 +1487,10 @@ e1000e_write_to_rx_buffers(E1000ECore *core,
}
static void
e1000e_update_rx_stats(E1000ECore *core,
size_t data_size,
size_t data_fcs_size)
e1000e_update_rx_stats(E1000ECore *core, size_t pkt_size, size_t pkt_fcs_size)
{
e1000x_update_rx_total_stats(core->mac, data_size, data_fcs_size);
switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
case ETH_PKT_BCAST:
e1000x_inc_reg_if_not_full(core->mac, BPRC);
break;
case ETH_PKT_MCAST:
e1000x_inc_reg_if_not_full(core->mac, MPRC);
break;
default:
break;
}
eth_pkt_types_e pkt_type = net_rx_pkt_get_packet_type(core->rx_pkt);
e1000x_update_rx_total_stats(core->mac, pkt_type, pkt_size, pkt_fcs_size);
}
static inline bool
@@ -1700,12 +1685,9 @@ static ssize_t
e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
bool has_vnet)
{
static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
uint32_t n = 0;
uint8_t min_buf[ETH_ZLEN];
uint8_t buf[ETH_ZLEN];
struct iovec min_iov;
uint8_t *filter_buf;
size_t size, orig_size;
size_t iov_ofs = 0;
E1000E_RxRing rxr;
@@ -1728,24 +1710,21 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
net_rx_pkt_unset_vhdr(core->rx_pkt);
}
filter_buf = iov->iov_base + iov_ofs;
orig_size = iov_size(iov, iovcnt);
size = orig_size - iov_ofs;
/* Pad to minimum Ethernet frame length */
if (size < sizeof(min_buf)) {
iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
memset(&min_buf[size], 0, sizeof(min_buf) - size);
if (size < sizeof(buf)) {
iov_to_buf(iov, iovcnt, iov_ofs, buf, size);
memset(&buf[size], 0, sizeof(buf) - size);
e1000x_inc_reg_if_not_full(core->mac, RUC);
min_iov.iov_base = filter_buf = min_buf;
min_iov.iov_len = size = sizeof(min_buf);
min_iov.iov_base = buf;
min_iov.iov_len = size = sizeof(buf);
iovcnt = 1;
iov = &min_iov;
iov_ofs = 0;
} else if (iov->iov_len < maximum_ethernet_hdr_len) {
/* This is very unlikely, but may happen. */
iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
filter_buf = min_buf;
} else {
iov_to_buf(iov, iovcnt, iov_ofs, buf, ETH_HLEN + 4);
}
/* Discard oversized packets if !LPE and !SBP. */
@@ -1754,9 +1733,9 @@ e1000e_receive_internal(E1000ECore *core, const struct iovec *iov, int iovcnt,
}
net_rx_pkt_set_packet_type(core->rx_pkt,
get_eth_packet_type(PKT_GET_ETH_HDR(filter_buf)));
get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
if (!e1000e_receive_filter(core, filter_buf, size)) {
if (!e1000e_receive_filter(core, buf, size)) {
trace_e1000e_rx_flt_dropped();
return orig_size;
}
+20 -8
View File
@@ -80,7 +80,6 @@ bool e1000x_rx_group_filter(uint32_t *mac, const uint8_t *buf)
f = mta_shift[(rctl >> E1000_RCTL_MO_SHIFT) & 3];
f = (((buf[5] << 8) | buf[4]) >> f) & 0xfff;
if (mac[MTA + (f >> 5)] & (1 << (f & 0x1f))) {
e1000x_inc_reg_if_not_full(mac, MPRC);
return true;
}
@@ -212,23 +211,36 @@ e1000x_rxbufsize(uint32_t rctl)
void
e1000x_update_rx_total_stats(uint32_t *mac,
size_t data_size,
size_t data_fcs_size)
eth_pkt_types_e pkt_type,
size_t pkt_size,
size_t pkt_fcs_size)
{
static const int PRCregs[6] = { PRC64, PRC127, PRC255, PRC511,
PRC1023, PRC1522 };
e1000x_increase_size_stats(mac, PRCregs, data_fcs_size);
e1000x_increase_size_stats(mac, PRCregs, pkt_fcs_size);
e1000x_inc_reg_if_not_full(mac, TPR);
mac[GPRC] = mac[TPR];
e1000x_inc_reg_if_not_full(mac, GPRC);
/* TOR - Total Octets Received:
* This register includes bytes received in a packet from the <Destination
* Address> field through the <CRC> field, inclusively.
* Always include FCS length (4) in size.
*/
e1000x_grow_8reg_if_not_full(mac, TORL, data_size + 4);
mac[GORCL] = mac[TORL];
mac[GORCH] = mac[TORH];
e1000x_grow_8reg_if_not_full(mac, TORL, pkt_size + 4);
e1000x_grow_8reg_if_not_full(mac, GORCL, pkt_size + 4);
switch (pkt_type) {
case ETH_PKT_BCAST:
e1000x_inc_reg_if_not_full(mac, BPRC);
break;
case ETH_PKT_MCAST:
e1000x_inc_reg_if_not_full(mac, MPRC);
break;
default:
break;
}
}
void
+3 -2
View File
@@ -91,8 +91,9 @@ e1000x_update_regs_on_link_up(uint32_t *mac, uint16_t *phy)
}
void e1000x_update_rx_total_stats(uint32_t *mac,
size_t data_size,
size_t data_fcs_size);
eth_pkt_types_e pkt_type,
size_t pkt_size,
size_t pkt_fcs_size);
void e1000x_core_prepare_eeprom(uint16_t *eeprom,
const uint16_t *templ,
+56 -66
View File
@@ -67,6 +67,11 @@ typedef struct IGBTxPktVmdqCallbackContext {
NetClientState *nc;
} IGBTxPktVmdqCallbackContext;
typedef struct L2Header {
struct eth_header eth;
struct vlan_header vlan;
} L2Header;
static ssize_t
igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
bool has_vnet, bool *external_tx);
@@ -402,7 +407,7 @@ igb_tx_insert_vlan(IGBCore *core, uint16_t qn, struct igb_tx *tx,
}
}
if (insert_vlan && e1000x_vlan_enabled(core->mac)) {
if (insert_vlan) {
net_tx_pkt_setup_vlan_header_ex(tx->tx_pkt, vlan,
core->mac[VET] & 0xffff);
}
@@ -538,9 +543,8 @@ igb_on_tx_done_update_stats(IGBCore *core, struct NetTxPkt *tx_pkt, int qn)
g_assert_not_reached();
}
core->mac[GPTC] = core->mac[TPT];
core->mac[GOTCL] = core->mac[TOTL];
core->mac[GOTCH] = core->mac[TOTH];
e1000x_inc_reg_if_not_full(core->mac, GPTC);
e1000x_grow_8reg_if_not_full(core->mac, GOTCL, tot_len);
if (core->mac[MRQC] & 1) {
uint16_t pool = qn % IGB_NUM_VM_POOLS;
@@ -961,15 +965,16 @@ igb_rx_is_oversized(IGBCore *core, uint16_t qn, size_t size)
return size > (lpe ? max_ethernet_lpe_size : max_ethernet_vlan_size);
}
static uint16_t igb_receive_assign(IGBCore *core, const struct eth_header *ehdr,
static uint16_t igb_receive_assign(IGBCore *core, const L2Header *l2_header,
size_t size, E1000E_RSSInfo *rss_info,
bool *external_tx)
{
static const int ta_shift[] = { 4, 3, 2, 0 };
const struct eth_header *ehdr = &l2_header->eth;
uint32_t f, ra[2], *macp, rctl = core->mac[RCTL];
uint16_t queues = 0;
uint16_t oversized = 0;
uint16_t vid = lduw_be_p(&PKT_GET_VLAN_HDR(ehdr)->h_tci) & VLAN_VID_MASK;
uint16_t vid = be16_to_cpu(l2_header->vlan.h_tci) & VLAN_VID_MASK;
bool accepted = false;
int i;
@@ -1227,7 +1232,6 @@ igb_build_rx_metadata(IGBCore *core,
struct virtio_net_hdr *vhdr;
bool hasip4, hasip6;
EthL4HdrProto l4hdr_proto;
uint32_t pkt_type;
*status_flags = E1000_RXD_STAT_DD;
@@ -1266,28 +1270,29 @@ igb_build_rx_metadata(IGBCore *core,
trace_e1000e_rx_metadata_ack();
}
if (hasip6 && (core->mac[RFCTL] & E1000_RFCTL_IPV6_DIS)) {
trace_e1000e_rx_metadata_ipv6_filtering_disabled();
pkt_type = E1000_RXD_PKT_MAC;
} else if (l4hdr_proto == ETH_L4_HDR_PROTO_TCP ||
l4hdr_proto == ETH_L4_HDR_PROTO_UDP) {
pkt_type = hasip4 ? E1000_RXD_PKT_IP4_XDP : E1000_RXD_PKT_IP6_XDP;
} else if (hasip4 || hasip6) {
pkt_type = hasip4 ? E1000_RXD_PKT_IP4 : E1000_RXD_PKT_IP6;
} else {
pkt_type = E1000_RXD_PKT_MAC;
}
trace_e1000e_rx_metadata_pkt_type(pkt_type);
if (pkt_info) {
if (rss_info->enabled) {
*pkt_info = rss_info->type;
*pkt_info = rss_info->enabled ? rss_info->type : 0;
if (hasip4) {
*pkt_info |= E1000_ADVRXD_PKT_IP4;
}
*pkt_info |= (pkt_type << 4);
} else {
*status_flags |= E1000_RXD_PKT_TYPE(pkt_type);
if (hasip6) {
*pkt_info |= E1000_ADVRXD_PKT_IP6;
}
switch (l4hdr_proto) {
case ETH_L4_HDR_PROTO_TCP:
*pkt_info |= E1000_ADVRXD_PKT_TCP;
break;
case ETH_L4_HDR_PROTO_UDP:
*pkt_info |= E1000_ADVRXD_PKT_UDP;
break;
default:
break;
}
}
if (hdr_info) {
@@ -1438,29 +1443,17 @@ igb_write_to_rx_buffers(IGBCore *core,
static void
igb_update_rx_stats(IGBCore *core, const E1000E_RingInfo *rxi,
size_t data_size, size_t data_fcs_size)
size_t pkt_size, size_t pkt_fcs_size)
{
e1000x_update_rx_total_stats(core->mac, data_size, data_fcs_size);
switch (net_rx_pkt_get_packet_type(core->rx_pkt)) {
case ETH_PKT_BCAST:
e1000x_inc_reg_if_not_full(core->mac, BPRC);
break;
case ETH_PKT_MCAST:
e1000x_inc_reg_if_not_full(core->mac, MPRC);
break;
default:
break;
}
eth_pkt_types_e pkt_type = net_rx_pkt_get_packet_type(core->rx_pkt);
e1000x_update_rx_total_stats(core->mac, pkt_type, pkt_size, pkt_fcs_size);
if (core->mac[MRQC] & 1) {
uint16_t pool = rxi->idx % IGB_NUM_VM_POOLS;
core->mac[PVFGORC0 + (pool * 64)] += data_size + 4;
core->mac[PVFGORC0 + (pool * 64)] += pkt_size + 4;
core->mac[PVFGPRC0 + (pool * 64)]++;
if (net_rx_pkt_get_packet_type(core->rx_pkt) == ETH_PKT_MCAST) {
if (pkt_type == ETH_PKT_MCAST) {
core->mac[PVFMPRC0 + (pool * 64)]++;
}
}
@@ -1602,14 +1595,13 @@ static ssize_t
igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
bool has_vnet, bool *external_tx)
{
static const int maximum_ethernet_hdr_len = (ETH_HLEN + 4);
uint16_t queues = 0;
uint32_t n = 0;
uint8_t min_buf[ETH_ZLEN];
union {
L2Header l2_header;
uint8_t octets[ETH_ZLEN];
} buf;
struct iovec min_iov;
struct eth_header *ehdr;
uint8_t *filter_buf;
size_t size, orig_size;
size_t iov_ofs = 0;
E1000E_RxRing rxr;
@@ -1635,24 +1627,21 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
net_rx_pkt_unset_vhdr(core->rx_pkt);
}
filter_buf = iov->iov_base + iov_ofs;
orig_size = iov_size(iov, iovcnt);
size = orig_size - iov_ofs;
/* Pad to minimum Ethernet frame length */
if (size < sizeof(min_buf)) {
iov_to_buf(iov, iovcnt, iov_ofs, min_buf, size);
memset(&min_buf[size], 0, sizeof(min_buf) - size);
if (size < sizeof(buf)) {
iov_to_buf(iov, iovcnt, iov_ofs, &buf, size);
memset(&buf.octets[size], 0, sizeof(buf) - size);
e1000x_inc_reg_if_not_full(core->mac, RUC);
min_iov.iov_base = filter_buf = min_buf;
min_iov.iov_len = size = sizeof(min_buf);
min_iov.iov_base = &buf;
min_iov.iov_len = size = sizeof(buf);
iovcnt = 1;
iov = &min_iov;
iov_ofs = 0;
} else if (iov->iov_len < maximum_ethernet_hdr_len) {
/* This is very unlikely, but may happen. */
iov_to_buf(iov, iovcnt, iov_ofs, min_buf, maximum_ethernet_hdr_len);
filter_buf = min_buf;
} else {
iov_to_buf(iov, iovcnt, iov_ofs, &buf, sizeof(buf.l2_header));
}
/* Discard oversized packets if !LPE and !SBP. */
@@ -1660,11 +1649,12 @@ igb_receive_internal(IGBCore *core, const struct iovec *iov, int iovcnt,
return orig_size;
}
ehdr = PKT_GET_ETH_HDR(filter_buf);
net_rx_pkt_set_packet_type(core->rx_pkt, get_eth_packet_type(ehdr));
net_rx_pkt_set_protocols(core->rx_pkt, filter_buf, size);
net_rx_pkt_set_packet_type(core->rx_pkt,
get_eth_packet_type(&buf.l2_header.eth));
net_rx_pkt_set_protocols(core->rx_pkt, iov, iovcnt, iov_ofs);
queues = igb_receive_assign(core, ehdr, size, &rss_info, external_tx);
queues = igb_receive_assign(core, &buf.l2_header, size,
&rss_info, external_tx);
if (!queues) {
trace_e1000e_rx_flt_dropped();
return orig_size;
@@ -2464,16 +2454,16 @@ igb_set_ims(IGBCore *core, int index, uint32_t val)
static void igb_commit_icr(IGBCore *core)
{
/*
* If GPIE.NSICR = 0, then the copy of IAM to IMS will occur only if at
* If GPIE.NSICR = 0, then the clear of IMS will occur only if at
* least one bit is set in the IMS and there is a true interrupt as
* reflected in ICR.INTA.
*/
if ((core->mac[GPIE] & E1000_GPIE_NSICR) ||
(core->mac[IMS] && (core->mac[ICR] & E1000_ICR_INT_ASSERTED))) {
igb_set_ims(core, IMS, core->mac[IAM]);
} else {
igb_update_interrupt_state(core);
igb_clear_ims_bits(core, core->mac[IAM]);
}
igb_update_interrupt_state(core);
}
static void igb_set_icr(IGBCore *core, int index, uint32_t val)
+5
View File
@@ -641,6 +641,11 @@ union e1000_adv_rx_desc {
#define E1000_STATUS_NUM_VFS_SHIFT 14
#define E1000_ADVRXD_PKT_IP4 BIT(4)
#define E1000_ADVRXD_PKT_IP6 BIT(6)
#define E1000_ADVRXD_PKT_TCP BIT(8)
#define E1000_ADVRXD_PKT_UDP BIT(9)
static inline uint8_t igb_ivar_entry_rx(uint8_t i)
{
return i < 8 ? i * 4 : (i - 8) * 4 + 2;
+10 -6
View File
@@ -118,14 +118,18 @@ static void emac_load_desc(MSF2EmacState *s, EmacDesc *d, hwaddr desc)
d->next = le32_to_cpu(d->next);
}
static void emac_store_desc(MSF2EmacState *s, EmacDesc *d, hwaddr desc)
static void emac_store_desc(MSF2EmacState *s, const EmacDesc *d, hwaddr desc)
{
/* Convert from host endianness into LE. */
d->pktaddr = cpu_to_le32(d->pktaddr);
d->pktsize = cpu_to_le32(d->pktsize);
d->next = cpu_to_le32(d->next);
EmacDesc outd;
/*
* Convert from host endianness into LE. We use a local struct because
* calling code may still want to look at the fields afterwards.
*/
outd.pktaddr = cpu_to_le32(d->pktaddr);
outd.pktsize = cpu_to_le32(d->pktsize);
outd.next = cpu_to_le32(d->next);
address_space_write(&s->dma_as, desc, MEMTXATTRS_UNSPECIFIED, d, sizeof *d);
address_space_write(&s->dma_as, desc, MEMTXATTRS_UNSPECIFIED, &outd, sizeof outd);
}
static void msf2_dma_tx(MSF2EmacState *s)
+5 -9
View File
@@ -103,7 +103,7 @@ net_rx_pkt_pull_data(struct NetRxPkt *pkt,
iov, iovcnt, ploff, pkt->tot_len);
}
eth_get_protocols(pkt->vec, pkt->vec_len, &pkt->hasip4, &pkt->hasip6,
eth_get_protocols(pkt->vec, pkt->vec_len, 0, &pkt->hasip4, &pkt->hasip6,
&pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
&pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);
@@ -186,17 +186,13 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt)
return pkt->tot_len;
}
void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
size_t len)
void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
const struct iovec *iov, size_t iovcnt,
size_t iovoff)
{
const struct iovec iov = {
.iov_base = (void *)data,
.iov_len = len
};
assert(pkt);
eth_get_protocols(&iov, 1, &pkt->hasip4, &pkt->hasip6,
eth_get_protocols(iov, iovcnt, iovoff, &pkt->hasip4, &pkt->hasip6,
&pkt->l3hdr_off, &pkt->l4hdr_off, &pkt->l5hdr_off,
&pkt->ip6hdr_info, &pkt->ip4hdr_info, &pkt->l4hdr_info);
}
+6 -4
View File
@@ -55,12 +55,14 @@ size_t net_rx_pkt_get_total_len(struct NetRxPkt *pkt);
* parse and set packet analysis results
*
* @pkt: packet
* @data: pointer to the data buffer to be parsed
* @len: data length
* @iov: received data scatter-gather list
* @iovcnt: number of elements in iov
* @iovoff: data start offset in the iov
*
*/
void net_rx_pkt_set_protocols(struct NetRxPkt *pkt, const void *data,
size_t len);
void net_rx_pkt_set_protocols(struct NetRxPkt *pkt,
const struct iovec *iov, size_t iovcnt,
size_t iovoff);
/**
* fetches packet analysis results
+3
View File
@@ -2154,6 +2154,9 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
int large_send_mss = (txdw0 >> CP_TC_LGSEN_MSS_SHIFT) &
CP_TC_LGSEN_MSS_MASK;
if (large_send_mss == 0) {
goto skip_offload;
}
DPRINTF("+++ C+ mode offloaded task TSO IP data %d "
"frame data %d specified MSS=%d\n",
+10 -6
View File
@@ -805,7 +805,6 @@ static uint64_t virtio_net_get_features(VirtIODevice *vdev, uint64_t features,
}
if (!get_vhost_net(nc->peer)) {
virtio_add_feature(&features, VIRTIO_F_RING_RESET);
return features;
}
@@ -1835,9 +1834,12 @@ static int virtio_net_process_rss(NetClientState *nc, const uint8_t *buf,
VIRTIO_NET_HASH_REPORT_UDPv6,
VIRTIO_NET_HASH_REPORT_UDPv6_EX
};
struct iovec iov = {
.iov_base = (void *)buf,
.iov_len = size
};
net_rx_pkt_set_protocols(pkt, buf + n->host_hdr_len,
size - n->host_hdr_len);
net_rx_pkt_set_protocols(pkt, &iov, 1, n->host_hdr_len);
net_rx_pkt_get_protocols(pkt, &hasip4, &hasip6, &l4hdr_proto);
net_hash_type = virtio_net_get_hash_type(hasip4, hasip6, l4hdr_proto,
n->rss_data.hash_types);
@@ -2917,7 +2919,8 @@ static void virtio_net_add_queue(VirtIONet *n, int index)
n->vqs[index].tx_vq =
virtio_add_queue(vdev, n->net_conf.tx_queue_size,
virtio_net_handle_tx_bh);
n->vqs[index].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[index]);
n->vqs[index].tx_bh = qemu_bh_new_guarded(virtio_net_tx_bh, &n->vqs[index],
&DEVICE(vdev)->mem_reentrancy_guard);
}
n->vqs[index].tx_waiting = 0;
@@ -3627,12 +3630,12 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp)
}
if (n->net_conf.tx_queue_size < VIRTIO_NET_TX_QUEUE_MIN_SIZE ||
n->net_conf.tx_queue_size > VIRTQUEUE_MAX_SIZE ||
n->net_conf.tx_queue_size > virtio_net_max_tx_queue_size(n) ||
!is_power_of_2(n->net_conf.tx_queue_size)) {
error_setg(errp, "Invalid tx_queue_size (= %" PRIu16 "), "
"must be a power of 2 between %d and %d",
n->net_conf.tx_queue_size, VIRTIO_NET_TX_QUEUE_MIN_SIZE,
VIRTQUEUE_MAX_SIZE);
virtio_net_max_tx_queue_size(n));
virtio_cleanup(vdev);
return;
}
@@ -3948,6 +3951,7 @@ static void virtio_net_class_init(ObjectClass *klass, void *data)
vdc->vmsd = &vmstate_virtio_net_device;
vdc->primary_unplug_pending = primary_unplug_pending;
vdc->get_vhost = virtio_net_get_vhost;
vdc->toggle_device_iotlb = vhost_toggle_device_iotlb;
}
static const TypeInfo virtio_net_info = {
+10 -2
View File
@@ -1439,7 +1439,10 @@ static void vmxnet3_activate_device(VMXNET3State *s)
vmxnet3_setup_rx_filtering(s);
/* Cache fields from shared memory */
s->mtu = VMXNET3_READ_DRV_SHARED32(d, s->drv_shmem, devRead.misc.mtu);
assert(VMXNET3_MIN_MTU <= s->mtu && s->mtu <= VMXNET3_MAX_MTU);
if (s->mtu < VMXNET3_MIN_MTU || s->mtu > VMXNET3_MAX_MTU) {
qemu_log_mask(LOG_GUEST_ERROR, "vmxnet3: Bad MTU size: %u\n", s->mtu);
return;
}
VMW_CFPRN("MTU is %u", s->mtu);
s->max_rx_frags =
@@ -2001,7 +2004,12 @@ vmxnet3_receive(NetClientState *nc, const uint8_t *buf, size_t size)
get_eth_packet_type(PKT_GET_ETH_HDR(buf)));
if (vmxnet3_rx_filter_may_indicate(s, buf, size)) {
net_rx_pkt_set_protocols(s->rx_pkt, buf, size);
struct iovec iov = {
.iov_base = (void *)buf,
.iov_len = size
};
net_rx_pkt_set_protocols(s->rx_pkt, &iov, 1, 0);
vmxnet3_rx_need_csum_calculate(s->rx_pkt, buf, size);
net_rx_pkt_attach_data(s->rx_pkt, buf, size, s->rx_vlan_stripping);
bytes_indicated = vmxnet3_indicate_packet(s) ? size : -1;
+42 -43
View File
@@ -1504,7 +1504,7 @@ static void nvme_post_cqes(void *opaque)
req->cqe.status = cpu_to_le16((req->status << 1) | cq->phase);
req->cqe.sq_id = cpu_to_le16(sq->sqid);
req->cqe.sq_head = cpu_to_le16(sq->head);
addr = cq->dma_addr + cq->tail * n->cqe_size;
addr = cq->dma_addr + (cq->tail << NVME_CQES);
ret = pci_dma_write(PCI_DEVICE(n), addr, (void *)&req->cqe,
sizeof(req->cqe));
if (ret) {
@@ -4333,7 +4333,13 @@ static uint16_t nvme_io_mgmt_send_ruh_update(NvmeCtrl *n, NvmeRequest *req)
uint32_t npid = (cdw10 >> 1) + 1;
unsigned int i = 0;
g_autofree uint16_t *pids = NULL;
uint32_t maxnpid = n->subsys->endgrp.fdp.nrg * n->subsys->endgrp.fdp.nruh;
uint32_t maxnpid;
if (!ns->endgrp || !ns->endgrp->fdp.enabled) {
return NVME_FDP_DISABLED | NVME_DNR;
}
maxnpid = n->subsys->endgrp.fdp.nrg * n->subsys->endgrp.fdp.nruh;
if (unlikely(npid >= MIN(NVME_FDP_MAXPIDS, maxnpid))) {
return NVME_INVALID_FIELD | NVME_DNR;
@@ -4607,7 +4613,8 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
QTAILQ_INSERT_TAIL(&(sq->req_list), &sq->io_req[i], entry);
}
sq->bh = qemu_bh_new(nvme_process_sq, sq);
sq->bh = qemu_bh_new_guarded(nvme_process_sq, sq,
&DEVICE(sq->ctrl)->mem_reentrancy_guard);
if (n->dbbuf_enabled) {
sq->db_addr = n->dbbuf_dbs + (sqid << 3);
@@ -5091,6 +5098,11 @@ static uint16_t nvme_fdp_events(NvmeCtrl *n, uint32_t endgrpid,
}
log_size = sizeof(NvmeFdpEventsLog) + ebuf->nelems * sizeof(NvmeFdpEvent);
if (off >= log_size) {
return NVME_INVALID_FIELD | NVME_DNR;
}
trans_len = MIN(log_size - off, buf_len);
elog = g_malloc0(log_size);
elog->num_events = cpu_to_le32(ebuf->nelems);
@@ -5253,7 +5265,8 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
}
}
n->cq[cqid] = cq;
cq->bh = qemu_bh_new(nvme_post_cqes, cq);
cq->bh = qemu_bh_new_guarded(nvme_post_cqes, cq,
&DEVICE(cq->ctrl)->mem_reentrancy_guard);
}
static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
@@ -5265,10 +5278,18 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
uint16_t qsize = le16_to_cpu(c->qsize);
uint16_t qflags = le16_to_cpu(c->cq_flags);
uint64_t prp1 = le64_to_cpu(c->prp1);
uint32_t cc = ldq_le_p(&n->bar.cc);
uint8_t iocqes = NVME_CC_IOCQES(cc);
uint8_t iosqes = NVME_CC_IOSQES(cc);
trace_pci_nvme_create_cq(prp1, cqid, vector, qsize, qflags,
NVME_CQ_FLAGS_IEN(qflags) != 0);
if (iosqes != NVME_SQES || iocqes != NVME_CQES) {
trace_pci_nvme_err_invalid_create_cq_entry_size(iosqes, iocqes);
return NVME_MAX_QSIZE_EXCEEDED | NVME_DNR;
}
if (unlikely(!cqid || cqid > n->conf_ioqpairs || n->cq[cqid] != NULL)) {
trace_pci_nvme_err_invalid_create_cq_cqid(cqid);
return NVME_INVALID_QID | NVME_DNR;
@@ -6767,6 +6788,7 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const NvmeRequest *req)
PCIDevice *pci = PCI_DEVICE(n);
uint64_t dbs_addr = le64_to_cpu(req->cmd.dptr.prp1);
uint64_t eis_addr = le64_to_cpu(req->cmd.dptr.prp2);
uint32_t v;
int i;
/* Address should be page aligned */
@@ -6784,6 +6806,8 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const NvmeRequest *req)
NvmeCQueue *cq = n->cq[i];
if (sq) {
v = cpu_to_le32(sq->tail);
/*
* CAP.DSTRD is 0, so offset of ith sq db_addr is (i<<3)
* nvme_process_db() uses this hard-coded way to calculate
@@ -6791,7 +6815,7 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const NvmeRequest *req)
*/
sq->db_addr = dbs_addr + (i << 3);
sq->ei_addr = eis_addr + (i << 3);
pci_dma_write(pci, sq->db_addr, &sq->tail, sizeof(sq->tail));
pci_dma_write(pci, sq->db_addr, &v, sizeof(sq->tail));
if (n->params.ioeventfd && sq->sqid != 0) {
if (!nvme_init_sq_ioeventfd(sq)) {
@@ -6801,10 +6825,12 @@ static uint16_t nvme_dbbuf_config(NvmeCtrl *n, const NvmeRequest *req)
}
if (cq) {
v = cpu_to_le32(cq->head);
/* CAP.DSTRD is 0, so offset of ith cq db_addr is (i<<3)+(1<<2) */
cq->db_addr = dbs_addr + (i << 3) + (1 << 2);
cq->ei_addr = eis_addr + (i << 3) + (1 << 2);
pci_dma_write(pci, cq->db_addr, &cq->head, sizeof(cq->head));
pci_dma_write(pci, cq->db_addr, &v, sizeof(cq->head));
if (n->params.ioeventfd && cq->cqid != 0) {
if (!nvme_init_cq_ioeventfd(cq)) {
@@ -6857,7 +6883,7 @@ static uint16_t nvme_directive_receive(NvmeCtrl *n, NvmeRequest *req)
case NVME_DIRECTIVE_IDENTIFY:
switch (doper) {
case NVME_DIRECTIVE_RETURN_PARAMS:
if (ns->endgrp->fdp.enabled) {
if (ns->endgrp && ns->endgrp->fdp.enabled) {
id.supported |= 1 << NVME_DIRECTIVE_DATA_PLACEMENT;
id.enabled |= 1 << NVME_DIRECTIVE_DATA_PLACEMENT;
id.persistent |= 1 << NVME_DIRECTIVE_DATA_PLACEMENT;
@@ -6969,7 +6995,7 @@ static void nvme_process_sq(void *opaque)
}
while (!(nvme_sq_empty(sq) || QTAILQ_EMPTY(&sq->req_list))) {
addr = sq->dma_addr + sq->head * n->sqe_size;
addr = sq->dma_addr + (sq->head << NVME_SQES);
if (nvme_addr_read(n, addr, (void *)&cmd, sizeof(cmd))) {
trace_pci_nvme_err_addr_read(addr);
trace_pci_nvme_err_cfs();
@@ -7196,34 +7222,6 @@ static int nvme_start_ctrl(NvmeCtrl *n)
NVME_CAP_MPSMAX(cap));
return -1;
}
if (unlikely(NVME_CC_IOCQES(cc) <
NVME_CTRL_CQES_MIN(n->id_ctrl.cqes))) {
trace_pci_nvme_err_startfail_cqent_too_small(
NVME_CC_IOCQES(cc),
NVME_CTRL_CQES_MIN(cap));
return -1;
}
if (unlikely(NVME_CC_IOCQES(cc) >
NVME_CTRL_CQES_MAX(n->id_ctrl.cqes))) {
trace_pci_nvme_err_startfail_cqent_too_large(
NVME_CC_IOCQES(cc),
NVME_CTRL_CQES_MAX(cap));
return -1;
}
if (unlikely(NVME_CC_IOSQES(cc) <
NVME_CTRL_SQES_MIN(n->id_ctrl.sqes))) {
trace_pci_nvme_err_startfail_sqent_too_small(
NVME_CC_IOSQES(cc),
NVME_CTRL_SQES_MIN(cap));
return -1;
}
if (unlikely(NVME_CC_IOSQES(cc) >
NVME_CTRL_SQES_MAX(n->id_ctrl.sqes))) {
trace_pci_nvme_err_startfail_sqent_too_large(
NVME_CC_IOSQES(cc),
NVME_CTRL_SQES_MAX(cap));
return -1;
}
if (unlikely(!NVME_AQA_ASQS(aqa))) {
trace_pci_nvme_err_startfail_asqent_sz_zero();
return -1;
@@ -7236,8 +7234,6 @@ static int nvme_start_ctrl(NvmeCtrl *n)
n->page_bits = page_bits;
n->page_size = page_size;
n->max_prp_ents = n->page_size / sizeof(uint64_t);
n->cqe_size = 1 << NVME_CC_IOCQES(cc);
n->sqe_size = 1 << NVME_CC_IOSQES(cc);
nvme_init_cq(&n->admin_cq, n, acq, 0, 0, NVME_AQA_ACQS(aqa) + 1, 1);
nvme_init_sq(&n->admin_sq, n, asq, 0, 0, NVME_AQA_ASQS(aqa) + 1);
@@ -7555,7 +7551,7 @@ static uint64_t nvme_mmio_read(void *opaque, hwaddr addr, unsigned size)
static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
{
PCIDevice *pci = PCI_DEVICE(n);
uint32_t qid;
uint32_t qid, v;
if (unlikely(addr & ((1 << 2) - 1))) {
NVME_GUEST_ERR(pci_nvme_ub_db_wr_misaligned,
@@ -7622,7 +7618,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
start_sqs = nvme_cq_full(cq) ? 1 : 0;
cq->head = new_head;
if (!qid && n->dbbuf_enabled) {
pci_dma_write(pci, cq->db_addr, &cq->head, sizeof(cq->head));
v = cpu_to_le32(cq->head);
pci_dma_write(pci, cq->db_addr, &v, sizeof(cq->head));
}
if (start_sqs) {
NvmeSQueue *sq;
@@ -7682,6 +7679,8 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
sq->tail = new_tail;
if (!qid && n->dbbuf_enabled) {
v = cpu_to_le32(sq->tail);
/*
* The spec states "the host shall also update the controller's
* corresponding doorbell property to match the value of that entry
@@ -7695,7 +7694,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
* including ones that run on Linux, are not updating Admin Queues,
* so we can't trust reading it for an appropriate sq tail.
*/
pci_dma_write(pci, sq->db_addr, &sq->tail, sizeof(sq->tail));
pci_dma_write(pci, sq->db_addr, &v, sizeof(sq->tail));
}
qemu_bh_schedule(sq->bh);
@@ -8206,8 +8205,8 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING);
id->cctemp = cpu_to_le16(NVME_TEMPERATURE_CRITICAL);
id->sqes = (0x6 << 4) | 0x6;
id->cqes = (0x4 << 4) | 0x4;
id->sqes = (NVME_SQES << 4) | NVME_SQES;
id->cqes = (NVME_CQES << 4) | NVME_CQES;
id->nn = cpu_to_le32(NVME_MAX_NAMESPACES);
id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP |
NVME_ONCS_FEATURES | NVME_ONCS_DSM |
+2 -2
View File
@@ -115,7 +115,7 @@ static void nvme_dif_pract_generate_dif_crc64(NvmeNamespace *ns, uint8_t *buf,
uint64_t crc = crc64_nvme(~0ULL, buf, ns->lbasz);
if (pil) {
crc = crc64_nvme(crc, mbuf, pil);
crc = crc64_nvme(~crc, mbuf, pil);
}
dif->g64.guard = cpu_to_be64(crc);
@@ -246,7 +246,7 @@ static uint16_t nvme_dif_prchk_crc64(NvmeNamespace *ns, NvmeDifTuple *dif,
uint64_t crc = crc64_nvme(~0ULL, buf, ns->lbasz);
if (pil) {
crc = crc64_nvme(crc, mbuf, pil);
crc = crc64_nvme(~crc, mbuf, pil);
}
trace_pci_nvme_dif_prchk_guard_crc64(be64_to_cpu(dif->g64.guard), crc);
+7 -2
View File
@@ -30,6 +30,13 @@
#define NVME_FDP_MAX_EVENTS 63
#define NVME_FDP_MAXPIDS 128
/*
* The controller only supports Submission and Completion Queue Entry Sizes of
* 64 and 16 bytes respectively.
*/
#define NVME_SQES 6
#define NVME_CQES 4
QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES > NVME_NSID_BROADCAST - 1);
typedef struct NvmeCtrl NvmeCtrl;
@@ -530,8 +537,6 @@ typedef struct NvmeCtrl {
uint32_t page_size;
uint16_t page_bits;
uint16_t max_prp_ents;
uint16_t cqe_size;
uint16_t sqe_size;
uint32_t max_q_ents;
uint8_t outstanding_aers;
uint32_t irq_status;
+1
View File
@@ -168,6 +168,7 @@ pci_nvme_err_invalid_create_cq_size(uint16_t size) "failed creating completion q
pci_nvme_err_invalid_create_cq_addr(uint64_t addr) "failed creating completion queue, addr=0x%"PRIx64""
pci_nvme_err_invalid_create_cq_vector(uint16_t vector) "failed creating completion queue, vector=%"PRIu16""
pci_nvme_err_invalid_create_cq_qflags(uint16_t qflags) "failed creating completion queue, qflags=%"PRIu16""
pci_nvme_err_invalid_create_cq_entry_size(uint8_t iosqes, uint8_t iocqes) "iosqes %"PRIu8" iocqes %"PRIu8""
pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx16""
pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32""
pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32""
+1 -1
View File
@@ -311,7 +311,7 @@ static void pxb_cxl_dev_reset(DeviceState *dev)
* The CXL specification allows for host bridges with no HDM decoders
* if they only have a single root port.
*/
if (!PXB_DEV(dev)->hdm_for_passthrough) {
if (!PXB_CXL_DEV(dev)->hdm_for_passthrough) {
dsp_count = pcie_count_ds_ports(hb->bus);
}
/* Initial reset will have 0 dsp so wait until > 0 */
+7
View File
@@ -294,6 +294,13 @@ static void raven_pcihost_initfn(Object *obj)
memory_region_init(&s->pci_memory, obj, "pci-memory", 0x3f000000);
address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
/*
* Raven's raven_io_ops use the address-space API to access pci-conf-idx
* (which is also owned by the raven device). As such, mark the
* pci_io_non_contiguous as re-entrancy safe.
*/
s->pci_io_non_contiguous.disable_reentrancy_guard = true;
/* CPU address space */
memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
&s->pci_io);
+2
View File
@@ -79,6 +79,8 @@ static Property pci_props[] = {
DEFINE_PROP_STRING("failover_pair_id", PCIDevice,
failover_pair_id),
DEFINE_PROP_UINT32("acpi-index", PCIDevice, acpi_index, 0),
DEFINE_PROP_BIT("x-pcie-err-unc-mask", PCIDevice, cap_present,
QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
DEFINE_PROP_END_OF_LIST()
};
+13 -2
View File
@@ -62,6 +62,17 @@ static void pci_adjust_config_limit(PCIBus *bus, uint32_t *limit)
}
}
static bool is_pci_dev_ejected(PCIDevice *pci_dev)
{
/*
* device unplug was requested and the guest acked it,
* so we stop responding config accesses even if the
* device is not deleted (failover flow)
*/
return pci_dev && pci_dev->partially_hotplugged &&
!pci_dev->qdev.pending_deleted_event;
}
void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
uint32_t limit, uint32_t val, uint32_t len)
{
@@ -75,7 +86,7 @@ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
* allowing direct removal of unexposed functions.
*/
if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) ||
!pci_dev->has_power) {
!pci_dev->has_power || is_pci_dev_ejected(pci_dev)) {
return;
}
@@ -100,7 +111,7 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, uint32_t addr,
* allowing direct removal of unexposed functions.
*/
if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) ||
!pci_dev->has_power) {
!pci_dev->has_power || is_pci_dev_ejected(pci_dev)) {
return ~0x0;
}
+7 -4
View File
@@ -112,10 +112,13 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, uint16_t offset,
pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
PCI_ERR_UNC_SUPPORTED);
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
PCI_ERR_UNC_MASK_DEFAULT);
pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
PCI_ERR_UNC_SUPPORTED);
if (dev->cap_present & QEMU_PCIE_ERR_UNC_MASK) {
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
PCI_ERR_UNC_MASK_DEFAULT);
pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
PCI_ERR_UNC_SUPPORTED);
}
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
PCI_ERR_UNC_SEVERITY_DEFAULT);
+1 -1
View File
@@ -712,7 +712,7 @@ static int ppce500_prep_device_tree(PPCE500MachineState *machine,
p->kernel_base = kernel_base;
p->kernel_size = kernel_size;
qemu_register_reset(ppce500_reset_device_tree, p);
qemu_register_reset_nosnapshotload(ppce500_reset_device_tree, p);
p->notifier.notify = ppce500_init_notify;
qemu_add_machine_init_done_notifier(&p->notifier);

Some files were not shown because too many files have changed in this diff Show More