Commit Graph

28878 Commits

Author SHA1 Message Date
Joerg Roedel 94441c3bd9 iommu/core: Remove global iommu_ops and register_iommu
With all IOMMU drivers being converted to bus_set_iommu the
global iommu_ops are no longer required. The same is true
for the deprecated register_iommu function.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:23 +02:00
Joerg Roedel a1b60c1cd9 iommu/core: Convert iommu_found to iommu_present
With per-bus iommu_ops the iommu_found function needs to
work on a bus_type too. This patch adds a bus_type parameter
to that function and converts all call-places.
The function is also renamed to iommu_present because the
function now checks if an iommu is present for a given bus
and does not check for a global iommu anymore.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:20 +02:00
Joerg Roedel 905d66c1e5 iommu/core: Add bus_type parameter to iommu_domain_alloc
This is necessary to store a pointer to the bus-specific
iommu_ops in the iommu-domain structure. It will be used
later to call into bus-specific iommu-ops.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:19 +02:00
Joerg Roedel ff21776d12 Driver core: Add iommu_ops to bus_type
This is the starting point to make the iommu_ops used for
the iommu-api a per-bus-type structure. It is required to
easily implement bus-specific setup in the iommu-layer.
The first user will be the iommu-group attribute in sysfs.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:19 +02:00
Joerg Roedel 39d4ebb959 iommu/core: Define iommu_ops and register_iommu only with CONFIG_IOMMU_API
This makes it impossible to compile an iommu driver into the
kernel without selecting CONFIG_IOMMU_API.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-10-21 14:37:18 +02:00
Steffen Klassert 07a5fa4abd crypto: Add userspace report for cipher type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:07 +02:00
Steffen Klassert 792608e9c2 crypto: Add userspace report for rng type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert a55465dca7 crypto: Add userspace report for pcompress type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert 6ad414fe71 crypto: Add userspace report for aead type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:06 +02:00
Steffen Klassert 50496a1fab crypto: Add userspace report for blkcipher type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:05 +02:00
Steffen Klassert f4d663ce63 crypto: Add userspace report for shash type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:04 +02:00
Steffen Klassert 6c5a86f529 crypto: Add userspace report for larval type algorithms
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:04 +02:00
Steffen Klassert a38f7907b9 crypto: Add userspace configuration API
This patch adds a basic userspace configuration API for the crypto layer.
With this it is possible to instantiate, remove and to show crypto
algorithms from userspace.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:03 +02:00
Steffen Klassert 64a947b133 crypto: Add a flag to identify crypto instances
The upcomming crypto user configuration api needs to identify
crypto instances. This patch adds a flag that is set if the
algorithm is an instance that is build from templates.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-10-21 14:24:01 +02:00
Eric W. Biederman e09eff7fc1 macvtap: Fix the minor device number allocation
On systems that create and delete lots of dynamic devices the
31bit linux ifindex fails to fit in the 16bit macvtap minor,
resulting in unusable macvtap devices.  I have systems running
automated tests that that hit this condition in just a few days.

Use a linux idr allocator to track which mavtap minor numbers
are available and and to track the association between macvtap
minor numbers and macvtap network devices.

Remove the unnecessary unneccessary check to see if the network
device we have found is indeed a macvtap device.  With macvtap
specific data structures it is impossible to find any other
kind of networking device.

Increase the macvtap minor range from 65536 to the full 20 bits
that is supported by linux device numbers.  It doesn't solve the
original problem but there is no penalty for a larger minor
device range.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Ian Campbell a8605c6063 net: add opaque struct around skb frag page
I've split this bit out of the skb frag destructor patch since it helps enforce
the use of the fragment API.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:53 -04:00
Daniel Vetter 24dd85ff72 io-mapping: ensure io_mapping_map_atomic _is_ atomic
For the !HAVE_ATOMIC_IOMAP case the stub functions did not call
pagefault_disable/_enable. The i915 driver relies on the map
actually being atomic, otherwise it can deadlock with it's own
pagefault handler in the gtt pwrite fastpath.

This is exercised by gem_mmap_gtt from the intel-gpu-toosl gem
testsuite.

v2: Chris Wilson noted the lack of an include.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-10-20 15:26:17 -07:00
Maciej Żenczykowski 6cc7a765c2 net: allow CAP_NET_RAW to set socket options IP{,V6}_TRANSPARENT
Up till now the IP{,V6}_TRANSPARENT socket options (which actually set
the same bit in the socket struct) have required CAP_NET_ADMIN
privileges to set or clear the option.

- we make clearing the bit not require any privileges.
- we allow CAP_NET_ADMIN to set the bit (as before this change)
- we allow CAP_NET_RAW to set this bit, because raw
  sockets already pretty much effectively allow you
  to emulate socket transparency.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 18:21:36 -04:00
Eric Dumazet 05bdd2f143 net: constify skbuff and Qdisc elements
Preliminary patch before tcp constification

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:45:43 -04:00
Stephen Warren a5818a8bd0 pinctrl: get_group_pins() const fixes
get_group_pins() "returns" a pointer to an array of const objects, through
a pointer parameter. Fix the prototype so what's pointed at by the returned
pointer is const, rather than the function parameter being const.

This also allows the removal of a cast in each of the two current pinmux
drivers.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2011-10-20 11:41:49 +02:00
Ian Campbell 30d3c128ea mm: add a "struct page_frag" type containing a page, offset and length
A few network drivers currently use skb_frag_struct for this purpose but I have
patches which add additional fields and semantics there which these other uses
do not want.

A structure for reference sub-page regions seems like a generally useful thing
so do so instead of adding a network subsystem specific structure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 04:58:32 -04:00
Steve French b957ae9c53 [CIFS] Fixup trivial checkpatch warning
Signed-off-by: Steve French <smfrench@gmail.com>
2011-10-19 21:27:11 -05:00
Ian Campbell a0bec1cd8f net: do not take an additional reference in skb_frag_set_page
I audited all of the callers in the tree and only one of them (pktgen) expects
it to do so. Taking this reference is pretty obviously confusing and error
prone.

In particular I looked at the following commits which switched callers of
(__)skb_frag_set_page to the skb paged fragment api:

6a930b9f16 cxgb3: convert to SKB paged frag API.
5dc3e196ea myri10ge: convert to SKB paged frag API.
0e0634d20d vmxnet3: convert to SKB paged frag API.
86ee8130a4 virtionet: convert to SKB paged frag API.
4a22c4c919 sfc: convert to SKB paged frag API.
18324d690d cassini: convert to SKB paged frag API.
b061b39e3a benet: convert to SKB paged frag API.
b7b6a688d2 bnx2: convert to SKB paged frag API.
804cf14ea5 net: xfrm: convert to SKB frag APIs
ea2ab69379 net: convert core to skb paged frag APIs

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:40:39 -04:00
Dan Carpenter 4f25af2782 filter: use unsigned int to silence static checker warning
This is just a cleanup.

My testing version of Smatch warns about this:
net/core/filter.c +380 check_load_and_stores(6)
	warn: check 'flen' for negative values

flen comes from the user.  We try to clamp the values here between 1
and BPF_MAXINSNS but the clamp doesn't work because it could be
negative.  This is a bug, but it's not exploitable.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:35:51 -04:00
Eric W. Biederman 672d82c18d class: Implement support for class attrs in tagged sysfs directories.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:15 -04:00
Eric W. Biederman 487505c257 sysfs: Implement support for tagged files in sysfs.
Looking up files in sysfs is hard to understand and analyize because we
currently allow placing untagged files in tagged directories.  In the
implementation of that we have two subtly different meanings of NULL.
NULL meaning there is no tag on a directory entry and NULL meaning
we don't care which namespace the lookup is performed for.  This
multiple uses of NULL have resulted in subtle bugs (since fixed)
in the code.

Currently it is only the bonding driver that needs to have an untagged
file in a tagged directory.

To untagle this mess I am adding support for tagged files to sysfs.
Modifying the bonding driver to implement bonding_masters as a tagged
file.  Registering bonding_masters once for each network namespace.
Then I am removing support for untagged entries in tagged sysfs
directories.

Resulting in code that is much easier to reason about.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 19:24:14 -04:00
Richard Cochran 4dc360c5e7 net: validate HWTSTAMP ioctl parameters
This patch adds a sanity check on the values provided by user space for
the hardware time stamping configuration. If the values lie outside of
the absolute limits, then the ioctl request will be denied.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 17:00:35 -04:00
Trond Myklebust fba730050d NFS: Don't rely on PageError in nfs_readpage_release_partial
Don't rely on the PageError flag to tell us if one of the partial reads of
the page failed. Instead, replace that with a dedicated flag in the
struct nfs_page.

Then clean out redundant uses of the PageError flag: the VM no longer
checks it for reads.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:58:38 -07:00
Trond Myklebust b8ef70639b NFS: Get rid of the unused nfs_write_data->flags field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:37:34 -07:00
Trond Myklebust a1940805d0 NFS: Get rid of the unused nfs_read_data->flags field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:37:34 -07:00
Andy Fleming 3d153a7c8b net: Allow skb_recycle_check to be done in stages
skb_recycle_check resets the skb if it's eligible for recycling.
However, there are times when a driver might want to optionally
manipulate the skb data with the skb before resetting the skb,
but after it has determined eligibility.  We do this by splitting the
eligibility check from the skb reset, creating two inline functions to
accomplish that task.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 15:59:45 -04:00
Jeff Layton f06ac72e92 cifs, freezer: add wait_event_freezekillable and have cifs use it
CIFS currently uses wait_event_killable to put tasks to sleep while
they await replies from the server. That function though does not
allow the freezer to run. In many cases, the network interface may
be going down anyway, in which case the reply will never come. The
client then ends up blocking the computer from suspending.

Fix this by adding a new wait_event_freezable variant --
wait_event_freezekillable. The idea is to combine the behavior of
wait_event_killable and wait_event_freezable -- put the task to
sleep and only allow it to be awoken by fatal signals, but also
allow the freezer to do its job.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2011-10-19 15:30:40 -04:00
J. Bruce Fields 8b289b2c23 nfsd4: implement new 4.1 open reclaim types
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-19 11:52:12 -04:00
Tejun Heo bd87b5898a block: drop @tsk from attempt_plug_merge() and explain sync rules
attempt_plug_merge() accesses elevator without holding queue_lock and
may call into ->elevator_bio_merge_fn().  The elvator is guaranteed to
be valid because it's accessed iff the plugged list has requests and
elevator is never exited with live requests, so as long as the
elevator method can deal with unlocked access, this is safe.

Explain the sync rules around attempt_plug_merge() and drop the
unnecessary @tsk parameter.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:33:08 +02:00
Tejun Heo bc9fcbf9cb block: move blk_throtl prototypes to block/blk.h
blk_throtl interface is block internal and there's no reason to have
them in linux/blkdev.h.  Move them to block/blk.h.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:31:18 +02:00
Jens Axboe 5c04b426f2 Merge branch 'v3.1-rc10' into for-3.2/core
Conflicts:
	block/blk-core.c
	include/linux/blkdev.h

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-19 14:30:42 +02:00
Yevgeny Petrilin f3a9d1f25d mlx4_en: Controlling FCS header removal
Canceling FCS removal where FW allows for better alignment
of incoming data.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:26 -04:00
Eric Dumazet 9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Michael Hennerich 3f48e73543 Input: adp5589-keys - add support for the ADP5585 derivatives
The ADP5585 family keypad decoder and IO expander is similar to the ADP5589,
however it features less IO pins, and lacks hardware assisted key-lock
functionality. Unfortunately the register addresses are different, as well as
the event codes and bit organization within the port related registers.

Move ADP5589 Register defines from the header file into the main source file.
Add new defines while making sure we don't break existing platform_data.
Add register address translation, and turn device specific defines into variables.
Introduce some helper functions and disable functions that doesn't
exist on the added devices.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-10-18 21:26:55 -07:00
Trond Myklebust 0c2e53f11a NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
...and also remove the associated nfs_v4_clientops entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Jiri Slaby fa90e1c935 TTY: make tty_add_file non-failing
If tty_add_file fails at the point it is now, we have to revert all
the changes we did to the tty. It means either decrease all refcounts
if this was a tty reopen or delete the tty if it was newly allocated.

There was a try to fix this in v3.0-rc2 using tty_release in 0259894c7
(TTY: fix fail path in tty_open). But instead it introduced a NULL
dereference. It's because tty_release dereferences
filp->private_data, but that one is set even in our tty_add_file. And
when tty_add_file fails, it's still NULL/garbage. Hence tty_release
cannot be called there.

To circumvent the original leak (and the current NULL deref) we split
tty_add_file into two functions, making the latter non-failing. In
that case we may do the former early in open, where handling failures
is easy. The latter stays as it is now. So there is no change in
functionality.

The original bug (leak) was introduced by f573bd176 (tty: Remove
__GFP_NOFAIL from tty_add_file()). Thanks Dan for reporting this.

Later, we may split tty_release into more functions and call only some
of them in this fail path instead. (If at all possible.)

Introduced-in: v2.6.37-rc2
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 14:22:37 -07:00
Thomas Meyer 8193c42906 tty: Support compat_ioctl get/set termios_locked
When running a Fedora 15 (x86) on an x86_64 kernel, in the boot process
plymouthd complains about those two missing ioctls:
[    2.581783] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005457){t:'T';sz:0} arg(ffb6a5d0) on /dev/tty1
[    2.581803] ioctl32(plymouthd:186): Unknown cmd fd(10) cmd(00005456){t:'T';sz:0} arg(ffb6a680) on /dev/tty1

both ioctl functions work on the 'struct termios' resp. 'struct termios2',
which has the same size (36 bytes resp. 44 bytes) on x86 and x86_64,
so it's just a matter of converting the pointer from userland.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 14:17:11 -07:00
Jason Baron 07613b0b5e dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions
Replace the repetitive struct _ddebug descriptor definitions with a new
DECLARE_DYNAMIC_DEBUG_META_DATA(name, fmt) macro.

[akpm@linux-foundation.org: s/DECLARE/DEFINE/]
Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 11:22:00 -07:00
Kai Jiang 27a90700a4 uio: Support physical addresses >32 bits on 32-bit systems
To support >32-bit physical addresses for UIO_MEM_PHYS type we need to
extend the width of 'addr' in struct uio_mem.  Numerous platforms like
embedded PPC, ARM, and X86 have support for systems with larger physical
address than logical.

Since 'addr' may contain a physical, logical, or virtual address the
easiest solution is to just change the type to 'phys_addr_t' which
should always be greater than or equal to the sizeof(void *) such that
it can properly hold any of the address types.

For physical address we can support up to a 44-bit physical address on a
typical 32-bit system as we utilize remap_pfn_range() for the mapping of
the memory region and pfn's are represnted by shifting the address by
the page size (typically 4k).

Signed-off-by: Kai Jiang <Kai.Jiang@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-18 11:18:57 -07:00
Trond Myklebust a9a4a87a59 NFS: Use the inode->i_version to cache NFSv4 change attribute information
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:14:34 -07:00
Trond Myklebust d77385f238 SUNRPC: Fix rpc_sockaddr2uaddr
rpc_sockaddr2uaddr is only used by net/sunrpc/rpcb_clnt.c, where
it is used in a non-blockable context in at least one case.

Add non-blocking capability by adding a gfp_t argument

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:13:32 -07:00
Peng Tao c1225158a8 SUNRPC/NFS: make rpc pipe upcall generic
The same function is used by idmap, gss and blocklayout code. Make it
generic.

Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:08:12 -07:00
Marc Kleine-Budde f861c2b80c can: remove references to berlios mailinglist
The BerliOS project, which currently hosts our mailinglist, will
close with the end of the year. Now take the chance and remove all
occurrences of the mailinglist address from the source files.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17 19:22:46 -04:00
John W. Linville 41ebe9cde7 Merge branch 'master' of git://git.infradead.org/users/linville/wireless-next into for-davem 2011-10-17 15:05:26 -04:00
Christoph Hellwig 456be1484f loop: remove the incorrect write_begin/write_end shortcut
Currently the loop device tries to call directly into write_begin/write_end
instead of going through ->write if it can.  This is a fairly nasty shortcut
as write_begin and write_end are only callbacks for the generic write code
and expect to be called with filesystem specific locks held.

This code currently causes various issues for clustered filesystems as it
doesn't take the required cluster locks, and it also causes issues for XFS
as it doesn't properly lock against the swapext ioctl as called by the
defragmentation tools.  This in case causes data corruption if
defragmentation hits a busy loop device in the wrong time window, as
reported by RH QA.

The reason why we have this shortcut is that it saves a data copy when
doing a transformation on the loop device, which is the technical term
for using cryptoloop (or an XOR transformation).  Given that cryptoloop
has been deprecated in favour of dm-crypt my opinion is that we should
simply drop this shortcut instead of finding complicated ways to to
introduce a formal interface for this shortcut.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2011-10-17 12:57:20 +02:00