Commit Graph

38952 Commits

Author SHA1 Message Date
Peter Zijlstra
ea81174789 sched, idle: Fix the idle polling state logic
Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.

The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).

Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).

Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.

Reported-by: Mike Galbraith <bitbucket@online.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: lenb@kernel.org
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-25 13:53:10 +02:00
Peter Zijlstra
3150398626 sched: Remove {set,clear}_need_resched
Preemption semantics are going to change which mandate a change.

All DRM usage sites are already broken and will not be affected (much)
by this change. DRM people are aware and will remove the last few
stragglers.

For now, leave an empty stub that generates a warning, once all users
are gone we can remove this.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: airlied@linux.ie
Cc: daniel.vetter@ffwll.ch
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-qfc1el2zvhxiyut4ai99ij4n@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-25 13:53:09 +02:00
Roy Franz
ed37ddffe2 efi: Add proper definitions for some EFI function pointers.
The x86/AMD64 EFI stubs must use a call wrapper to convert between
the Linux and EFI ABIs, so void pointers are sufficient.  For ARM,
the ABIs are compatible, so we can directly invoke the function
pointers.  The functions that are used by the ARM stub are updated
to match the EFI definitions.
Also add some EFI types used by EFI functions.

Signed-off-by: Roy Franz <roy.franz@linaro.org>
Acked-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-09-25 12:34:33 +01:00
Shuah Khan
56fa484969 iommu: Change iommu driver to call io_page_fault trace event
Change iommu driver call io_page_fault trace event. This iommu_error class
event can be enabled to trigger when an iommu error occurs. Trace information
includes driver name, device name, iova, and flags.

Testing:
Added trace calls to iommu_prepare_identity_map() for testing some of the
conditions that are hard to trigger. Here is the trace from the testing:

       swapper/0-1     [003] ....     2.003774: io_page_fault: IOMMU:pci 0000:00:02.0 iova=0x00000000cb800000 flags=0x0002
       swapper/0-1     [003] ....     2.004098: io_page_fault: IOMMU:pci 0000:00:1d.0 iova=0x00000000cadc6000 flags=0x0002
       swapper/0-1     [003] ....     2.004115: io_page_fault: IOMMU:pci 0000:00:1a.0 iova=0x00000000cadc6000 flags=0x0002
       swapper/0-1     [003] ....     2.004129: io_page_fault: IOMMU:pci 0000:00:1f.0 iova=0x0000000000000000 flags=0x0002

Signed-off-by: Shuah Khan <shuah.kh@samsung.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-09-25 11:07:04 +02:00
Rob Herring
b0b8c960ff of: clean-up ifdefs in of_irq.h
Much of of_irq.h is needlessly ifdef'ed. Clean this up and minimize the
amount ifdef'ed code. This fixes some  build warnings when CONFIG_OF
is not enabled (seen on i386 and x86_64):

include/linux/of_irq.h:82:7: warning: 'struct device_node' declared inside parameter list [enabled by default]
include/linux/of_irq.h:82:7: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
include/linux/of_irq.h:87:47: warning: 'struct device_node' declared inside parameter list [enabled by default]

Compile tested on i386, sparc and arm.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
2013-09-24 21:12:32 -05:00
Andrew Morton
0608f43da6 revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code"
Revert commit 3b38722efd ("memcg, vmscan: integrate soft reclaim
tighter with zone shrinking code")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-24 17:00:26 -07:00
Andrew Morton
b1aff7fcf8 revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim"
Revert commit a5b7c87f92 ("vmscan, memcg: do softlimit reclaim also
for targeted reclaim")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-24 17:00:26 -07:00
Andrew Morton
694fbc0fe7 revert "memcg: enhance memcg iterator to support predicates"
Revert commit de57780dc6 ("memcg: enhance memcg iterator to support
predicates")

I merged this prematurely - Michal and Johannes still disagree about the
overall design direction and the future remains unclear.

Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-24 17:00:26 -07:00
Michal Hocko
9809b18fcf watchdog: update watchdog_thresh properly
watchdog_tresh controls how often nmi perf event counter checks per-cpu
hrtimer_interrupts counter and blows up if the counter hasn't changed
since the last check.  The counter is updated by per-cpu
watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh
period which guarantees that hrtimer is scheduled 2 times per the main
period.  Both hrtimer and perf event are started together when the
watchdog is enabled.

So far so good.  But...

But what happens when watchdog_thresh is updated from sysctl handler?

proc_dowatchdog will set a new sampling period and hrtimer callback
(watchdog_timer_fn) will use the new value in the next round.  The
problem, however, is that nobody tells the perf event that the sampling
period has changed so it is ticking with the period configured when it
has been set up.

This might result in an ear ripping dissonance between perf and hrtimer
parts if the watchdog_thresh is increased.  And even worse it might lead
to KABOOM if the watchdog is configured to panic on such a spurious
lockup.

This patch fixes the issue by updating both nmi perf even counter and
hrtimers if the threshold value has changed.

The nmi one is disabled and then reinitialized from scratch.  This has
an unpleasant side effect that the allocation of the new event might
fail theoretically so the hard lockup detector would be disabled for
such cpus.  On the other hand such a memory allocation failure is very
unlikely because the original event is deallocated right before.

It would be much nicer if we just changed perf event period but there
doesn't seem to be any API to do that right now.  It is also unfortunate
that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we
cannot use on_each_cpu() and do the same thing from the per-cpu context.
The update from the current CPU should be safe because
perf_event_disable removes the event atomically before it clears the
per-cpu watchdog_ev so it cannot change anything under running handler
feet.

The hrtimer is simply restarted (thanks to Don Zickus who has pointed
this out) if it is queued because we cannot rely it will fire&adopt to
the new sampling period before a new nmi event triggers (when the
treshold is decreased).

[akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-24 17:00:25 -07:00
Philip Avinash
f1a4c52ff5 ARM: davinci: gpio: use gpiolib API instead of inline functions
Remove NEED_MACH_GPIO_H config select option for ARCH_DAVINCI
to start using gpiolib interface for davinci platforms. This makes
it easier to use the gpio driver on other platforms as it breaks
dependency on mach-davinci.

Latencies for gpio_get/set APIs will increase. On measurement,
latency was found to have increased by 18 microsecond with
gpiolib API as compared to inline APIs.

Measurement was done on DA850 EVM for gpio_get_value() API by
taking the printk timing across the call with interrupts disabled.

  inline gpio API with interrupt disabled
  [   29.734337] before gpio_get
  [   29.736847] after gpio_get

  Time difference 0.00251

  gpio library with interrupt disabled
  [  272.876763] before gpio_get
  [  272.879291] after gpio_get

  Time difference 0.002528
  Latency increased by (0.002528 -  0.00251) = 18 microsecond.

While at it, remove GPIO_TYPE_DAVINCI enum definition as
gpio-davinci.c is converted to Linux device driver model.

Signed-off-by: Philip Avinash <avinashphilip@ti.com>
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
[nsekhar@ti.com: minor edits to commit message]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
2013-09-25 04:16:37 +05:30
Radim Krčmář
98fda16929 kvm: remove .done from struct kvm_async_pf
'.done' is used to mark the completion of 'async_pf_execute()', but
'cancel_work_sync()' returns true when the work was canceled, so we
use it instead.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:12 +02:00
Li, Zhen-Hua
82aeef0bf0 x86/iommu: correct ICS register offset
According to Intel Vt-D specs, the offset of Invalidation complete
status register should be 0x9C, not 0x98.

See Intel's VT-d spec, Revision 1.3, Chapter 10.4, Page 98;

Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2013-09-24 13:04:07 +02:00
David Howells
f36f8c75ae KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
Add support for per-user_namespace registers of persistent per-UID kerberos
caches held within the kernel.

This allows the kerberos cache to be retained beyond the life of all a user's
processes so that the user's cron jobs can work.

The kerberos cache is envisioned as a keyring/key tree looking something like:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 big_key	- A ccache blob
			\___ tkt12345 big_key	- Another ccache blob

Or possibly:

	struct user_namespace
	  \___ .krb_cache keyring		- The register
		\___ _krb.0 keyring		- Root's Kerberos cache
		\___ _krb.5000 keyring		- User 5000's Kerberos cache
		\___ _krb.5001 keyring		- User 5001's Kerberos cache
			\___ tkt785 keyring	- A ccache
				\___ krbtgt/REDHAT.COM@REDHAT.COM big_key
				\___ http/REDHAT.COM@REDHAT.COM user
				\___ afs/REDHAT.COM@REDHAT.COM user
				\___ nfs/REDHAT.COM@REDHAT.COM user
				\___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key
				\___ http/KERNEL.ORG@KERNEL.ORG big_key

What goes into a particular Kerberos cache is entirely up to userspace.  Kernel
support is limited to giving you the Kerberos cache keyring that you want.

The user asks for their Kerberos cache by:

	krb_cache = keyctl_get_krbcache(uid, dest_keyring);

The uid is -1 or the user's own UID for the user's own cache or the uid of some
other user's cache (requires CAP_SETUID).  This permits rpc.gssd or whatever to
mess with the cache.

The cache returned is a keyring named "_krb.<uid>" that the possessor can read,
search, clear, invalidate, unlink from and add links to.  Active LSMs get a
chance to rule on whether the caller is permitted to make a link.

Each uid's cache keyring is created when it first accessed and is given a
timeout that is extended each time this function is called so that the keyring
goes away after a while.  The timeout is configurable by sysctl but defaults to
three days.

Each user_namespace struct gets a lazily-created keyring that serves as the
register.  The cache keyrings are added to it.  This means that standard key
search and garbage collection facilities are available.

The user_namespace struct's register goes away when it does and anything left
in it is then automatically gc'd.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
cc: Eric W. Biederman <ebiederm@xmission.com>
2013-09-24 10:35:19 +01:00
David Howells
ab3c3587f8 KEYS: Implement a big key type that can save to tmpfs
Implement a big key type that can save its contents to tmpfs and thus
swapspace when memory is tight.  This is useful for Kerberos ticket caches.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
2013-09-24 10:35:18 +01:00
David Howells
b2a4df200d KEYS: Expand the capacity of a keyring
Expand the capacity of a keyring to be able to hold a lot more keys by using
the previously added associative array implementation.  Currently the maximum
capacity is:

	(PAGE_SIZE - sizeof(header)) / sizeof(struct key *)

which, on a 64-bit system, is a little more 500.  However, since this is being
used for the NFS uid mapper, we need more than that.  The new implementation
gives us effectively unlimited capacity.

With some alterations, the keyutils testsuite runs successfully to completion
after this patch is applied.  The alterations are because (a) keyrings that
are simply added to no longer appear ordered and (b) some of the errors have
changed a bit.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:18 +01:00
David Howells
3cb989501c Add a generic associative array implementation.
Add a generic associative array implementation that can be used as the
container for keyrings, thereby massively increasing the capacity available
whilst also speeding up searching in keyrings that contain a lot of keys.

This may also be useful in FS-Cache for tracking cookies.

Documentation is added into Documentation/associative_array.txt

Some of the properties of the implementation are:

 (1) Objects are opaque pointers.  The implementation does not care where they
     point (if anywhere) or what they point to (if anything).

     [!] NOTE: Pointers to objects _must_ be zero in the two least significant
     	       bits.

 (2) Objects do not need to contain linkage blocks for use by the array.  This
     permits an object to be located in multiple arrays simultaneously.
     Rather, the array is made up of metadata blocks that point to objects.

 (3) Objects are labelled as being one of two types (the type is a bool value).
     This information is stored in the array, but has no consequence to the
     array itself or its algorithms.

 (4) Objects require index keys to locate them within the array.

 (5) Index keys must be unique.  Inserting an object with the same key as one
     already in the array will replace the old object.

 (6) Index keys can be of any length and can be of different lengths.

 (7) Index keys should encode the length early on, before any variation due to
     length is seen.

 (8) Index keys can include a hash to scatter objects throughout the array.

 (9) The array can iterated over.  The objects will not necessarily come out in
     key order.

(10) The array can be iterated whilst it is being modified, provided the RCU
     readlock is being held by the iterator.  Note, however, under these
     circumstances, some objects may be seen more than once.  If this is a
     problem, the iterator should lock against modification.  Objects will not
     be missed, however, unless deleted.

(11) Objects in the array can be looked up by means of their index key.

(12) Objects can be looked up whilst the array is being modified, provided the
     RCU readlock is being held by the thread doing the look up.

The implementation uses a tree of 16-pointer nodes internally that are indexed
on each level by nibbles from the index key.  To improve memory efficiency,
shortcuts can be emplaced to skip over what would otherwise be a series of
single-occupancy nodes.  Further, nodes pack leaf object pointers into spare
space in the node rather than making an extra branch until as such time an
object needs to be added to a full node.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:17 +01:00
David Howells
ccc3e6d9c9 KEYS: Define a __key_get() wrapper to use rather than atomic_inc()
Define a __key_get() wrapper to use rather than atomic_inc() on the key usage
count as this makes it easier to hook in refcount error debugging.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:16 +01:00
David Howells
4bdf0bc300 KEYS: Introduce a search context structure
Search functions pass around a bunch of arguments, each of which gets copied
with each call.  Introduce a search context structure to hold these.

Whilst we're at it, create a search flag that indicates whether the search
should be directly to the description or whether it should iterate through all
keys looking for a non-description match.

This will be useful when keyrings use a generic data struct with generic
routines to manage their content as the search terms can just be passed
through to the iterator callback function.

Also, for future use, the data to be supplied to the match function is
separated from the description pointer in the search context.  This makes it
clear which is being supplied.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
David Howells
16feef4340 KEYS: Consolidate the concept of an 'index key' for key access
Consolidate the concept of an 'index key' for accessing keys.  The index key
is the search term needed to find a key directly - basically the key type and
the key description.  We can add to that the description length.

This will be useful when turning a keyring into an associative array rather
than just a pointer block.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:15 +01:00
David Howells
a5b4bd2874 KEYS: Use bool in make_key_ref() and is_key_possessed()
Make make_key_ref() take a bool possession parameter and make
is_key_possessed() return a bool.

Signed-off-by: David Howells <dhowells@redhat.com>
2013-09-24 10:35:14 +01:00
KV Sujith
118150f22d gpio: davinci: move to platform device
Modify DaVinci GPIO driver to become a platform device
driver.

The driver does not have platform driver structure or
a probe. Instead, it has pure_initcall function for
initialization. The platform specific informaiton is
obtained using the DaVinci specific davinci_soc_info
structure. This is a problem for Device Tree (DT)
implementation.

As a first stage of DT conversion, we implement a probe.

Additional notes:

- The driver registration happens as  postcore_initcall.
  This is required since machine init functions like
  da850_lcd_hw_init() make use of GPIO.
- Start using devres APIs for simpler error handling.

Signed-off-by: KV Sujith <sujithkv@ti.com>
[avinashphilip@ti.com: Move global definition of
		       "davinci_gpio_controller" to local]
Signed-off-by: Philip Avinash <avinashphilip@ti.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
[nsekhar@ti.com: drop unused structure member, rebase to new
		 clean-up patch and fix error messages]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
2013-09-24 10:31:51 +05:30
Li Zefan
2ff2a7d03b cgroup: kill css_id
The only user of css_id was memcg, and it has been convered to use
cgroup->id, so kill css_id.

Signed-off-by: Li Zefan <lizefan@huwei.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-09-23 21:44:16 -04:00
Bjorn Helgaas
7dab9ef4f0 PCI/ACPI: Name _OSC #defines more consistently
Make PCI Host Bridge _OSC #defines more consistent.  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Bjorn Helgaas
335b15097d ACPI: Write OSC_PCI_CONTROL_MASKS like OSC_PCI_SUPPORT_MASKS
We write OSC_PCI_SUPPORT_MASKS as a simple 0x1f, so do the same
for OSC_PCI_CONTROL_MASKS.  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Bjorn Helgaas
b448193416 ACPI: Remove unused OSC_PCI_NATIVE_HOTPLUG
OSC_PCI_NATIVE_HOTPLUG is completely unused, so remove it.  No functional
change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Bjorn Helgaas
c867847360 ACPI: Tidy acpi_run_osc() declarations
Move the acpi_run_osc() prototype next to the related structure and
update comments.  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Bjorn Helgaas
b938a229c8 ACPI: Rename OSC_QUERY_TYPE to OSC_QUERY_DWORD
OSC_QUERY_TYPE isn't a "type"; it's an index into the _OSC Capabilities
Buffer of DWORDs.  Rename OSC_QUERY_TYPE, OSC_SUPPORT_TYPE, and
OSC_CONTROL_TYPE to OSC_QUERY_DWORD, etc., to make this clear.
No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Bjorn Helgaas
dedf1e4dfd ACPI: Write _OSC bit field definitions in hex
Update _OSC definition comments to correspond to the 1-based spec wording
(DWORD 1, etc.)  Write _OSC field #defines as hex to make clear that they
are bits in a 32-bit DWORD, not arbitrary values.  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-23 17:40:45 -06:00
Jiang Liu
d536bf3dc9 ACPI / processor: use apic_id and remove duplicated _MAT evaluation
Since APIC id is saved in processor struct, just use it and
remove the duplicated _MAT evaluation.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-09-24 01:39:39 +02:00
Nicolas Pitre
14d2ca615a ARM: GIC: interface to send a SGI directly
The regular gic_raise_softirq() takes as input a CPU mask which is not
adequate when we need to send an IPI to a CPU which is not represented
in the kernel to GIC mapping.  That is the case with the b.L switcher
when GIC migration to the inbound CPU has not yet occurred.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-09-23 18:47:28 -04:00
Nicolas Pitre
eeb446581b ARM: GIC: function to retrieve the physical address of the SGIR
In order to have early assembly code signal other CPUs in the system,
we need to get the physical address for the SGIR register used to
send IPIs.  Because the register will be used with a precomputed CPU
interface ID number, there is no need for any locking in the assembly
code where this register is written to.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-09-23 18:47:28 -04:00
Greg Kroah-Hartman
dc4fea795b Merge branch 'master' into usb-next
We have USB fixes now in Linus's tree that we need to properly sort out
with reverts and the like in the usb-next branch, so merge them together
and do it by hand.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-23 13:24:10 -07:00
Lee Jones
71e1980c8d iio: pressure-core: st: Provide support for the Vdd_IO power supply
The power to some of the sensors are controlled by regulators. In most
cases these are 'always on', but if not they will fail to work until
the regulator is enabled using the relevant APIs. This patch allows for
the Vdd_IO power supply to be specified by either platform data or
Device Tree.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-23 20:17:58 +01:00
Lee Jones
774487611c iio: pressure-core: st: Provide support for the Vdd power supply
The power to some of the sensors are controlled by regulators. In most
cases these are 'always on', but if not they will fail to work until
the regulator is enabled using the relevant APIs. This patch allows for
the Vdd power supply to be specified by either platform data or Device
Tree.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-23 20:17:54 +01:00
Paul E. McKenney
2a855b644c rcu: Make list_splice_init_rcu() account for RCU readers
The list_splice_init_rcu() function allows a list visible to RCU readers
to be spliced into another list visible to RCU readers.  This is OK,
except for the use of INIT_LIST_HEAD(), which does pointer updates
without doing anything to make those updates safe for concurrent readers.

Of course, most of the time INIT_LIST_HEAD() is being used in reader-free
contexts, such as initialization or cleanup, so it is OK for it to update
pointers in an unsafe-for-RCU-readers manner.  This commit therefore
creates an INIT_LIST_HEAD_RCU() that uses ACCESS_ONCE() to make the updates
reader-safe.  The reason that we can use ACCESS_ONCE() instead of the more
typical rcu_assign_pointer() is that list_splice_init_rcu() is updating the
pointers to reference something that is already visible to readers, so
that there is no problem with pre-initialized values.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-09-23 09:13:49 -07:00
Theodore Ts'o
47d06e532e random: run random_int_secret_init() run after all late_initcalls
The some platforms (e.g., ARM) initializes their clocks as
late_initcalls for some unknown reason.  So make sure
random_int_secret_init() is run after all of the late_initcalls are
run.

Cc: stable@vger.kernel.org
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-09-23 06:35:06 -04:00
Li Zefan
cd64647f04 hung_task: Change sysctl_hung_task_check_count to 'int'
As 'sysctl_hung_task_check_count' is 'unsigned long' when this
value is assigned to max_count in check_hung_uninterruptible_tasks(),
it's truncated to 'int' type.

This causes a minor artifact: if we write 2^32 to sysctl.hung_task_check_count,
hung task detection will be effectively disabled.

With this fix, it will still truncate the user input to 32 bits, but
reading sysctl.hung_task_check_count reflects the actual truncated value.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/523FFF4E.9050401@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-23 11:10:49 +02:00
Rusty Russell
3f2b9c9cdf module: remove rmmod --wait option.
The option to wait for a module reference count to reach zero was in
the initial module implementation, but it was never supported in
modprobe (you had to use rmmod --wait).  After discussion with Lucas,
It has been deprecated (with a 10 second sleep) in kmod for the last
year.

This finally removes it: the flag will evoke a printk warning and a
normal (non-blocking) remove attempt.

Cc: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-09-23 15:44:58 +09:30
Linus Torvalds
68cf8d0c72 Merge branch 'for-3.12/core' of git://git.kernel.dk/linux-block
Pull block IO fixes from Jens Axboe:
 "After merge window, no new stuff this time only a collection of neatly
  confined and simple fixes"

* 'for-3.12/core' of git://git.kernel.dk/linux-block:
  cfq: explicitly use 64bit divide operation for 64bit arguments
  block: Add nr_bios to block_rq_remap tracepoint
  If the queue is dying then we only call the rq->end_io callout. This leaves bios setup on the request, because the caller assumes when the blk_execute_rq_nowait/blk_execute_rq call has completed that the rq->bios have been cleaned up.
  bio-integrity: Fix use of bs->bio_integrity_pool after free
  blkcg: relocate root_blkg setting and clearing
  block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)
  block: trace all devices plug operation
2013-09-22 15:00:11 -07:00
Greg Kroah-Hartman
3ffdea3fec Merge tag 'iio-for-3.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
Jonathan writes:

First round of new drivers, functionality and cleanups for IIO in the 3.13 cycle

A number of new drivers and some new functionality + a lot of cleanups
all over IIO.

New Core Elements

1) New INT_TIME info_mask element for integration time, which may have
   different effects on measurement noise and similar, than an amplifier
   and hence is different from existing SCALE.  Already existed in some
   drivers as a custom attribute.

2) Introduce a iio_push_buffers_with_timestamp helper to cover the common
   case of filling the last 64 bits of data to be passed to the buffer with
   a timestamp.  Applied to lots of drivers. Cuts down on repeated code and
   moves a slightly fiddly bit of logic into a single location.

3) Introduce info_mask_[shared_by_dir/shared_by_all] elements to allow support
   of elements such as sampling_frequency which is typically shared by all
   input channels on a device.  This reduces code and makes these controls
   available from in kernel consumers of IIO devices.

New drivers

1) MCP3422/3/4 ADC

2) TSL4531 ambient light sensor

3) TCS3472/5 color light sensor

4) GP2AP020A00F ambient light / proximity sensor

5) LPS001WP support added to ST pressure sensor driver.

New driver functionality

1) ti_am335x_adc Add buffered sampling support.
   This device has a hardware fifo that is fed directly into an IIO kfifo
   buffer based on a watershed interrupt.  Note this will act as an example
   of how to handle this increasingly common type of device.
   The only previous example - sca3000 - take a less than optimal approach
   which is largely why it is still in staging.
   A couple of little cleanups for that new functionality followed later.

Core cleanups:

1) MAINTAINERS - Sachin actually brought my email address up to date because
   I said I'd do it and never got around to it :)

2) Assign buffer list elements as single element lists to simplify the
   iio_buffer_is_active logic.

3) wake_up_interruptible_poll instead of wake_up_interruptible to only wake
   up threads waiting for poll notifications.

4) Add O_CLOEXEC flag to anon_inode_get_fd call for IIO event interface.

5) Change iio_push_to_buffers to take a void * pointer so as to avoid some
   annoying and unnecessary type casts.

6) iio_compute_scan_bytes incorrectly took a long rather than unsigned long.

7) Various minor tidy ups.

Driver cleanups (in no particular order)

1) Another set of devm_ allocations patches from Sachin Kamat.

2) tsl2x7x - 0 to NULL cleanup.

3) hmc5843 - fix missing > in MODULE_AUTHOR

4) Set of strict_strto* to kstrto* conversions.

5) mxs-lradc - fix ordering of resource removal to match creation

6) mxs-lradc - add MODULE_ALIAS

7) adc7606 - drop a work pending test duplicated in core functions.

8) hmc5843 - devm_ allocation patch

9) Series of redundant breaks removed.

10) ad2s1200 - pr_err -> dev_err

11) adjd_s311 - use INT_TIME

12)  ST sensors - large set of cleanups from Lee Jones and removed restriction
    to using only triggers provided by the st_sensors themselves from
    Dennis Ciocca.

13) dummy and tmp006 provide sampling_frequency via info_mask_shared_by_all.

14) tcs3472 - fix incorrect buffer size and wrong device pointer used in
    suspend / resume functions.

15) max1363 - use defaults for buffer setup ops as provided by the triggered
    buffer helpers as they are the same as were specified in max1363 driver.

16) Trivial tidy ups in a number of other drivers.
2013-09-22 11:30:12 -07:00
Jun'ichi Nomura
75afb35299 block: Add nr_bios to block_rq_remap tracepoint
Adding the number of bios in a remapped request to 'block_rq_remap'
tracepoint.

Request remapper clones bios in a request to track the completion
status of each bio. So the number of bios can be useful information
for investigation.

Related discussions:
  http://www.redhat.com/archives/dm-devel/2013-August/msg00084.html
  http://www.redhat.com/archives/dm-devel/2013-September/msg00024.html

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-09-21 13:57:47 -06:00
Lars-Peter Clausen
d2c3d072c4 iio: Add iio_push_buffers_with_timestamp() helper
Drivers using software buffers often store the timestamp in their data buffer
before calling iio_push_to_buffers() with that data buffer. Storing the
timestamp in the buffer usually involves some ugly pointer arithmetic. This
patch adds a new helper function called iio_push_buffers_with_timestamp() which
is similar to iio_push_to_buffers but takes an additional timestamp parameter.
The function will help to hide to uglyness in one central place instead of
exposing it in every driver. If timestamps are enabled for the IIO device
iio_push_buffers_with_timestamp() will store the timestamp as the last element
in buffer, before passing the buffer on to iio_push_buffers(). The buffer needs
large enough to hold the timestamp in this case. If timestamps are disabled
iio_push_buffers_with_timestamp() will behave just like iio_push_buffers().

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Oleksandr Kravchenko <o.v.kravchenko@globallogic.com>
Cc: Josh Wu <josh.wu@atmel.com>
Cc: Denis Ciocca <denis.ciocca@gmail.com>
Cc: Manuel Stahl <manuel.stahl@iis.fraunhofer.de>
Cc: Ge Gao <ggao@invensense.com>
Cc: Peter Meerwald <pmeerw@pmeerw.net>
Cc: Jacek Anaszewski <j.anaszewski@samsung.com>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21 19:23:50 +01:00
Zubair Lutfullah
ca9a563805 iio: ti_am335x_adc: Add continuous sampling support
Previously the driver had only one-shot reading functionality.
This patch adds continuous sampling support to the driver.

Continuous sampling starts when buffer is enabled.
HW IRQ wakes worker thread that pushes samples to userspace.
Sampling stops when buffer is disabled by userspace.

Patil Rachna (TI) laid the ground work for ADC HW register access.
Russ Dill (TI) fixed bugs in the driver relevant to FIFOs and IRQs.

I fixed channel scanning so multiple ADC channels can be read
simultaneously and pushed to userspace.
Restructured the driver to fit IIO ABI.
And added INDIO_BUFFER_HARDWARE mode.

Signed-off-by: Zubair Lutfullah <zubair.lutfullah@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Russ Dill <Russ.Dill@ti.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
2013-09-21 19:23:47 +01:00
Andre Naujoks
c26d436cbf lib: introduce upper case hex ascii helpers
To be able to use the hex ascii functions in case sensitive environments
the array hex_asc_upper[] and the needed functions for hex_byte_pack_upper()
are introduced.

Signed-off-by: Andre Naujoks <nautsch2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-20 15:38:26 -04:00
Laxman Dewangan
6112fe60ac regmap: add helper macro to set min/max range of register
Add helper macro to set the min and max value of the register range.

This is useful when initialising the register ranges of the device like

static const struct regmap_range readable_ranges[] = {
	regmap_reg_range(DEVICE_REG0, DEVICE_REG10),
};

Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
2013-09-20 17:50:46 +01:00
Mike Snitzer
f84cb8a46a dm mpath: disable WRITE SAME if it fails
Workaround the SCSI layer's problematic WRITE SAME heuristics by
disabling WRITE SAME in the DM multipath device's queue_limits if an
underlying device disabled it.

The WRITE SAME heuristics, with both the original commit 5db44863b6
("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
WRITE SAME(10) even without successfully determining it is supported.
After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
for the device (by setting sdkp->device->no_write_same which results in
'max_write_same_sectors' in device's queue_limits to be set to 0).

When a device is stacked ontop of such a SCSI device any changes to that
SCSI device's queue_limits do not automatically propagate up the stack.
As such, a DM multipath device will not have its WRITE SAME support
disabled.  This causes the block layer to continue to issue WRITE SAME
requests to the mpath device which causes paths to fail and (if mpath IO
isn't configured to queue when no paths are available) it will result in
actual IO errors to the upper layers.

This fix doesn't help configurations that have additional devices
stacked ontop of the mpath device (e.g. LVM created linear DM devices
ontop).  A proper fix that restacks all the queue_limits from the bottom
of the device stack up will need to be explored if SCSI will continue to
use this model of optimistically allowing op codes and then disabling
them after they fail for the first time.

Before this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
device-mapper: multipath: Failing path 8:112.
end_request: I/O error, dev dm-6, sector 4616
dm-6: WRITE SAME failed. Manually zeroing.
end_request: I/O error, dev dm-6, sector 4616
end_request: I/O error, dev dm-6, sector 5640
end_request: I/O error, dev dm-6, sector 6664
end_request: I/O error, dev dm-6, sector 7688
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
end_request: I/O error, dev dm-6, sector 524296
Aborting journal on device dm-6-8.
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
33553920

After this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
0

It should be noted that WRITE SAME support wasn't enabled in DM
multipath until v3.10.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org # 3.10+
2013-09-20 10:36:34 -04:00
Jason Low
f48627e686 sched/balancing: Periodically decay max cost of idle balance
This patch builds on patch 2 and periodically decays that max value to
do idle balancing per sched domain by approximately 1% per second. Also
decay the rq's max_idle_balance_cost value.

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20 12:03:46 +02:00
Jason Low
9bd721c55c sched/balancing: Consider max cost of idle balance per sched domain
In this patch, we keep track of the max cost we spend doing idle load balancing
for each sched domain. If the avg time the CPU remains idle is less then the
time we have already spent on idle balancing + the max cost of idle balancing
in the sched domain, then we don't continue to attempt the balance. We also
keep a per rq variable, max_idle_balance_cost, which keeps track of the max
time spent on newidle load balances throughout all its domains so that we can
determine the avg_idle's max value.

By using the max, we avoid overrunning the average. This further reduces the
chance we attempt balancing when the CPU is not idle for longer than the cost
to balance.

Signed-off-by: Jason Low <jason.low2@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-09-20 12:03:44 +02:00
Linus Torvalds
b75ff5e84b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) If the local_df boolean is set on an SKB we have to allocate a
    unique ID even if IP_DF is set in the ipv4 headers, from Ansis
    Atteka.

 2) Some fixups for the new chipset support that went into the sfc
    driver, from Ben Hutchings.

 3) Because SCTP bypasses a good chunk of, and actually duplicates, the
    logic of the ipv6 output path, some IPSEC things don't get done
    properly.  Integrate SCTP better into the ipv6 output path so that
    these problems are fixed and such issues don't get missed in the
    future either.  From Daniel Borkmann.

 4) Fix skge regressions added by the DMA mapping error return checking
    added in v3.10, from Mikulas Patocka.

 5) Kill some more IRQF_DISABLED references, from Michael Opdenacker.

 6) Fix races and deadlocks in the bridging code, from Hong Zhiguo.

 7) Fix error handling in tun_set_iff(), in particular don't leak
    resources.  From Jason Wang.

 8) Prevent format-string injection into xen-netback driver, from Kees
    Cook.

 9) Fix regression added to netpoll ARP packet handling, in particular
    check for the right ETH_P_ARP protocol code.  From Sonic Zhang.

10) Try to deal with AMD IOMMU errors when using r8169 chips, from
    Francois Romieu.

11) Cure freezes due to recent changes in the rt2x00 wireless driver,
    from Stanislaw Gruszka.

12) Don't do SPI transfers (which can sleep) in interrupt context in
    cw1200 driver, from Solomon Peachy.

13) Fix LEDs handling bug in 5720 tg3 chips already handled for 5719.
    From Nithin Sujir.

14) Make xen_netbk_count_skb_slots() count the actual number of slots
    that will be used, taking into consideration packing and other
    issues that the transmit path will run into.  From David Vrabel.

15) Use the correct maximum age when calculating the bridge
    message_age_timer, from Chris Healy.

16) Get rid of memory leaks in mcs7780 IRDA driver, from Alexey
    Khoroshilov.

17) Netfilter conntrack extensions were converted to RCU but are not
    always freed properly using kfree_rcu().  Fix from Michal Kubecek.

18) VF reset recovery not being done correctly in qlcnic driver, from
    Manish Chopra.

19) Fix inverted test in ATM nicstar driver, from Andy Shevchenko.

20) Missing workqueue destroy in cxgb4 error handling, from Wei Yang.

21) Internal switch not initialized properly in bgmac driver, from Rafał
    Miłecki.

22) Netlink messages report wrong local and remote addresses in IPv6
    tunneling, from Ding Zhi.

23) ICMP redirects should not generate socket errors in DCCP and SCTP.
    We're still working out how this should be handled for RAW and UDP
    sockets.  From Daniel Borkmann and Duan Jiong.

24) We've had several bugs wherein the network namespace's loopback
    device gets accessed after it is free'd, NULL it out so that we can
    catch these problems more readily.  From Eric W Biederman.

25) Fix regression in TCP RTO calculations, from Neal Cardwell.

26) Fix too early free of xen-netback network device when VIFs still
    exist.  From Paul Durrant.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
  netconsole: fix a deadlock with rtnl and netconsole's mutex
  netpoll: fix NULL pointer dereference in netpoll_cleanup
  skge: fix broken driver
  ip: generate unique IP identificator if local fragmentation is allowed
  ip: use ip_hdr() in __ip_make_skb() to retrieve IP header
  xen-netback: Don't destroy the netdev until the vif is shut down
  net:dccp: do not report ICMP redirects to user space
  cnic: Fix crash in cnic_bnx2x_service_kcq()
  bnx2x, cnic, bnx2i, bnx2fc: Fix bnx2i and bnx2fc regressions.
  vxlan: Avoid creating fdb entry with NULL destination
  tcp: fix RTO calculated from cached RTT
  drivers: net: phy: cicada.c: clears warning Use #include <linux/io.h> instead of <asm/io.h>
  net loopback: Set loopback_dev to NULL when freed
  batman-adv: set the TAG flag for the vid passed to BLA
  netfilter: nfnetlink_queue: use network skb for sequence adjustment
  net: sctp: rfc4443: do not report ICMP redirects to user space
  net: usb: cdc_ether: use usb.h macros whenever possible
  net: usb: cdc_ether: fix checkpatch errors and warnings
  net: usb: cdc_ether: Use wwan interface for Telit modules
  ip6_tunnels: raddr and laddr are inverted in nl msg
  ...
2013-09-19 13:57:28 -05:00
Linus Torvalds
e9ff04dd94 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "These fix several bugs with RBD from 3.11 that didn't get tested in
  time for the merge window: some error handling, a use-after-free, and
  a sequencing issue when unmapping and image races with a notify
  operation.

  There is also a patch fixing a problem with the new ceph + fscache
  code that just went in"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  fscache: check consistency does not decrement refcount
  rbd: fix error handling from rbd_snap_name()
  rbd: ignore unmapped snapshots that no longer exist
  rbd: fix use-after free of rbd_dev->disk
  rbd: make rbd_obj_notify_ack() synchronous
  rbd: complete notifies before cleaning up osd_client and rbd_dev
  libceph: add function to ensure notifies are complete
2013-09-19 12:50:37 -05:00