1564 lines
64 KiB
Markdown
1564 lines
64 KiB
Markdown
|
[Source](http://yarchive.net/comp/linux/semaphores.html "Permalink to Semaphores (Linus Torvalds) ")
|
||
|
|
||
|
# Semaphores (Linus Torvalds)
|
||
|
|
||
|
[Index][1] [Home][2] [About][3] [Blog][4]
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: torvalds@transmeta.com (Linus Torvalds)
|
||
|
Newsgroups: comp.os.linux.development.system
|
||
|
Subject: [Re: NT kernel guy playing with Linux][5]
|
||
|
Date: 27 Jun 1999 03:25:59 GMT
|
||
|
|
||
|
In article <7l3oan$mrj$1@wire.cadcamlab.org>,
|
||
|
Peter Samuelson <psamuels@sampo.creighton.edu> wrote:
|
||
|
|
||
|
>I say this in the hope of sparing others the confusion I had about what
|
||
|
>a spinlock is. (It looked a lot like what CS calls a semaphore, so why
|
||
|
>don't they just call it a semaphore, and what's this semaphore_t...?)
|
||
|
|
||
|
If your CS courses didn't tell you the difference between a semaphore
|
||
|
and a spinlock, your CS courses were bad (or just didn't cover much
|
||
|
about concurrency, which is fairly common).
|
||
|
|
||
|
Blame your professors, don't blame the Linux kernel code.
|
||
|
|
||
|
>A spinlock is a semaphore used for very short critical sections; while
|
||
|
>waiting for a spinlock to be released, the kernel sits on the CPU and
|
||
|
>does nothing.
|
||
|
|
||
|
A spinlock is a mutual exclusion mechanism, not a semaphore (a semaphore
|
||
|
is a very specific _kind_ of mutual exclusion).
|
||
|
|
||
|
But yes, you're right in that it's (by design) a busy-waiting one.
|
||
|
That's why they are called "spinlocks" - they "spin" waiting for the
|
||
|
lock to go away.
|
||
|
|
||
|
> A `semaphore_t' is a longer-term semaphore which has the
|
||
|
>advantage that the kernel puts the needy process to sleep and goes and
|
||
|
>does something else rather than busy-waiting.
|
||
|
|
||
|
A semaphore also has the ability to have more than one process enter the
|
||
|
critical region. Basically semaphores were (as far as I know) first
|
||
|
proposed by Dijkstra, and they explicitly imply a "sleep"/"wakeup"
|
||
|
behaviour, ie they are _not_ spinlocks. They originally had operations
|
||
|
called "P()" and "V()", but nobody ever remembers whether P() was down()
|
||
|
or up(), so nobody uses those names any more. Dijkstra was probably a
|
||
|
bit heavy on drugs or something (I think the official explanation is
|
||
|
that P and V are the first letters in some Dutch words, but I personally
|
||
|
find the drug overdose story much more believable).
|
||
|
|
||
|
Also, unlike basic spinlocks, a semaphore has a "count" value: each
|
||
|
process that does a down() operation decrements the count if positive,
|
||
|
until the count would go negative. Only then do they sleep. The
|
||
|
original intent of this was to allow multiple entries to the region
|
||
|
protected by the semaphore: by initializing the count to 4, you allow
|
||
|
four down() operations and only the fifth one will block.
|
||
|
|
||
|
However, almost all practical use of semaphores is a special case where
|
||
|
the counter is initialized to 1, and where they are used as simple
|
||
|
mutual exclusion with only one user allowed in the critical region.
|
||
|
Such a semaphore is often called a "mutex" semaphore for MUTual
|
||
|
EXclusion.
|
||
|
|
||
|
I've never really seen anybody use the more complex case of semaphores,
|
||
|
although I do know of cases where it can be useful. For example, one use
|
||
|
of a more complex semaphore is as a "throttle", where you do something
|
||
|
like this:
|
||
|
|
||
|
/* Maximum concurrent users */
|
||
|
#define MAX_CONCURRENT_USERS 20
|
||
|
struct semaphore sem;
|
||
|
|
||
|
init_sema(&sem, MAX_CONCURRENT_USERS);
|
||
|
|
||
|
and then each user does a down() on the semaphore before starting an
|
||
|
operation. It won't block until you have 20 users - you've not created
|
||
|
a mutual exclusion, but you HAVE created a throttling mechanism. See?
|
||
|
|
||
|
Potentially useful, as I said, but the common case (and the only case
|
||
|
currently in use in the Linux kernel - even though the implementation
|
||
|
definitely can handle the general case) is certainly the mutex one.
|
||
|
|
||
|
>However, you need to grab a spinlock for the purpose of grabbing a
|
||
|
>semaphore_t, so for short critical sections a spinlock (even with the
|
||
|
>potential of busy-waiting) is more efficient.
|
||
|
|
||
|
Only in bad implementations or on bad hardware.
|
||
|
|
||
|
All reasonably modern CPU's can do a semaphore without having grabbed a
|
||
|
spinlock. Often a spinlock is needed for the _contention_ case, but if
|
||
|
done right that is very rare. The contention case for a semaphore is
|
||
|
usually the very expensive one, because that's when you have to
|
||
|
re-schedule etc.
|
||
|
|
||
|
So basically spinlocks are much simpler, and faster under short-lived
|
||
|
contention, so that's why they tend to be used. Also, semaphores cannot
|
||
|
be used by interrupt handlers, as Linux doesn't allow interrupt handlers
|
||
|
to sleep, so anything that protects interrupts needs to be a spinlock.
|
||
|
|
||
|
In addition to semaphores, there are other mutual exclusion notions, the
|
||
|
most popular being a "read-write" lock - something that requires
|
||
|
exclusive access for writers, but allows any number of readers at a
|
||
|
time. Linux has the spinning version of this, but not the blocking one.
|
||
|
We'll probably add a blocking version some day, as it's often very
|
||
|
useful, but it hasn't been a major issue yet.
|
||
|
|
||
|
For example, the per-VM memory management semaphore could very usefully
|
||
|
be a blocking read-write lock, but without heavy thread contention a
|
||
|
mutex semaphore is basically equivalent.
|
||
|
|
||
|
> Since the usual practice
|
||
|
>is to make your critical sections as short as possible, the result is
|
||
|
>that the kernel uses a lot more spinlocks than semaphore_t's.
|
||
|
|
||
|
True. Semaphores are really only useful around anything that does IO,
|
||
|
for example. When the potential contention period is multiple
|
||
|
milliseconds as opposed to nano- or microseconds, blocking operations
|
||
|
(ie operations that cause a re-schedule on contention) are the right way
|
||
|
to go.
|
||
|
|
||
|
Note that some people believe in a mix-and-match approach, where you
|
||
|
have a spinlock that gets upgraded to a semaphore if it waits too long.
|
||
|
Personally, I think that only makes sense if (a) you're in user space
|
||
|
and don't know what the scheduling rules are or (b) your locking is so
|
||
|
badly designed that you have tons of short-lived contention on your
|
||
|
semaphores, so you want to try the light approach first.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Sun, 8 Apr 2001 20:08:13 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: rw_semaphores][6]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Sun, 8 Apr 2001, Andrew Morton wrote:
|
||
|
>
|
||
|
> One issue with the current implementation is the (necessary)
|
||
|
> design where one reader (or writer) is sleeping in
|
||
|
> down_x_failed_biased(), with its bias in place, whereas
|
||
|
> all other sleepers are sleeping in down_x_failed(), with
|
||
|
> their bias not in place.
|
||
|
|
||
|
Why not do this the same way the _real_ semaphores do it?
|
||
|
|
||
|
You have a fast-path that has a single integer, and a slow path which has
|
||
|
more state but and protected by a spinlock. The _only_ worry I see is to
|
||
|
make sure that we properly have a "contention" state, because both readers
|
||
|
and writers need to be able to know when they should wake things up. But
|
||
|
there's a _trivial_ contention state. Read on:
|
||
|
|
||
|
Forget about the BIAS stuff, and go back to the old simple "negative is
|
||
|
writer, positive is reader" implementation:
|
||
|
|
||
|
0 - unlocked
|
||
|
1..n - 1-n readers
|
||
|
0xc0000000 - writer
|
||
|
other <0 - contention
|
||
|
|
||
|
Do you see anything wrong with this?
|
||
|
|
||
|
Implementation:
|
||
|
|
||
|
- fast path:
|
||
|
|
||
|
down_read:
|
||
|
lock incl (%sem)
|
||
|
js __down_read_failed
|
||
|
|
||
|
down_write:
|
||
|
xorl %eax,%eax
|
||
|
movl $0xc0000000,%r
|
||
|
lock cmpxchgl %r,(%sem)
|
||
|
jne __down_write_failed
|
||
|
|
||
|
up_read:
|
||
|
lock decl (%sem)
|
||
|
js __up_read_wakeup
|
||
|
|
||
|
up_write:
|
||
|
lock andl $0x3fffffff,(%sem)
|
||
|
jne __up_write_wakeup
|
||
|
|
||
|
The above are all fairly obvious for the non-failure case, agreed?
|
||
|
Including the guarantee that _if_ there is contention, the "up()"
|
||
|
routines will always go to the slow path to handle the contention.
|
||
|
|
||
|
Now, the _slow_ case could be your generic "protected by a spinlock" code,
|
||
|
although I have a suggestion. As follows:
|
||
|
|
||
|
The only half-way "subtle" case is that __down_write_failed needs to make
|
||
|
sure that it marks itself as a contender (the down_read() fast-case code
|
||
|
will have done so already by virtue of incrementing the counter, which is
|
||
|
guaranteed to have resulted in a "contention" value.
|
||
|
|
||
|
While down_read() automatically gets a "contention value" on failure,
|
||
|
"down_write()" needs to do extra work. The extra work is not all that
|
||
|
expensive: the simplest thing is to just do
|
||
|
|
||
|
subl $0x8000,(%sem)
|
||
|
|
||
|
at the beginning - which will cause a "contention" value regardless of
|
||
|
whether it was pure-reader before (it goes negative by the simple
|
||
|
assumption that there will never be more than 32k concurrent readers), or
|
||
|
whether it was locked by a writer (it stays negative due to the 0x40000000
|
||
|
bit in the write lock logic). In both cases, both a up_write() and a
|
||
|
up_read() will notice that they need to handle the contention of a waiting
|
||
|
writer-to-be.
|
||
|
|
||
|
Would you agree that the above fast-path implementation guarantees that we
|
||
|
always get to the spinlock-protected slow path?
|
||
|
|
||
|
Now, the above _heavily_ favours readers. There's no question that the
|
||
|
above is unfair. I think that's ok. I think fairness and efficiency tend
|
||
|
to be at odds. But queuing theory shows that the faster you can get out of
|
||
|
the critical region, the smaller your queue will be - exponentially. So
|
||
|
speed is worth it.
|
||
|
|
||
|
Let's take an example at this point:
|
||
|
|
||
|
- lock is zero
|
||
|
|
||
|
- writer(1) gets lock: lock is 0xc0000000
|
||
|
|
||
|
- reader(2) tries to get in: lock becomes 0xc0000001, synchronizes on
|
||
|
spinlock.
|
||
|
|
||
|
- another writer(3) tries to get in: lock becomes 0xbfff8001, synchronizes
|
||
|
on spinlock.
|
||
|
|
||
|
- writer(1) does up_write: lock becomes 0x3fff8001, != 0, so writer decides
|
||
|
it needs to wake up, and synchronizes on spinlock.
|
||
|
|
||
|
- another reader(4) comes on on another CPU, increments, and notices that
|
||
|
it can do so without it being negative: lock becomes 0x3fff8002 and
|
||
|
this one does NOT synchronize on the spinlock.
|
||
|
|
||
|
End result: reader(4) "stole base" and actually got the lock without ever
|
||
|
seeing any contention, and we now have (1), (2) and (3) who are serialized
|
||
|
inside the spinlock.
|
||
|
|
||
|
So we get to what the serializers have to do, ie the slow path:
|
||
|
|
||
|
First, let's do the __up_read/write_wakeup() case, because that one is the
|
||
|
easy one. In fact, I think it ends up being the same function:
|
||
|
|
||
|
spin_lock(&sem->lock);
|
||
|
wake_up(&sem->waiters);
|
||
|
spin_unlock(&sem->lock);
|
||
|
|
||
|
and we're all done. The only optimization here (which we should do for
|
||
|
regular semaphores too) is to use the same spinlock for the semaphore lock
|
||
|
and the wait-queue lock.
|
||
|
|
||
|
The above is fairly obviously correct, and sufficient: we have shown that
|
||
|
we'll get here if there is contention, and the only other thing that the
|
||
|
wakup could sanely do would possibly be to select which process to wake.
|
||
|
Let's not do that yet.
|
||
|
|
||
|
The harder case is __down_read/write_failed(). Here is my suggested
|
||
|
pseudo-code:
|
||
|
|
||
|
__down_write_failed(sem)
|
||
|
{
|
||
|
DECLARE_WAIT_QUEUE(wait, current);
|
||
|
|
||
|
lock subl $0x8000,(%sem) /* Contention marker */
|
||
|
spin_lock(&sem->lock);
|
||
|
add_wait_queue_exclusive(&sem->wait, &wait);
|
||
|
for (;;) {
|
||
|
unsigned int value, newvalue;
|
||
|
|
||
|
set_task_state(TASK_SLEEPING);
|
||
|
value = sem->value;
|
||
|
|
||
|
/*
|
||
|
* Ignore other pending writers: but if there is
|
||
|
* are pending readers or a write holder we should
|
||
|
* sleep
|
||
|
*/
|
||
|
if (value & 0xc0007fff) {
|
||
|
spin_unlock(&sem->lock);
|
||
|
schedule();
|
||
|
spin_lock(&sem->lock);
|
||
|
continue;
|
||
|
}
|
||
|
|
||
|
/*
|
||
|
* This also undoes our contention marker thing, while
|
||
|
* leaving other waiters contention markers in place
|
||
|
*/
|
||
|
newvalue = (value + 0x8000) | 0xc0000000;
|
||
|
if (lock_cmpxchg(sem->value, value, newvalue))
|
||
|
break; /* GOT IT! */
|
||
|
|
||
|
/* Damn, somebody else changed it from under us */
|
||
|
continue;
|
||
|
}
|
||
|
remove_wait_queue(&sem->wait, &wait);
|
||
|
spin_unlock(&sem->lock);
|
||
|
}
|
||
|
|
||
|
The down_read() slow case is equivalent, but ends up being much simpler
|
||
|
(because we don't need to play with contention markers or ignore other
|
||
|
peoples contention markers):
|
||
|
|
||
|
__down_read_failed(sem)
|
||
|
{
|
||
|
DECLARE_WAIT_QUEUE(wait, current);
|
||
|
|
||
|
spin_lock(&sem->lock);
|
||
|
add_wait_queue(&sem->wait, &wait);
|
||
|
for (;;) {
|
||
|
set_task_state(TASK_SLEEPING);
|
||
|
/*
|
||
|
* Yah! We already did our "inc", so if we ever see
|
||
|
* a positive value we're all done.
|
||
|
*/
|
||
|
if (sem->value > 0)
|
||
|
break;
|
||
|
spin_unlock(&sem->lock);
|
||
|
schedule();
|
||
|
spin_lock(&sem->lock);
|
||
|
}
|
||
|
remove_wait_queue(&sem->wait, &wait);
|
||
|
spin_unlock(&sem->lock);
|
||
|
}
|
||
|
|
||
|
Can anybody shoot any holes in this? I haven't actually tested it, but
|
||
|
race conditions in locking primitives are slippery things, and I'd much
|
||
|
rather have an algorithm we can _think_ about and prove to be working. And
|
||
|
I think the above one is provably correct.
|
||
|
|
||
|
Not that I want to go to that kind of extremes.
|
||
|
|
||
|
Anybody? Andrew? Mind giving the above a whirl on your testbed? Or can you
|
||
|
see some thinko in it without even testing?
|
||
|
|
||
|
Note in particular how the above keeps the non-contention "down_read()" /
|
||
|
"up_read()" cases as two single instructions, no slower than a spinlock.
|
||
|
|
||
|
(There are the following assumptions in the code: there are at most 32k
|
||
|
active readers, and also at most 32k pending writers. The limits come from
|
||
|
the writer contention marker logic. You have the 32 bits split up as:
|
||
|
|
||
|
- 2 bits "write lock owner" (it's really only one bit, but the other bit
|
||
|
is set so that the writer contention marker won't turn the most
|
||
|
negative number into a positive one, so we have one "extra" bit set to
|
||
|
keep the thing negative for the whole duration of a write lock)
|
||
|
- 15 "reader count" bits
|
||
|
- 15 "pending writer count" bits
|
||
|
|
||
|
and the 15 bits is why you have the 32k user limitation. I think it's an
|
||
|
acceptable one - and if not you can expand on it by adding extra fields
|
||
|
that are only accessed within the spinlock).
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Sun, 8 Apr 2001 21:18:20 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: rw_semaphores][6]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
The "down_writer_failed()" case was wrong:
|
||
|
|
||
|
On Sun, 8 Apr 2001, Linus Torvalds wrote:
|
||
|
>
|
||
|
> __down_write_failed(sem)
|
||
|
> {
|
||
|
> ....
|
||
|
> /*
|
||
|
> * Ignore other pending writers: but if there is
|
||
|
> * are pending readers or a write holder we should
|
||
|
> * sleep
|
||
|
> */
|
||
|
> if (value & 0xc0007fff) {
|
||
|
....
|
||
|
|
||
|
The "value & 0xc0007ffff" test is wrong, because it's actually always true
|
||
|
for some normal contention case (multiple pending writers waiting for a
|
||
|
reader to release the lock). Because even if there is no write lock
|
||
|
holder, other pending writers (and we're one) will have caused the high
|
||
|
bits to be set, so the above would end up causing us to think that we
|
||
|
can't get the lock even if there is no real lock holder.
|
||
|
|
||
|
The comment is right, though. It's just the test that is simplistic and
|
||
|
wrong.
|
||
|
|
||
|
The pending readers part is correct, and obvious enough:
|
||
|
|
||
|
(value & 0x7fff) != 0
|
||
|
|
||
|
implies that there are readers. In which case we should try again. Simple
|
||
|
enough.
|
||
|
|
||
|
The pending writers part is slightly more subtle: we know that there is at
|
||
|
least _one_ pending writer, namely us. It turns out that we must check the
|
||
|
two high bits, and the logic is:
|
||
|
|
||
|
- 11: only pending writers, no write lock holder (if we had a write lock
|
||
|
holder, he would have set the bits to 11, but a pending writer
|
||
|
would have borrowed from the lower bit, so you'd get bit pattern
|
||
|
10).
|
||
|
- 10: we have a real lock holder, and the pending writers borrowed from
|
||
|
the low lock bit when they did the "subl 0x8000" to mark off
|
||
|
contention.
|
||
|
- 01: must not happen. BUG.
|
||
|
- 00: we had a real write lock holder, but he released the lock and
|
||
|
cleared both bits.
|
||
|
|
||
|
So the "is there a write lock holder" test basically becomes
|
||
|
|
||
|
(value & 0xc0000000) == 0x80000000
|
||
|
|
||
|
and the full test should be
|
||
|
|
||
|
if ((value & 0x7fff) || ((value & 0xc0000000) == 0x80000000) {
|
||
|
spin_unlock();
|
||
|
schedule();
|
||
|
spin_lock();
|
||
|
continue;
|
||
|
}
|
||
|
|
||
|
which might be rewritten some simpler way. I'm too lazy to think about it
|
||
|
even more.
|
||
|
|
||
|
For similar reasons, the "newvalue" calculation was subtly bogus: we must
|
||
|
make sure that we maintain the correct logic for the two upper bits in the
|
||
|
presense of _other_ pending writers. We can't just do the unconditional
|
||
|
binary "or" operation to set the two upper bits, because then we'd break
|
||
|
the above rules if there are other pending writers. So the newvalue
|
||
|
calculation should be something along the lines of
|
||
|
|
||
|
/* Undo _our_ contention marker */
|
||
|
newvalue = value + 0x8000;
|
||
|
|
||
|
/* Get rid of stale writer lock bits */
|
||
|
newvalue &= 0x3fffffff;
|
||
|
|
||
|
/*
|
||
|
* If there were no other pending writers (newvalue == 0), we set
|
||
|
* both high bits, otherwise we only set bit 31.
|
||
|
* (see above on the "borrow bit being clear" logic).
|
||
|
*/
|
||
|
if (!newvalue)
|
||
|
newvalue = 0xc0000000;
|
||
|
newvalue |= 0x80000000;
|
||
|
|
||
|
And THEN I think the algorithm in the email I'm following up to should
|
||
|
actually really work.
|
||
|
|
||
|
Does anybody find any other details I missed?
|
||
|
|
||
|
And no, I have not yet actually _tested_ any of this. But all my code
|
||
|
works on the first try (or, in this case, second try if you want to be a
|
||
|
stickler for details).
|
||
|
|
||
|
No wonder we didn't get this right first time through. It's not really all
|
||
|
that horribly complicated, but the _details_ kill you.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Mon, 9 Apr 2001 22:43:53 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: rw_semaphores][7]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Tue, 10 Apr 2001, Tachino Nobuhiro wrote:
|
||
|
>
|
||
|
> I am not familiar with semaphore or x86, so this may not be correct,
|
||
|
> but if the following sequence is possible, the writer can call wake_up()
|
||
|
> before the reader calls add_wait_queue() and reader may sleep forever.
|
||
|
> Is it possible?
|
||
|
|
||
|
The ordering is certainly possible, but if it happens,
|
||
|
__down_read_failed() won't actually sleep, because it will notice that the
|
||
|
value is positive and just return immediately. So it will do some
|
||
|
unnecessary work (add itself to the wait-queue only to remove itself
|
||
|
immediately again), but it will do the right thing.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Tue, 10 Apr 2001 12:42:07 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: [PATCH] i386 rw_semaphores fix][8]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Tue, 10 Apr 2001, David Howells wrote:
|
||
|
>
|
||
|
> Here's a patch that fixes RW semaphores on the i386 architecture. It is very
|
||
|
> simple in the way it works.
|
||
|
|
||
|
XADD only works on Pentium+.
|
||
|
|
||
|
That's no problem if we make this SMP-specific - I doubt anybody actually
|
||
|
uses SMP on i486's even if the machines exist, as I think they all had
|
||
|
special glue logic that Linux would have trouble with anyway. But the
|
||
|
advantages of being able to use one generic kernel that works on plain UP
|
||
|
i386 machines as well as SMP P6+ machines is big enough that I would want
|
||
|
to be able to say "CONFIG_X86_GENERIC" + "CONFIG_SMP".
|
||
|
|
||
|
Even if it would be noticeably slower (ie a fallback to a spinlock might
|
||
|
be perfectly ok).
|
||
|
|
||
|
If you do this, I would suggest having asm-i386/{rwsem.h|rwsem-xadd.h},
|
||
|
and just having a
|
||
|
|
||
|
#ifndef CONFIG_XADD
|
||
|
#include <asm/rwsem.h>
|
||
|
#else
|
||
|
#include <asm/rwsem-xadd.h>
|
||
|
#endif
|
||
|
|
||
|
(And adding "CONFIG_XADD" to the list of generated optimization
|
||
|
configuration options in arch/i386/config.in, of course).
|
||
|
|
||
|
That way we don't make the semaphore.h file even more unreadable.
|
||
|
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Tue, 10 Apr 2001 13:16:10 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: [PATCH] i386 rw_semaphores fix][8]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Tue, 10 Apr 2001, Andi Kleen wrote:
|
||
|
>
|
||
|
> I guess 386 could live with an exception handler that emulates it.
|
||
|
|
||
|
That approach is fine, although I'd personally prefer to take the
|
||
|
exception just once and just rewrite the instuction as a "call". The
|
||
|
places that need xadd would have to follow some strict guidelines (long
|
||
|
modrms or other instructions to pad out to enough size, and have the
|
||
|
arguments in fixed registers)
|
||
|
|
||
|
> (BTW an generic exception handler for CMPXCHG would also be very useful
|
||
|
> for glibc -- currently it has special checking code for 386 in its mutexes)
|
||
|
> The 386 are so slow that nobody would probably notice a bit more slowness
|
||
|
> by a few exceptions.
|
||
|
|
||
|
Ehh. I find that the slower the machine is, the more easily I _notice_
|
||
|
that it is slow. So..
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Tue, 10 Apr 2001 17:55:09 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: [PATCH] i386 rw_semaphores fix][8]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Wed, 11 Apr 2001, David Weinehall wrote:
|
||
|
> >
|
||
|
> > Yes, and with CMPXCHG handler in the kernel it wouldn't be needed
|
||
|
> > (the other 686 optimizations like memcpy also work on 386)
|
||
|
>
|
||
|
> But the code would be much slower, and it's on 386's and similarly
|
||
|
> slow beasts you need every cycle you can get, NOT on a Pentium IV.
|
||
|
|
||
|
Note that the "fixup" approach is not necessarily very painful at all,
|
||
|
from a performance standpoint (either on 386 or on newer CPU's). It's not
|
||
|
really that hard to just replace the instruction in the "undefined
|
||
|
instruction" handler by having strict rules about how to use the "xadd"
|
||
|
instruction.
|
||
|
|
||
|
For example, you would not actually fix up the xadd to be a function call
|
||
|
to something that emulates "xadd" itself on a 386. You would fix up the
|
||
|
whole sequence of "inline down_write()" with a simple call to an
|
||
|
out-of-line "i386_down_write()" function.
|
||
|
|
||
|
Note that down_write() on an old 386 is likely to be complicated enough
|
||
|
that you want to do it out-of-line anyway, so the code-path you take
|
||
|
(afetr the first time you've trapped on that particular location) would be
|
||
|
the one you would take for an optimized 386 kernel anyway. And similarly,
|
||
|
the restrictions you place on non-386-code to make it fixable are simple
|
||
|
enough that it probably shouldn't make a difference for performance on
|
||
|
modern chips.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Wed, 11 Apr 2001 11:41:06 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: [PATCH] i386 rw_semaphores fix][9]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Wed, 11 Apr 2001, David Howells wrote:
|
||
|
>
|
||
|
> > These numbers are infinity :)
|
||
|
>
|
||
|
> I know, but I think Linus may be happy with the resolution for the moment. It
|
||
|
> can be extended later by siphoning off excess quantities of waiters into a
|
||
|
> separate counter (as is done now) and by making the access count use a larger
|
||
|
> part of the variable.
|
||
|
|
||
|
I'm certainly ok with the could being limited to "thousands". I don't see
|
||
|
people being able to exploit it any practical way. But we should remember
|
||
|
to be careful: starting thousands of threads and trying to make them all
|
||
|
take page faults and overflowing the read counter would be a _nasty_
|
||
|
attack, It would probably not be particularly easy to arrange, but still.
|
||
|
|
||
|
Note that blocking locks are different from spinlocks: for spinlocks we
|
||
|
can get by with just 7 bits in a byte, and be guaranteed that that is
|
||
|
enough for any machine with less than 128 processors. For the blocking
|
||
|
locks, that is not true.
|
||
|
|
||
|
(Right now the default "max user processes" ulimit already protects us
|
||
|
from this exploit, I just wanted to make sure that people _think_ about
|
||
|
this).
|
||
|
|
||
|
So a 16-bit count is _fine_. And I could live with less.
|
||
|
|
||
|
We should remember the issue, though. If we ever decide to expand it, it
|
||
|
would be easy enough to make an alternative "rwsem-reallybig.h" that uses
|
||
|
cmpxchg8b instead, or whatever. You don't need to start splitting the
|
||
|
counts up to expand them past 16 bits, you could have a simple system
|
||
|
where the _read_ locks only look at one (fast) 32-bit word for their fast
|
||
|
case, and only the write lock actually needs to use cmpxchg8b.
|
||
|
|
||
|
(I think it's reasonable to optimize the read-lock more than the
|
||
|
write-lock: in the cases where you want to do rw-locking, the common
|
||
|
reason is because you really _want_ to allow many concurrent readers. That
|
||
|
also implies that the read case is the common one).
|
||
|
|
||
|
So you could fairly easily expand past 16-bit counters by using a 31-bit
|
||
|
counter for the reader, and making the high bit in the reader count be the
|
||
|
"contention" bit. Then the slow path (and the write path) would use the
|
||
|
64-bit operations offered by cmpxchg8b.
|
||
|
|
||
|
And yes, it's a Pentium+ instruction (and this time I -checked- ;), but by
|
||
|
the time you worry about hundreds of thousands of threads I think you can
|
||
|
safely just say "you'd better be running on a big a modern machine", and
|
||
|
just make the code conditional on CONFIG_PENTIUM+
|
||
|
|
||
|
So no real trickiness needed for expanding the number space, but certainly
|
||
|
also no real _reason_ for it at this time.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Mon, 16 Apr 2001 10:05:57 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: rw_semaphores][10]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Mon, 16 Apr 2001 yodaiken@fsmlabs.com wrote:
|
||
|
>
|
||
|
> I'm trying to imagine a case where 32,000 sharing a semaphore was anything but a
|
||
|
> major failure and I can't. To me: the result of an attempt by the 32,768th locker
|
||
|
> should be a kernel panic. Is there a reasonable scenario where this is wrong?
|
||
|
|
||
|
Hint: "I'm trying to imagine a case when writing all zeroes to /etc/passwd
|
||
|
is anything but a major failure, but I can't. So why don't we make
|
||
|
/etc/passwd world-writable?"
|
||
|
|
||
|
Right. Security.
|
||
|
|
||
|
There is _never_ any excuse for panic'ing because of some inherent
|
||
|
limitation of the data structures. You can return -ENOMEM, -EAGAIN or
|
||
|
something like that, but you must _not_ allow a panic (or a roll-over,
|
||
|
which would just result in corrupted kernel data structures).
|
||
|
|
||
|
Note that the limit is probably really easy to work around even without
|
||
|
extending the number of bits: a sleeper that notices that the count is
|
||
|
even _halfway_ to rolling around could easily do something like:
|
||
|
|
||
|
- undo "this process" action
|
||
|
- sleep for 1 second
|
||
|
- try again from the beginning.
|
||
|
|
||
|
I certainly agree that no _reasonable_ pattern can cause the failure, but
|
||
|
we need to worry about people who are malicious. The above trivial
|
||
|
approach would take care of that, while not penalizing any non-malicious
|
||
|
users.
|
||
|
|
||
|
So I'm not worried about this at all. I just want people _always_ to think
|
||
|
about "how could I mis-use this if I was _truly_ evil", and making sure it
|
||
|
doesn't cause problems for others on the system.
|
||
|
|
||
|
(NOTE: This does not mean that the kernel has to do anything _reasonable_
|
||
|
under all circumstances. There are cases where Linux has decided that
|
||
|
"this is not something a reasonable program can do, and if you try to do
|
||
|
it, we'll give you random results back - but they will not be _security_
|
||
|
holes". We don't need to be _nice_ to unreasonable requests. We just must
|
||
|
never panic, otherwise crash or allow unreasonable requests to mess up
|
||
|
_other_ people)
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Fri, 20 Apr 2001 10:46:01 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: [andrea@suse.de: Re: generic rwsem [Re: Alpha "process table][11]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Fri, 20 Apr 2001, David Howells wrote:
|
||
|
>
|
||
|
> The file should only be used for the 80386 and maybe early 80486's where
|
||
|
> CMPXCHG doesn't work properly, everything above that can use the XADD
|
||
|
> implementation.
|
||
|
|
||
|
Why are those not using the generic files? The generic code is obviously
|
||
|
more maintainable.
|
||
|
|
||
|
> But if you want it totally non-inline, then that can be done. However, whilst
|
||
|
> developing it, I did notice that that slowed things down, hence why I wanted
|
||
|
> it kept in line.
|
||
|
|
||
|
I want to keep the _fast_ case in-line.
|
||
|
|
||
|
I do not care at ALL about the stupid spinlock version. That should be the
|
||
|
_fallback_, and it should be out-of-line. It is always going to be the
|
||
|
slowest implementation, modulo bugs in architecture-specific code.
|
||
|
|
||
|
For i386 and i486, there is no reason to try to maintain a complex fast
|
||
|
case. The machines are unquestionably going away - we should strive to not
|
||
|
burden them unnecessarily, but we should _not_ try to save two cycles.
|
||
|
|
||
|
In short:
|
||
|
- the only case that _really_ matters for performance is the uncontended
|
||
|
read-lock for "reasonable" machines. A i386 no longer counts as
|
||
|
reasonable, and designing for it would be silly. And the write-lock
|
||
|
case is much less compelling.
|
||
|
- We should avoid any inlines where the inline code is >2* the
|
||
|
out-of-line code. Icache issues can overcome any cycle gains, and do
|
||
|
not show up well in benchmarks (benchmarks tend to have very hot
|
||
|
icaches). Note that this is less important for the out-of-line code in
|
||
|
another segment that doesn't get brought into the icache at all for the
|
||
|
non-contention case, but that should still be taken _somewhat_ into
|
||
|
account if only because of kernel size issues.
|
||
|
|
||
|
Both of the above rules implies that the generic spin-lock implementation
|
||
|
should be out-of-line.
|
||
|
|
||
|
> (1) asm-i386/rwsem-spin.h is wrong, and can probably be replaced with the
|
||
|
> generic spinlock implementation without inconveniencing people much.
|
||
|
> (though someone has commented that they'd want this to be inline as
|
||
|
> cycles are precious on the slow 80386).
|
||
|
|
||
|
Icache is also precious on the 386, which has no L2 in 99% of all cases.
|
||
|
Make it out-of-line.
|
||
|
|
||
|
> (2) "fix up linux/rwsem-spinlock.h": do you want the whole generic spinlock
|
||
|
> implementation made non-inline then?
|
||
|
|
||
|
Yes. People who care about performance _will_ have architecture-specific
|
||
|
inlines on architectures where they make sense (ie 99% of them).
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Fri, 20 Apr 2001 16:45:32 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem][12]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Fri, 20 Apr 2001, Andrea Arcangeli wrote:
|
||
|
>
|
||
|
> While dropping the list_empty check to speed up the fast path I faced the same
|
||
|
> complexity of the 2.4.4pre4 lib/rwsem.c and so before reinventing the wheel I
|
||
|
> read how the problem was solved in 2.4.4pre4.
|
||
|
|
||
|
I would suggest the following:
|
||
|
|
||
|
- the generic semaphores should use the lock that already exists in the
|
||
|
wait-queue as the semaphore spinlock.
|
||
|
|
||
|
- the generic semaphores should _not_ drop the lock. Right now it drops
|
||
|
the semaphore lock when it goes into the slow path, only to re-aquire
|
||
|
it. This is due to bad interfacing with the generic slow-path routines.
|
||
|
|
||
|
I suspect that this lock-drop is why Andrea sees problems with the
|
||
|
generic semaphores. The changes to "count" and "sleeper" aren't
|
||
|
actually atomic, because we don't hold the lock over them all. And
|
||
|
re-using the lock means that we don't need the two levels of
|
||
|
spinlocking for adding ourselves to the wait queue. Easily done by just
|
||
|
moving the locking _out_ of the wait-queue helper functions, no?
|
||
|
|
||
|
- the generic semaphores are entirely out-of-line, and are just declared
|
||
|
universally as regular FASTCALL() functions.
|
||
|
|
||
|
The fast-path x86 code looks ok to me. The debugging stuff makes it less
|
||
|
readable than it should be, I suspect, and is probably not worth it at
|
||
|
this stage. The users of rw-semaphores are so well-defined (and so well
|
||
|
debugged) that the debugging code only makes the code harder to follow
|
||
|
right now.
|
||
|
|
||
|
Comments? Andrea? Your patches have looked ok, but I absolutely refuse to
|
||
|
see the non-inlined fast-path for reasonable x86 hardware..
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
Date: Sat, 21 Apr 2001 10:18:06 -0700 (PDT)
|
||
|
From: Linus Torvalds <torvalds@transmeta.com>
|
||
|
Subject: [Re: x86 rwsem in 2.4.4pre[234] are still buggy [was Re: rwsem][13]
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
|
||
|
On Sat, 21 Apr 2001, Russell King wrote:
|
||
|
>
|
||
|
> Erm, spin_lock()? What if up_read or up_write gets called from interrupt
|
||
|
> context (is this allowed)?
|
||
|
|
||
|
Currently that is not allowed.
|
||
|
|
||
|
We allow it for regular semaphores, but not for rw-semaphores.
|
||
|
|
||
|
We may some day have to revisit that issue, but I suspect we won't have
|
||
|
much reason to.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
|
||
|
Date: Fri, 16 Dec 2005 21:43:03 UTC
|
||
|
Message-ID: <[fa.fv9t5bi.g0g5qg@ifi.uio.no][14]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512161339140.3698@g5.osdl.org][15]>
|
||
|
|
||
|
On Fri, 16 Dec 2005, Thomas Gleixner wrote:
|
||
|
|
||
|
> On Thu, 2005-12-15 at 21:32 +0100, Geert Uytterhoeven wrote:
|
||
|
> > > Why have the "MUTEX" part in there? Shouldn't that just be DECLARE_SEM
|
||
|
> > > (oops, I mean DEFINE_SEM). Especially that MUTEX_LOCKED! What is that?
|
||
|
> > > How does a MUTEX start off as locked. It can't, since a mutex must
|
||
|
> > > always have an owner (which, by the way, helped us in the -rt patch to
|
||
|
> > > find our "compat_semaphores"). So who's the owner of a
|
||
|
> > > DEFINE_SEM_MUTEX_LOCKED?
|
||
|
> >
|
||
|
> > No one. It's not really a mutex, but a completion.
|
||
|
>
|
||
|
> Well, then let us use a completion and not some semantically wrong
|
||
|
> workaround
|
||
|
|
||
|
It is _not_ wrong to have a semaphore start out in locked state.
|
||
|
|
||
|
For example, it makes perfect sense if the data structures that the
|
||
|
semaphore needs need initialization. The way you _should_ handle that is
|
||
|
to make the semaphore come up as locked, and the data structures in some
|
||
|
"don't matter" state, and then the thing that initializes stuff can do so
|
||
|
properly and then release the semaphore.
|
||
|
|
||
|
Yes, in some cases such a locked semaphore is only used once, and ends up
|
||
|
being a "completion", but that doesn't invalidate the fact that this is
|
||
|
a perfectly fine way to handle a real issue.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation
|
||
|
Date: Fri, 16 Dec 2005 22:21:08 UTC
|
||
|
Message-ID: <[fa.g0a54rd.h0o5at@ifi.uio.no][16]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512161414370.3698@g5.osdl.org][17]>
|
||
|
|
||
|
On Fri, 16 Dec 2005, Thomas Gleixner wrote:
|
||
|
>
|
||
|
> Well, in case of a semaphore it is a semantically correct use case. In
|
||
|
> case of of a mutex it is not.
|
||
|
|
||
|
I disagree.
|
||
|
|
||
|
Think of "initialization" as a user. The system starts out initializing
|
||
|
stuff, and as such the mutex should start out being held. It's that
|
||
|
simple. It _is_ mutual exclusion, with one user being the early bootup
|
||
|
state.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH 1/12]: MUTEX: Implement mutexes
|
||
|
Date: Sun, 18 Dec 2005 18:43:42 UTC
|
||
|
Message-ID: <[fa.g09n3bh.i067av@ifi.uio.no][18]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512181027220.4827@g5.osdl.org][19]>
|
||
|
|
||
|
On Sun, 18 Dec 2005, Russell King wrote:
|
||
|
>
|
||
|
> On Sat, Dec 17, 2005 at 10:30:41PM -0800, Linus Torvalds wrote:
|
||
|
> > An interrupt can never change the value without changing it back, except
|
||
|
> > for the old-fashioned use of "up()" as a completion (which I don't think
|
||
|
> > we do any more - we used to do it for IO completion a looong time ago).
|
||
|
>
|
||
|
> I doubt you can guarantee that statement, or has the kernel source
|
||
|
> been audited for this recently?
|
||
|
|
||
|
Well, _if_ it's a noticeable performance win, we should just do it. We
|
||
|
already know that people don't call "down()" in interrupts (it just
|
||
|
wouldn't work), we can instrument "up()" too.
|
||
|
|
||
|
It's easy enough to add a "might_sleep()" to the up(). Not strictly true,
|
||
|
but conceptually it would make sense to make up/down match in that sense.
|
||
|
We'd have to mark the (few) places that do down_trylock() + up() in
|
||
|
interrupt context with a special "up_in_interrupt()", but that would be ok
|
||
|
even from a documentation standpoint.
|
||
|
|
||
|
> However, the real consideration is stability - if a semaphore was
|
||
|
> used for a completion and it was merged, would it be found and
|
||
|
> fixed? Probably not, because it won't cause any problems on
|
||
|
> architectures where semaphores have atomic properties.
|
||
|
|
||
|
Actually, the reason we have completions is that using semaphores as
|
||
|
completions caused some really subtle problems that had nothing to do with
|
||
|
atomicity of the operations themselves, so if you find somebody who uses a
|
||
|
semaphore from an interrupt, I think we want to know about it.
|
||
|
|
||
|
Completions actually have another - and more important - property than the
|
||
|
fact that they have a more logical name for a particular usage.
|
||
|
|
||
|
The completion has "don't touch me" guarantees. A thread or interrupt that
|
||
|
does an "up()" on a semaphore may still touch the memory that was
|
||
|
allocated for the semaphore after the "down()" part has been released.
|
||
|
And THAT was the reason for the completions: we allocate them on the stack
|
||
|
or in temporary allocations, and the thing that does the "down()" to wait
|
||
|
for something to finish will also do the _release_ of the memory.
|
||
|
|
||
|
With semaphores, that caused problems, because the side doing the "up()"
|
||
|
would thus possibly touch memory that got released from under it.
|
||
|
|
||
|
This problem happens only on SMP (since you need to have the downer and
|
||
|
the upper running at the same time), but it's totally independent of the
|
||
|
other atomicity issues. And almost any semaphore that is used as a
|
||
|
completion from an interrupt will have this problem, so yes, if you find
|
||
|
somebody doing an "up()" in interrupt context, we'll fix it.
|
||
|
|
||
|
It would be good to make the rules clear, that you can never touch a
|
||
|
semaphore from irq context without changing it back before you return.
|
||
|
|
||
|
Of course, that still leaves the following sequence
|
||
|
|
||
|
if (!down_trylock(..)) {
|
||
|
... do something ..
|
||
|
up(..);
|
||
|
}
|
||
|
|
||
|
which is actually used from interrupts too. At least the console layer
|
||
|
does that (printk() can happen from interrupts, and we do a down_trylock
|
||
|
on the console semaphore. But that one shouldn't mess with the _count_,
|
||
|
although it does mean that the wait-queue preparation etc (for when the
|
||
|
fast case fails) does still need to be protected against interrupts.
|
||
|
|
||
|
But that would be the slow case, so from a performance standpoint, it
|
||
|
would still allow the case that really _matters_ to be done with
|
||
|
interrupts enabled.
|
||
|
|
||
|
> Unless of course sparse can be extended to detect the use of unbalanced
|
||
|
> semaphores in interrupt contexts.
|
||
|
|
||
|
In theory, yes, but in practice I'd much rather just do the stupid brute
|
||
|
force things.
|
||
|
|
||
|
> > (Of course, maybe it's not worth it. It might not be a big performance
|
||
|
> > issue).
|
||
|
>
|
||
|
> Balancing the elimination of 4 instructions per semaphore operation,
|
||
|
> totalling about 4 to 6 cycles, vs stability I'd go for stability
|
||
|
> unless we can prove the above assertion via (eg) sparse.
|
||
|
|
||
|
I agree, if arm interrupt disables are fast. For example, on x86 (where
|
||
|
this isn't needed, because you can have an "interrupt-safe" decrement by
|
||
|
just having it as a single instruction, even if it isn't SMP-safe),
|
||
|
disabling and re-enabling interrupts is just one instruction each, but the
|
||
|
combination is usually something like 50+ cycles. So if this was an issue
|
||
|
on x86, we'd definitely care.
|
||
|
|
||
|
But if you don't think it's a big issue on ARM, it just doesn't matter.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH 1/12]: MUTEX: Implement mutexes
|
||
|
Date: Sun, 18 Dec 2005 19:55:47 UTC
|
||
|
Message-ID: <[fa.fva543d.j0g6ir@ifi.uio.no][20]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512181153080.4827@g5.osdl.org][21]>
|
||
|
|
||
|
On Sun, 18 Dec 2005, James Bottomley wrote:
|
||
|
>
|
||
|
> Actually, I don't think you want might_sleep(): there are a few cases
|
||
|
> where we do an up() from under a spinlock, which will spuriously trigger
|
||
|
> this. I'd suggest WARN_ON(in_interrupt()) instead.
|
||
|
|
||
|
Ahh, good point. Yes.
|
||
|
|
||
|
However, if even the arm people aren't all that interested in this, maybe
|
||
|
it simply doesn't matter. A lot of other architectures either have
|
||
|
"decrement in memory" or can already use ll/sc for it.
|
||
|
|
||
|
(of course, on some architectures, ll/sc is really really slow, so they
|
||
|
might well prefer using a normal load and store instead).
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 00/15] Generic Mutex Subsystem
|
||
|
Date: Mon, 19 Dec 2005 19:12:41 UTC
|
||
|
Message-ID: <[fa.g19j43e.h026iq@ifi.uio.no][22]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512191053400.4827@g5.osdl.org][23]>
|
||
|
|
||
|
On Mon, 19 Dec 2005, Ingo Molnar wrote:
|
||
|
>
|
||
|
> in fact, generic mutexes are _more_ fair than struct semaphore in their
|
||
|
> wait logic. In the stock semaphore implementation, when a waiter is
|
||
|
> woken up, it will retry the lock, and if it fails, it goes back to the
|
||
|
> _tail_ of the queue again - waiting one full cycle again.
|
||
|
|
||
|
Ingo, I don't think that is true.
|
||
|
|
||
|
It shouldn't be true, at least. The whole point with the "sleeper" count
|
||
|
was to not have that happen. Of course, bugs happen, so I won't guarantee
|
||
|
that's actually true, but ..
|
||
|
|
||
|
If you are woken up as a waiter on a semaphore, you shouldn't fail to get
|
||
|
it. You will be woken first, and nobody else will get at it, because the
|
||
|
count has been kept negative or zero even by the waiters, so that a
|
||
|
fast-path user shouldn't be able to get the lock without going through the
|
||
|
slow path and adding itself (last) to the list.
|
||
|
|
||
|
But hey, somebody should test it with <n> kernel threads that just do
|
||
|
down()/up() and some make-believe work in between to make sure there
|
||
|
really is contention, and count how many times they actually get the
|
||
|
semaphore. That code has been changed so many times that it may not work
|
||
|
the way it is advertized ;)
|
||
|
|
||
|
[ Oh. I'm looking at the semaphore code, and I realize that we have a
|
||
|
"wake_up(&sem->wait)" in the __down() path because we had some race long
|
||
|
ago that we fixed by band-aiding over it. Which means that we wake up
|
||
|
sleepers that shouldn't be woken up. THAT may well be part of the
|
||
|
performance problem.. The semaphores are really meant to wake up just
|
||
|
one at a time, but because of that race hack they'll wake up _two_ at a
|
||
|
time - once by up(), once by down().
|
||
|
|
||
|
That also destroys the fairness. Does anybody remember why it's that
|
||
|
way? ]
|
||
|
|
||
|
Ho humm.. That's interesting.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 00/15] Generic Mutex Subsystem
|
||
|
Date: Mon, 19 Dec 2005 19:57:13 UTC
|
||
|
Message-ID: <[fa.g1a13rj.h0c6qh@ifi.uio.no][24]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512191148460.4827@g5.osdl.org][25]>
|
||
|
|
||
|
On Mon, 19 Dec 2005, Benjamin LaHaise wrote:
|
||
|
>
|
||
|
> The only thing I can see as an improvement that a mutex can offer over
|
||
|
> the current semaphore implementation is if we can perform the same
|
||
|
> optimization that spinlocks perform in the unlock operation: don't use
|
||
|
> a locked, serialising instruction in the up() codepath. That might be
|
||
|
> a bit tricky to implement, but it's definately a win on the P4 where the
|
||
|
> cost of serialisation can be quite high.
|
||
|
|
||
|
Good point. However, it really _is_ hard, because we also need to know if
|
||
|
the mutex was under contention. A spinlock doesn't care, so we can just
|
||
|
overwrite the lock value. A mutex would always care, in order to know
|
||
|
whether it needs to do the slow wakeup path.
|
||
|
|
||
|
So I suspect you can't avoid serializing the unlock path for a mutex. The
|
||
|
issue of "was there contention while I held it" fundamentally _is_ a
|
||
|
serializing question.
|
||
|
|
||
|
> > [ Oh. I'm looking at the semaphore code, and I realize that we have a
|
||
|
> > "wake_up(&sem->wait)" in the __down() path because we had some race long
|
||
|
> > ago that we fixed by band-aiding over it. Which means that we wake up
|
||
|
> > sleepers that shouldn't be woken up. THAT may well be part of the
|
||
|
> > performance problem.. The semaphores are really meant to wake up just
|
||
|
> > one at a time, but because of that race hack they'll wake up _two_ at a
|
||
|
> > time - once by up(), once by down().
|
||
|
> >
|
||
|
> > That also destroys the fairness. Does anybody remember why it's that
|
||
|
> > way? ]
|
||
|
>
|
||
|
> History?
|
||
|
|
||
|
Oh, absolutely, I already checked the old BK history too, and that extra
|
||
|
wake_up() has been there at least since before we even started using BK.
|
||
|
So it's very much historical, I'm just wondering if somebody remembers far
|
||
|
enough back that we'd know.
|
||
|
|
||
|
I don't see why it's needed (since we re-try the "atomic_add_negative()"
|
||
|
inside the semaphore wait lock, and any up() that saw contention should
|
||
|
have always been guaranteed to do a wakeup that should fill the race in
|
||
|
between that atomic_add_negative() and the thing going to sleep).
|
||
|
|
||
|
It may be that it is _purely_ historical, and simply isn't needed. That
|
||
|
would be funny/sad, in the sense that we've had it there for years and
|
||
|
years ;)
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 00/15] Generic Mutex Subsystem
|
||
|
Date: Mon, 19 Dec 2005 20:14:24 UTC
|
||
|
Message-ID: <[fa.fvpr2re.jg27qq@ifi.uio.no][26]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512191203120.4827@g5.osdl.org][27]>
|
||
|
|
||
|
On Mon, 19 Dec 2005, Ingo Molnar wrote:
|
||
|
>
|
||
|
> average cost per op: 206.59 usecs
|
||
|
> average cost per op: 512.13 usecs
|
||
|
|
||
|
(mutex vs semaphore).
|
||
|
|
||
|
That looks suspiciously like exactly double the cost, so I do believe that
|
||
|
the double wake_up() might be exactly what is going on.
|
||
|
|
||
|
However:
|
||
|
|
||
|
> hm, removing that wakeup quickly causes hung test-tasks.
|
||
|
|
||
|
So clearly it really is still hiding some bug.
|
||
|
|
||
|
> and even considering that the current semaphore implementation may have
|
||
|
> a fairness bug, i cannot imagine that making it more fair would also
|
||
|
> speed it up.
|
||
|
|
||
|
That's not the point. The extra wakeup() in the semaphore code wakes up
|
||
|
two processes for every single up(), so the semaphores end up not just
|
||
|
being unfair, they also end up doing twice the work (because it will
|
||
|
result in the other processes effectively just doing the down() twice).
|
||
|
|
||
|
> I personally find the semaphore implementation clever but too complex,
|
||
|
> maybe that's a reason why such bugs might be hiding there. (possibly
|
||
|
> for many years already ...)
|
||
|
|
||
|
Oh, absolutely. It is too complex.
|
||
|
|
||
|
And don't get me wrong: if it's easier to just ignore the performance bug,
|
||
|
and introduce a new "struct mutex" that just doesn't have it, I'm all for
|
||
|
it. However, if so, I do NOT want to do the unnecessary renaming. "struct
|
||
|
semaphore" should stay as "struct semaphore", and we should not affect old
|
||
|
code in the _least_.
|
||
|
|
||
|
Then code can switch to "struct mutex" if people want to. And if one
|
||
|
reason for it ends up being that the code avoids a performance bug in the
|
||
|
process, all the better ;)
|
||
|
|
||
|
IOW, I really think this should be a series of small patches that don't
|
||
|
touch old users of "struct semaphore" at all. None of this "semaphore" to
|
||
|
"arch_semaphore" stuff, and the new "struct mutex" would not re-use _any_
|
||
|
of the names that the old "struct semaphore" uses.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 04/15] Generic Mutex Subsystem,
|
||
|
add-atomic-call-func-x86_64.patch
|
||
|
Date: Tue, 20 Dec 2005 20:13:01 UTC
|
||
|
Message-ID: <[fa.g09n334.i0672i@ifi.uio.no][28]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512201202200.4827@g5.osdl.org][29]>
|
||
|
|
||
|
On Tue, 20 Dec 2005, Russell King wrote:
|
||
|
>
|
||
|
> Also, Nico has an alternative idea for mutexes which does not
|
||
|
> involve decrementing or incrementing - it's an atomic swap.
|
||
|
> That works out at about the same cycle count on non-Intel ARM
|
||
|
> CPUs as the present semaphore path. I'm willing to bet that
|
||
|
> it will be faster than the present semaphore path on Intel ARM
|
||
|
> CPUs.
|
||
|
|
||
|
Don't be so sure, especially not in the future.
|
||
|
|
||
|
An atomic "swap" operation is, from a CPU design standpoint, fundamentally
|
||
|
more expensive that a "load + store".
|
||
|
|
||
|
Now, most ARM architectures don't notice this, because they are all
|
||
|
in-order, and not SMP-aware anyway. No suble memory ordering, no nothing.
|
||
|
Which is the only case when "swap" basically becomes a cheap "load+store".
|
||
|
|
||
|
What I'm trying to say is that a plain "load + store" is almost always
|
||
|
going to be the best option in the long run.
|
||
|
|
||
|
It's also almost certainly always the best option for UP + non-preempt,
|
||
|
for both present and future CPU's. The reason is simply that a
|
||
|
microarchitecture will _always_ be optimized for that case, since it's
|
||
|
pretty much by definition the common situation.
|
||
|
|
||
|
Is preemption even the common case on ARM? I'd assume not. Why are people
|
||
|
so interested in the preemption case? IOW, why don't you just do
|
||
|
|
||
|
ldr lr,[%0]
|
||
|
subs lr, lr, %1
|
||
|
str lr,[%0]
|
||
|
blmi failure
|
||
|
|
||
|
as the _base_ timings, since that should be the common case. That's the
|
||
|
drop-dead fastest UP case.
|
||
|
|
||
|
There's an additional advantage of the regular load/store case: if some
|
||
|
CPU has scheduling issues, you can actually write this out as C code (with
|
||
|
an extra empty ASM to make sure that the compiler doesn't move anything
|
||
|
out of the critical region). Again, that probably doesn't matter on most
|
||
|
ARM chips, but in the general case it sure does matter.
|
||
|
|
||
|
(Btw, inlining _any_ of these except perhaps the above trivial case, is
|
||
|
probably wrong. None of the ARM chips tend to have caches all that big, I
|
||
|
bet).
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 04/15] Generic Mutex Subsystem,
|
||
|
add-atomic-call-func-x86_64.patch
|
||
|
Date: Tue, 20 Dec 2005 22:07:04 UTC
|
||
|
Message-ID: <[fa.g09r4b6.i066ak@ifi.uio.no][30]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512201354210.4827@g5.osdl.org][31]>
|
||
|
|
||
|
On Tue, 20 Dec 2005, Nicolas Pitre wrote:
|
||
|
>
|
||
|
> I mean...... what is it with mutexes that you dislike to the point of
|
||
|
> bending backward that far, and even after seeing the numbers, with such
|
||
|
> a semaphore implementation that _I_ even wouldn't trust people to use
|
||
|
> correctly?
|
||
|
|
||
|
Quite frankly, what has disgusted me about this mutex discussion is the
|
||
|
totally specious arguments for the new mutexes that just rubs me entirely
|
||
|
the wrong way.
|
||
|
|
||
|
If it had _started_ with a mutex implementation that was faster, simpler,
|
||
|
and didn't rename the old and working semaphores, I'd have been perfectly
|
||
|
fine with it.
|
||
|
|
||
|
As it is, the discussion has been pretty much everything but that.
|
||
|
|
||
|
And then people who argue about single cycles, end up dismissing the
|
||
|
single cycles when I argue that "ld+st" is faster - like you just did.
|
||
|
|
||
|
Be consistent, dammit. If single cycles matter, they matter. If they
|
||
|
don't, then the existing code is better, since it's existing and works.
|
||
|
You can't have it both ways.
|
||
|
|
||
|
In other words: if people didn't mix up issues that had nothing to do with
|
||
|
this into it, I'd be happier. I've already said that a mutex that does
|
||
|
_not_ replace semaphore (and doesn't mess with naming) is acceptable.
|
||
|
|
||
|
We've done that before. But do it RIGHT, dammit. And don't mix existing
|
||
|
semaphores into it (for example, completions didn't change any old users).
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 2/8] mutex subsystem, add asm-generic/mutex-[dec|xchg].h
|
||
|
Date: Thu, 22 Dec 2005 23:58:36 UTC
|
||
|
Message-ID: <[fa.hb476cr.k76389@ifi.uio.no][32]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0512221550290.14098@g5.osdl.org][33]>
|
||
|
|
||
|
On Fri, 23 Dec 2005, Ingo Molnar wrote:
|
||
|
>
|
||
|
> add the two generic mutex fastpath implementations.
|
||
|
|
||
|
Now this looks more like it. This is readable code without any #ifdef's in
|
||
|
the middle.
|
||
|
|
||
|
Now the only #ifdef's seem to be for mutex debugging. Might it be
|
||
|
worthwhile to have a generic debugging, that just uses spinlocks and just
|
||
|
accept that it's going to be slow, but shared across absolutely all
|
||
|
architectures?
|
||
|
|
||
|
Then you could have <linux/mutex.h> just doing a single
|
||
|
|
||
|
#ifdef CONFIG_MUTEX_DEBUG
|
||
|
# include <asm-generic/mutex-dbg.h>
|
||
|
#else
|
||
|
# include <asm/mutex.h>
|
||
|
#endif
|
||
|
|
||
|
and have mutex-dbg.h just contain prototypes (no point in inlining them,
|
||
|
they're going to be big anyway) and then have a
|
||
|
|
||
|
obj$(CONFIG_MUTEX_DEBUG) += mutex-debug.c
|
||
|
|
||
|
in the kernel/ subdirectory? That way you could _really_ have a clean
|
||
|
separation, with absolutely zero pollution of any architecture mess or
|
||
|
debugging #ifdef's in any implementation code.
|
||
|
|
||
|
At that point I'd like to switch to mutexes just because the code is
|
||
|
cleaner!
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@osdl.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [patch 08/19] mutex subsystem, core
|
||
|
Date: Tue, 03 Jan 2006 15:40:21 UTC
|
||
|
Message-ID: <[fa.g0a14jd.i0g62p@ifi.uio.no][34]>
|
||
|
Original-Message-ID: <[Pine.LNX.4.64.0601030736531.3668@g5.osdl.org][35]>
|
||
|
|
||
|
On Tue, 3 Jan 2006, Ingo Molnar wrote:
|
||
|
> >
|
||
|
> > Is this an interrupt deadlock, or do you not allow interrupt contexts
|
||
|
> > to even trylock a mutex?
|
||
|
>
|
||
|
> correct, no irq contexts are allowed. This is also checked for if
|
||
|
> CONFIG_DEBUG_MUTEXES is enabled.
|
||
|
|
||
|
Note that semaphores are definitely used from interrupt context, and as
|
||
|
such you can't replace them with mutexes if you do this.
|
||
|
|
||
|
The prime example is the console semaphore. See kernel/printk.c, look for
|
||
|
"down_trylock()", and realize that they are all about interrupts.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH] Replace completions with semaphores
|
||
|
Date: Tue, 15 Apr 2008 17:00:38 UTC
|
||
|
Message-ID: <[fa.z+KqPzoOtCdF/f8Xq6ejdEs5kZ4@ifi.uio.no][36]>
|
||
|
|
||
|
On Tue, 15 Apr 2008, Andi Kleen wrote:
|
||
|
>
|
||
|
> > - probably add support for completions to do counting
|
||
|
>
|
||
|
> But that's just a semaphore, isn't it?
|
||
|
|
||
|
Exactly. But the point here is:
|
||
|
|
||
|
- nobody should use semaphores anyway (use mutexes)
|
||
|
- making *more* code use semaphores is wrong
|
||
|
- completions have a different _mental_ model
|
||
|
|
||
|
IOW, this is not about implementation issues. It's about how you think
|
||
|
about the operations.
|
||
|
|
||
|
We should _not_ implement completions as semaphores, simply because we
|
||
|
want to get *rid* of semaphores some day.
|
||
|
|
||
|
So rather than this long and involved patch series that first makes
|
||
|
semaphores generic, and then makes them be used as completions, I'd much
|
||
|
rather just skip this whole pointless exercise entirely.
|
||
|
|
||
|
Why have "generic semaphores" at all, if we want to get rid of them?
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
|
||
|
From: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
Newsgroups: fa.linux.kernel
|
||
|
Subject: Re: [PATCH] Replace completions with semaphores
|
||
|
Date: Tue, 15 Apr 2008 18:20:47 UTC
|
||
|
Message-ID: <[fa.6KvRw3P6/PVf+3AnODUXgA0i0AU@ifi.uio.no][37]>
|
||
|
|
||
|
On Tue, 15 Apr 2008, Matthew Wilcox wrote:
|
||
|
>
|
||
|
> > In other words, what makes me not like this is hat we first turn
|
||
|
> > semaphores into the generic code (which is largely what completions were:
|
||
|
> > just a special case of the generic semaphores!) and then turns completions
|
||
|
> > into these things. That just doesn't make any sense to me!
|
||
|
>
|
||
|
> Blame me for not realising that completions were semaphores under a
|
||
|
> different name.
|
||
|
|
||
|
The origin of completions is literally the semaphore code - just
|
||
|
simplified to use spinlocks and be usable as just a mutex. We used to use
|
||
|
semaphores, and because of the subtle race with lockless semaphores I
|
||
|
wrote that stupid completion code as a "generic semaphore with a very
|
||
|
specific usage scenario" and called them "completions".
|
||
|
|
||
|
The completions _could_ have been extended/used as mutex semaphores, but
|
||
|
the difference was really the mental model for them. That then limited the
|
||
|
implementation of them: the functions working on completions are defined
|
||
|
on purpose to be limited - it doesn't really have "up()" and "down()"
|
||
|
functions: "complete()" is really a up(), but "wait_for_completion()" is
|
||
|
more like a "wait_until_I_could_do_a_trydown()" function.
|
||
|
|
||
|
Would it make sense to use completions for countable events too? Yeah. In
|
||
|
fact, we have some things that really would like to do counting, both in
|
||
|
the sense of "wait for <n> events to all complete" _and_ in the sense of
|
||
|
"allow up to <n> events to be outstanding". Both of which could be done as
|
||
|
a counting function (just make "complete" increment the counter, and then
|
||
|
make "wait for <n> events" initialize it to negative, while "allow <n>
|
||
|
outstanding events" would be a positive counter, and make
|
||
|
"wait_for_completion()" basically be a "decrement and wait until it
|
||
|
is zero".
|
||
|
|
||
|
IOW, completions() really follow the same patterns as semaphores, and it
|
||
|
*does* make sense to just have one single code-base. But if we want to
|
||
|
make semaphores go away, I think that it would be better to implement
|
||
|
semaphores in terms of "extended completions" rather than the other way
|
||
|
around. That way, we could one day really get rid of semaphores entirely.
|
||
|
|
||
|
Linus
|
||
|
|
||
|
|
||
|
* * *
|
||
|
|
||
|
[Index][1] [Home][2] [About][3] [Blog][4]
|
||
|
|
||
|
[1]: http://yarchive.net/index.html
|
||
|
[2]: http://yarchive.net/home.html
|
||
|
[3]: http://yarchive.net/about.html
|
||
|
[4]: http://yarchive.net/blog
|
||
|
[5]: http://groups.google.com/groups/search?as_ugroup=comp.os.linux.development.system&as_uauthors=Linus+Torvalds&as_usubject=playing+kernel+linux&as_drrb=b&as_mind=26&as_minm=6&as_miny=1999&as_maxd=28&as_maxm=6&as_maxy=1999&sitesearch=groups.google.com
|
||
|
[6]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=rw_semaphores&as_drrb=b&as_mind=7&as_minm=4&as_miny=2001&as_maxd=9&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[7]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=rw_semaphores&as_drrb=b&as_mind=8&as_minm=4&as_miny=2001&as_maxd=10&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[8]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=rw_semaphores+patch+i386&as_drrb=b&as_mind=9&as_minm=4&as_miny=2001&as_maxd=11&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[9]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=rw_semaphores+patch+i386&as_drrb=b&as_mind=10&as_minm=4&as_miny=2001&as_maxd=12&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[10]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=rw_semaphores&as_drrb=b&as_mind=15&as_minm=4&as_miny=2001&as_maxd=17&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[11]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=generic+process+andrea&as_drrb=b&as_mind=19&as_minm=4&as_miny=2001&as_maxd=21&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[12]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=buggy+rwsem+still&as_drrb=b&as_mind=19&as_minm=4&as_miny=2001&as_maxd=21&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[13]: http://groups.google.com/groups/search?as_ugroup=fa.linux.kernel&as_uauthors=Linus+Torvalds&as_usubject=buggy+rwsem+still&as_drrb=b&as_mind=20&as_minm=4&as_miny=2001&as_maxd=22&as_maxm=4&as_maxy=2001&sitesearch=groups.google.com
|
||
|
[14]: http://groups.google.com/groups/search?as_umsgid=fa.fv9t5bi.g0g5qg%40ifi.uio.no
|
||
|
[15]: http://mid.gmane.org/Pine.LNX.4.64.0512161339140.3698%40g5.osdl.org
|
||
|
[16]: http://groups.google.com/groups/search?as_umsgid=fa.g0a54rd.h0o5at%40ifi.uio.no
|
||
|
[17]: http://mid.gmane.org/Pine.LNX.4.64.0512161414370.3698%40g5.osdl.org
|
||
|
[18]: http://groups.google.com/groups/search?as_umsgid=fa.g09n3bh.i067av%40ifi.uio.no
|
||
|
[19]: http://mid.gmane.org/Pine.LNX.4.64.0512181027220.4827%40g5.osdl.org
|
||
|
[20]: http://groups.google.com/groups/search?as_umsgid=fa.fva543d.j0g6ir%40ifi.uio.no
|
||
|
[21]: http://mid.gmane.org/Pine.LNX.4.64.0512181153080.4827%40g5.osdl.org
|
||
|
[22]: http://groups.google.com/groups/search?as_umsgid=fa.g19j43e.h026iq%40ifi.uio.no
|
||
|
[23]: http://mid.gmane.org/Pine.LNX.4.64.0512191053400.4827%40g5.osdl.org
|
||
|
[24]: http://groups.google.com/groups/search?as_umsgid=fa.g1a13rj.h0c6qh%40ifi.uio.no
|
||
|
[25]: http://mid.gmane.org/Pine.LNX.4.64.0512191148460.4827%40g5.osdl.org
|
||
|
[26]: http://groups.google.com/groups/search?as_umsgid=fa.fvpr2re.jg27qq%40ifi.uio.no
|
||
|
[27]: http://mid.gmane.org/Pine.LNX.4.64.0512191203120.4827%40g5.osdl.org
|
||
|
[28]: http://groups.google.com/groups/search?as_umsgid=fa.g09n334.i0672i%40ifi.uio.no
|
||
|
[29]: http://mid.gmane.org/Pine.LNX.4.64.0512201202200.4827%40g5.osdl.org
|
||
|
[30]: http://groups.google.com/groups/search?as_umsgid=fa.g09r4b6.i066ak%40ifi.uio.no
|
||
|
[31]: http://mid.gmane.org/Pine.LNX.4.64.0512201354210.4827%40g5.osdl.org
|
||
|
[32]: http://groups.google.com/groups/search?as_umsgid=fa.hb476cr.k76389%40ifi.uio.no
|
||
|
[33]: http://mid.gmane.org/Pine.LNX.4.64.0512221550290.14098%40g5.osdl.org
|
||
|
[34]: http://groups.google.com/groups/search?as_umsgid=fa.g0a14jd.i0g62p%40ifi.uio.no
|
||
|
[35]: http://mid.gmane.org/Pine.LNX.4.64.0601030736531.3668%40g5.osdl.org
|
||
|
[36]: http://groups.google.com/groups/search?as_umsgid=fa.z+KqPzoOtCdF%2ff8Xq6ejdEs5kZ4%40ifi.uio.no
|
||
|
[37]: http://groups.google.com/groups/search?as_umsgid=fa.6KvRw3P6%2fPVf+3AnODUXgA0i0AU%40ifi.uio.no
|
||
|
|