aha/include/linux/eventpoll.h
Davide Libenzi 3419b23a91 [PATCH] epoll: use unlocked wqueue operations
A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock).

Like I explained in another email, and directly to Arjan, this can't happen
in reality because of the explicit check at eventpoll.c:592, that does not
allow to drop an epoll fd inside the same epoll fd.  Since lockdep is
working on per-structure locks, it will never be able to know of policies
enforced in other parts of the code.

It was decided time ago of having the ability to drop epoll fds inside
other epoll fds, that triggers a very trick wakeup operations (due to
possibly reentrant callback-driven wakeups) handled by the
ep_poll_safewake() function.  While looking again at the code though, I
noticed that all the operations done on the epoll's main structure wait
queue head (->wq) are already protected by the epoll lock (->lock), so that
locked-style functions can be used to manipulate the ->wq member.  This
makes both a lock-acquire save, and lockdep happy.

Running totalmess on my dual opteron for a while did not reveal any problem
so far:

http://www.xmailserver.org/totalmess.c

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:13 -07:00

103 lines
2.6 KiB
C

/*
* include/linux/eventpoll.h ( Efficent event polling implementation )
* Copyright (C) 2001,...,2006 Davide Libenzi
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* Davide Libenzi <davidel@xmailserver.org>
*
*/
#ifndef _LINUX_EVENTPOLL_H
#define _LINUX_EVENTPOLL_H
#include <linux/types.h>
/* Valid opcodes to issue to sys_epoll_ctl() */
#define EPOLL_CTL_ADD 1
#define EPOLL_CTL_DEL 2
#define EPOLL_CTL_MOD 3
/* Set the One Shot behaviour for the target file descriptor */
#define EPOLLONESHOT (1 << 30)
/* Set the Edge Triggered behaviour for the target file descriptor */
#define EPOLLET (1 << 31)
/*
* On x86-64 make the 64bit structure have the same alignment as the
* 32bit structure. This makes 32bit emulation easier.
*/
#ifdef __x86_64__
#define EPOLL_PACKED __attribute__((packed))
#else
#define EPOLL_PACKED
#endif
struct epoll_event {
__u32 events;
__u64 data;
} EPOLL_PACKED;
#ifdef __KERNEL__
/* Forward declarations to avoid compiler errors */
struct file;
#ifdef CONFIG_EPOLL
/* Used to initialize the epoll bits inside the "struct file" */
static inline void eventpoll_init_file(struct file *file)
{
INIT_LIST_HEAD(&file->f_ep_links);
spin_lock_init(&file->f_ep_lock);
}
/* Used to release the epoll bits inside the "struct file" */
void eventpoll_release_file(struct file *file);
/*
* This is called from inside fs/file_table.c:__fput() to unlink files
* from the eventpoll interface. We need to have this facility to cleanup
* correctly files that are closed without being removed from the eventpoll
* interface.
*/
static inline void eventpoll_release(struct file *file)
{
/*
* Fast check to avoid the get/release of the semaphore. Since
* we're doing this outside the semaphore lock, it might return
* false negatives, but we don't care. It'll help in 99.99% of cases
* to avoid the semaphore lock. False positives simply cannot happen
* because the file in on the way to be removed and nobody ( but
* eventpoll ) has still a reference to this file.
*/
if (likely(list_empty(&file->f_ep_links)))
return;
/*
* The file is being closed while it is still linked to an epoll
* descriptor. We need to handle this by correctly unlinking it
* from its containers.
*/
eventpoll_release_file(file);
}
#else
static inline void eventpoll_init_file(struct file *file) {}
static inline void eventpoll_release(struct file *file) {}
#endif
#endif /* #ifdef __KERNEL__ */
#endif /* #ifndef _LINUX_EVENTPOLL_H */