mirror of
https://github.com/adulau/aha.git
synced 2025-04-23 11:46:36 +00:00
Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
This commit is contained in:
commit
1da177e4c3
17291 changed files with 6718755 additions and 0 deletions
356
COPYING
Normal file
356
COPYING
Normal file
|
@ -0,0 +1,356 @@
|
|||
|
||||
NOTE! This copyright does *not* cover user programs that use kernel
|
||||
services by normal system calls - this is merely considered normal use
|
||||
of the kernel, and does *not* fall under the heading of "derived work".
|
||||
Also note that the GPL below is copyrighted by the Free Software
|
||||
Foundation, but the instance of code that it refers to (the Linux
|
||||
kernel) is copyrighted by me and others who actually wrote it.
|
||||
|
||||
Also note that the only valid version of the GPL as far as the kernel
|
||||
is concerned is _this_ particular version of the license (ie v2, not
|
||||
v2.2 or v3.x or whatever), unless explicitly otherwise stated.
|
||||
|
||||
Linus Torvalds
|
||||
|
||||
----------------------------------------
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
|
||||
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Library General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) year name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, the commands you use may
|
||||
be called something other than `show w' and `show c'; they could even be
|
||||
mouse-clicks or menu items--whatever suits your program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
This General Public License does not permit incorporating your program into
|
||||
proprietary programs. If your program is a subroutine library, you may
|
||||
consider it more useful to permit linking proprietary applications with the
|
||||
library. If this is what you want to do, use the GNU Library General
|
||||
Public License instead of this License.
|
294
Documentation/00-INDEX
Normal file
294
Documentation/00-INDEX
Normal file
|
@ -0,0 +1,294 @@
|
|||
|
||||
This is a brief list of all the files in ./linux/Documentation and what
|
||||
they contain. If you add a documentation file, please list it here in
|
||||
alphabetical order as well, or risk being hunted down like a rabid dog.
|
||||
Please try and keep the descriptions small enough to fit on one line.
|
||||
Thanks -- Paul G.
|
||||
|
||||
Following translations are available on the WWW:
|
||||
|
||||
- Japanese, maintained by the JF Project (JF@linux.or.jp), at
|
||||
http://www.linux.or.jp/JF/
|
||||
|
||||
00-INDEX
|
||||
- this file.
|
||||
BK-usage/
|
||||
- directory with info on BitKeeper.
|
||||
BUG-HUNTING
|
||||
- brute force method of doing binary search of patches to find bug.
|
||||
Changes
|
||||
- list of changes that break older software packages.
|
||||
CodingStyle
|
||||
- how the boss likes the C code in the kernel to look.
|
||||
DMA-API.txt
|
||||
- DMA API, pci_ API & extensions for non-consistent memory machines.
|
||||
DMA-mapping.txt
|
||||
- info for PCI drivers using DMA portably across all platforms.
|
||||
DocBook/
|
||||
- directory with DocBook templates etc. for kernel documentation.
|
||||
IO-mapping.txt
|
||||
- how to access I/O mapped memory from within device drivers.
|
||||
IPMI.txt
|
||||
- info on Linux Intelligent Platform Management Interface (IPMI) Driver.
|
||||
IRQ-affinity.txt
|
||||
- how to select which CPU(s) handle which interrupt events on SMP.
|
||||
ManagementStyle
|
||||
- how to (attempt to) manage kernel hackers.
|
||||
MSI-HOWTO.txt
|
||||
- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
|
||||
RCU/
|
||||
- directory with info on RCU (read-copy update).
|
||||
README.DAC960
|
||||
- info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux.
|
||||
SAK.txt
|
||||
- info on Secure Attention Keys.
|
||||
SubmittingDrivers
|
||||
- procedure to get a new driver source included into the kernel tree.
|
||||
SubmittingPatches
|
||||
- procedure to get a source patch included into the kernel tree.
|
||||
VGA-softcursor.txt
|
||||
- how to change your VGA cursor from a blinking underscore.
|
||||
arm/
|
||||
- directory with info about Linux on the ARM architecture.
|
||||
basic_profiling.txt
|
||||
- basic instructions for those who wants to profile Linux kernel.
|
||||
binfmt_misc.txt
|
||||
- info on the kernel support for extra binary formats.
|
||||
block/
|
||||
- info on the Block I/O (BIO) layer.
|
||||
cachetlb.txt
|
||||
- describes the cache/TLB flushing interfaces Linux uses.
|
||||
cciss.txt
|
||||
- info, major/minor #'s for Compaq's SMART Array Controllers.
|
||||
cdrom/
|
||||
- directory with information on the CD-ROM drivers that Linux has.
|
||||
cli-sti-removal.txt
|
||||
- cli()/sti() removal guide.
|
||||
computone.txt
|
||||
- info on Computone Intelliport II/Plus Multiport Serial Driver.
|
||||
cpqarray.txt
|
||||
- info on using Compaq's SMART2 Intelligent Disk Array Controllers.
|
||||
cpu-freq/
|
||||
- info on CPU frequency and voltage scaling.
|
||||
cris/
|
||||
- directory with info about Linux on CRIS architecture.
|
||||
crypto/
|
||||
- directory with info on the Crypto API.
|
||||
debugging-modules.txt
|
||||
- some notes on debugging modules after Linux 2.6.3.
|
||||
device-mapper/
|
||||
- directory with info on Device Mapper.
|
||||
devices.txt
|
||||
- plain ASCII listing of all the nodes in /dev/ with major minor #'s.
|
||||
digiepca.txt
|
||||
- info on Digi Intl. {PC,PCI,EISA}Xx and Xem series cards.
|
||||
dnotify.txt
|
||||
- info about directory notification in Linux.
|
||||
driver-model/
|
||||
- directory with info about Linux driver model.
|
||||
dvb/
|
||||
- info on Linux Digital Video Broadcast (DVB) subsystem.
|
||||
early-userspace/
|
||||
- info about initramfs, klibc, and userspace early during boot.
|
||||
eisa.txt
|
||||
- info on EISA bus support.
|
||||
exception.txt
|
||||
- how Linux v2.2 handles exceptions without verify_area etc.
|
||||
fb/
|
||||
- directory with info on the frame buffer graphics abstraction layer.
|
||||
filesystems/
|
||||
- directory with info on the various filesystems that Linux supports.
|
||||
firmware_class/
|
||||
- request_firmware() hotplug interface info.
|
||||
floppy.txt
|
||||
- notes and driver options for the floppy disk driver.
|
||||
ftape.txt
|
||||
- notes about the floppy tape device driver.
|
||||
hayes-esp.txt
|
||||
- info on using the Hayes ESP serial driver.
|
||||
highuid.txt
|
||||
- notes on the change from 16 bit to 32 bit user/group IDs.
|
||||
hpet.txt
|
||||
- High Precision Event Timer Driver for Linux.
|
||||
hw_random.txt
|
||||
- info on Linux support for random number generator in i8xx chipsets.
|
||||
i2c/
|
||||
- directory with info about the I2C bus/protocol (2 wire, kHz speed).
|
||||
i2o/
|
||||
- directory with info about the Linux I2O subsystem.
|
||||
i386/
|
||||
- directory with info about Linux on Intel 32 bit architecture.
|
||||
ia64/
|
||||
- directory with info about Linux on Intel 64 bit architecture.
|
||||
ide.txt
|
||||
- important info for users of ATA devices (IDE/EIDE disks and CD-ROMS).
|
||||
initrd.txt
|
||||
- how to use the RAM disk as an initial/temporary root filesystem.
|
||||
input/
|
||||
- info on Linux input device support.
|
||||
io_ordering.txt
|
||||
- info on ordering I/O writes to memory-mapped addresses.
|
||||
ioctl-number.txt
|
||||
- how to implement and register device/driver ioctl calls.
|
||||
iostats.txt
|
||||
- info on I/O statistics Linux kernel provides.
|
||||
isapnp.txt
|
||||
- info on Linux ISA Plug & Play support.
|
||||
isdn/
|
||||
- directory with info on the Linux ISDN support, and supported cards.
|
||||
java.txt
|
||||
- info on the in-kernel binary support for Java(tm).
|
||||
kbuild/
|
||||
- directory with info about the kernel build process.
|
||||
kernel-doc-nano-HOWTO.txt
|
||||
- mini HowTo on generation and location of kernel documentation files.
|
||||
kernel-docs.txt
|
||||
- listing of various WWW + books that document kernel internals.
|
||||
kernel-parameters.txt
|
||||
- summary listing of command line / boot prompt args for the kernel.
|
||||
kobject.txt
|
||||
- info of the kobject infrastructure of the Linux kernel.
|
||||
laptop-mode.txt
|
||||
- How to conserve battery power using laptop-mode.
|
||||
ldm.txt
|
||||
- a brief description of LDM (Windows Dynamic Disks).
|
||||
locks.txt
|
||||
- info on file locking implementations, flock() vs. fcntl(), etc.
|
||||
logo.gif
|
||||
- Full colour GIF image of Linux logo (penguin).
|
||||
logo.txt
|
||||
- Info on creator of above logo & site to get additional images from.
|
||||
m68k/
|
||||
- directory with info about Linux on Motorola 68k architecture.
|
||||
magic-number.txt
|
||||
- list of magic numbers used to mark/protect kernel data structures.
|
||||
mandatory.txt
|
||||
- info on the Linux implementation of Sys V mandatory file locking.
|
||||
mca.txt
|
||||
- info on supporting Micro Channel Architecture (e.g. PS/2) systems.
|
||||
md.txt
|
||||
- info on boot arguments for the multiple devices driver.
|
||||
memory.txt
|
||||
- info on typical Linux memory problems.
|
||||
mips/
|
||||
- directory with info about Linux on MIPS architecture.
|
||||
mono.txt
|
||||
- how to execute Mono-based .NET binaries with the help of BINFMT_MISC.
|
||||
moxa-smartio
|
||||
- info on installing/using Moxa multiport serial driver.
|
||||
mtrr.txt
|
||||
- how to use PPro Memory Type Range Registers to increase performance.
|
||||
nbd.txt
|
||||
- info on a TCP implementation of a network block device.
|
||||
networking/
|
||||
- directory with info on various aspects of networking with Linux.
|
||||
nfsroot.txt
|
||||
- short guide on setting up a diskless box with NFS root filesystem.
|
||||
nmi_watchdog.txt
|
||||
- info on NMI watchdog for SMP systems.
|
||||
numastat.txt
|
||||
- info on how to read Numa policy hit/miss statistics in sysfs.
|
||||
oops-tracing.txt
|
||||
- how to decode those nasty internal kernel error dump messages.
|
||||
paride.txt
|
||||
- information about the parallel port IDE subsystem.
|
||||
parisc/
|
||||
- directory with info on using Linux on PA-RISC architecture.
|
||||
parport.txt
|
||||
- how to use the parallel-port driver.
|
||||
parport-lowlevel.txt
|
||||
- description and usage of the low level parallel port functions.
|
||||
pci.txt
|
||||
- info on the PCI subsystem for device driver authors.
|
||||
pm.txt
|
||||
- info on Linux power management support.
|
||||
pnp.txt
|
||||
- Linux Plug and Play documentation.
|
||||
power/
|
||||
- directory with info on Linux PCI power management.
|
||||
powerpc/
|
||||
- directory with info on using Linux with the PowerPC.
|
||||
preempt-locking.txt
|
||||
- info on locking under a preemptive kernel.
|
||||
ramdisk.txt
|
||||
- short guide on how to set up and use the RAM disk.
|
||||
riscom8.txt
|
||||
- notes on using the RISCom/8 multi-port serial driver.
|
||||
rocket.txt
|
||||
- info on the Comtrol RocketPort multiport serial driver.
|
||||
rpc-cache.txt
|
||||
- introduction to the caching mechanisms in the sunrpc layer.
|
||||
rtc.txt
|
||||
- notes on how to use the Real Time Clock (aka CMOS clock) driver.
|
||||
s390/
|
||||
- directory with info on using Linux on the IBM S390.
|
||||
sched-coding.txt
|
||||
- reference for various scheduler-related methods in the O(1) scheduler.
|
||||
sched-design.txt
|
||||
- goals, design and implementation of the Linux O(1) scheduler.
|
||||
sched-domains.txt
|
||||
- information on scheduling domains.
|
||||
sched-stats.txt
|
||||
- information on schedstats (Linux Scheduler Statistics).
|
||||
scsi/
|
||||
- directory with info on Linux scsi support.
|
||||
serial/
|
||||
- directory with info on the low level serial API.
|
||||
serial-console.txt
|
||||
- how to set up Linux with a serial line console as the default.
|
||||
sgi-visws.txt
|
||||
- short blurb on the SGI Visual Workstations.
|
||||
sh/
|
||||
- directory with info on porting Linux to a new architecture.
|
||||
smart-config.txt
|
||||
- description of the Smart Config makefile feature.
|
||||
smp.txt
|
||||
- a few notes on symmetric multi-processing.
|
||||
sonypi.txt
|
||||
- info on Linux Sony Programmable I/O Device support.
|
||||
sound/
|
||||
- directory with info on sound card support.
|
||||
sparc/
|
||||
- directory with info on using Linux on Sparc architecture.
|
||||
specialix.txt
|
||||
- info on hardware/driver for specialix IO8+ multiport serial card.
|
||||
spinlocks.txt
|
||||
- info on using spinlocks to provide exclusive access in kernel.
|
||||
stallion.txt
|
||||
- info on using the Stallion multiport serial driver.
|
||||
svga.txt
|
||||
- short guide on selecting video modes at boot via VGA BIOS.
|
||||
sx.txt
|
||||
- info on the Specialix SX/SI multiport serial driver.
|
||||
sysctl/
|
||||
- directory with info on the /proc/sys/* files.
|
||||
sysrq.txt
|
||||
- info on the magic SysRq key.
|
||||
telephony/
|
||||
- directory with info on telephony (e.g. voice over IP) support.
|
||||
time_interpolators.txt
|
||||
- info on time interpolators.
|
||||
tipar.txt
|
||||
- information about Parallel link cable for Texas Instruments handhelds.
|
||||
tty.txt
|
||||
- guide to the locking policies of the tty layer.
|
||||
unicode.txt
|
||||
- info on the Unicode character/font mapping used in Linux.
|
||||
uml/
|
||||
- directory with infomation about User Mode Linux.
|
||||
usb/
|
||||
- directory with info regarding the Universal Serial Bus.
|
||||
video4linux/
|
||||
- directory with info regarding video/TV/radio cards and linux.
|
||||
vm/
|
||||
- directory with info on the Linux vm code.
|
||||
voyager.txt
|
||||
- guide to running Linux on the Voyager architecture.
|
||||
watchdog/
|
||||
- how to auto-reboot Linux if it has "fallen and can't get up". ;-)
|
||||
x86_64/
|
||||
- directory with info on Linux support for AMD x86-64 (Hammer) machines.
|
||||
xterm-linux.xpm
|
||||
- XPM image of penguin logo (see logo.txt) sitting on an xterm.
|
||||
zorro.txt
|
||||
- info on writing drivers for Zorro bus devices found on Amigas.
|
51
Documentation/BK-usage/00-INDEX
Normal file
51
Documentation/BK-usage/00-INDEX
Normal file
|
@ -0,0 +1,51 @@
|
|||
bk-kernel-howto.txt: Description of kernel workflow under BitKeeper
|
||||
|
||||
bk-make-sum: Create summary of changesets in one repository and not
|
||||
another, typically in preparation to be sent to an upstream maintainer.
|
||||
Typical usage:
|
||||
cd my-updated-repo
|
||||
bk-make-sum ~/repo/original-repo
|
||||
mv /tmp/linus.txt ../original-repo.txt
|
||||
|
||||
bksend: Create readable text output containing summary of changes, GNU
|
||||
patch of the changes, and BK metadata of changes (as needed for proper
|
||||
importing into BitKeeper by an upstream maintainer). This output is
|
||||
suitable for emailing BitKeeper changes. The recipient of this output
|
||||
may pipe it directly to 'bk receive'.
|
||||
|
||||
bz64wrap: helper script. Uncompressed input is piped to this script,
|
||||
which compresses its input, and then outputs the uu-/base64-encoded
|
||||
version of the compressed input.
|
||||
|
||||
cpcset: Copy changeset between unrelated repositories.
|
||||
Attempts to preserve changeset user, user address, description, in
|
||||
addition to the changeset (the patch) itself.
|
||||
Typical usage:
|
||||
cd my-updated-repo
|
||||
bk changes # looking for a changeset...
|
||||
cpcset 1.1511 . ../another-repo
|
||||
|
||||
csets-to-patches: Produces a delta of two BK repositories, in the form
|
||||
of individual files, each containing a single cset as a GNU patch.
|
||||
Output is several files, each with the filename "/tmp/rev-$REV.patch"
|
||||
Typical usage:
|
||||
cd my-updated-repo
|
||||
bk changes -L ~/repo/original-repo 2>&1 | \
|
||||
perl csets-to-patches
|
||||
|
||||
cset-to-linus: Produces a delta of two BK repositories, in the form of
|
||||
changeset descriptions, with 'diffstat' output created for each
|
||||
individual changset.
|
||||
Typical usage:
|
||||
cd my-updated-repo
|
||||
bk changes -L ~/repo/original-repo 2>&1 | \
|
||||
perl cset-to-linus > summary.txt
|
||||
|
||||
gcapatch: Generates patch containing changes in local repository.
|
||||
Typical usage:
|
||||
cd my-updated-repo
|
||||
gcapatch > foo.patch
|
||||
|
||||
unbz64wrap: Reverse an encoded, compressed data stream created by
|
||||
bz64wrap into an uncompressed, typically text/plain output.
|
||||
|
283
Documentation/BK-usage/bk-kernel-howto.txt
Normal file
283
Documentation/BK-usage/bk-kernel-howto.txt
Normal file
|
@ -0,0 +1,283 @@
|
|||
|
||||
Doing the BK Thing, Penguin-Style
|
||||
|
||||
|
||||
|
||||
|
||||
This set of notes is intended mainly for kernel developers, occasional
|
||||
or full-time, but sysadmins and power users may find parts of it useful
|
||||
as well. It assumes at least a basic familiarity with CVS, both at a
|
||||
user level (use on the cmd line) and at a higher level (client-server model).
|
||||
Due to the author's background, an operation may be described in terms
|
||||
of CVS, or in terms of how that operation differs from CVS.
|
||||
|
||||
This is -not- intended to be BitKeeper documentation. Always run
|
||||
"bk help <command>" or in X "bk helptool <command>" for reference
|
||||
documentation.
|
||||
|
||||
|
||||
BitKeeper Concepts
|
||||
------------------
|
||||
|
||||
In the true nature of the Internet itself, BitKeeper is a distributed
|
||||
system. When applied to revision control, this means doing away with
|
||||
client-server, and changing to a parent-child model... essentially
|
||||
peer-to-peer. On the developer's end, this also represents a
|
||||
fundamental disruption in the standard workflow of changes, commits,
|
||||
and merges. You will need to take a few minutes to think about
|
||||
how to best work under BitKeeper, and re-optimize things a bit.
|
||||
In some sense it is a bit radical, because it might described as
|
||||
tossing changes out into a maelstrom and having them magically
|
||||
land at the right destination... but I'm getting ahead of myself.
|
||||
|
||||
Let's start with this progression:
|
||||
Each BitKeeper source tree on disk is a repository unto itself.
|
||||
Each repository has a parent (except the root/original, of course).
|
||||
Each repository contains a set of a changesets ("csets").
|
||||
Each cset is one or more changed files, bundled together.
|
||||
|
||||
Each tree is a repository, so all changes are checked into the local
|
||||
tree. When a change is checked in, all modified files are grouped
|
||||
into a logical unit, the changeset. Internally, BK links these
|
||||
changesets in a tree, representing various converging and diverging
|
||||
lines of development. These changesets are the bread and butter of
|
||||
the BK system.
|
||||
|
||||
After the concept of changesets, the next thing you need to get used
|
||||
to is having multiple copies of source trees lying around. This -really-
|
||||
takes some getting used to, for some people. Separate source trees
|
||||
are the means in BitKeeper by which you delineate parallel lines
|
||||
of development, both minor and major. What would be branches in
|
||||
CVS become separate source trees, or "clones" in BitKeeper [heh,
|
||||
or Star Wars] terminology.
|
||||
|
||||
Clones and changesets are the tools from which most of the power of
|
||||
BitKeeper is derived. As mentioned earlier, each clone has a parent,
|
||||
the tree used as the source when the new clone was created. In a
|
||||
CVS-like setup, the parent would be a remote server on the Internet,
|
||||
and the child is your local clone of that tree.
|
||||
|
||||
Once you have established a common baseline between two source trees --
|
||||
a common parent -- then you can merge changesets between those two
|
||||
trees with ease. Merging changes into a tree is called a "pull", and
|
||||
is analagous to 'cvs update'. A pull downloads all the changesets in
|
||||
the remote tree you do not have, and merges them. Sending changes in
|
||||
one tree to another tree is called a "push". Push sends all changes
|
||||
in the local tree the remote does not yet have, and merges them.
|
||||
|
||||
From these concepts come some initial command examples:
|
||||
|
||||
1) bk clone -q http://linux.bkbits.net/linux-2.5 linus-2.5
|
||||
Download a 2.5 stock kernel tree, naming it "linus-2.5" in the local dir.
|
||||
The "-q" disables listing every single file as it is downloaded.
|
||||
|
||||
2) bk clone -ql linus-2.5 alpha-2.5
|
||||
Create a separate source tree for the Alpha AXP architecture.
|
||||
The "-l" uses hard links instead of copying data, since both trees are
|
||||
on the local disk. You can also replace the above with "bk lclone -q ..."
|
||||
|
||||
You only clone a tree -once-. After cloning the tree lives a long time
|
||||
on disk, being updating by pushes and pulls.
|
||||
|
||||
3) cd alpha-2.5 ; bk pull http://gkernel.bkbits.net/alpha-2.5
|
||||
Download changes in "alpha-2.5" repository which are not present
|
||||
in the local repository, and merge them into the source tree.
|
||||
|
||||
4) bk -r co -q
|
||||
Because every tree is a repository, files must be checked out before
|
||||
they will be in their standard places in the source tree.
|
||||
|
||||
5) bk vi fs/inode.c # example change...
|
||||
bk citool # checkin, using X tool
|
||||
bk push bk://gkernel@bkbits.net/alpha-2.5 # upload change
|
||||
Typical example of a BK sequence that would replace the analagous CVS
|
||||
situation,
|
||||
vi fs/inode.c
|
||||
cvs commit
|
||||
|
||||
As this is just supposed to be a quick BK intro, for more in-depth
|
||||
tutorials, live working demos, and docs, see http://www.bitkeeper.com/
|
||||
|
||||
|
||||
|
||||
BK and Kernel Development Workflow
|
||||
----------------------------------
|
||||
Currently the latest 2.5 tree is available via "bk clone $URL"
|
||||
and "bk pull $URL" at http://linux.bkbits.net/linux-2.5
|
||||
This should change in a few weeks to a kernel.org URL.
|
||||
|
||||
|
||||
A big part of using BitKeeper is organizing the various trees you have
|
||||
on your local disk, and organizing the flow of changes among those
|
||||
trees, and remote trees. If one were to graph the relationships between
|
||||
a desired BK setup, you are likely to see a few-many-few graph, like
|
||||
this:
|
||||
|
||||
linux-2.5
|
||||
|
|
||||
merge-to-linus-2.5
|
||||
/ | |
|
||||
/ | |
|
||||
vm-hacks bugfixes filesys personal-hacks
|
||||
\ | | /
|
||||
\ | | /
|
||||
\ | | /
|
||||
testing-and-validation
|
||||
|
||||
Since a "bk push" sends all changes not in the target tree, and
|
||||
since a "bk pull" receives all changes not in the source tree, you want
|
||||
to make sure you are only pushing specific changes to the desired tree,
|
||||
not all changes from "peer parent" trees. For example, pushing a change
|
||||
from the testing-and-validation tree would probably be a bad idea,
|
||||
because it will push all changes from vm-hacks, bugfixes, filesys, and
|
||||
personal-hacks trees into the target tree.
|
||||
|
||||
One would typically work on only one "theme" at a time, either
|
||||
vm-hacks or bugfixes or filesys, keeping those changes isolated in
|
||||
their own tree during development, and only merge the isolated with
|
||||
other changes when going upstream (to Linus or other maintainers) or
|
||||
downstream (to your "union" trees, like testing-and-validation above).
|
||||
|
||||
It should be noted that some of this separation is not just recommended
|
||||
practice, it's actually [for now] -enforced- by BitKeeper. BitKeeper
|
||||
requires that changesets maintain a certain order, which is the reason
|
||||
that "bk push" sends all local changesets the remote doesn't have. This
|
||||
separation may look like a lot of wasted disk space at first, but it
|
||||
helps when two unrelated changes may "pollute" the same area of code, or
|
||||
don't follow the same pace of development, or any other of the standard
|
||||
reasons why one creates a development branch.
|
||||
|
||||
Small development branches (clones) will appear and disappear:
|
||||
|
||||
-------- A --------- B --------- C --------- D -------
|
||||
\ /
|
||||
-----short-term devel branch-----
|
||||
|
||||
While long-term branches will parallel a tree (or trees), with period
|
||||
merge points. In this first example, we pull from a tree (pulls,
|
||||
"\") periodically, such as what occurs when tracking changes in a
|
||||
vendor tree, never pushing changes back up the line:
|
||||
|
||||
-------- A --------- B --------- C --------- D -------
|
||||
\ \ \
|
||||
----long-term devel branch-----------------
|
||||
|
||||
And then a more common case in Linux kernel development, a long term
|
||||
branch with periodic merges back into the tree (pushes, "/"):
|
||||
|
||||
-------- A --------- B --------- C --------- D -------
|
||||
\ \ / \
|
||||
----long-term devel branch-----------------
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Submitting Changes to Linus
|
||||
---------------------------
|
||||
There's a bit of an art, or style, of submitting changes to Linus.
|
||||
Since Linus's tree is now (you might say) fully integrated into the
|
||||
distributed BitKeeper system, there are several prerequisites to
|
||||
properly submitting a BitKeeper change. All these prereq's are just
|
||||
general cleanliness of BK usage, so as people become experts at BK, feel
|
||||
free to optimize this process further (assuming Linus agrees, of
|
||||
course).
|
||||
|
||||
|
||||
|
||||
0) Make sure your tree was originally cloned from the linux-2.5 tree
|
||||
created by Linus. If your tree does not have this as its ancestor, it
|
||||
is impossible to reliably exchange changesets.
|
||||
|
||||
|
||||
|
||||
1) Pay attention to your commit text. The commit message that
|
||||
accompanies each changeset you submit will live on forever in history,
|
||||
and is used by Linus to accurately summarize the changes in each
|
||||
pre-patch. Remember that there is no context, so
|
||||
"fix for new scheduler changes"
|
||||
would be too vague, but
|
||||
"fix mips64 arch for new scheduler switch_to(), TIF_xxx semantics"
|
||||
would be much better.
|
||||
|
||||
You can and should use the command "bk comment -C<rev>" to update the
|
||||
commit text, and improve it after the fact. This is very useful for
|
||||
development: poor, quick descriptions during development, which get
|
||||
cleaned up using "bk comment" before issuing the "bk push" to submit the
|
||||
changes.
|
||||
|
||||
|
||||
|
||||
2) Include an Internet-available URL for Linus to pull from, such as
|
||||
|
||||
Pull from: http://gkernel.bkbits.net/net-drivers-2.5
|
||||
|
||||
|
||||
|
||||
3) Include a summary and "diffstat -p1" of each changeset that will be
|
||||
downloaded, when Linus issues a "bk pull". The author auto-generates
|
||||
these summaries using "bk changes -L <parent>", to obtain a listing
|
||||
of all the pending-to-send changesets, and their commit messages.
|
||||
|
||||
It is important to show Linus what he will be downloading when he issues
|
||||
a "bk pull", to reduce the time required to sift the changes once they
|
||||
are downloaded to Linus's local machine.
|
||||
|
||||
IMPORTANT NOTE: One of the features of BK is that your repository does
|
||||
not have to be up to date, in order for Linus to receive your changes.
|
||||
It is considered a courtesy to keep your repository fairly recent, to
|
||||
lessen any potential merge work Linus may need to do.
|
||||
|
||||
|
||||
4) Split up your changes. Each maintainer<->Linus situation is likely
|
||||
to be slightly different here, so take this just as general advice. The
|
||||
author splits up changes according to "themes" when merging with Linus.
|
||||
Simultaneous pushes from local development go to special trees which
|
||||
exist solely to house changes "queued" for Linus. Example of the trees:
|
||||
|
||||
net-drivers-2.5 -- on-going net driver maintenance
|
||||
vm-2.5 -- VM-related changes
|
||||
fs-2.5 -- filesystem-related changes
|
||||
|
||||
Linus then has much more freedom for pulling changes. He could (for
|
||||
example) issue a "bk pull" on vm-2.5 and fs-2.5 trees, to merge their
|
||||
changes, but hold off net-drivers-2.5 because of a change that needs
|
||||
more discussion.
|
||||
|
||||
Other maintainers may find that a single linus-pull-from tree is
|
||||
adequate for passing BK changesets to him.
|
||||
|
||||
|
||||
|
||||
Frequently Answered Questions
|
||||
-----------------------------
|
||||
1) How do I change the e-mail address shown in the changelog?
|
||||
A. When you run "bk citool" or "bk commit", set environment
|
||||
variables BK_USER and BK_HOST to the desired username
|
||||
and host/domain name.
|
||||
|
||||
|
||||
2) How do I use tags / get a diff between two kernel versions?
|
||||
A. Pass the tags Linus uses to 'bk export'.
|
||||
|
||||
ChangeSets are in a forward-progressing order, so it's pretty easy
|
||||
to get a snapshot starting and ending at any two points in time.
|
||||
Linus puts tags on each release and pre-release, so you could use
|
||||
these two examples:
|
||||
|
||||
bk export -tpatch -hdu -rv2.5.4,v2.5.5 | less
|
||||
# creates patch-2.5.5 essentially
|
||||
bk export -tpatch -du -rv2.5.5-pre1,v2.5.5 | less
|
||||
# changes from pre1 to final
|
||||
|
||||
A tag is just an alias for a specific changeset... and since changesets
|
||||
are ordered, a tag is thus a marker for a specific point in time (or
|
||||
specific state of the tree).
|
||||
|
||||
|
||||
3) Is there an easy way to generate One Big Patch versus mainline,
|
||||
for my long-lived kernel branch?
|
||||
A. Yes. This requires BK 3.x, though.
|
||||
|
||||
bk export -tpatch -r`bk repogca bk://linux.bkbits.net/linux-2.5`,+
|
||||
|
34
Documentation/BK-usage/bk-make-sum
Executable file
34
Documentation/BK-usage/bk-make-sum
Executable file
|
@ -0,0 +1,34 @@
|
|||
#!/bin/sh -e
|
||||
# DIR=$HOME/BK/axp-2.5
|
||||
# cd $DIR
|
||||
|
||||
LINUS_REPO=$1
|
||||
DIRBASE=`basename $PWD`
|
||||
|
||||
{
|
||||
cat <<EOT
|
||||
Please do a
|
||||
|
||||
bk pull bk://gkernel.bkbits.net/$DIRBASE
|
||||
|
||||
This will update the following files:
|
||||
|
||||
EOT
|
||||
|
||||
bk export -tpatch -hdu -r`bk repogca $LINUS_REPO`,+ | diffstat -p1 2>/dev/null
|
||||
|
||||
cat <<EOT
|
||||
|
||||
through these ChangeSets:
|
||||
|
||||
EOT
|
||||
|
||||
bk changes -L -d'$unless(:MERGE:){ChangeSet|:CSETREV:\n}' $LINUS_REPO |
|
||||
bk -R prs -h -d'$unless(:MERGE:){<:P:@:HOST:> (:D: :I:)\n$each(:C:){ (:C:)\n}\n}' -
|
||||
|
||||
} > /tmp/linus.txt
|
||||
|
||||
cat <<EOT
|
||||
Mail text in /tmp/linus.txt; please check and send using your favourite
|
||||
mailer.
|
||||
EOT
|
36
Documentation/BK-usage/bksend
Executable file
36
Documentation/BK-usage/bksend
Executable file
|
@ -0,0 +1,36 @@
|
|||
#!/bin/sh
|
||||
# A script to format BK changeset output in a manner that is easy to read.
|
||||
# Andreas Dilger <adilger@turbolabs.com> 13/02/2002
|
||||
#
|
||||
# Add diffstat output after Changelog <adilger@turbolabs.com> 21/02/2002
|
||||
|
||||
PROG=bksend
|
||||
|
||||
usage() {
|
||||
echo "usage: $PROG -r<rev>"
|
||||
echo -e "\twhere <rev> is of the form '1.23', '1.23..', '1.23..1.27',"
|
||||
echo -e "\tor '+' to indicate the most recent revision"
|
||||
|
||||
exit 1
|
||||
}
|
||||
|
||||
case $1 in
|
||||
-r) REV=$2; shift ;;
|
||||
-r*) REV=`echo $1 | sed 's/^-r//'` ;;
|
||||
*) echo "$PROG: no revision given, you probably don't want that";;
|
||||
esac
|
||||
|
||||
[ -z "$REV" ] && usage
|
||||
|
||||
echo "You can import this changeset into BK by piping this whole message to:"
|
||||
echo "'| bk receive [path to repository]' or apply the patch as usual."
|
||||
|
||||
SEP="\n===================================================================\n\n"
|
||||
echo -e $SEP
|
||||
env PAGER=/bin/cat bk changes -r$REV
|
||||
echo
|
||||
bk export -tpatch -du -h -r$REV | diffstat
|
||||
echo; echo
|
||||
bk export -tpatch -du -h -r$REV
|
||||
echo -e $SEP
|
||||
bk send -wgzip_uu -r$REV -
|
41
Documentation/BK-usage/bz64wrap
Executable file
41
Documentation/BK-usage/bz64wrap
Executable file
|
@ -0,0 +1,41 @@
|
|||
#!/bin/sh
|
||||
|
||||
# bz64wrap - the sending side of a bzip2 | base64 stream
|
||||
# Andreas Dilger <adilger@clusterfs.com> Jan 2002
|
||||
|
||||
|
||||
PATH=$PATH:/usr/bin:/usr/local/bin:/usr/freeware/bin
|
||||
|
||||
# A program to generate base64 encoding on stdout
|
||||
BASE64_ENCODE="uuencode -m /dev/stdout"
|
||||
BASE64_BEGIN=
|
||||
BASE64_END=
|
||||
|
||||
BZIP=NO
|
||||
BASE64=NO
|
||||
|
||||
# Test if we have the bzip program installed
|
||||
bzip2 -c /dev/null > /dev/null 2>&1 && BZIP=YES
|
||||
|
||||
# Test if uuencode can handle the -m (MIME) encoding option
|
||||
$BASE64_ENCODE < /dev/null > /dev/null 2>&1 && BASE64=YES
|
||||
|
||||
if [ $BASE64 = NO ]; then
|
||||
BASE64_ENCODE=mimencode
|
||||
BASE64_BEGIN="begin-base64 644 -"
|
||||
BASE64_END="===="
|
||||
|
||||
$BASE64_ENCODE < /dev/null > /dev/null 2>&1 && BASE64=YES
|
||||
fi
|
||||
|
||||
if [ $BZIP = NO -o $BASE64 = NO ]; then
|
||||
echo "$0: can't use bz64 encoding: bzip2=$BZIP, $BASE64_ENCODE=$BASE64"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Sadly, mimencode does not appear to have good "begin" and "end" markers
|
||||
# like uuencode does, and it is picky about getting the right start/end of
|
||||
# the base64 stream, so we handle this internally.
|
||||
echo "$BASE64_BEGIN"
|
||||
bzip2 -9 | $BASE64_ENCODE
|
||||
echo "$BASE64_END"
|
36
Documentation/BK-usage/cpcset
Executable file
36
Documentation/BK-usage/cpcset
Executable file
|
@ -0,0 +1,36 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Purpose: Copy changeset patch and description from one
|
||||
# repository to another, unrelated one.
|
||||
#
|
||||
# usage: cpcset [revision] [from-repository] [to-repository]
|
||||
#
|
||||
|
||||
REV=$1
|
||||
FROM=$2
|
||||
TO=$3
|
||||
TMPF=/tmp/cpcset.$$
|
||||
|
||||
rm -f $TMPF*
|
||||
|
||||
CWD_SAVE=`pwd`
|
||||
cd $FROM
|
||||
bk changes -r$REV | \
|
||||
grep -v '^ChangeSet' | \
|
||||
sed -e 's/^ //g' > $TMPF.log
|
||||
|
||||
USERHOST=`bk changes -r$REV | grep '^ChangeSet' | awk '{print $4}'`
|
||||
export BK_USER=`echo $USERHOST | awk '-F@' '{print $1}'`
|
||||
export BK_HOST=`echo $USERHOST | awk '-F@' '{print $2}'`
|
||||
|
||||
bk export -tpatch -hdu -r$REV > $TMPF.patch && \
|
||||
cd $CWD_SAVE && \
|
||||
cd $TO && \
|
||||
bk import -tpatch -CFR -y"`cat $TMPF.log`" $TMPF.patch . && \
|
||||
bk commit -y"`cat $TMPF.log`"
|
||||
|
||||
rm -f $TMPF*
|
||||
|
||||
echo changeset $REV copied.
|
||||
echo ""
|
||||
|
49
Documentation/BK-usage/cset-to-linus
Executable file
49
Documentation/BK-usage/cset-to-linus
Executable file
|
@ -0,0 +1,49 @@
|
|||
#!/usr/bin/perl -w
|
||||
|
||||
use strict;
|
||||
|
||||
my ($lhs, $rev, $tmp, $rhs, $s);
|
||||
my @cset_text = ();
|
||||
my @pipe_text = ();
|
||||
my $have_cset = 0;
|
||||
|
||||
while (<>) {
|
||||
next if /^---/;
|
||||
|
||||
if (($lhs, $tmp, $rhs) = (/^(ChangeSet\@)([^,]+)(, .*)$/)) {
|
||||
&cset_rev if ($have_cset);
|
||||
|
||||
$rev = $tmp;
|
||||
$have_cset = 1;
|
||||
|
||||
push(@cset_text, $_);
|
||||
}
|
||||
|
||||
elsif ($have_cset) {
|
||||
push(@cset_text, $_);
|
||||
}
|
||||
}
|
||||
&cset_rev if ($have_cset);
|
||||
exit(0);
|
||||
|
||||
|
||||
sub cset_rev {
|
||||
my $empty_cset = 0;
|
||||
|
||||
open PIPE, "bk export -tpatch -hdu -r $rev | diffstat -p1 2>/dev/null |" or die;
|
||||
while ($s = <PIPE>) {
|
||||
$empty_cset = 1 if ($s =~ /0 files changed/);
|
||||
push(@pipe_text, $s);
|
||||
}
|
||||
close(PIPE);
|
||||
|
||||
if (! $empty_cset) {
|
||||
print @cset_text;
|
||||
print @pipe_text;
|
||||
print "\n\n";
|
||||
}
|
||||
|
||||
@pipe_text = ();
|
||||
@cset_text = ();
|
||||
}
|
||||
|
44
Documentation/BK-usage/csets-to-patches
Executable file
44
Documentation/BK-usage/csets-to-patches
Executable file
|
@ -0,0 +1,44 @@
|
|||
#!/usr/bin/perl -w
|
||||
|
||||
use strict;
|
||||
|
||||
my ($lhs, $rev, $tmp, $rhs, $s);
|
||||
my @cset_text = ();
|
||||
my @pipe_text = ();
|
||||
my $have_cset = 0;
|
||||
|
||||
while (<>) {
|
||||
next if /^---/;
|
||||
|
||||
if (($lhs, $tmp, $rhs) = (/^(ChangeSet\@)([^,]+)(, .*)$/)) {
|
||||
&cset_rev if ($have_cset);
|
||||
|
||||
$rev = $tmp;
|
||||
$have_cset = 1;
|
||||
|
||||
push(@cset_text, $_);
|
||||
}
|
||||
|
||||
elsif ($have_cset) {
|
||||
push(@cset_text, $_);
|
||||
}
|
||||
}
|
||||
&cset_rev if ($have_cset);
|
||||
exit(0);
|
||||
|
||||
|
||||
sub cset_rev {
|
||||
my $empty_cset = 0;
|
||||
|
||||
system("bk export -tpatch -du -r $rev > /tmp/rev-$rev.patch");
|
||||
|
||||
if (! $empty_cset) {
|
||||
print @cset_text;
|
||||
print @pipe_text;
|
||||
print "\n\n";
|
||||
}
|
||||
|
||||
@pipe_text = ();
|
||||
@cset_text = ();
|
||||
}
|
||||
|
8
Documentation/BK-usage/gcapatch
Executable file
8
Documentation/BK-usage/gcapatch
Executable file
|
@ -0,0 +1,8 @@
|
|||
#!/bin/sh
|
||||
#
|
||||
# Purpose: Generate GNU diff of local changes versus canonical top-of-tree
|
||||
#
|
||||
# Usage: gcapatch > foo.patch
|
||||
#
|
||||
|
||||
bk export -tpatch -hdu -r`bk repogca bk://linux.bkbits.net/linux-2.5`,+
|
25
Documentation/BK-usage/unbz64wrap
Executable file
25
Documentation/BK-usage/unbz64wrap
Executable file
|
@ -0,0 +1,25 @@
|
|||
#!/bin/sh
|
||||
|
||||
# unbz64wrap - the receiving side of a bzip2 | base64 stream
|
||||
# Andreas Dilger <adilger@clusterfs.com> Jan 2002
|
||||
|
||||
# Sadly, mimencode does not appear to have good "begin" and "end" markers
|
||||
# like uuencode does, and it is picky about getting the right start/end of
|
||||
# the base64 stream, so we handle this explicitly here.
|
||||
|
||||
PATH=$PATH:/usr/bin:/usr/local/bin:/usr/freeware/bin
|
||||
|
||||
if mimencode -u < /dev/null > /dev/null 2>&1 ; then
|
||||
SHOW=
|
||||
while read LINE; do
|
||||
case $LINE in
|
||||
begin-base64*) SHOW=YES ;;
|
||||
====) SHOW= ;;
|
||||
*) [ "$SHOW" ] && echo "$LINE" ;;
|
||||
esac
|
||||
done | mimencode -u | bunzip2
|
||||
exit $?
|
||||
else
|
||||
cat - | uudecode -o /dev/stdout | bunzip2
|
||||
exit $?
|
||||
fi
|
92
Documentation/BUG-HUNTING
Normal file
92
Documentation/BUG-HUNTING
Normal file
|
@ -0,0 +1,92 @@
|
|||
[Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)]
|
||||
|
||||
This is how to track down a bug if you know nothing about kernel hacking.
|
||||
It's a brute force approach but it works pretty well.
|
||||
|
||||
You need:
|
||||
|
||||
. A reproducible bug - it has to happen predictably (sorry)
|
||||
. All the kernel tar files from a revision that worked to the
|
||||
revision that doesn't
|
||||
|
||||
You will then do:
|
||||
|
||||
. Rebuild a revision that you believe works, install, and verify that.
|
||||
. Do a binary search over the kernels to figure out which one
|
||||
introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but
|
||||
you know that 1.3.69 does. Pick a kernel in the middle and build
|
||||
that, like 1.3.50. Build & test; if it works, pick the mid point
|
||||
between .50 and .69, else the mid point between .28 and .50.
|
||||
. You'll narrow it down to the kernel that introduced the bug. You
|
||||
can probably do better than this but it gets tricky.
|
||||
|
||||
. Narrow it down to a subdirectory
|
||||
|
||||
- Copy kernel that works into "test". Let's say that 3.62 works,
|
||||
but 3.63 doesn't. So you diff -r those two kernels and come
|
||||
up with a list of directories that changed. For each of those
|
||||
directories:
|
||||
|
||||
Copy the non-working directory next to the working directory
|
||||
as "dir.63".
|
||||
One directory at time, try moving the working directory to
|
||||
"dir.62" and mv dir.63 dir"time, try
|
||||
|
||||
mv dir dir.62
|
||||
mv dir.63 dir
|
||||
find dir -name '*.[oa]' -print | xargs rm -f
|
||||
|
||||
And then rebuild and retest. Assuming that all related
|
||||
changes were contained in the sub directory, this should
|
||||
isolate the change to a directory.
|
||||
|
||||
Problems: changes in header files may have occurred; I've
|
||||
found in my case that they were self explanatory - you may
|
||||
or may not want to give up when that happens.
|
||||
|
||||
. Narrow it down to a file
|
||||
|
||||
- You can apply the same technique to each file in the directory,
|
||||
hoping that the changes in that file are self contained.
|
||||
|
||||
. Narrow it down to a routine
|
||||
|
||||
- You can take the old file and the new file and manually create
|
||||
a merged file that has
|
||||
|
||||
#ifdef VER62
|
||||
routine()
|
||||
{
|
||||
...
|
||||
}
|
||||
#else
|
||||
routine()
|
||||
{
|
||||
...
|
||||
}
|
||||
#endif
|
||||
|
||||
And then walk through that file, one routine at a time and
|
||||
prefix it with
|
||||
|
||||
#define VER62
|
||||
/* both routines here */
|
||||
#undef VER62
|
||||
|
||||
Then recompile, retest, move the ifdefs until you find the one
|
||||
that makes the difference.
|
||||
|
||||
Finally, you take all the info that you have, kernel revisions, bug
|
||||
description, the extent to which you have narrowed it down, and pass
|
||||
that off to whomever you believe is the maintainer of that section.
|
||||
A post to linux.dev.kernel isn't such a bad idea if you've done some
|
||||
work to narrow it down.
|
||||
|
||||
If you get it down to a routine, you'll probably get a fix in 24 hours.
|
||||
|
||||
My apologies to Linus and the other kernel hackers for describing this
|
||||
brute force approach, it's hardly what a kernel hacker would do. However,
|
||||
it does work and it lets non-hackers help fix bugs. And it is cool
|
||||
because Linux snapshots will let you do this - something that you can't
|
||||
do with vendor supplied releases.
|
||||
|
410
Documentation/Changes
Normal file
410
Documentation/Changes
Normal file
|
@ -0,0 +1,410 @@
|
|||
Intro
|
||||
=====
|
||||
|
||||
This document is designed to provide a list of the minimum levels of
|
||||
software necessary to run the 2.6 kernels, as well as provide brief
|
||||
instructions regarding any other "Gotchas" users may encounter when
|
||||
trying life on the Bleeding Edge. If upgrading from a pre-2.4.x
|
||||
kernel, please consult the Changes file included with 2.4.x kernels for
|
||||
additional information; most of that information will not be repeated
|
||||
here. Basically, this document assumes that your system is already
|
||||
functional and running at least 2.4.x kernels.
|
||||
|
||||
This document is originally based on my "Changes" file for 2.0.x kernels
|
||||
and therefore owes credit to the same people as that file (Jared Mauch,
|
||||
Axel Boldt, Alessandro Sigala, and countless other users all over the
|
||||
'net).
|
||||
|
||||
The latest revision of this document, in various formats, can always
|
||||
be found at <http://cyberbuzz.gatech.edu/kaboom/linux/Changes-2.4/>.
|
||||
|
||||
Feel free to translate this document. If you do so, please send me a
|
||||
URL to your translation for inclusion in future revisions of this
|
||||
document.
|
||||
|
||||
Smotrite file <http://oblom.rnc.ru/linux/kernel/Changes.ru>, yavlyaushisya
|
||||
russkim perevodom dannogo documenta.
|
||||
|
||||
Visite <http://www2.adi.uam.es/~ender/tecnico/> para obtener la traducción
|
||||
al español de este documento en varios formatos.
|
||||
|
||||
Eine deutsche Version dieser Datei finden Sie unter
|
||||
<http://www.stefan-winter.de/Changes-2.4.0.txt>.
|
||||
|
||||
Last updated: October 29th, 2002
|
||||
|
||||
Chris Ricker (kaboom@gatech.edu or chris.ricker@genetics.utah.edu).
|
||||
|
||||
Current Minimal Requirements
|
||||
============================
|
||||
|
||||
Upgrade to at *least* these software revisions before thinking you've
|
||||
encountered a bug! If you're unsure what version you're currently
|
||||
running, the suggested command should tell you.
|
||||
|
||||
Again, keep in mind that this list assumes you are already
|
||||
functionally running a Linux 2.4 kernel. Also, not all tools are
|
||||
necessary on all systems; obviously, if you don't have any PCMCIA (PC
|
||||
Card) hardware, for example, you probably needn't concern yourself
|
||||
with pcmcia-cs.
|
||||
|
||||
o Gnu C 2.95.3 # gcc --version
|
||||
o Gnu make 3.79.1 # make --version
|
||||
o binutils 2.12 # ld -v
|
||||
o util-linux 2.10o # fdformat --version
|
||||
o module-init-tools 0.9.10 # depmod -V
|
||||
o e2fsprogs 1.29 # tune2fs
|
||||
o jfsutils 1.1.3 # fsck.jfs -V
|
||||
o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs
|
||||
o xfsprogs 2.6.0 # xfs_db -V
|
||||
o pcmcia-cs 3.1.21 # cardmgr -V
|
||||
o quota-tools 3.09 # quota -V
|
||||
o PPP 2.4.0 # pppd --version
|
||||
o isdn4k-utils 3.1pre1 # isdnctrl 2>&1|grep version
|
||||
o nfs-utils 1.0.5 # showmount --version
|
||||
o procps 3.2.0 # ps --version
|
||||
o oprofile 0.5.3 # oprofiled --version
|
||||
|
||||
Kernel compilation
|
||||
==================
|
||||
|
||||
GCC
|
||||
---
|
||||
|
||||
The gcc version requirements may vary depending on the type of CPU in your
|
||||
computer. The next paragraph applies to users of x86 CPUs, but not
|
||||
necessarily to users of other CPUs. Users of other CPUs should obtain
|
||||
information about their gcc version requirements from another source.
|
||||
|
||||
The recommended compiler for the kernel is gcc 2.95.x (x >= 3), and it
|
||||
should be used when you need absolute stability. You may use gcc 3.0.x
|
||||
instead if you wish, although it may cause problems. Later versions of gcc
|
||||
have not received much testing for Linux kernel compilation, and there are
|
||||
almost certainly bugs (mainly, but not exclusively, in the kernel) that
|
||||
will need to be fixed in order to use these compilers. In any case, using
|
||||
pgcc instead of plain gcc is just asking for trouble.
|
||||
|
||||
The Red Hat gcc 2.96 compiler subtree can also be used to build this tree.
|
||||
You should ensure you use gcc-2.96-74 or later. gcc-2.96-54 will not build
|
||||
the kernel correctly.
|
||||
|
||||
In addition, please pay attention to compiler optimization. Anything
|
||||
greater than -O2 may not be wise. Similarly, if you choose to use gcc-2.95.x
|
||||
or derivatives, be sure not to use -fstrict-aliasing (which, depending on
|
||||
your version of gcc 2.95.x, may necessitate using -fno-strict-aliasing).
|
||||
|
||||
Make
|
||||
----
|
||||
|
||||
You will need Gnu make 3.79.1 or later to build the kernel.
|
||||
|
||||
Binutils
|
||||
--------
|
||||
|
||||
Linux on IA-32 has recently switched from using as86 to using gas for
|
||||
assembling the 16-bit boot code, removing the need for as86 to compile
|
||||
your kernel. This change does, however, mean that you need a recent
|
||||
release of binutils.
|
||||
|
||||
System utilities
|
||||
================
|
||||
|
||||
Architectural changes
|
||||
---------------------
|
||||
|
||||
DevFS has been obsoleted in favour of udev
|
||||
(http://www.kernel.org/pub/linux/utils/kernel/hotplug/)
|
||||
|
||||
32-bit UID support is now in place. Have fun!
|
||||
|
||||
Linux documentation for functions is transitioning to inline
|
||||
documentation via specially-formatted comments near their
|
||||
definitions in the source. These comments can be combined with the
|
||||
SGML templates in the Documentation/DocBook directory to make DocBook
|
||||
files, which can then be converted by DocBook stylesheets to PostScript,
|
||||
HTML, PDF files, and several other formats. In order to convert from
|
||||
DocBook format to a format of your choice, you'll need to install Jade as
|
||||
well as the desired DocBook stylesheets.
|
||||
|
||||
Util-linux
|
||||
----------
|
||||
|
||||
New versions of util-linux provide *fdisk support for larger disks,
|
||||
support new options to mount, recognize more supported partition
|
||||
types, have a fdformat which works with 2.4 kernels, and similar goodies.
|
||||
You'll probably want to upgrade.
|
||||
|
||||
Ksymoops
|
||||
--------
|
||||
|
||||
If the unthinkable happens and your kernel oopses, you'll need a 2.4
|
||||
version of ksymoops to decode the report; see REPORTING-BUGS in the
|
||||
root of the Linux source for more information.
|
||||
|
||||
Module-Init-Tools
|
||||
-----------------
|
||||
|
||||
A new module loader is now in the kernel that requires module-init-tools
|
||||
to use. It is backward compatible with the 2.4.x series kernels.
|
||||
|
||||
Mkinitrd
|
||||
--------
|
||||
|
||||
These changes to the /lib/modules file tree layout also require that
|
||||
mkinitrd be upgraded.
|
||||
|
||||
E2fsprogs
|
||||
---------
|
||||
|
||||
The latest version of e2fsprogs fixes several bugs in fsck and
|
||||
debugfs. Obviously, it's a good idea to upgrade.
|
||||
|
||||
JFSutils
|
||||
--------
|
||||
|
||||
The jfsutils package contains the utilities for the file system.
|
||||
The following utilities are available:
|
||||
o fsck.jfs - initiate replay of the transaction log, and check
|
||||
and repair a JFS formatted partition.
|
||||
o mkfs.jfs - create a JFS formatted partition.
|
||||
o other file system utilities are also available in this package.
|
||||
|
||||
Reiserfsprogs
|
||||
-------------
|
||||
|
||||
The reiserfsprogs package should be used for reiserfs-3.6.x
|
||||
(Linux kernels 2.4.x). It is a combined package and contains working
|
||||
versions of mkreiserfs, resize_reiserfs, debugreiserfs and
|
||||
reiserfsck. These utils work on both i386 and alpha platforms.
|
||||
|
||||
Xfsprogs
|
||||
--------
|
||||
|
||||
The latest version of xfsprogs contains mkfs.xfs, xfs_db, and the
|
||||
xfs_repair utilities, among others, for the XFS filesystem. It is
|
||||
architecture independent and any version from 2.0.0 onward should
|
||||
work correctly with this version of the XFS kernel code (2.6.0 or
|
||||
later is recommended, due to some significant improvements).
|
||||
|
||||
|
||||
Pcmcia-cs
|
||||
---------
|
||||
|
||||
PCMCIA (PC Card) support is now partially implemented in the main
|
||||
kernel source. Pay attention when you recompile your kernel ;-).
|
||||
Also, be sure to upgrade to the latest pcmcia-cs release.
|
||||
|
||||
Quota-tools
|
||||
-----------
|
||||
|
||||
Support for 32 bit uid's and gid's is required if you want to use
|
||||
the newer version 2 quota format. Quota-tools version 3.07 and
|
||||
newer has this support. Use the recommended version or newer
|
||||
from the table above.
|
||||
|
||||
Intel IA32 microcode
|
||||
--------------------
|
||||
|
||||
A driver has been added to allow updating of Intel IA32 microcode,
|
||||
accessible as both a devfs regular file and as a normal (misc)
|
||||
character device. If you are not using devfs you may need to:
|
||||
|
||||
mkdir /dev/cpu
|
||||
mknod /dev/cpu/microcode c 10 184
|
||||
chmod 0644 /dev/cpu/microcode
|
||||
|
||||
as root before you can use this. You'll probably also want to
|
||||
get the user-space microcode_ctl utility to use with this.
|
||||
|
||||
Powertweak
|
||||
----------
|
||||
|
||||
If you are running v0.1.17 or earlier, you should upgrade to
|
||||
version v0.99.0 or higher. Running old versions may cause problems
|
||||
with programs using shared memory.
|
||||
|
||||
udev
|
||||
----
|
||||
udev is a userspace application for populating /dev dynamically with
|
||||
only entries for devices actually present. udev replaces devfs.
|
||||
|
||||
Networking
|
||||
==========
|
||||
|
||||
General changes
|
||||
---------------
|
||||
|
||||
If you have advanced network configuration needs, you should probably
|
||||
consider using the network tools from ip-route2.
|
||||
|
||||
Packet Filter / NAT
|
||||
-------------------
|
||||
The packet filtering and NAT code uses the same tools like the previous 2.4.x
|
||||
kernel series (iptables). It still includes backwards-compatibility modules
|
||||
for 2.2.x-style ipchains and 2.0.x-style ipfwadm.
|
||||
|
||||
PPP
|
||||
---
|
||||
|
||||
The PPP driver has been restructured to support multilink and to
|
||||
enable it to operate over diverse media layers. If you use PPP,
|
||||
upgrade pppd to at least 2.4.0.
|
||||
|
||||
If you are not using devfs, you must have the device file /dev/ppp
|
||||
which can be made by:
|
||||
|
||||
mknod /dev/ppp c 108 0
|
||||
|
||||
as root.
|
||||
|
||||
If you use devfsd and build ppp support as modules, you will need
|
||||
the following in your /etc/devfsd.conf file:
|
||||
|
||||
LOOKUP PPP MODLOAD
|
||||
|
||||
Isdn4k-utils
|
||||
------------
|
||||
|
||||
Due to changes in the length of the phone number field, isdn4k-utils
|
||||
needs to be recompiled or (preferably) upgraded.
|
||||
|
||||
NFS-utils
|
||||
---------
|
||||
|
||||
In 2.4 and earlier kernels, the nfs server needed to know about any
|
||||
client that expected to be able to access files via NFS. This
|
||||
information would be given to the kernel by "mountd" when the client
|
||||
mounted the filesystem, or by "exportfs" at system startup. exportfs
|
||||
would take information about active clients from /var/lib/nfs/rmtab.
|
||||
|
||||
This approach is quite fragile as it depends on rmtab being correct
|
||||
which is not always easy, particularly when trying to implement
|
||||
fail-over. Even when the system is working well, rmtab suffers from
|
||||
getting lots of old entries that never get removed.
|
||||
|
||||
With 2.6 we have the option of having the kernel tell mountd when it
|
||||
gets a request from an unknown host, and mountd can give appropriate
|
||||
export information to the kernel. This removes the dependency on
|
||||
rmtab and means that the kernel only needs to know about currently
|
||||
active clients.
|
||||
|
||||
To enable this new functionality, you need to:
|
||||
|
||||
mount -t nfsd nfsd /proc/fs/nfs
|
||||
|
||||
before running exportfs or mountd. It is recommended that all NFS
|
||||
services be protected from the internet-at-large by a firewall where
|
||||
that is possible.
|
||||
|
||||
Getting updated software
|
||||
========================
|
||||
|
||||
Kernel compilation
|
||||
******************
|
||||
|
||||
gcc 2.95.3
|
||||
----------
|
||||
o <ftp://ftp.gnu.org/gnu/gcc/gcc-2.95.3.tar.gz>
|
||||
|
||||
Make
|
||||
----
|
||||
o <ftp://ftp.gnu.org/gnu/make/>
|
||||
|
||||
Binutils
|
||||
--------
|
||||
o <ftp://ftp.kernel.org/pub/linux/devel/binutils/>
|
||||
|
||||
System utilities
|
||||
****************
|
||||
|
||||
Util-linux
|
||||
----------
|
||||
o <ftp://ftp.kernel.org/pub/linux/utils/util-linux/>
|
||||
|
||||
Ksymoops
|
||||
--------
|
||||
o <ftp://ftp.kernel.org/pub/linux/utils/kernel/ksymoops/v2.4/>
|
||||
|
||||
Module-Init-Tools
|
||||
-----------------
|
||||
o <ftp://ftp.kernel.org/pub/linux/kernel/people/rusty/modules/>
|
||||
|
||||
Mkinitrd
|
||||
--------
|
||||
o <ftp://rawhide.redhat.com/pub/rawhide/SRPMS/SRPMS/>
|
||||
|
||||
E2fsprogs
|
||||
---------
|
||||
o <http://prdownloads.sourceforge.net/e2fsprogs/e2fsprogs-1.29.tar.gz>
|
||||
|
||||
JFSutils
|
||||
--------
|
||||
o <http://jfs.sourceforge.net/>
|
||||
|
||||
Reiserfsprogs
|
||||
-------------
|
||||
o <http://www.namesys.com/pub/reiserfsprogs/reiserfsprogs-3.6.3.tar.gz>
|
||||
|
||||
Xfsprogs
|
||||
--------
|
||||
o <ftp://oss.sgi.com/projects/xfs/download/>
|
||||
|
||||
Pcmcia-cs
|
||||
---------
|
||||
o <ftp://pcmcia-cs.sourceforge.net/pub/pcmcia-cs/pcmcia-cs-3.1.21.tar.gz>
|
||||
|
||||
Quota-tools
|
||||
----------
|
||||
o <http://sourceforge.net/projects/linuxquota/>
|
||||
|
||||
Jade
|
||||
----
|
||||
o <ftp://ftp.jclark.com/pub/jade/jade-1.2.1.tar.gz>
|
||||
|
||||
DocBook Stylesheets
|
||||
-------------------
|
||||
o <http://nwalsh.com/docbook/dsssl/>
|
||||
|
||||
Intel P6 microcode
|
||||
------------------
|
||||
o <http://www.urbanmyth.org/microcode/>
|
||||
|
||||
Powertweak
|
||||
----------
|
||||
o <http://powertweak.sourceforge.net/>
|
||||
|
||||
udev
|
||||
----
|
||||
o <http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html>
|
||||
|
||||
Networking
|
||||
**********
|
||||
|
||||
PPP
|
||||
---
|
||||
o <ftp://ftp.samba.org/pub/ppp/ppp-2.4.0.tar.gz>
|
||||
|
||||
Isdn4k-utils
|
||||
------------
|
||||
o <ftp://ftp.isdn4linux.de/pub/isdn4linux/utils/isdn4k-utils.v3.1pre1.tar.gz>
|
||||
|
||||
NFS-utils
|
||||
---------
|
||||
o <http://sourceforge.net/project/showfiles.php?group_id=14>
|
||||
|
||||
Iptables
|
||||
--------
|
||||
o <http://www.iptables.org/downloads.html>
|
||||
|
||||
Ip-route2
|
||||
---------
|
||||
o <ftp://ftp.tux.org/pub/net/ip-routing/iproute2-2.2.4-now-ss991023.tar.gz>
|
||||
|
||||
OProfile
|
||||
--------
|
||||
o <http://oprofile.sf.net/download/>
|
||||
|
||||
NFS-Utils
|
||||
---------
|
||||
o <http://nfs.sourceforge.net/>
|
||||
|
431
Documentation/CodingStyle
Normal file
431
Documentation/CodingStyle
Normal file
|
@ -0,0 +1,431 @@
|
|||
|
||||
Linux kernel coding style
|
||||
|
||||
This is a short document describing the preferred coding style for the
|
||||
linux kernel. Coding style is very personal, and I won't _force_ my
|
||||
views on anybody, but this is what goes for anything that I have to be
|
||||
able to maintain, and I'd prefer it for most other things too. Please
|
||||
at least consider the points made here.
|
||||
|
||||
First off, I'd suggest printing out a copy of the GNU coding standards,
|
||||
and NOT read it. Burn them, it's a great symbolic gesture.
|
||||
|
||||
Anyway, here goes:
|
||||
|
||||
|
||||
Chapter 1: Indentation
|
||||
|
||||
Tabs are 8 characters, and thus indentations are also 8 characters.
|
||||
There are heretic movements that try to make indentations 4 (or even 2!)
|
||||
characters deep, and that is akin to trying to define the value of PI to
|
||||
be 3.
|
||||
|
||||
Rationale: The whole idea behind indentation is to clearly define where
|
||||
a block of control starts and ends. Especially when you've been looking
|
||||
at your screen for 20 straight hours, you'll find it a lot easier to see
|
||||
how the indentation works if you have large indentations.
|
||||
|
||||
Now, some people will claim that having 8-character indentations makes
|
||||
the code move too far to the right, and makes it hard to read on a
|
||||
80-character terminal screen. The answer to that is that if you need
|
||||
more than 3 levels of indentation, you're screwed anyway, and should fix
|
||||
your program.
|
||||
|
||||
In short, 8-char indents make things easier to read, and have the added
|
||||
benefit of warning you when you're nesting your functions too deep.
|
||||
Heed that warning.
|
||||
|
||||
Don't put multiple statements on a single line unless you have
|
||||
something to hide:
|
||||
|
||||
if (condition) do_this;
|
||||
do_something_everytime;
|
||||
|
||||
Outside of comments, documentation and except in Kconfig, spaces are never
|
||||
used for indentation, and the above example is deliberately broken.
|
||||
|
||||
Get a decent editor and don't leave whitespace at the end of lines.
|
||||
|
||||
|
||||
Chapter 2: Breaking long lines and strings
|
||||
|
||||
Coding style is all about readability and maintainability using commonly
|
||||
available tools.
|
||||
|
||||
The limit on the length of lines is 80 columns and this is a hard limit.
|
||||
|
||||
Statements longer than 80 columns will be broken into sensible chunks.
|
||||
Descendants are always substantially shorter than the parent and are placed
|
||||
substantially to the right. The same applies to function headers with a long
|
||||
argument list. Long strings are as well broken into shorter strings.
|
||||
|
||||
void fun(int a, int b, int c)
|
||||
{
|
||||
if (condition)
|
||||
printk(KERN_WARNING "Warning this is a long printk with "
|
||||
"3 parameters a: %u b: %u "
|
||||
"c: %u \n", a, b, c);
|
||||
else
|
||||
next_statement;
|
||||
}
|
||||
|
||||
Chapter 3: Placing Braces
|
||||
|
||||
The other issue that always comes up in C styling is the placement of
|
||||
braces. Unlike the indent size, there are few technical reasons to
|
||||
choose one placement strategy over the other, but the preferred way, as
|
||||
shown to us by the prophets Kernighan and Ritchie, is to put the opening
|
||||
brace last on the line, and put the closing brace first, thusly:
|
||||
|
||||
if (x is true) {
|
||||
we do y
|
||||
}
|
||||
|
||||
However, there is one special case, namely functions: they have the
|
||||
opening brace at the beginning of the next line, thus:
|
||||
|
||||
int function(int x)
|
||||
{
|
||||
body of function
|
||||
}
|
||||
|
||||
Heretic people all over the world have claimed that this inconsistency
|
||||
is ... well ... inconsistent, but all right-thinking people know that
|
||||
(a) K&R are _right_ and (b) K&R are right. Besides, functions are
|
||||
special anyway (you can't nest them in C).
|
||||
|
||||
Note that the closing brace is empty on a line of its own, _except_ in
|
||||
the cases where it is followed by a continuation of the same statement,
|
||||
ie a "while" in a do-statement or an "else" in an if-statement, like
|
||||
this:
|
||||
|
||||
do {
|
||||
body of do-loop
|
||||
} while (condition);
|
||||
|
||||
and
|
||||
|
||||
if (x == y) {
|
||||
..
|
||||
} else if (x > y) {
|
||||
...
|
||||
} else {
|
||||
....
|
||||
}
|
||||
|
||||
Rationale: K&R.
|
||||
|
||||
Also, note that this brace-placement also minimizes the number of empty
|
||||
(or almost empty) lines, without any loss of readability. Thus, as the
|
||||
supply of new-lines on your screen is not a renewable resource (think
|
||||
25-line terminal screens here), you have more empty lines to put
|
||||
comments on.
|
||||
|
||||
|
||||
Chapter 4: Naming
|
||||
|
||||
C is a Spartan language, and so should your naming be. Unlike Modula-2
|
||||
and Pascal programmers, C programmers do not use cute names like
|
||||
ThisVariableIsATemporaryCounter. A C programmer would call that
|
||||
variable "tmp", which is much easier to write, and not the least more
|
||||
difficult to understand.
|
||||
|
||||
HOWEVER, while mixed-case names are frowned upon, descriptive names for
|
||||
global variables are a must. To call a global function "foo" is a
|
||||
shooting offense.
|
||||
|
||||
GLOBAL variables (to be used only if you _really_ need them) need to
|
||||
have descriptive names, as do global functions. If you have a function
|
||||
that counts the number of active users, you should call that
|
||||
"count_active_users()" or similar, you should _not_ call it "cntusr()".
|
||||
|
||||
Encoding the type of a function into the name (so-called Hungarian
|
||||
notation) is brain damaged - the compiler knows the types anyway and can
|
||||
check those, and it only confuses the programmer. No wonder MicroSoft
|
||||
makes buggy programs.
|
||||
|
||||
LOCAL variable names should be short, and to the point. If you have
|
||||
some random integer loop counter, it should probably be called "i".
|
||||
Calling it "loop_counter" is non-productive, if there is no chance of it
|
||||
being mis-understood. Similarly, "tmp" can be just about any type of
|
||||
variable that is used to hold a temporary value.
|
||||
|
||||
If you are afraid to mix up your local variable names, you have another
|
||||
problem, which is called the function-growth-hormone-imbalance syndrome.
|
||||
See next chapter.
|
||||
|
||||
|
||||
Chapter 5: Functions
|
||||
|
||||
Functions should be short and sweet, and do just one thing. They should
|
||||
fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
|
||||
as we all know), and do one thing and do that well.
|
||||
|
||||
The maximum length of a function is inversely proportional to the
|
||||
complexity and indentation level of that function. So, if you have a
|
||||
conceptually simple function that is just one long (but simple)
|
||||
case-statement, where you have to do lots of small things for a lot of
|
||||
different cases, it's OK to have a longer function.
|
||||
|
||||
However, if you have a complex function, and you suspect that a
|
||||
less-than-gifted first-year high-school student might not even
|
||||
understand what the function is all about, you should adhere to the
|
||||
maximum limits all the more closely. Use helper functions with
|
||||
descriptive names (you can ask the compiler to in-line them if you think
|
||||
it's performance-critical, and it will probably do a better job of it
|
||||
than you would have done).
|
||||
|
||||
Another measure of the function is the number of local variables. They
|
||||
shouldn't exceed 5-10, or you're doing something wrong. Re-think the
|
||||
function, and split it into smaller pieces. A human brain can
|
||||
generally easily keep track of about 7 different things, anything more
|
||||
and it gets confused. You know you're brilliant, but maybe you'd like
|
||||
to understand what you did 2 weeks from now.
|
||||
|
||||
|
||||
Chapter 6: Centralized exiting of functions
|
||||
|
||||
Albeit deprecated by some people, the equivalent of the goto statement is
|
||||
used frequently by compilers in form of the unconditional jump instruction.
|
||||
|
||||
The goto statement comes in handy when a function exits from multiple
|
||||
locations and some common work such as cleanup has to be done.
|
||||
|
||||
The rationale is:
|
||||
|
||||
- unconditional statements are easier to understand and follow
|
||||
- nesting is reduced
|
||||
- errors by not updating individual exit points when making
|
||||
modifications are prevented
|
||||
- saves the compiler work to optimize redundant code away ;)
|
||||
|
||||
int fun(int )
|
||||
{
|
||||
int result = 0;
|
||||
char *buffer = kmalloc(SIZE);
|
||||
|
||||
if (buffer == NULL)
|
||||
return -ENOMEM;
|
||||
|
||||
if (condition1) {
|
||||
while (loop1) {
|
||||
...
|
||||
}
|
||||
result = 1;
|
||||
goto out;
|
||||
}
|
||||
...
|
||||
out:
|
||||
kfree(buffer);
|
||||
return result;
|
||||
}
|
||||
|
||||
Chapter 7: Commenting
|
||||
|
||||
Comments are good, but there is also a danger of over-commenting. NEVER
|
||||
try to explain HOW your code works in a comment: it's much better to
|
||||
write the code so that the _working_ is obvious, and it's a waste of
|
||||
time to explain badly written code.
|
||||
|
||||
Generally, you want your comments to tell WHAT your code does, not HOW.
|
||||
Also, try to avoid putting comments inside a function body: if the
|
||||
function is so complex that you need to separately comment parts of it,
|
||||
you should probably go back to chapter 5 for a while. You can make
|
||||
small comments to note or warn about something particularly clever (or
|
||||
ugly), but try to avoid excess. Instead, put the comments at the head
|
||||
of the function, telling people what it does, and possibly WHY it does
|
||||
it.
|
||||
|
||||
|
||||
Chapter 8: You've made a mess of it
|
||||
|
||||
That's OK, we all do. You've probably been told by your long-time Unix
|
||||
user helper that "GNU emacs" automatically formats the C sources for
|
||||
you, and you've noticed that yes, it does do that, but the defaults it
|
||||
uses are less than desirable (in fact, they are worse than random
|
||||
typing - an infinite number of monkeys typing into GNU emacs would never
|
||||
make a good program).
|
||||
|
||||
So, you can either get rid of GNU emacs, or change it to use saner
|
||||
values. To do the latter, you can stick the following in your .emacs file:
|
||||
|
||||
(defun linux-c-mode ()
|
||||
"C mode with adjusted defaults for use with the Linux kernel."
|
||||
(interactive)
|
||||
(c-mode)
|
||||
(c-set-style "K&R")
|
||||
(setq tab-width 8)
|
||||
(setq indent-tabs-mode t)
|
||||
(setq c-basic-offset 8))
|
||||
|
||||
This will define the M-x linux-c-mode command. When hacking on a
|
||||
module, if you put the string -*- linux-c -*- somewhere on the first
|
||||
two lines, this mode will be automatically invoked. Also, you may want
|
||||
to add
|
||||
|
||||
(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode)
|
||||
auto-mode-alist))
|
||||
|
||||
to your .emacs file if you want to have linux-c-mode switched on
|
||||
automagically when you edit source files under /usr/src/linux.
|
||||
|
||||
But even if you fail in getting emacs to do sane formatting, not
|
||||
everything is lost: use "indent".
|
||||
|
||||
Now, again, GNU indent has the same brain-dead settings that GNU emacs
|
||||
has, which is why you need to give it a few command line options.
|
||||
However, that's not too bad, because even the makers of GNU indent
|
||||
recognize the authority of K&R (the GNU people aren't evil, they are
|
||||
just severely misguided in this matter), so you just give indent the
|
||||
options "-kr -i8" (stands for "K&R, 8 character indents"), or use
|
||||
"scripts/Lindent", which indents in the latest style.
|
||||
|
||||
"indent" has a lot of options, and especially when it comes to comment
|
||||
re-formatting you may want to take a look at the man page. But
|
||||
remember: "indent" is not a fix for bad programming.
|
||||
|
||||
|
||||
Chapter 9: Configuration-files
|
||||
|
||||
For configuration options (arch/xxx/Kconfig, and all the Kconfig files),
|
||||
somewhat different indentation is used.
|
||||
|
||||
Help text is indented with 2 spaces.
|
||||
|
||||
if CONFIG_EXPERIMENTAL
|
||||
tristate CONFIG_BOOM
|
||||
default n
|
||||
help
|
||||
Apply nitroglycerine inside the keyboard (DANGEROUS)
|
||||
bool CONFIG_CHEER
|
||||
depends on CONFIG_BOOM
|
||||
default y
|
||||
help
|
||||
Output nice messages when you explode
|
||||
endif
|
||||
|
||||
Generally, CONFIG_EXPERIMENTAL should surround all options not considered
|
||||
stable. All options that are known to trash data (experimental write-
|
||||
support for file-systems, for instance) should be denoted (DANGEROUS), other
|
||||
experimental options should be denoted (EXPERIMENTAL).
|
||||
|
||||
|
||||
Chapter 10: Data structures
|
||||
|
||||
Data structures that have visibility outside the single-threaded
|
||||
environment they are created and destroyed in should always have
|
||||
reference counts. In the kernel, garbage collection doesn't exist (and
|
||||
outside the kernel garbage collection is slow and inefficient), which
|
||||
means that you absolutely _have_ to reference count all your uses.
|
||||
|
||||
Reference counting means that you can avoid locking, and allows multiple
|
||||
users to have access to the data structure in parallel - and not having
|
||||
to worry about the structure suddenly going away from under them just
|
||||
because they slept or did something else for a while.
|
||||
|
||||
Note that locking is _not_ a replacement for reference counting.
|
||||
Locking is used to keep data structures coherent, while reference
|
||||
counting is a memory management technique. Usually both are needed, and
|
||||
they are not to be confused with each other.
|
||||
|
||||
Many data structures can indeed have two levels of reference counting,
|
||||
when there are users of different "classes". The subclass count counts
|
||||
the number of subclass users, and decrements the global count just once
|
||||
when the subclass count goes to zero.
|
||||
|
||||
Examples of this kind of "multi-level-reference-counting" can be found in
|
||||
memory management ("struct mm_struct": mm_users and mm_count), and in
|
||||
filesystem code ("struct super_block": s_count and s_active).
|
||||
|
||||
Remember: if another thread can find your data structure, and you don't
|
||||
have a reference count on it, you almost certainly have a bug.
|
||||
|
||||
|
||||
Chapter 11: Macros, Enums, Inline functions and RTL
|
||||
|
||||
Names of macros defining constants and labels in enums are capitalized.
|
||||
|
||||
#define CONSTANT 0x12345
|
||||
|
||||
Enums are preferred when defining several related constants.
|
||||
|
||||
CAPITALIZED macro names are appreciated but macros resembling functions
|
||||
may be named in lower case.
|
||||
|
||||
Generally, inline functions are preferable to macros resembling functions.
|
||||
|
||||
Macros with multiple statements should be enclosed in a do - while block:
|
||||
|
||||
#define macrofun(a, b, c) \
|
||||
do { \
|
||||
if (a == 5) \
|
||||
do_this(b, c); \
|
||||
} while (0)
|
||||
|
||||
Things to avoid when using macros:
|
||||
|
||||
1) macros that affect control flow:
|
||||
|
||||
#define FOO(x) \
|
||||
do { \
|
||||
if (blah(x) < 0) \
|
||||
return -EBUGGERED; \
|
||||
} while(0)
|
||||
|
||||
is a _very_ bad idea. It looks like a function call but exits the "calling"
|
||||
function; don't break the internal parsers of those who will read the code.
|
||||
|
||||
2) macros that depend on having a local variable with a magic name:
|
||||
|
||||
#define FOO(val) bar(index, val)
|
||||
|
||||
might look like a good thing, but it's confusing as hell when one reads the
|
||||
code and it's prone to breakage from seemingly innocent changes.
|
||||
|
||||
3) macros with arguments that are used as l-values: FOO(x) = y; will
|
||||
bite you if somebody e.g. turns FOO into an inline function.
|
||||
|
||||
4) forgetting about precedence: macros defining constants using expressions
|
||||
must enclose the expression in parentheses. Beware of similar issues with
|
||||
macros using parameters.
|
||||
|
||||
#define CONSTANT 0x4000
|
||||
#define CONSTEXP (CONSTANT | 3)
|
||||
|
||||
The cpp manual deals with macros exhaustively. The gcc internals manual also
|
||||
covers RTL which is used frequently with assembly language in the kernel.
|
||||
|
||||
|
||||
Chapter 12: Printing kernel messages
|
||||
|
||||
Kernel developers like to be seen as literate. Do mind the spelling
|
||||
of kernel messages to make a good impression. Do not use crippled
|
||||
words like "dont" and use "do not" or "don't" instead.
|
||||
|
||||
Kernel messages do not have to be terminated with a period.
|
||||
|
||||
Printing numbers in parentheses (%d) adds no value and should be avoided.
|
||||
|
||||
|
||||
Chapter 13: References
|
||||
|
||||
The C Programming Language, Second Edition
|
||||
by Brian W. Kernighan and Dennis M. Ritchie.
|
||||
Prentice Hall, Inc., 1988.
|
||||
ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
|
||||
URL: http://cm.bell-labs.com/cm/cs/cbook/
|
||||
|
||||
The Practice of Programming
|
||||
by Brian W. Kernighan and Rob Pike.
|
||||
Addison-Wesley, Inc., 1999.
|
||||
ISBN 0-201-61586-X.
|
||||
URL: http://cm.bell-labs.com/cm/cs/tpop/
|
||||
|
||||
GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
|
||||
gcc internals and indent, all available from http://www.gnu.org
|
||||
|
||||
WG14 is the international standardization working group for the programming
|
||||
language C, URL: http://std.dkuug.dk/JTC1/SC22/WG14/
|
||||
|
||||
--
|
||||
Last updated on 16 February 2004 by a community effort on LKML.
|
526
Documentation/DMA-API.txt
Normal file
526
Documentation/DMA-API.txt
Normal file
|
@ -0,0 +1,526 @@
|
|||
Dynamic DMA mapping using the generic device
|
||||
============================================
|
||||
|
||||
James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
|
||||
|
||||
This document describes the DMA API. For a more gentle introduction
|
||||
phrased in terms of the pci_ equivalents (and actual examples) see
|
||||
DMA-mapping.txt
|
||||
|
||||
This API is split into two pieces. Part I describes the API and the
|
||||
corresponding pci_ API. Part II describes the extensions to the API
|
||||
for supporting non-consistent memory machines. Unless you know that
|
||||
your driver absolutely has to support non-consistent platforms (this
|
||||
is usually only legacy platforms) you should only use the API
|
||||
described in part I.
|
||||
|
||||
Part I - pci_ and dma_ Equivalent API
|
||||
-------------------------------------
|
||||
|
||||
To get the pci_ API, you must #include <linux/pci.h>
|
||||
To get the dma_ API, you must #include <linux/dma-mapping.h>
|
||||
|
||||
|
||||
Part Ia - Using large dma-coherent buffers
|
||||
------------------------------------------
|
||||
|
||||
void *
|
||||
dma_alloc_coherent(struct device *dev, size_t size,
|
||||
dma_addr_t *dma_handle, int flag)
|
||||
void *
|
||||
pci_alloc_consistent(struct pci_dev *dev, size_t size,
|
||||
dma_addr_t *dma_handle)
|
||||
|
||||
Consistent memory is memory for which a write by either the device or
|
||||
the processor can immediately be read by the processor or device
|
||||
without having to worry about caching effects.
|
||||
|
||||
This routine allocates a region of <size> bytes of consistent memory.
|
||||
it also returns a <dma_handle> which may be cast to an unsigned
|
||||
integer the same width as the bus and used as the physical address
|
||||
base of the region.
|
||||
|
||||
Returns: a pointer to the allocated region (in the processor's virtual
|
||||
address space) or NULL if the allocation failed.
|
||||
|
||||
Note: consistent memory can be expensive on some platforms, and the
|
||||
minimum allocation length may be as big as a page, so you should
|
||||
consolidate your requests for consistent memory as much as possible.
|
||||
The simplest way to do that is to use the dma_pool calls (see below).
|
||||
|
||||
The flag parameter (dma_alloc_coherent only) allows the caller to
|
||||
specify the GFP_ flags (see kmalloc) for the allocation (the
|
||||
implementation may chose to ignore flags that affect the location of
|
||||
the returned memory, like GFP_DMA). For pci_alloc_consistent, you
|
||||
must assume GFP_ATOMIC behaviour.
|
||||
|
||||
void
|
||||
dma_free_coherent(struct device *dev, size_t size, void *cpu_addr
|
||||
dma_addr_t dma_handle)
|
||||
void
|
||||
pci_free_consistent(struct pci_dev *dev, size_t size, void *cpu_addr
|
||||
dma_addr_t dma_handle)
|
||||
|
||||
Free the region of consistent memory you previously allocated. dev,
|
||||
size and dma_handle must all be the same as those passed into the
|
||||
consistent allocate. cpu_addr must be the virtual address returned by
|
||||
the consistent allocate
|
||||
|
||||
|
||||
Part Ib - Using small dma-coherent buffers
|
||||
------------------------------------------
|
||||
|
||||
To get this part of the dma_ API, you must #include <linux/dmapool.h>
|
||||
|
||||
Many drivers need lots of small dma-coherent memory regions for DMA
|
||||
descriptors or I/O buffers. Rather than allocating in units of a page
|
||||
or more using dma_alloc_coherent(), you can use DMA pools. These work
|
||||
much like a kmem_cache_t, except that they use the dma-coherent allocator
|
||||
not __get_free_pages(). Also, they understand common hardware constraints
|
||||
for alignment, like queue heads needing to be aligned on N byte boundaries.
|
||||
|
||||
|
||||
struct dma_pool *
|
||||
dma_pool_create(const char *name, struct device *dev,
|
||||
size_t size, size_t align, size_t alloc);
|
||||
|
||||
struct pci_pool *
|
||||
pci_pool_create(const char *name, struct pci_device *dev,
|
||||
size_t size, size_t align, size_t alloc);
|
||||
|
||||
The pool create() routines initialize a pool of dma-coherent buffers
|
||||
for use with a given device. It must be called in a context which
|
||||
can sleep.
|
||||
|
||||
The "name" is for diagnostics (like a kmem_cache_t name); dev and size
|
||||
are like what you'd pass to dma_alloc_coherent(). The device's hardware
|
||||
alignment requirement for this type of data is "align" (which is expressed
|
||||
in bytes, and must be a power of two). If your device has no boundary
|
||||
crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
|
||||
from this pool must not cross 4KByte boundaries.
|
||||
|
||||
|
||||
void *dma_pool_alloc(struct dma_pool *pool, int gfp_flags,
|
||||
dma_addr_t *dma_handle);
|
||||
|
||||
void *pci_pool_alloc(struct pci_pool *pool, int gfp_flags,
|
||||
dma_addr_t *dma_handle);
|
||||
|
||||
This allocates memory from the pool; the returned memory will meet the size
|
||||
and alignment requirements specified at creation time. Pass GFP_ATOMIC to
|
||||
prevent blocking, or if it's permitted (not in_interrupt, not holding SMP locks)
|
||||
pass GFP_KERNEL to allow blocking. Like dma_alloc_coherent(), this returns
|
||||
two values: an address usable by the cpu, and the dma address usable by the
|
||||
pool's device.
|
||||
|
||||
|
||||
void dma_pool_free(struct dma_pool *pool, void *vaddr,
|
||||
dma_addr_t addr);
|
||||
|
||||
void pci_pool_free(struct pci_pool *pool, void *vaddr,
|
||||
dma_addr_t addr);
|
||||
|
||||
This puts memory back into the pool. The pool is what was passed to
|
||||
the the pool allocation routine; the cpu and dma addresses are what
|
||||
were returned when that routine allocated the memory being freed.
|
||||
|
||||
|
||||
void dma_pool_destroy(struct dma_pool *pool);
|
||||
|
||||
void pci_pool_destroy(struct pci_pool *pool);
|
||||
|
||||
The pool destroy() routines free the resources of the pool. They must be
|
||||
called in a context which can sleep. Make sure you've freed all allocated
|
||||
memory back to the pool before you destroy it.
|
||||
|
||||
|
||||
Part Ic - DMA addressing limitations
|
||||
------------------------------------
|
||||
|
||||
int
|
||||
dma_supported(struct device *dev, u64 mask)
|
||||
int
|
||||
pci_dma_supported(struct device *dev, u64 mask)
|
||||
|
||||
Checks to see if the device can support DMA to the memory described by
|
||||
mask.
|
||||
|
||||
Returns: 1 if it can and 0 if it can't.
|
||||
|
||||
Notes: This routine merely tests to see if the mask is possible. It
|
||||
won't change the current mask settings. It is more intended as an
|
||||
internal API for use by the platform than an external API for use by
|
||||
driver writers.
|
||||
|
||||
int
|
||||
dma_set_mask(struct device *dev, u64 mask)
|
||||
int
|
||||
pci_set_dma_mask(struct pci_device *dev, u64 mask)
|
||||
|
||||
Checks to see if the mask is possible and updates the device
|
||||
parameters if it is.
|
||||
|
||||
Returns: 0 if successful and a negative error if not.
|
||||
|
||||
u64
|
||||
dma_get_required_mask(struct device *dev)
|
||||
|
||||
After setting the mask with dma_set_mask(), this API returns the
|
||||
actual mask (within that already set) that the platform actually
|
||||
requires to operate efficiently. Usually this means the returned mask
|
||||
is the minimum required to cover all of memory. Examining the
|
||||
required mask gives drivers with variable descriptor sizes the
|
||||
opportunity to use smaller descriptors as necessary.
|
||||
|
||||
Requesting the required mask does not alter the current mask. If you
|
||||
wish to take advantage of it, you should issue another dma_set_mask()
|
||||
call to lower the mask again.
|
||||
|
||||
|
||||
Part Id - Streaming DMA mappings
|
||||
--------------------------------
|
||||
|
||||
dma_addr_t
|
||||
dma_map_single(struct device *dev, void *cpu_addr, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
dma_addr_t
|
||||
pci_map_single(struct device *dev, void *cpu_addr, size_t size,
|
||||
int direction)
|
||||
|
||||
Maps a piece of processor virtual memory so it can be accessed by the
|
||||
device and returns the physical handle of the memory.
|
||||
|
||||
The direction for both api's may be converted freely by casting.
|
||||
However the dma_ API uses a strongly typed enumerator for its
|
||||
direction:
|
||||
|
||||
DMA_NONE = PCI_DMA_NONE no direction (used for
|
||||
debugging)
|
||||
DMA_TO_DEVICE = PCI_DMA_TODEVICE data is going from the
|
||||
memory to the device
|
||||
DMA_FROM_DEVICE = PCI_DMA_FROMDEVICE data is coming from
|
||||
the device to the
|
||||
memory
|
||||
DMA_BIDIRECTIONAL = PCI_DMA_BIDIRECTIONAL direction isn't known
|
||||
|
||||
Notes: Not all memory regions in a machine can be mapped by this
|
||||
API. Further, regions that appear to be physically contiguous in
|
||||
kernel virtual space may not be contiguous as physical memory. Since
|
||||
this API does not provide any scatter/gather capability, it will fail
|
||||
if the user tries to map a non physically contiguous piece of memory.
|
||||
For this reason, it is recommended that memory mapped by this API be
|
||||
obtained only from sources which guarantee to be physically contiguous
|
||||
(like kmalloc).
|
||||
|
||||
Further, the physical address of the memory must be within the
|
||||
dma_mask of the device (the dma_mask represents a bit mask of the
|
||||
addressable region for the device. i.e. if the physical address of
|
||||
the memory anded with the dma_mask is still equal to the physical
|
||||
address, then the device can perform DMA to the memory). In order to
|
||||
ensure that the memory allocated by kmalloc is within the dma_mask,
|
||||
the driver may specify various platform dependent flags to restrict
|
||||
the physical memory range of the allocation (e.g. on x86, GFP_DMA
|
||||
guarantees to be within the first 16Mb of available physical memory,
|
||||
as required by ISA devices).
|
||||
|
||||
Note also that the above constraints on physical contiguity and
|
||||
dma_mask may not apply if the platform has an IOMMU (a device which
|
||||
supplies a physical to virtual mapping between the I/O memory bus and
|
||||
the device). However, to be portable, device driver writers may *not*
|
||||
assume that such an IOMMU exists.
|
||||
|
||||
Warnings: Memory coherency operates at a granularity called the cache
|
||||
line width. In order for memory mapped by this API to operate
|
||||
correctly, the mapped region must begin exactly on a cache line
|
||||
boundary and end exactly on one (to prevent two separately mapped
|
||||
regions from sharing a single cache line). Since the cache line size
|
||||
may not be known at compile time, the API will not enforce this
|
||||
requirement. Therefore, it is recommended that driver writers who
|
||||
don't take special care to determine the cache line size at run time
|
||||
only map virtual regions that begin and end on page boundaries (which
|
||||
are guaranteed also to be cache line boundaries).
|
||||
|
||||
DMA_TO_DEVICE synchronisation must be done after the last modification
|
||||
of the memory region by the software and before it is handed off to
|
||||
the driver. Once this primitive is used. Memory covered by this
|
||||
primitive should be treated as read only by the device. If the device
|
||||
may write to it at any point, it should be DMA_BIDIRECTIONAL (see
|
||||
below).
|
||||
|
||||
DMA_FROM_DEVICE synchronisation must be done before the driver
|
||||
accesses data that may be changed by the device. This memory should
|
||||
be treated as read only by the driver. If the driver needs to write
|
||||
to it at any point, it should be DMA_BIDIRECTIONAL (see below).
|
||||
|
||||
DMA_BIDIRECTIONAL requires special handling: it means that the driver
|
||||
isn't sure if the memory was modified before being handed off to the
|
||||
device and also isn't sure if the device will also modify it. Thus,
|
||||
you must always sync bidirectional memory twice: once before the
|
||||
memory is handed off to the device (to make sure all memory changes
|
||||
are flushed from the processor) and once before the data may be
|
||||
accessed after being used by the device (to make sure any processor
|
||||
cache lines are updated with data that the device may have changed.
|
||||
|
||||
void
|
||||
dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
void
|
||||
pci_unmap_single(struct pci_dev *hwdev, dma_addr_t dma_addr,
|
||||
size_t size, int direction)
|
||||
|
||||
Unmaps the region previously mapped. All the parameters passed in
|
||||
must be identical to those passed in (and returned) by the mapping
|
||||
API.
|
||||
|
||||
dma_addr_t
|
||||
dma_map_page(struct device *dev, struct page *page,
|
||||
unsigned long offset, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
dma_addr_t
|
||||
pci_map_page(struct pci_dev *hwdev, struct page *page,
|
||||
unsigned long offset, size_t size, int direction)
|
||||
void
|
||||
dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
void
|
||||
pci_unmap_page(struct pci_dev *hwdev, dma_addr_t dma_address,
|
||||
size_t size, int direction)
|
||||
|
||||
API for mapping and unmapping for pages. All the notes and warnings
|
||||
for the other mapping APIs apply here. Also, although the <offset>
|
||||
and <size> parameters are provided to do partial page mapping, it is
|
||||
recommended that you never use these unless you really know what the
|
||||
cache width is.
|
||||
|
||||
int
|
||||
dma_mapping_error(dma_addr_t dma_addr)
|
||||
|
||||
int
|
||||
pci_dma_mapping_error(dma_addr_t dma_addr)
|
||||
|
||||
In some circumstances dma_map_single and dma_map_page will fail to create
|
||||
a mapping. A driver can check for these errors by testing the returned
|
||||
dma address with dma_mapping_error(). A non zero return value means the mapping
|
||||
could not be created and the driver should take appropriate action (eg
|
||||
reduce current DMA mapping usage or delay and try again later).
|
||||
|
||||
int
|
||||
dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
|
||||
enum dma_data_direction direction)
|
||||
int
|
||||
pci_map_sg(struct pci_dev *hwdev, struct scatterlist *sg,
|
||||
int nents, int direction)
|
||||
|
||||
Maps a scatter gather list from the block layer.
|
||||
|
||||
Returns: the number of physical segments mapped (this may be shorted
|
||||
than <nents> passed in if the block layer determines that some
|
||||
elements of the scatter/gather list are physically adjacent and thus
|
||||
may be mapped with a single entry).
|
||||
|
||||
Please note that the sg cannot be mapped again if it has been mapped once.
|
||||
The mapping process is allowed to destroy information in the sg.
|
||||
|
||||
As with the other mapping interfaces, dma_map_sg can fail. When it
|
||||
does, 0 is returned and a driver must take appropriate action. It is
|
||||
critical that the driver do something, in the case of a block driver
|
||||
aborting the request or even oopsing is better than doing nothing and
|
||||
corrupting the filesystem.
|
||||
|
||||
void
|
||||
dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nhwentries,
|
||||
enum dma_data_direction direction)
|
||||
void
|
||||
pci_unmap_sg(struct pci_dev *hwdev, struct scatterlist *sg,
|
||||
int nents, int direction)
|
||||
|
||||
unmap the previously mapped scatter/gather list. All the parameters
|
||||
must be the same as those and passed in to the scatter/gather mapping
|
||||
API.
|
||||
|
||||
Note: <nents> must be the number you passed in, *not* the number of
|
||||
physical entries returned.
|
||||
|
||||
void
|
||||
dma_sync_single(struct device *dev, dma_addr_t dma_handle, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
void
|
||||
pci_dma_sync_single(struct pci_dev *hwdev, dma_addr_t dma_handle,
|
||||
size_t size, int direction)
|
||||
void
|
||||
dma_sync_sg(struct device *dev, struct scatterlist *sg, int nelems,
|
||||
enum dma_data_direction direction)
|
||||
void
|
||||
pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
|
||||
int nelems, int direction)
|
||||
|
||||
synchronise a single contiguous or scatter/gather mapping. All the
|
||||
parameters must be the same as those passed into the single mapping
|
||||
API.
|
||||
|
||||
Notes: You must do this:
|
||||
|
||||
- Before reading values that have been written by DMA from the device
|
||||
(use the DMA_FROM_DEVICE direction)
|
||||
- After writing values that will be written to the device using DMA
|
||||
(use the DMA_TO_DEVICE) direction
|
||||
- before *and* after handing memory to the device if the memory is
|
||||
DMA_BIDIRECTIONAL
|
||||
|
||||
See also dma_map_single().
|
||||
|
||||
|
||||
Part II - Advanced dma_ usage
|
||||
-----------------------------
|
||||
|
||||
Warning: These pieces of the DMA API have no PCI equivalent. They
|
||||
should also not be used in the majority of cases, since they cater for
|
||||
unlikely corner cases that don't belong in usual drivers.
|
||||
|
||||
If you don't understand how cache line coherency works between a
|
||||
processor and an I/O device, you should not be using this part of the
|
||||
API at all.
|
||||
|
||||
void *
|
||||
dma_alloc_noncoherent(struct device *dev, size_t size,
|
||||
dma_addr_t *dma_handle, int flag)
|
||||
|
||||
Identical to dma_alloc_coherent() except that the platform will
|
||||
choose to return either consistent or non-consistent memory as it sees
|
||||
fit. By using this API, you are guaranteeing to the platform that you
|
||||
have all the correct and necessary sync points for this memory in the
|
||||
driver should it choose to return non-consistent memory.
|
||||
|
||||
Note: where the platform can return consistent memory, it will
|
||||
guarantee that the sync points become nops.
|
||||
|
||||
Warning: Handling non-consistent memory is a real pain. You should
|
||||
only ever use this API if you positively know your driver will be
|
||||
required to work on one of the rare (usually non-PCI) architectures
|
||||
that simply cannot make consistent memory.
|
||||
|
||||
void
|
||||
dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr,
|
||||
dma_addr_t dma_handle)
|
||||
|
||||
free memory allocated by the nonconsistent API. All parameters must
|
||||
be identical to those passed in (and returned by
|
||||
dma_alloc_noncoherent()).
|
||||
|
||||
int
|
||||
dma_is_consistent(dma_addr_t dma_handle)
|
||||
|
||||
returns true if the memory pointed to by the dma_handle is actually
|
||||
consistent.
|
||||
|
||||
int
|
||||
dma_get_cache_alignment(void)
|
||||
|
||||
returns the processor cache alignment. This is the absolute minimum
|
||||
alignment *and* width that you must observe when either mapping
|
||||
memory or doing partial flushes.
|
||||
|
||||
Notes: This API may return a number *larger* than the actual cache
|
||||
line, but it will guarantee that one or more cache lines fit exactly
|
||||
into the width returned by this call. It will also always be a power
|
||||
of two for easy alignment
|
||||
|
||||
void
|
||||
dma_sync_single_range(struct device *dev, dma_addr_t dma_handle,
|
||||
unsigned long offset, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
|
||||
does a partial sync. starting at offset and continuing for size. You
|
||||
must be careful to observe the cache alignment and width when doing
|
||||
anything like this. You must also be extra careful about accessing
|
||||
memory you intend to sync partially.
|
||||
|
||||
void
|
||||
dma_cache_sync(void *vaddr, size_t size,
|
||||
enum dma_data_direction direction)
|
||||
|
||||
Do a partial sync of memory that was allocated by
|
||||
dma_alloc_noncoherent(), starting at virtual address vaddr and
|
||||
continuing on for size. Again, you *must* observe the cache line
|
||||
boundaries when doing this.
|
||||
|
||||
int
|
||||
dma_declare_coherent_memory(struct device *dev, dma_addr_t bus_addr,
|
||||
dma_addr_t device_addr, size_t size, int
|
||||
flags)
|
||||
|
||||
|
||||
Declare region of memory to be handed out by dma_alloc_coherent when
|
||||
it's asked for coherent memory for this device.
|
||||
|
||||
bus_addr is the physical address to which the memory is currently
|
||||
assigned in the bus responding region (this will be used by the
|
||||
platform to perform the mapping)
|
||||
|
||||
device_addr is the physical address the device needs to be programmed
|
||||
with actually to address this memory (this will be handed out as the
|
||||
dma_addr_t in dma_alloc_coherent())
|
||||
|
||||
size is the size of the area (must be multiples of PAGE_SIZE).
|
||||
|
||||
flags can be or'd together and are
|
||||
|
||||
DMA_MEMORY_MAP - request that the memory returned from
|
||||
dma_alloc_coherent() be directly writeable.
|
||||
|
||||
DMA_MEMORY_IO - request that the memory returned from
|
||||
dma_alloc_coherent() be addressable using read/write/memcpy_toio etc.
|
||||
|
||||
One or both of these flags must be present
|
||||
|
||||
DMA_MEMORY_INCLUDES_CHILDREN - make the declared memory be allocated by
|
||||
dma_alloc_coherent of any child devices of this one (for memory residing
|
||||
on a bridge).
|
||||
|
||||
DMA_MEMORY_EXCLUSIVE - only allocate memory from the declared regions.
|
||||
Do not allow dma_alloc_coherent() to fall back to system memory when
|
||||
it's out of memory in the declared region.
|
||||
|
||||
The return value will be either DMA_MEMORY_MAP or DMA_MEMORY_IO and
|
||||
must correspond to a passed in flag (i.e. no returning DMA_MEMORY_IO
|
||||
if only DMA_MEMORY_MAP were passed in) for success or zero for
|
||||
failure.
|
||||
|
||||
Note, for DMA_MEMORY_IO returns, all subsequent memory returned by
|
||||
dma_alloc_coherent() may no longer be accessed directly, but instead
|
||||
must be accessed using the correct bus functions. If your driver
|
||||
isn't prepared to handle this contingency, it should not specify
|
||||
DMA_MEMORY_IO in the input flags.
|
||||
|
||||
As a simplification for the platforms, only *one* such region of
|
||||
memory may be declared per device.
|
||||
|
||||
For reasons of efficiency, most platforms choose to track the declared
|
||||
region only at the granularity of a page. For smaller allocations,
|
||||
you should use the dma_pool() API.
|
||||
|
||||
void
|
||||
dma_release_declared_memory(struct device *dev)
|
||||
|
||||
Remove the memory region previously declared from the system. This
|
||||
API performs *no* in-use checking for this region and will return
|
||||
unconditionally having removed all the required structures. It is the
|
||||
drivers job to ensure that no parts of this memory region are
|
||||
currently in use.
|
||||
|
||||
void *
|
||||
dma_mark_declared_memory_occupied(struct device *dev,
|
||||
dma_addr_t device_addr, size_t size)
|
||||
|
||||
This is used to occupy specific regions of the declared space
|
||||
(dma_alloc_coherent() will hand out the first free region it finds).
|
||||
|
||||
device_addr is the *device* address of the region requested
|
||||
|
||||
size is the size (and should be a page sized multiple).
|
||||
|
||||
The return value will be either a pointer to the processor virtual
|
||||
address of the memory, or an error (via PTR_ERR()) if any part of the
|
||||
region is occupied.
|
||||
|
||||
|
881
Documentation/DMA-mapping.txt
Normal file
881
Documentation/DMA-mapping.txt
Normal file
|
@ -0,0 +1,881 @@
|
|||
Dynamic DMA mapping
|
||||
===================
|
||||
|
||||
David S. Miller <davem@redhat.com>
|
||||
Richard Henderson <rth@cygnus.com>
|
||||
Jakub Jelinek <jakub@redhat.com>
|
||||
|
||||
This document describes the DMA mapping system in terms of the pci_
|
||||
API. For a similar API that works for generic devices, see
|
||||
DMA-API.txt.
|
||||
|
||||
Most of the 64bit platforms have special hardware that translates bus
|
||||
addresses (DMA addresses) into physical addresses. This is similar to
|
||||
how page tables and/or a TLB translates virtual addresses to physical
|
||||
addresses on a CPU. This is needed so that e.g. PCI devices can
|
||||
access with a Single Address Cycle (32bit DMA address) any page in the
|
||||
64bit physical address space. Previously in Linux those 64bit
|
||||
platforms had to set artificial limits on the maximum RAM size in the
|
||||
system, so that the virt_to_bus() static scheme works (the DMA address
|
||||
translation tables were simply filled on bootup to map each bus
|
||||
address to the physical page __pa(bus_to_virt())).
|
||||
|
||||
So that Linux can use the dynamic DMA mapping, it needs some help from the
|
||||
drivers, namely it has to take into account that DMA addresses should be
|
||||
mapped only for the time they are actually used and unmapped after the DMA
|
||||
transfer.
|
||||
|
||||
The following API will work of course even on platforms where no such
|
||||
hardware exists, see e.g. include/asm-i386/pci.h for how it is implemented on
|
||||
top of the virt_to_bus interface.
|
||||
|
||||
First of all, you should make sure
|
||||
|
||||
#include <linux/pci.h>
|
||||
|
||||
is in your driver. This file will obtain for you the definition of the
|
||||
dma_addr_t (which can hold any valid DMA address for the platform)
|
||||
type which should be used everywhere you hold a DMA (bus) address
|
||||
returned from the DMA mapping functions.
|
||||
|
||||
What memory is DMA'able?
|
||||
|
||||
The first piece of information you must know is what kernel memory can
|
||||
be used with the DMA mapping facilities. There has been an unwritten
|
||||
set of rules regarding this, and this text is an attempt to finally
|
||||
write them down.
|
||||
|
||||
If you acquired your memory via the page allocator
|
||||
(i.e. __get_free_page*()) or the generic memory allocators
|
||||
(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
|
||||
that memory using the addresses returned from those routines.
|
||||
|
||||
This means specifically that you may _not_ use the memory/addresses
|
||||
returned from vmalloc() for DMA. It is possible to DMA to the
|
||||
_underlying_ memory mapped into a vmalloc() area, but this requires
|
||||
walking page tables to get the physical addresses, and then
|
||||
translating each of those pages back to a kernel address using
|
||||
something like __va(). [ EDIT: Update this when we integrate
|
||||
Gerd Knorr's generic code which does this. ]
|
||||
|
||||
This rule also means that you may not use kernel image addresses
|
||||
(ie. items in the kernel's data/text/bss segment, or your driver's)
|
||||
nor may you use kernel stack addresses for DMA. Both of these items
|
||||
might be mapped somewhere entirely different than the rest of physical
|
||||
memory.
|
||||
|
||||
Also, this means that you cannot take the return of a kmap()
|
||||
call and DMA to/from that. This is similar to vmalloc().
|
||||
|
||||
What about block I/O and networking buffers? The block I/O and
|
||||
networking subsystems make sure that the buffers they use are valid
|
||||
for you to DMA from/to.
|
||||
|
||||
DMA addressing limitations
|
||||
|
||||
Does your device have any DMA addressing limitations? For example, is
|
||||
your device only capable of driving the low order 24-bits of address
|
||||
on the PCI bus for SAC DMA transfers? If so, you need to inform the
|
||||
PCI layer of this fact.
|
||||
|
||||
By default, the kernel assumes that your device can address the full
|
||||
32-bits in a SAC cycle. For a 64-bit DAC capable device, this needs
|
||||
to be increased. And for a device with limitations, as discussed in
|
||||
the previous paragraph, it needs to be decreased.
|
||||
|
||||
pci_alloc_consistent() by default will return 32-bit DMA addresses.
|
||||
PCI-X specification requires PCI-X devices to support 64-bit
|
||||
addressing (DAC) for all transactions. And at least one platform (SGI
|
||||
SN2) requires 64-bit consistent allocations to operate correctly when
|
||||
the IO bus is in PCI-X mode. Therefore, like with pci_set_dma_mask(),
|
||||
it's good practice to call pci_set_consistent_dma_mask() to set the
|
||||
appropriate mask even if your device only supports 32-bit DMA
|
||||
(default) and especially if it's a PCI-X device.
|
||||
|
||||
For correct operation, you must interrogate the PCI layer in your
|
||||
device probe routine to see if the PCI controller on the machine can
|
||||
properly support the DMA addressing limitation your device has. It is
|
||||
good style to do this even if your device holds the default setting,
|
||||
because this shows that you did think about these issues wrt. your
|
||||
device.
|
||||
|
||||
The query is performed via a call to pci_set_dma_mask():
|
||||
|
||||
int pci_set_dma_mask(struct pci_dev *pdev, u64 device_mask);
|
||||
|
||||
The query for consistent allocations is performed via a a call to
|
||||
pci_set_consistent_dma_mask():
|
||||
|
||||
int pci_set_consistent_dma_mask(struct pci_dev *pdev, u64 device_mask);
|
||||
|
||||
Here, pdev is a pointer to the PCI device struct of your device, and
|
||||
device_mask is a bit mask describing which bits of a PCI address your
|
||||
device supports. It returns zero if your card can perform DMA
|
||||
properly on the machine given the address mask you provided.
|
||||
|
||||
If it returns non-zero, your device can not perform DMA properly on
|
||||
this platform, and attempting to do so will result in undefined
|
||||
behavior. You must either use a different mask, or not use DMA.
|
||||
|
||||
This means that in the failure case, you have three options:
|
||||
|
||||
1) Use another DMA mask, if possible (see below).
|
||||
2) Use some non-DMA mode for data transfer, if possible.
|
||||
3) Ignore this device and do not initialize it.
|
||||
|
||||
It is recommended that your driver print a kernel KERN_WARNING message
|
||||
when you end up performing either #2 or #3. In this manner, if a user
|
||||
of your driver reports that performance is bad or that the device is not
|
||||
even detected, you can ask them for the kernel messages to find out
|
||||
exactly why.
|
||||
|
||||
The standard 32-bit addressing PCI device would do something like
|
||||
this:
|
||||
|
||||
if (pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
|
||||
printk(KERN_WARNING
|
||||
"mydev: No suitable DMA available.\n");
|
||||
goto ignore_this_device;
|
||||
}
|
||||
|
||||
Another common scenario is a 64-bit capable device. The approach
|
||||
here is to try for 64-bit DAC addressing, but back down to a
|
||||
32-bit mask should that fail. The PCI platform code may fail the
|
||||
64-bit mask not because the platform is not capable of 64-bit
|
||||
addressing. Rather, it may fail in this case simply because
|
||||
32-bit SAC addressing is done more efficiently than DAC addressing.
|
||||
Sparc64 is one platform which behaves in this way.
|
||||
|
||||
Here is how you would handle a 64-bit capable device which can drive
|
||||
all 64-bits when accessing streaming DMA:
|
||||
|
||||
int using_dac;
|
||||
|
||||
if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) {
|
||||
using_dac = 1;
|
||||
} else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
|
||||
using_dac = 0;
|
||||
} else {
|
||||
printk(KERN_WARNING
|
||||
"mydev: No suitable DMA available.\n");
|
||||
goto ignore_this_device;
|
||||
}
|
||||
|
||||
If a card is capable of using 64-bit consistent allocations as well,
|
||||
the case would look like this:
|
||||
|
||||
int using_dac, consistent_using_dac;
|
||||
|
||||
if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) {
|
||||
using_dac = 1;
|
||||
consistent_using_dac = 1;
|
||||
pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK);
|
||||
} else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
|
||||
using_dac = 0;
|
||||
consistent_using_dac = 0;
|
||||
pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK);
|
||||
} else {
|
||||
printk(KERN_WARNING
|
||||
"mydev: No suitable DMA available.\n");
|
||||
goto ignore_this_device;
|
||||
}
|
||||
|
||||
pci_set_consistent_dma_mask() will always be able to set the same or a
|
||||
smaller mask as pci_set_dma_mask(). However for the rare case that a
|
||||
device driver only uses consistent allocations, one would have to
|
||||
check the return value from pci_set_consistent_dma_mask().
|
||||
|
||||
If your 64-bit device is going to be an enormous consumer of DMA
|
||||
mappings, this can be problematic since the DMA mappings are a
|
||||
finite resource on many platforms. Please see the "DAC Addressing
|
||||
for Address Space Hungry Devices" section near the end of this
|
||||
document for how to handle this case.
|
||||
|
||||
Finally, if your device can only drive the low 24-bits of
|
||||
address during PCI bus mastering you might do something like:
|
||||
|
||||
if (pci_set_dma_mask(pdev, 0x00ffffff)) {
|
||||
printk(KERN_WARNING
|
||||
"mydev: 24-bit DMA addressing not available.\n");
|
||||
goto ignore_this_device;
|
||||
}
|
||||
|
||||
When pci_set_dma_mask() is successful, and returns zero, the PCI layer
|
||||
saves away this mask you have provided. The PCI layer will use this
|
||||
information later when you make DMA mappings.
|
||||
|
||||
There is a case which we are aware of at this time, which is worth
|
||||
mentioning in this documentation. If your device supports multiple
|
||||
functions (for example a sound card provides playback and record
|
||||
functions) and the various different functions have _different_
|
||||
DMA addressing limitations, you may wish to probe each mask and
|
||||
only provide the functionality which the machine can handle. It
|
||||
is important that the last call to pci_set_dma_mask() be for the
|
||||
most specific mask.
|
||||
|
||||
Here is pseudo-code showing how this might be done:
|
||||
|
||||
#define PLAYBACK_ADDRESS_BITS DMA_32BIT_MASK
|
||||
#define RECORD_ADDRESS_BITS 0x00ffffff
|
||||
|
||||
struct my_sound_card *card;
|
||||
struct pci_dev *pdev;
|
||||
|
||||
...
|
||||
if (!pci_set_dma_mask(pdev, PLAYBACK_ADDRESS_BITS)) {
|
||||
card->playback_enabled = 1;
|
||||
} else {
|
||||
card->playback_enabled = 0;
|
||||
printk(KERN_WARN "%s: Playback disabled due to DMA limitations.\n",
|
||||
card->name);
|
||||
}
|
||||
if (!pci_set_dma_mask(pdev, RECORD_ADDRESS_BITS)) {
|
||||
card->record_enabled = 1;
|
||||
} else {
|
||||
card->record_enabled = 0;
|
||||
printk(KERN_WARN "%s: Record disabled due to DMA limitations.\n",
|
||||
card->name);
|
||||
}
|
||||
|
||||
A sound card was used as an example here because this genre of PCI
|
||||
devices seems to be littered with ISA chips given a PCI front end,
|
||||
and thus retaining the 16MB DMA addressing limitations of ISA.
|
||||
|
||||
Types of DMA mappings
|
||||
|
||||
There are two types of DMA mappings:
|
||||
|
||||
- Consistent DMA mappings which are usually mapped at driver
|
||||
initialization, unmapped at the end and for which the hardware should
|
||||
guarantee that the device and the CPU can access the data
|
||||
in parallel and will see updates made by each other without any
|
||||
explicit software flushing.
|
||||
|
||||
Think of "consistent" as "synchronous" or "coherent".
|
||||
|
||||
The current default is to return consistent memory in the low 32
|
||||
bits of the PCI bus space. However, for future compatibility you
|
||||
should set the consistent mask even if this default is fine for your
|
||||
driver.
|
||||
|
||||
Good examples of what to use consistent mappings for are:
|
||||
|
||||
- Network card DMA ring descriptors.
|
||||
- SCSI adapter mailbox command data structures.
|
||||
- Device firmware microcode executed out of
|
||||
main memory.
|
||||
|
||||
The invariant these examples all require is that any CPU store
|
||||
to memory is immediately visible to the device, and vice
|
||||
versa. Consistent mappings guarantee this.
|
||||
|
||||
IMPORTANT: Consistent DMA memory does not preclude the usage of
|
||||
proper memory barriers. The CPU may reorder stores to
|
||||
consistent memory just as it may normal memory. Example:
|
||||
if it is important for the device to see the first word
|
||||
of a descriptor updated before the second, you must do
|
||||
something like:
|
||||
|
||||
desc->word0 = address;
|
||||
wmb();
|
||||
desc->word1 = DESC_VALID;
|
||||
|
||||
in order to get correct behavior on all platforms.
|
||||
|
||||
- Streaming DMA mappings which are usually mapped for one DMA transfer,
|
||||
unmapped right after it (unless you use pci_dma_sync_* below) and for which
|
||||
hardware can optimize for sequential accesses.
|
||||
|
||||
This of "streaming" as "asynchronous" or "outside the coherency
|
||||
domain".
|
||||
|
||||
Good examples of what to use streaming mappings for are:
|
||||
|
||||
- Networking buffers transmitted/received by a device.
|
||||
- Filesystem buffers written/read by a SCSI device.
|
||||
|
||||
The interfaces for using this type of mapping were designed in
|
||||
such a way that an implementation can make whatever performance
|
||||
optimizations the hardware allows. To this end, when using
|
||||
such mappings you must be explicit about what you want to happen.
|
||||
|
||||
Neither type of DMA mapping has alignment restrictions that come
|
||||
from PCI, although some devices may have such restrictions.
|
||||
|
||||
Using Consistent DMA mappings.
|
||||
|
||||
To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
|
||||
you should do:
|
||||
|
||||
dma_addr_t dma_handle;
|
||||
|
||||
cpu_addr = pci_alloc_consistent(dev, size, &dma_handle);
|
||||
|
||||
where dev is a struct pci_dev *. You should pass NULL for PCI like buses
|
||||
where devices don't have struct pci_dev (like ISA, EISA). This may be
|
||||
called in interrupt context.
|
||||
|
||||
This argument is needed because the DMA translations may be bus
|
||||
specific (and often is private to the bus which the device is attached
|
||||
to).
|
||||
|
||||
Size is the length of the region you want to allocate, in bytes.
|
||||
|
||||
This routine will allocate RAM for that region, so it acts similarly to
|
||||
__get_free_pages (but takes size instead of a page order). If your
|
||||
driver needs regions sized smaller than a page, you may prefer using
|
||||
the pci_pool interface, described below.
|
||||
|
||||
The consistent DMA mapping interfaces, for non-NULL dev, will by
|
||||
default return a DMA address which is SAC (Single Address Cycle)
|
||||
addressable. Even if the device indicates (via PCI dma mask) that it
|
||||
may address the upper 32-bits and thus perform DAC cycles, consistent
|
||||
allocation will only return > 32-bit PCI addresses for DMA if the
|
||||
consistent dma mask has been explicitly changed via
|
||||
pci_set_consistent_dma_mask(). This is true of the pci_pool interface
|
||||
as well.
|
||||
|
||||
pci_alloc_consistent returns two values: the virtual address which you
|
||||
can use to access it from the CPU and dma_handle which you pass to the
|
||||
card.
|
||||
|
||||
The cpu return address and the DMA bus master address are both
|
||||
guaranteed to be aligned to the smallest PAGE_SIZE order which
|
||||
is greater than or equal to the requested size. This invariant
|
||||
exists (for example) to guarantee that if you allocate a chunk
|
||||
which is smaller than or equal to 64 kilobytes, the extent of the
|
||||
buffer you receive will not cross a 64K boundary.
|
||||
|
||||
To unmap and free such a DMA region, you call:
|
||||
|
||||
pci_free_consistent(dev, size, cpu_addr, dma_handle);
|
||||
|
||||
where dev, size are the same as in the above call and cpu_addr and
|
||||
dma_handle are the values pci_alloc_consistent returned to you.
|
||||
This function may not be called in interrupt context.
|
||||
|
||||
If your driver needs lots of smaller memory regions, you can write
|
||||
custom code to subdivide pages returned by pci_alloc_consistent,
|
||||
or you can use the pci_pool API to do that. A pci_pool is like
|
||||
a kmem_cache, but it uses pci_alloc_consistent not __get_free_pages.
|
||||
Also, it understands common hardware constraints for alignment,
|
||||
like queue heads needing to be aligned on N byte boundaries.
|
||||
|
||||
Create a pci_pool like this:
|
||||
|
||||
struct pci_pool *pool;
|
||||
|
||||
pool = pci_pool_create(name, dev, size, align, alloc);
|
||||
|
||||
The "name" is for diagnostics (like a kmem_cache name); dev and size
|
||||
are as above. The device's hardware alignment requirement for this
|
||||
type of data is "align" (which is expressed in bytes, and must be a
|
||||
power of two). If your device has no boundary crossing restrictions,
|
||||
pass 0 for alloc; passing 4096 says memory allocated from this pool
|
||||
must not cross 4KByte boundaries (but at that time it may be better to
|
||||
go for pci_alloc_consistent directly instead).
|
||||
|
||||
Allocate memory from a pci pool like this:
|
||||
|
||||
cpu_addr = pci_pool_alloc(pool, flags, &dma_handle);
|
||||
|
||||
flags are SLAB_KERNEL if blocking is permitted (not in_interrupt nor
|
||||
holding SMP locks), SLAB_ATOMIC otherwise. Like pci_alloc_consistent,
|
||||
this returns two values, cpu_addr and dma_handle.
|
||||
|
||||
Free memory that was allocated from a pci_pool like this:
|
||||
|
||||
pci_pool_free(pool, cpu_addr, dma_handle);
|
||||
|
||||
where pool is what you passed to pci_pool_alloc, and cpu_addr and
|
||||
dma_handle are the values pci_pool_alloc returned. This function
|
||||
may be called in interrupt context.
|
||||
|
||||
Destroy a pci_pool by calling:
|
||||
|
||||
pci_pool_destroy(pool);
|
||||
|
||||
Make sure you've called pci_pool_free for all memory allocated
|
||||
from a pool before you destroy the pool. This function may not
|
||||
be called in interrupt context.
|
||||
|
||||
DMA Direction
|
||||
|
||||
The interfaces described in subsequent portions of this document
|
||||
take a DMA direction argument, which is an integer and takes on
|
||||
one of the following values:
|
||||
|
||||
PCI_DMA_BIDIRECTIONAL
|
||||
PCI_DMA_TODEVICE
|
||||
PCI_DMA_FROMDEVICE
|
||||
PCI_DMA_NONE
|
||||
|
||||
One should provide the exact DMA direction if you know it.
|
||||
|
||||
PCI_DMA_TODEVICE means "from main memory to the PCI device"
|
||||
PCI_DMA_FROMDEVICE means "from the PCI device to main memory"
|
||||
It is the direction in which the data moves during the DMA
|
||||
transfer.
|
||||
|
||||
You are _strongly_ encouraged to specify this as precisely
|
||||
as you possibly can.
|
||||
|
||||
If you absolutely cannot know the direction of the DMA transfer,
|
||||
specify PCI_DMA_BIDIRECTIONAL. It means that the DMA can go in
|
||||
either direction. The platform guarantees that you may legally
|
||||
specify this, and that it will work, but this may be at the
|
||||
cost of performance for example.
|
||||
|
||||
The value PCI_DMA_NONE is to be used for debugging. One can
|
||||
hold this in a data structure before you come to know the
|
||||
precise direction, and this will help catch cases where your
|
||||
direction tracking logic has failed to set things up properly.
|
||||
|
||||
Another advantage of specifying this value precisely (outside of
|
||||
potential platform-specific optimizations of such) is for debugging.
|
||||
Some platforms actually have a write permission boolean which DMA
|
||||
mappings can be marked with, much like page protections in the user
|
||||
program address space. Such platforms can and do report errors in the
|
||||
kernel logs when the PCI controller hardware detects violation of the
|
||||
permission setting.
|
||||
|
||||
Only streaming mappings specify a direction, consistent mappings
|
||||
implicitly have a direction attribute setting of
|
||||
PCI_DMA_BIDIRECTIONAL.
|
||||
|
||||
The SCSI subsystem provides mechanisms for you to easily obtain
|
||||
the direction to use, in the SCSI command:
|
||||
|
||||
scsi_to_pci_dma_dir(SCSI_DIRECTION)
|
||||
|
||||
Where SCSI_DIRECTION is obtained from the 'sc_data_direction'
|
||||
member of the SCSI command your driver is working on. The
|
||||
mentioned interface above returns a value suitable for passing
|
||||
into the streaming DMA mapping interfaces below.
|
||||
|
||||
For Networking drivers, it's a rather simple affair. For transmit
|
||||
packets, map/unmap them with the PCI_DMA_TODEVICE direction
|
||||
specifier. For receive packets, just the opposite, map/unmap them
|
||||
with the PCI_DMA_FROMDEVICE direction specifier.
|
||||
|
||||
Using Streaming DMA mappings
|
||||
|
||||
The streaming DMA mapping routines can be called from interrupt
|
||||
context. There are two versions of each map/unmap, one which will
|
||||
map/unmap a single memory region, and one which will map/unmap a
|
||||
scatterlist.
|
||||
|
||||
To map a single region, you do:
|
||||
|
||||
struct pci_dev *pdev = mydev->pdev;
|
||||
dma_addr_t dma_handle;
|
||||
void *addr = buffer->ptr;
|
||||
size_t size = buffer->len;
|
||||
|
||||
dma_handle = pci_map_single(dev, addr, size, direction);
|
||||
|
||||
and to unmap it:
|
||||
|
||||
pci_unmap_single(dev, dma_handle, size, direction);
|
||||
|
||||
You should call pci_unmap_single when the DMA activity is finished, e.g.
|
||||
from the interrupt which told you that the DMA transfer is done.
|
||||
|
||||
Using cpu pointers like this for single mappings has a disadvantage,
|
||||
you cannot reference HIGHMEM memory in this way. Thus, there is a
|
||||
map/unmap interface pair akin to pci_{map,unmap}_single. These
|
||||
interfaces deal with page/offset pairs instead of cpu pointers.
|
||||
Specifically:
|
||||
|
||||
struct pci_dev *pdev = mydev->pdev;
|
||||
dma_addr_t dma_handle;
|
||||
struct page *page = buffer->page;
|
||||
unsigned long offset = buffer->offset;
|
||||
size_t size = buffer->len;
|
||||
|
||||
dma_handle = pci_map_page(dev, page, offset, size, direction);
|
||||
|
||||
...
|
||||
|
||||
pci_unmap_page(dev, dma_handle, size, direction);
|
||||
|
||||
Here, "offset" means byte offset within the given page.
|
||||
|
||||
With scatterlists, you map a region gathered from several regions by:
|
||||
|
||||
int i, count = pci_map_sg(dev, sglist, nents, direction);
|
||||
struct scatterlist *sg;
|
||||
|
||||
for (i = 0, sg = sglist; i < count; i++, sg++) {
|
||||
hw_address[i] = sg_dma_address(sg);
|
||||
hw_len[i] = sg_dma_len(sg);
|
||||
}
|
||||
|
||||
where nents is the number of entries in the sglist.
|
||||
|
||||
The implementation is free to merge several consecutive sglist entries
|
||||
into one (e.g. if DMA mapping is done with PAGE_SIZE granularity, any
|
||||
consecutive sglist entries can be merged into one provided the first one
|
||||
ends and the second one starts on a page boundary - in fact this is a huge
|
||||
advantage for cards which either cannot do scatter-gather or have very
|
||||
limited number of scatter-gather entries) and returns the actual number
|
||||
of sg entries it mapped them to. On failure 0 is returned.
|
||||
|
||||
Then you should loop count times (note: this can be less than nents times)
|
||||
and use sg_dma_address() and sg_dma_len() macros where you previously
|
||||
accessed sg->address and sg->length as shown above.
|
||||
|
||||
To unmap a scatterlist, just call:
|
||||
|
||||
pci_unmap_sg(dev, sglist, nents, direction);
|
||||
|
||||
Again, make sure DMA activity has already finished.
|
||||
|
||||
PLEASE NOTE: The 'nents' argument to the pci_unmap_sg call must be
|
||||
the _same_ one you passed into the pci_map_sg call,
|
||||
it should _NOT_ be the 'count' value _returned_ from the
|
||||
pci_map_sg call.
|
||||
|
||||
Every pci_map_{single,sg} call should have its pci_unmap_{single,sg}
|
||||
counterpart, because the bus address space is a shared resource (although
|
||||
in some ports the mapping is per each BUS so less devices contend for the
|
||||
same bus address space) and you could render the machine unusable by eating
|
||||
all bus addresses.
|
||||
|
||||
If you need to use the same streaming DMA region multiple times and touch
|
||||
the data in between the DMA transfers, the buffer needs to be synced
|
||||
properly in order for the cpu and device to see the most uptodate and
|
||||
correct copy of the DMA buffer.
|
||||
|
||||
So, firstly, just map it with pci_map_{single,sg}, and after each DMA
|
||||
transfer call either:
|
||||
|
||||
pci_dma_sync_single_for_cpu(dev, dma_handle, size, direction);
|
||||
|
||||
or:
|
||||
|
||||
pci_dma_sync_sg_for_cpu(dev, sglist, nents, direction);
|
||||
|
||||
as appropriate.
|
||||
|
||||
Then, if you wish to let the device get at the DMA area again,
|
||||
finish accessing the data with the cpu, and then before actually
|
||||
giving the buffer to the hardware call either:
|
||||
|
||||
pci_dma_sync_single_for_device(dev, dma_handle, size, direction);
|
||||
|
||||
or:
|
||||
|
||||
pci_dma_sync_sg_for_device(dev, sglist, nents, direction);
|
||||
|
||||
as appropriate.
|
||||
|
||||
After the last DMA transfer call one of the DMA unmap routines
|
||||
pci_unmap_{single,sg}. If you don't touch the data from the first pci_map_*
|
||||
call till pci_unmap_*, then you don't have to call the pci_dma_sync_*
|
||||
routines at all.
|
||||
|
||||
Here is pseudo code which shows a situation in which you would need
|
||||
to use the pci_dma_sync_*() interfaces.
|
||||
|
||||
my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len)
|
||||
{
|
||||
dma_addr_t mapping;
|
||||
|
||||
mapping = pci_map_single(cp->pdev, buffer, len, PCI_DMA_FROMDEVICE);
|
||||
|
||||
cp->rx_buf = buffer;
|
||||
cp->rx_len = len;
|
||||
cp->rx_dma = mapping;
|
||||
|
||||
give_rx_buf_to_card(cp);
|
||||
}
|
||||
|
||||
...
|
||||
|
||||
my_card_interrupt_handler(int irq, void *devid, struct pt_regs *regs)
|
||||
{
|
||||
struct my_card *cp = devid;
|
||||
|
||||
...
|
||||
if (read_card_status(cp) == RX_BUF_TRANSFERRED) {
|
||||
struct my_card_header *hp;
|
||||
|
||||
/* Examine the header to see if we wish
|
||||
* to accept the data. But synchronize
|
||||
* the DMA transfer with the CPU first
|
||||
* so that we see updated contents.
|
||||
*/
|
||||
pci_dma_sync_single_for_cpu(cp->pdev, cp->rx_dma,
|
||||
cp->rx_len,
|
||||
PCI_DMA_FROMDEVICE);
|
||||
|
||||
/* Now it is safe to examine the buffer. */
|
||||
hp = (struct my_card_header *) cp->rx_buf;
|
||||
if (header_is_ok(hp)) {
|
||||
pci_unmap_single(cp->pdev, cp->rx_dma, cp->rx_len,
|
||||
PCI_DMA_FROMDEVICE);
|
||||
pass_to_upper_layers(cp->rx_buf);
|
||||
make_and_setup_new_rx_buf(cp);
|
||||
} else {
|
||||
/* Just sync the buffer and give it back
|
||||
* to the card.
|
||||
*/
|
||||
pci_dma_sync_single_for_device(cp->pdev,
|
||||
cp->rx_dma,
|
||||
cp->rx_len,
|
||||
PCI_DMA_FROMDEVICE);
|
||||
give_rx_buf_to_card(cp);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Drivers converted fully to this interface should not use virt_to_bus any
|
||||
longer, nor should they use bus_to_virt. Some drivers have to be changed a
|
||||
little bit, because there is no longer an equivalent to bus_to_virt in the
|
||||
dynamic DMA mapping scheme - you have to always store the DMA addresses
|
||||
returned by the pci_alloc_consistent, pci_pool_alloc, and pci_map_single
|
||||
calls (pci_map_sg stores them in the scatterlist itself if the platform
|
||||
supports dynamic DMA mapping in hardware) in your driver structures and/or
|
||||
in the card registers.
|
||||
|
||||
All PCI drivers should be using these interfaces with no exceptions.
|
||||
It is planned to completely remove virt_to_bus() and bus_to_virt() as
|
||||
they are entirely deprecated. Some ports already do not provide these
|
||||
as it is impossible to correctly support them.
|
||||
|
||||
64-bit DMA and DAC cycle support
|
||||
|
||||
Do you understand all of the text above? Great, then you already
|
||||
know how to use 64-bit DMA addressing under Linux. Simply make
|
||||
the appropriate pci_set_dma_mask() calls based upon your cards
|
||||
capabilities, then use the mapping APIs above.
|
||||
|
||||
It is that simple.
|
||||
|
||||
Well, not for some odd devices. See the next section for information
|
||||
about that.
|
||||
|
||||
DAC Addressing for Address Space Hungry Devices
|
||||
|
||||
There exists a class of devices which do not mesh well with the PCI
|
||||
DMA mapping API. By definition these "mappings" are a finite
|
||||
resource. The number of total available mappings per bus is platform
|
||||
specific, but there will always be a reasonable amount.
|
||||
|
||||
What is "reasonable"? Reasonable means that networking and block I/O
|
||||
devices need not worry about using too many mappings.
|
||||
|
||||
As an example of a problematic device, consider compute cluster cards.
|
||||
They can potentially need to access gigabytes of memory at once via
|
||||
DMA. Dynamic mappings are unsuitable for this kind of access pattern.
|
||||
|
||||
To this end we've provided a small API by which a device driver
|
||||
may use DAC cycles to directly address all of physical memory.
|
||||
Not all platforms support this, but most do. It is easy to determine
|
||||
whether the platform will work properly at probe time.
|
||||
|
||||
First, understand that there may be a SEVERE performance penalty for
|
||||
using these interfaces on some platforms. Therefore, you MUST only
|
||||
use these interfaces if it is absolutely required. %99 of devices can
|
||||
use the normal APIs without any problems.
|
||||
|
||||
Note that for streaming type mappings you must either use these
|
||||
interfaces, or the dynamic mapping interfaces above. You may not mix
|
||||
usage of both for the same device. Such an act is illegal and is
|
||||
guaranteed to put a banana in your tailpipe.
|
||||
|
||||
However, consistent mappings may in fact be used in conjunction with
|
||||
these interfaces. Remember that, as defined, consistent mappings are
|
||||
always going to be SAC addressable.
|
||||
|
||||
The first thing your driver needs to do is query the PCI platform
|
||||
layer with your devices DAC addressing capabilities:
|
||||
|
||||
int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 mask);
|
||||
|
||||
This routine behaves identically to pci_set_dma_mask. You may not
|
||||
use the following interfaces if this routine fails.
|
||||
|
||||
Next, DMA addresses using this API are kept track of using the
|
||||
dma64_addr_t type. It is guaranteed to be big enough to hold any
|
||||
DAC address the platform layer will give to you from the following
|
||||
routines. If you have consistent mappings as well, you still
|
||||
use plain dma_addr_t to keep track of those.
|
||||
|
||||
All mappings obtained here will be direct. The mappings are not
|
||||
translated, and this is the purpose of this dialect of the DMA API.
|
||||
|
||||
All routines work with page/offset pairs. This is the _ONLY_ way to
|
||||
portably refer to any piece of memory. If you have a cpu pointer
|
||||
(which may be validly DMA'd too) you may easily obtain the page
|
||||
and offset using something like this:
|
||||
|
||||
struct page *page = virt_to_page(ptr);
|
||||
unsigned long offset = offset_in_page(ptr);
|
||||
|
||||
Here are the interfaces:
|
||||
|
||||
dma64_addr_t pci_dac_page_to_dma(struct pci_dev *pdev,
|
||||
struct page *page,
|
||||
unsigned long offset,
|
||||
int direction);
|
||||
|
||||
The DAC address for the tuple PAGE/OFFSET are returned. The direction
|
||||
argument is the same as for pci_{map,unmap}_single(). The same rules
|
||||
for cpu/device access apply here as for the streaming mapping
|
||||
interfaces. To reiterate:
|
||||
|
||||
The cpu may touch the buffer before pci_dac_page_to_dma.
|
||||
The device may touch the buffer after pci_dac_page_to_dma
|
||||
is made, but the cpu may NOT.
|
||||
|
||||
When the DMA transfer is complete, invoke:
|
||||
|
||||
void pci_dac_dma_sync_single_for_cpu(struct pci_dev *pdev,
|
||||
dma64_addr_t dma_addr,
|
||||
size_t len, int direction);
|
||||
|
||||
This must be done before the CPU looks at the buffer again.
|
||||
This interface behaves identically to pci_dma_sync_{single,sg}_for_cpu().
|
||||
|
||||
And likewise, if you wish to let the device get back at the buffer after
|
||||
the cpu has read/written it, invoke:
|
||||
|
||||
void pci_dac_dma_sync_single_for_device(struct pci_dev *pdev,
|
||||
dma64_addr_t dma_addr,
|
||||
size_t len, int direction);
|
||||
|
||||
before letting the device access the DMA area again.
|
||||
|
||||
If you need to get back to the PAGE/OFFSET tuple from a dma64_addr_t
|
||||
the following interfaces are provided:
|
||||
|
||||
struct page *pci_dac_dma_to_page(struct pci_dev *pdev,
|
||||
dma64_addr_t dma_addr);
|
||||
unsigned long pci_dac_dma_to_offset(struct pci_dev *pdev,
|
||||
dma64_addr_t dma_addr);
|
||||
|
||||
This is possible with the DAC interfaces purely because they are
|
||||
not translated in any way.
|
||||
|
||||
Optimizing Unmap State Space Consumption
|
||||
|
||||
On many platforms, pci_unmap_{single,page}() is simply a nop.
|
||||
Therefore, keeping track of the mapping address and length is a waste
|
||||
of space. Instead of filling your drivers up with ifdefs and the like
|
||||
to "work around" this (which would defeat the whole purpose of a
|
||||
portable API) the following facilities are provided.
|
||||
|
||||
Actually, instead of describing the macros one by one, we'll
|
||||
transform some example code.
|
||||
|
||||
1) Use DECLARE_PCI_UNMAP_{ADDR,LEN} in state saving structures.
|
||||
Example, before:
|
||||
|
||||
struct ring_state {
|
||||
struct sk_buff *skb;
|
||||
dma_addr_t mapping;
|
||||
__u32 len;
|
||||
};
|
||||
|
||||
after:
|
||||
|
||||
struct ring_state {
|
||||
struct sk_buff *skb;
|
||||
DECLARE_PCI_UNMAP_ADDR(mapping)
|
||||
DECLARE_PCI_UNMAP_LEN(len)
|
||||
};
|
||||
|
||||
NOTE: DO NOT put a semicolon at the end of the DECLARE_*()
|
||||
macro.
|
||||
|
||||
2) Use pci_unmap_{addr,len}_set to set these values.
|
||||
Example, before:
|
||||
|
||||
ringp->mapping = FOO;
|
||||
ringp->len = BAR;
|
||||
|
||||
after:
|
||||
|
||||
pci_unmap_addr_set(ringp, mapping, FOO);
|
||||
pci_unmap_len_set(ringp, len, BAR);
|
||||
|
||||
3) Use pci_unmap_{addr,len} to access these values.
|
||||
Example, before:
|
||||
|
||||
pci_unmap_single(pdev, ringp->mapping, ringp->len,
|
||||
PCI_DMA_FROMDEVICE);
|
||||
|
||||
after:
|
||||
|
||||
pci_unmap_single(pdev,
|
||||
pci_unmap_addr(ringp, mapping),
|
||||
pci_unmap_len(ringp, len),
|
||||
PCI_DMA_FROMDEVICE);
|
||||
|
||||
It really should be self-explanatory. We treat the ADDR and LEN
|
||||
separately, because it is possible for an implementation to only
|
||||
need the address in order to perform the unmap operation.
|
||||
|
||||
Platform Issues
|
||||
|
||||
If you are just writing drivers for Linux and do not maintain
|
||||
an architecture port for the kernel, you can safely skip down
|
||||
to "Closing".
|
||||
|
||||
1) Struct scatterlist requirements.
|
||||
|
||||
Struct scatterlist must contain, at a minimum, the following
|
||||
members:
|
||||
|
||||
struct page *page;
|
||||
unsigned int offset;
|
||||
unsigned int length;
|
||||
|
||||
The base address is specified by a "page+offset" pair.
|
||||
|
||||
Previous versions of struct scatterlist contained a "void *address"
|
||||
field that was sometimes used instead of page+offset. As of Linux
|
||||
2.5., page+offset is always used, and the "address" field has been
|
||||
deleted.
|
||||
|
||||
2) More to come...
|
||||
|
||||
Handling Errors
|
||||
|
||||
DMA address space is limited on some architectures and an allocation
|
||||
failure can be determined by:
|
||||
|
||||
- checking if pci_alloc_consistent returns NULL or pci_map_sg returns 0
|
||||
|
||||
- checking the returned dma_addr_t of pci_map_single and pci_map_page
|
||||
by using pci_dma_mapping_error():
|
||||
|
||||
dma_addr_t dma_handle;
|
||||
|
||||
dma_handle = pci_map_single(dev, addr, size, direction);
|
||||
if (pci_dma_mapping_error(dma_handle)) {
|
||||
/*
|
||||
* reduce current DMA mapping usage,
|
||||
* delay and try again later or
|
||||
* reset driver.
|
||||
*/
|
||||
}
|
||||
|
||||
Closing
|
||||
|
||||
This document, and the API itself, would not be in it's current
|
||||
form without the feedback and suggestions from numerous individuals.
|
||||
We would like to specifically mention, in no particular order, the
|
||||
following people:
|
||||
|
||||
Russell King <rmk@arm.linux.org.uk>
|
||||
Leo Dagum <dagum@barrel.engr.sgi.com>
|
||||
Ralf Baechle <ralf@oss.sgi.com>
|
||||
Grant Grundler <grundler@cup.hp.com>
|
||||
Jay Estabrook <Jay.Estabrook@compaq.com>
|
||||
Thomas Sailer <sailer@ife.ee.ethz.ch>
|
||||
Andrea Arcangeli <andrea@suse.de>
|
||||
Jens Axboe <axboe@suse.de>
|
||||
David Mosberger-Tang <davidm@hpl.hp.com>
|
195
Documentation/DocBook/Makefile
Normal file
195
Documentation/DocBook/Makefile
Normal file
|
@ -0,0 +1,195 @@
|
|||
###
|
||||
# This makefile is used to generate the kernel documentation,
|
||||
# primarily based on in-line comments in various source files.
|
||||
# See Documentation/kernel-doc-nano-HOWTO.txt for instruction in how
|
||||
# to ducument the SRC - and how to read it.
|
||||
# To add a new book the only step required is to add the book to the
|
||||
# list of DOCBOOKS.
|
||||
|
||||
DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
|
||||
kernel-hacking.xml kernel-locking.xml via-audio.xml \
|
||||
deviceiobook.xml procfs-guide.xml tulip-user.xml \
|
||||
writing_usb_driver.xml scsidrivers.xml sis900.xml \
|
||||
kernel-api.xml journal-api.xml lsm.xml usb.xml \
|
||||
gadget.xml libata.xml mtdnand.xml librs.xml
|
||||
|
||||
###
|
||||
# The build process is as follows (targets):
|
||||
# (xmldocs)
|
||||
# file.tmpl --> file.xml +--> file.ps (psdocs)
|
||||
# +--> file.pdf (pdfdocs)
|
||||
# +--> DIR=file (htmldocs)
|
||||
# +--> man/ (mandocs)
|
||||
|
||||
###
|
||||
# The targets that may be used.
|
||||
.PHONY: xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs
|
||||
|
||||
BOOKS := $(addprefix $(obj)/,$(DOCBOOKS))
|
||||
xmldocs: $(BOOKS)
|
||||
sgmldocs: xmldocs
|
||||
|
||||
PS := $(patsubst %.xml, %.ps, $(BOOKS))
|
||||
psdocs: $(PS)
|
||||
|
||||
PDF := $(patsubst %.xml, %.pdf, $(BOOKS))
|
||||
pdfdocs: $(PDF)
|
||||
|
||||
HTML := $(patsubst %.xml, %.html, $(BOOKS))
|
||||
htmldocs: $(HTML)
|
||||
|
||||
MAN := $(patsubst %.xml, %.9, $(BOOKS))
|
||||
mandocs: $(MAN)
|
||||
|
||||
installmandocs: mandocs
|
||||
$(MAKEMAN) install Documentation/DocBook/man
|
||||
|
||||
###
|
||||
#External programs used
|
||||
KERNELDOC = scripts/kernel-doc
|
||||
DOCPROC = scripts/basic/docproc
|
||||
SPLITMAN = $(PERL) $(srctree)/scripts/split-man
|
||||
MAKEMAN = $(PERL) $(srctree)/scripts/makeman
|
||||
|
||||
###
|
||||
# DOCPROC is used for two purposes:
|
||||
# 1) To generate a dependency list for a .tmpl file
|
||||
# 2) To preprocess a .tmpl file and call kernel-doc with
|
||||
# appropriate parameters.
|
||||
# The following rules are used to generate the .xml documentation
|
||||
# required to generate the final targets. (ps, pdf, html).
|
||||
quiet_cmd_docproc = DOCPROC $@
|
||||
cmd_docproc = SRCTREE=$(srctree)/ $(DOCPROC) doc $< >$@
|
||||
define rule_docproc
|
||||
set -e; \
|
||||
$(if $($(quiet)cmd_$(1)),echo ' $($(quiet)cmd_$(1))';) \
|
||||
$(cmd_$(1)); \
|
||||
( \
|
||||
echo 'cmd_$@ := $(cmd_$(1))'; \
|
||||
echo $@: `SRCTREE=$(srctree) $(DOCPROC) depend $<`; \
|
||||
) > $(dir $@).$(notdir $@).cmd
|
||||
endef
|
||||
|
||||
%.xml: %.tmpl FORCE
|
||||
$(call if_changed_rule,docproc)
|
||||
|
||||
###
|
||||
#Read in all saved dependency files
|
||||
cmd_files := $(wildcard $(foreach f,$(BOOKS),$(dir $(f)).$(notdir $(f)).cmd))
|
||||
|
||||
ifneq ($(cmd_files),)
|
||||
include $(cmd_files)
|
||||
endif
|
||||
|
||||
###
|
||||
# Changes in kernel-doc force a rebuild of all documentation
|
||||
$(BOOKS): $(KERNELDOC)
|
||||
|
||||
###
|
||||
# procfs guide uses a .c file as example code.
|
||||
# This requires an explicit dependency
|
||||
C-procfs-example = procfs_example.xml
|
||||
C-procfs-example2 = $(addprefix $(obj)/,$(C-procfs-example))
|
||||
$(obj)/procfs-guide.xml: $(C-procfs-example2)
|
||||
|
||||
###
|
||||
# Rules to generate postscript, PDF and HTML
|
||||
# db2html creates a directory. Generate a html file used for timestamp
|
||||
|
||||
quiet_cmd_db2ps = DB2PS $@
|
||||
cmd_db2ps = db2ps -o $(dir $@) $<
|
||||
%.ps : %.xml
|
||||
@(which db2ps > /dev/null 2>&1) || \
|
||||
(echo "*** You need to install DocBook stylesheets ***"; \
|
||||
exit 1)
|
||||
$(call cmd,db2ps)
|
||||
|
||||
quiet_cmd_db2pdf = DB2PDF $@
|
||||
cmd_db2pdf = db2pdf -o $(dir $@) $<
|
||||
%.pdf : %.xml
|
||||
@(which db2pdf > /dev/null 2>&1) || \
|
||||
(echo "*** You need to install DocBook stylesheets ***"; \
|
||||
exit 1)
|
||||
$(call cmd,db2pdf)
|
||||
|
||||
quiet_cmd_db2html = DB2HTML $@
|
||||
cmd_db2html = db2html -o $(patsubst %.html,%,$@) $< && \
|
||||
echo '<a HREF="$(patsubst %.html,%,$(notdir $@))/book1.html"> \
|
||||
Goto $(patsubst %.html,%,$(notdir $@))</a><p>' > $@
|
||||
|
||||
%.html: %.xml
|
||||
@(which db2html > /dev/null 2>&1) || \
|
||||
(echo "*** You need to install DocBook stylesheets ***"; \
|
||||
exit 1)
|
||||
@rm -rf $@ $(patsubst %.html,%,$@)
|
||||
$(call cmd,db2html)
|
||||
@if [ ! -z "$(PNG-$(basename $(notdir $@)))" ]; then \
|
||||
cp $(PNG-$(basename $(notdir $@))) $(patsubst %.html,%,$@); fi
|
||||
|
||||
###
|
||||
# Rule to generate man files - output is placed in the man subdirectory
|
||||
|
||||
%.9: %.xml
|
||||
ifneq ($(KBUILD_SRC),)
|
||||
$(Q)mkdir -p $(objtree)/Documentation/DocBook/man
|
||||
endif
|
||||
$(SPLITMAN) $< $(objtree)/Documentation/DocBook/man "$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)"
|
||||
$(MAKEMAN) convert $(objtree)/Documentation/DocBook/man $<
|
||||
|
||||
###
|
||||
# Rules to generate postscripts and PNG imgages from .fig format files
|
||||
quiet_cmd_fig2eps = FIG2EPS $@
|
||||
cmd_fig2eps = fig2dev -Leps $< $@
|
||||
|
||||
%.eps: %.fig
|
||||
@(which fig2dev > /dev/null 2>&1) || \
|
||||
(echo "*** You need to install transfig ***"; \
|
||||
exit 1)
|
||||
$(call cmd,fig2eps)
|
||||
|
||||
quiet_cmd_fig2png = FIG2PNG $@
|
||||
cmd_fig2png = fig2dev -Lpng $< $@
|
||||
|
||||
%.png: %.fig
|
||||
@(which fig2dev > /dev/null 2>&1) || \
|
||||
(echo "*** You need to install transfig ***"; \
|
||||
exit 1)
|
||||
$(call cmd,fig2png)
|
||||
|
||||
###
|
||||
# Rule to convert a .c file to inline XML documentation
|
||||
%.xml: %.c
|
||||
@echo ' GEN $@'
|
||||
@( \
|
||||
echo "<programlisting>"; \
|
||||
expand --tabs=8 < $< | \
|
||||
sed -e "s/&/\\&/g" \
|
||||
-e "s/</\\</g" \
|
||||
-e "s/>/\\>/g"; \
|
||||
echo "</programlisting>") > $@
|
||||
|
||||
###
|
||||
# Help targets as used by the top-level makefile
|
||||
dochelp:
|
||||
@echo ' Linux kernel internal documentation in different formats:'
|
||||
@echo ' xmldocs (XML DocBook), psdocs (Postscript), pdfdocs (PDF)'
|
||||
@echo ' htmldocs (HTML), mandocs (man pages, use installmandocs to install)'
|
||||
|
||||
###
|
||||
# Temporary files left by various tools
|
||||
clean-files := $(DOCBOOKS) \
|
||||
$(patsubst %.xml, %.dvi, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.aux, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.tex, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.log, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.out, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.ps, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.pdf, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.html, $(DOCBOOKS)) \
|
||||
$(patsubst %.xml, %.9, $(DOCBOOKS)) \
|
||||
$(C-procfs-example)
|
||||
|
||||
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS))
|
||||
|
||||
#man put files in man subdir - traverse down
|
||||
subdir- := man/
|
341
Documentation/DocBook/deviceiobook.tmpl
Normal file
341
Documentation/DocBook/deviceiobook.tmpl
Normal file
|
@ -0,0 +1,341 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="DoingIO">
|
||||
<bookinfo>
|
||||
<title>Bus-Independent Device Accesses</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Matthew</firstname>
|
||||
<surname>Wilcox</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>matthew@wil.cx</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Alan</firstname>
|
||||
<surname>Cox</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>alan@redhat.com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2001</year>
|
||||
<holder>Matthew Wilcox</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
Linux provides an API which abstracts performing IO across all busses
|
||||
and devices, allowing device drivers to be written independently of
|
||||
bus type.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="bugs">
|
||||
<title>Known Bugs And Assumptions</title>
|
||||
<para>
|
||||
None.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="mmio">
|
||||
<title>Memory Mapped IO</title>
|
||||
<sect1>
|
||||
<title>Getting Access to the Device</title>
|
||||
<para>
|
||||
The most widely supported form of IO is memory mapped IO.
|
||||
That is, a part of the CPU's address space is interpreted
|
||||
not as accesses to memory, but as accesses to a device. Some
|
||||
architectures define devices to be at a fixed address, but most
|
||||
have some method of discovering devices. The PCI bus walk is a
|
||||
good example of such a scheme. This document does not cover how
|
||||
to receive such an address, but assumes you are starting with one.
|
||||
Physical addresses are of type unsigned long.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This address should not be used directly. Instead, to get an
|
||||
address suitable for passing to the accessor functions described
|
||||
below, you should call <function>ioremap</function>.
|
||||
An address suitable for accessing the device will be returned to you.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
After you've finished using the device (say, in your module's
|
||||
exit routine), call <function>iounmap</function> in order to return
|
||||
the address space to the kernel. Most architectures allocate new
|
||||
address space each time you call <function>ioremap</function>, and
|
||||
they can run out unless you call <function>iounmap</function>.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Accessing the device</title>
|
||||
<para>
|
||||
The part of the interface most used by drivers is reading and
|
||||
writing memory-mapped registers on the device. Linux provides
|
||||
interfaces to read and write 8-bit, 16-bit, 32-bit and 64-bit
|
||||
quantities. Due to a historical accident, these are named byte,
|
||||
word, long and quad accesses. Both read and write accesses are
|
||||
supported; there is no prefetch support at this time.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The functions are named <function>readb</function>,
|
||||
<function>readw</function>, <function>readl</function>,
|
||||
<function>readq</function>, <function>readb_relaxed</function>,
|
||||
<function>readw_relaxed</function>, <function>readl_relaxed</function>,
|
||||
<function>readq_relaxed</function>, <function>writeb</function>,
|
||||
<function>writew</function>, <function>writel</function> and
|
||||
<function>writeq</function>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Some devices (such as framebuffers) would like to use larger
|
||||
transfers than 8 bytes at a time. For these devices, the
|
||||
<function>memcpy_toio</function>, <function>memcpy_fromio</function>
|
||||
and <function>memset_io</function> functions are provided.
|
||||
Do not use memset or memcpy on IO addresses; they
|
||||
are not guaranteed to copy data in order.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The read and write functions are defined to be ordered. That is the
|
||||
compiler is not permitted to reorder the I/O sequence. When the
|
||||
ordering can be compiler optimised, you can use <function>
|
||||
__readb</function> and friends to indicate the relaxed ordering. Use
|
||||
this with care.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
While the basic functions are defined to be synchronous with respect
|
||||
to each other and ordered with respect to each other the busses the
|
||||
devices sit on may themselves have asynchronicity. In particular many
|
||||
authors are burned by the fact that PCI bus writes are posted
|
||||
asynchronously. A driver author must issue a read from the same
|
||||
device to ensure that writes have occurred in the specific cases the
|
||||
author cares. This kind of property cannot be hidden from driver
|
||||
writers in the API. In some cases, the read used to flush the device
|
||||
may be expected to fail (if the card is resetting, for example). In
|
||||
that case, the read should be done from config space, which is
|
||||
guaranteed to soft-fail if the card doesn't respond.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following is an example of flushing a write to a device when
|
||||
the driver would like to ensure the write's effects are visible prior
|
||||
to continuing execution.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
static inline void
|
||||
qla1280_disable_intrs(struct scsi_qla_host *ha)
|
||||
{
|
||||
struct device_reg *reg;
|
||||
|
||||
reg = ha->iobase;
|
||||
/* disable risc and host interrupts */
|
||||
WRT_REG_WORD(&reg->ictrl, 0);
|
||||
/*
|
||||
* The following read will ensure that the above write
|
||||
* has been received by the device before we return from this
|
||||
* function.
|
||||
*/
|
||||
RD_REG_WORD(&reg->ictrl);
|
||||
ha->flags.ints_enabled = 0;
|
||||
}
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
In addition to write posting, on some large multiprocessing systems
|
||||
(e.g. SGI Challenge, Origin and Altix machines) posted writes won't
|
||||
be strongly ordered coming from different CPUs. Thus it's important
|
||||
to properly protect parts of your driver that do memory-mapped writes
|
||||
with locks and use the <function>mmiowb</function> to make sure they
|
||||
arrive in the order intended. Issuing a regular <function>readX
|
||||
</function> will also ensure write ordering, but should only be used
|
||||
when the driver has to be sure that the write has actually arrived
|
||||
at the device (not that it's simply ordered with respect to other
|
||||
writes), since a full <function>readX</function> is a relatively
|
||||
expensive operation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Generally, one should use <function>mmiowb</function> prior to
|
||||
releasing a spinlock that protects regions using <function>writeb
|
||||
</function> or similar functions that aren't surrounded by <function>
|
||||
readb</function> calls, which will ensure ordering and flushing. The
|
||||
following pseudocode illustrates what might occur if write ordering
|
||||
isn't guaranteed via <function>mmiowb</function> or one of the
|
||||
<function>readX</function> functions.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
CPU A: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU A: ...
|
||||
CPU A: writel(newval, ring_ptr);
|
||||
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
...
|
||||
CPU B: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU B: writel(newval2, ring_ptr);
|
||||
CPU B: ...
|
||||
CPU B: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
In the case above, newval2 could be written to ring_ptr before
|
||||
newval. Fixing it is easy though:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
CPU A: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU A: ...
|
||||
CPU A: writel(newval, ring_ptr);
|
||||
CPU A: mmiowb(); /* ensure no other writes beat us to the device */
|
||||
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
...
|
||||
CPU B: spin_lock_irqsave(&dev_lock, flags)
|
||||
CPU B: writel(newval2, ring_ptr);
|
||||
CPU B: ...
|
||||
CPU B: mmiowb();
|
||||
CPU B: spin_unlock_irqrestore(&dev_lock, flags)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
See tg3.c for a real world example of how to use <function>mmiowb
|
||||
</function>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
PCI ordering rules also guarantee that PIO read responses arrive
|
||||
after any outstanding DMA writes from that bus, since for some devices
|
||||
the result of a <function>readb</function> call may signal to the
|
||||
driver that a DMA transaction is complete. In many cases, however,
|
||||
the driver may want to indicate that the next
|
||||
<function>readb</function> call has no relation to any previous DMA
|
||||
writes performed by the device. The driver can use
|
||||
<function>readb_relaxed</function> for these cases, although only
|
||||
some platforms will honor the relaxed semantics. Using the relaxed
|
||||
read functions will provide significant performance benefits on
|
||||
platforms that support it. The qla2xxx driver provides examples
|
||||
of how to use <function>readX_relaxed</function>. In many cases,
|
||||
a majority of the driver's <function>readX</function> calls can
|
||||
safely be converted to <function>readX_relaxed</function> calls, since
|
||||
only a few will indicate or depend on DMA completion.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>ISA legacy functions</title>
|
||||
<para>
|
||||
On older kernels (2.2 and earlier) the ISA bus could be read or
|
||||
written with these functions and without ioremap being used. This is
|
||||
no longer true in Linux 2.4. A set of equivalent functions exist for
|
||||
easy legacy driver porting. The functions available are prefixed
|
||||
with 'isa_' and are <function>isa_readb</function>,
|
||||
<function>isa_writeb</function>, <function>isa_readw</function>,
|
||||
<function>isa_writew</function>, <function>isa_readl</function>,
|
||||
<function>isa_writel</function>, <function>isa_memcpy_fromio</function>
|
||||
and <function>isa_memcpy_toio</function>
|
||||
</para>
|
||||
<para>
|
||||
These functions should not be used in new drivers, and will
|
||||
eventually be going away.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<title>Port Space Accesses</title>
|
||||
<sect1>
|
||||
<title>Port Space Explained</title>
|
||||
|
||||
<para>
|
||||
Another form of IO commonly supported is Port Space. This is a
|
||||
range of addresses separate to the normal memory address space.
|
||||
Access to these addresses is generally not as fast as accesses
|
||||
to the memory mapped addresses, and it also has a potentially
|
||||
smaller address space.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Unlike memory mapped IO, no preparation is required
|
||||
to access port space.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
<sect1>
|
||||
<title>Accessing Port Space</title>
|
||||
<para>
|
||||
Accesses to this space are provided through a set of functions
|
||||
which allow 8-bit, 16-bit and 32-bit accesses; also
|
||||
known as byte, word and long. These functions are
|
||||
<function>inb</function>, <function>inw</function>,
|
||||
<function>inl</function>, <function>outb</function>,
|
||||
<function>outw</function> and <function>outl</function>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Some variants are provided for these functions. Some devices
|
||||
require that accesses to their ports are slowed down. This
|
||||
functionality is provided by appending a <function>_p</function>
|
||||
to the end of the function. There are also equivalents to memcpy.
|
||||
The <function>ins</function> and <function>outs</function>
|
||||
functions copy bytes, words or longs to the given port.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="pubfunctions">
|
||||
<title>Public Functions Provided</title>
|
||||
!Einclude/asm-i386/io.h
|
||||
</chapter>
|
||||
|
||||
</book>
|
752
Documentation/DocBook/gadget.tmpl
Normal file
752
Documentation/DocBook/gadget.tmpl
Normal file
|
@ -0,0 +1,752 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="USB-Gadget-API">
|
||||
<bookinfo>
|
||||
<title>USB Gadget API for Linux</title>
|
||||
<date>20 August 2004</date>
|
||||
<edition>20 August 2004</edition>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
<copyright>
|
||||
<year>2003-2004</year>
|
||||
<holder>David Brownell</holder>
|
||||
</copyright>
|
||||
|
||||
<author>
|
||||
<firstname>David</firstname>
|
||||
<surname>Brownell</surname>
|
||||
<affiliation>
|
||||
<address><email>dbrownell@users.sourceforge.net</email></address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter><title>Introduction</title>
|
||||
|
||||
<para>This document presents a Linux-USB "Gadget"
|
||||
kernel mode
|
||||
API, for use within peripherals and other USB devices
|
||||
that embed Linux.
|
||||
It provides an overview of the API structure,
|
||||
and shows how that fits into a system development project.
|
||||
This is the first such API released on Linux to address
|
||||
a number of important problems, including: </para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem><para>Supports USB 2.0, for high speed devices which
|
||||
can stream data at several dozen megabytes per second.
|
||||
</para></listitem>
|
||||
<listitem><para>Handles devices with dozens of endpoints just as
|
||||
well as ones with just two fixed-function ones. Gadget drivers
|
||||
can be written so they're easy to port to new hardware.
|
||||
</para></listitem>
|
||||
<listitem><para>Flexible enough to expose more complex USB device
|
||||
capabilities such as multiple configurations, multiple interfaces,
|
||||
composite devices,
|
||||
and alternate interface settings.
|
||||
</para></listitem>
|
||||
<listitem><para>USB "On-The-Go" (OTG) support, in conjunction
|
||||
with updates to the Linux-USB host side.
|
||||
</para></listitem>
|
||||
<listitem><para>Sharing data structures and API models with the
|
||||
Linux-USB host side API. This helps the OTG support, and
|
||||
looks forward to more-symmetric frameworks (where the same
|
||||
I/O model is used by both host and device side drivers).
|
||||
</para></listitem>
|
||||
<listitem><para>Minimalist, so it's easier to support new device
|
||||
controller hardware. I/O processing doesn't imply large
|
||||
demands for memory or CPU resources.
|
||||
</para></listitem>
|
||||
</itemizedlist>
|
||||
|
||||
|
||||
<para>Most Linux developers will not be able to use this API, since they
|
||||
have USB "host" hardware in a PC, workstation, or server.
|
||||
Linux users with embedded systems are more likely to
|
||||
have USB peripheral hardware.
|
||||
To distinguish drivers running inside such hardware from the
|
||||
more familiar Linux "USB device drivers",
|
||||
which are host side proxies for the real USB devices,
|
||||
a different term is used:
|
||||
the drivers inside the peripherals are "USB gadget drivers".
|
||||
In USB protocol interactions, the device driver is the master
|
||||
(or "client driver")
|
||||
and the gadget driver is the slave (or "function driver").
|
||||
</para>
|
||||
|
||||
<para>The gadget API resembles the host side Linux-USB API in that both
|
||||
use queues of request objects to package I/O buffers, and those requests
|
||||
may be submitted or canceled.
|
||||
They share common definitions for the standard USB
|
||||
<emphasis>Chapter 9</emphasis> messages, structures, and constants.
|
||||
Also, both APIs bind and unbind drivers to devices.
|
||||
The APIs differ in detail, since the host side's current
|
||||
URB framework exposes a number of implementation details
|
||||
and assumptions that are inappropriate for a gadget API.
|
||||
While the model for control transfers and configuration
|
||||
management is necessarily different (one side is a hardware-neutral master,
|
||||
the other is a hardware-aware slave), the endpoint I/0 API used here
|
||||
should also be usable for an overhead-reduced host side API.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="structure"><title>Structure of Gadget Drivers</title>
|
||||
|
||||
<para>A system running inside a USB peripheral
|
||||
normally has at least three layers inside the kernel to handle
|
||||
USB protocol processing, and may have additional layers in
|
||||
user space code.
|
||||
The "gadget" API is used by the middle layer to interact
|
||||
with the lowest level (which directly handles hardware).
|
||||
</para>
|
||||
|
||||
<para>In Linux, from the bottom up, these layers are:
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
|
||||
<varlistentry>
|
||||
<term><emphasis>USB Controller Driver</emphasis></term>
|
||||
|
||||
<listitem>
|
||||
<para>This is the lowest software level.
|
||||
It is the only layer that talks to hardware,
|
||||
through registers, fifos, dma, irqs, and the like.
|
||||
The <filename><linux/usb_gadget.h></filename> API abstracts
|
||||
the peripheral controller endpoint hardware.
|
||||
That hardware is exposed through endpoint objects, which accept
|
||||
streams of IN/OUT buffers, and through callbacks that interact
|
||||
with gadget drivers.
|
||||
Since normal USB devices only have one upstream
|
||||
port, they only have one of these drivers.
|
||||
The controller driver can support any number of different
|
||||
gadget drivers, but only one of them can be used at a time.
|
||||
</para>
|
||||
|
||||
<para>Examples of such controller hardware include
|
||||
the PCI-based NetChip 2280 USB 2.0 high speed controller,
|
||||
the SA-11x0 or PXA-25x UDC (found within many PDAs),
|
||||
and a variety of other products.
|
||||
</para>
|
||||
|
||||
</listitem></varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><emphasis>Gadget Driver</emphasis></term>
|
||||
|
||||
<listitem>
|
||||
<para>The lower boundary of this driver implements hardware-neutral
|
||||
USB functions, using calls to the controller driver.
|
||||
Because such hardware varies widely in capabilities and restrictions,
|
||||
and is used in embedded environments where space is at a premium,
|
||||
the gadget driver is often configured at compile time
|
||||
to work with endpoints supported by one particular controller.
|
||||
Gadget drivers may be portable to several different controllers,
|
||||
using conditional compilation.
|
||||
(Recent kernels substantially simplify the work involved in
|
||||
supporting new hardware, by <emphasis>autoconfiguring</emphasis>
|
||||
endpoints automatically for many bulk-oriented drivers.)
|
||||
Gadget driver responsibilities include:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem><para>handling setup requests (ep0 protocol responses)
|
||||
possibly including class-specific functionality
|
||||
</para></listitem>
|
||||
<listitem><para>returning configuration and string descriptors
|
||||
</para></listitem>
|
||||
<listitem><para>(re)setting configurations and interface
|
||||
altsettings, including enabling and configuring endpoints
|
||||
</para></listitem>
|
||||
<listitem><para>handling life cycle events, such as managing
|
||||
bindings to hardware,
|
||||
USB suspend/resume, remote wakeup,
|
||||
and disconnection from the USB host.
|
||||
</para></listitem>
|
||||
<listitem><para>managing IN and OUT transfers on all currently
|
||||
enabled endpoints
|
||||
</para></listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
Such drivers may be modules of proprietary code, although
|
||||
that approach is discouraged in the Linux community.
|
||||
</para>
|
||||
</listitem></varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><emphasis>Upper Level</emphasis></term>
|
||||
|
||||
<listitem>
|
||||
<para>Most gadget drivers have an upper boundary that connects
|
||||
to some Linux driver or framework in Linux.
|
||||
Through that boundary flows the data which the gadget driver
|
||||
produces and/or consumes through protocol transfers over USB.
|
||||
Examples include:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem><para>user mode code, using generic (gadgetfs)
|
||||
or application specific files in
|
||||
<filename>/dev</filename>
|
||||
</para></listitem>
|
||||
<listitem><para>networking subsystem (for network gadgets,
|
||||
like the CDC Ethernet Model gadget driver)
|
||||
</para></listitem>
|
||||
<listitem><para>data capture drivers, perhaps video4Linux or
|
||||
a scanner driver; or test and measurement hardware.
|
||||
</para></listitem>
|
||||
<listitem><para>input subsystem (for HID gadgets)
|
||||
</para></listitem>
|
||||
<listitem><para>sound subsystem (for audio gadgets)
|
||||
</para></listitem>
|
||||
<listitem><para>file system (for PTP gadgets)
|
||||
</para></listitem>
|
||||
<listitem><para>block i/o subsystem (for usb-storage gadgets)
|
||||
</para></listitem>
|
||||
<listitem><para>... and more </para></listitem>
|
||||
</itemizedlist>
|
||||
</listitem></varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><emphasis>Additional Layers</emphasis></term>
|
||||
|
||||
<listitem>
|
||||
<para>Other layers may exist.
|
||||
These could include kernel layers, such as network protocol stacks,
|
||||
as well as user mode applications building on standard POSIX
|
||||
system call APIs such as
|
||||
<emphasis>open()</emphasis>, <emphasis>close()</emphasis>,
|
||||
<emphasis>read()</emphasis> and <emphasis>write()</emphasis>.
|
||||
On newer systems, POSIX Async I/O calls may be an option.
|
||||
Such user mode code will not necessarily be subject to
|
||||
the GNU General Public License (GPL).
|
||||
</para>
|
||||
</listitem></varlistentry>
|
||||
|
||||
|
||||
</variablelist>
|
||||
|
||||
<para>OTG-capable systems will also need to include a standard Linux-USB
|
||||
host side stack,
|
||||
with <emphasis>usbcore</emphasis>,
|
||||
one or more <emphasis>Host Controller Drivers</emphasis> (HCDs),
|
||||
<emphasis>USB Device Drivers</emphasis> to support
|
||||
the OTG "Targeted Peripheral List",
|
||||
and so forth.
|
||||
There will also be an <emphasis>OTG Controller Driver</emphasis>,
|
||||
which is visible to gadget and device driver developers only indirectly.
|
||||
That helps the host and device side USB controllers implement the
|
||||
two new OTG protocols (HNP and SRP).
|
||||
Roles switch (host to peripheral, or vice versa) using HNP
|
||||
during USB suspend processing, and SRP can be viewed as a
|
||||
more battery-friendly kind of device wakeup protocol.
|
||||
</para>
|
||||
|
||||
<para>Over time, reusable utilities are evolving to help make some
|
||||
gadget driver tasks simpler.
|
||||
For example, building configuration descriptors from vectors of
|
||||
descriptors for the configurations interfaces and endpoints is
|
||||
now automated, and many drivers now use autoconfiguration to
|
||||
choose hardware endpoints and initialize their descriptors.
|
||||
|
||||
A potential example of particular interest
|
||||
is code implementing standard USB-IF protocols for
|
||||
HID, networking, storage, or audio classes.
|
||||
Some developers are interested in KDB or KGDB hooks, to let
|
||||
target hardware be remotely debugged.
|
||||
Most such USB protocol code doesn't need to be hardware-specific,
|
||||
any more than network protocols like X11, HTTP, or NFS are.
|
||||
Such gadget-side interface drivers should eventually be combined,
|
||||
to implement composite devices.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
|
||||
<chapter id="api"><title>Kernel Mode Gadget API</title>
|
||||
|
||||
<para>Gadget drivers declare themselves through a
|
||||
<emphasis>struct usb_gadget_driver</emphasis>, which is responsible for
|
||||
most parts of enumeration for a <emphasis>struct usb_gadget</emphasis>.
|
||||
The response to a set_configuration usually involves
|
||||
enabling one or more of the <emphasis>struct usb_ep</emphasis> objects
|
||||
exposed by the gadget, and submitting one or more
|
||||
<emphasis>struct usb_request</emphasis> buffers to transfer data.
|
||||
Understand those four data types, and their operations, and
|
||||
you will understand how this API works.
|
||||
</para>
|
||||
|
||||
<note><title>Incomplete Data Type Descriptions</title>
|
||||
|
||||
<para>This documentation was prepared using the standard Linux
|
||||
kernel <filename>docproc</filename> tool, which turns text
|
||||
and in-code comments into SGML DocBook and then into usable
|
||||
formats such as HTML or PDF.
|
||||
Other than the "Chapter 9" data types, most of the significant
|
||||
data types and functions are described here.
|
||||
</para>
|
||||
|
||||
<para>However, docproc does not understand all the C constructs
|
||||
that are used, so some relevant information is likely omitted from
|
||||
what you are reading.
|
||||
One example of such information is endpoint autoconfiguration.
|
||||
You'll have to read the header file, and use example source
|
||||
code (such as that for "Gadget Zero"), to fully understand the API.
|
||||
</para>
|
||||
|
||||
<para>The part of the API implementing some basic
|
||||
driver capabilities is specific to the version of the
|
||||
Linux kernel that's in use.
|
||||
The 2.6 kernel includes a <emphasis>driver model</emphasis>
|
||||
framework that has no analogue on earlier kernels;
|
||||
so those parts of the gadget API are not fully portable.
|
||||
(They are implemented on 2.4 kernels, but in a different way.)
|
||||
The driver model state is another part of this API that is
|
||||
ignored by the kerneldoc tools.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<para>The core API does not expose
|
||||
every possible hardware feature, only the most widely available ones.
|
||||
There are significant hardware features, such as device-to-device DMA
|
||||
(without temporary storage in a memory buffer)
|
||||
that would be added using hardware-specific APIs.
|
||||
</para>
|
||||
|
||||
<para>This API allows drivers to use conditional compilation to handle
|
||||
endpoint capabilities of different hardware, but doesn't require that.
|
||||
Hardware tends to have arbitrary restrictions, relating to
|
||||
transfer types, addressing, packet sizes, buffering, and availability.
|
||||
As a rule, such differences only matter for "endpoint zero" logic
|
||||
that handles device configuration and management.
|
||||
The API supports limited run-time
|
||||
detection of capabilities, through naming conventions for endpoints.
|
||||
Many drivers will be able to at least partially autoconfigure
|
||||
themselves.
|
||||
In particular, driver init sections will often have endpoint
|
||||
autoconfiguration logic that scans the hardware's list of endpoints
|
||||
to find ones matching the driver requirements
|
||||
(relying on those conventions), to eliminate some of the most
|
||||
common reasons for conditional compilation.
|
||||
</para>
|
||||
|
||||
<para>Like the Linux-USB host side API, this API exposes
|
||||
the "chunky" nature of USB messages: I/O requests are in terms
|
||||
of one or more "packets", and packet boundaries are visible to drivers.
|
||||
Compared to RS-232 serial protocols, USB resembles
|
||||
synchronous protocols like HDLC
|
||||
(N bytes per frame, multipoint addressing, host as the primary
|
||||
station and devices as secondary stations)
|
||||
more than asynchronous ones
|
||||
(tty style: 8 data bits per frame, no parity, one stop bit).
|
||||
So for example the controller drivers won't buffer
|
||||
two single byte writes into a single two-byte USB IN packet,
|
||||
although gadget drivers may do so when they implement
|
||||
protocols where packet boundaries (and "short packets")
|
||||
are not significant.
|
||||
</para>
|
||||
|
||||
<sect1 id="lifecycle"><title>Driver Life Cycle</title>
|
||||
|
||||
<para>Gadget drivers make endpoint I/O requests to hardware without
|
||||
needing to know many details of the hardware, but driver
|
||||
setup/configuration code needs to handle some differences.
|
||||
Use the API like this:
|
||||
</para>
|
||||
|
||||
<orderedlist numeration='arabic'>
|
||||
|
||||
<listitem><para>Register a driver for the particular device side
|
||||
usb controller hardware,
|
||||
such as the net2280 on PCI (USB 2.0),
|
||||
sa11x0 or pxa25x as found in Linux PDAs,
|
||||
and so on.
|
||||
At this point the device is logically in the USB ch9 initial state
|
||||
("attached"), drawing no power and not usable
|
||||
(since it does not yet support enumeration).
|
||||
Any host should not see the device, since it's not
|
||||
activated the data line pullup used by the host to
|
||||
detect a device, even if VBUS power is available.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>Register a gadget driver that implements some higher level
|
||||
device function. That will then bind() to a usb_gadget, which
|
||||
activates the data line pullup sometime after detecting VBUS.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>The hardware driver can now start enumerating.
|
||||
The steps it handles are to accept USB power and set_address requests.
|
||||
Other steps are handled by the gadget driver.
|
||||
If the gadget driver module is unloaded before the host starts to
|
||||
enumerate, steps before step 7 are skipped.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>The gadget driver's setup() call returns usb descriptors,
|
||||
based both on what the bus interface hardware provides and on the
|
||||
functionality being implemented.
|
||||
That can involve alternate settings or configurations,
|
||||
unless the hardware prevents such operation.
|
||||
For OTG devices, each configuration descriptor includes
|
||||
an OTG descriptor.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>The gadget driver handles the last step of enumeration,
|
||||
when the USB host issues a set_configuration call.
|
||||
It enables all endpoints used in that configuration,
|
||||
with all interfaces in their default settings.
|
||||
That involves using a list of the hardware's endpoints, enabling each
|
||||
endpoint according to its descriptor.
|
||||
It may also involve using <function>usb_gadget_vbus_draw</function>
|
||||
to let more power be drawn from VBUS, as allowed by that configuration.
|
||||
For OTG devices, setting a configuration may also involve reporting
|
||||
HNP capabilities through a user interface.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>Do real work and perform data transfers, possibly involving
|
||||
changes to interface settings or switching to new configurations, until the
|
||||
device is disconnect()ed from the host.
|
||||
Queue any number of transfer requests to each endpoint.
|
||||
It may be suspended and resumed several times before being disconnected.
|
||||
On disconnect, the drivers go back to step 3 (above).
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>When the gadget driver module is being unloaded,
|
||||
the driver unbind() callback is issued. That lets the controller
|
||||
driver be unloaded.
|
||||
</para></listitem>
|
||||
|
||||
</orderedlist>
|
||||
|
||||
<para>Drivers will normally be arranged so that just loading the
|
||||
gadget driver module (or statically linking it into a Linux kernel)
|
||||
allows the peripheral device to be enumerated, but some drivers
|
||||
will defer enumeration until some higher level component (like
|
||||
a user mode daemon) enables it.
|
||||
Note that at this lowest level there are no policies about how
|
||||
ep0 configuration logic is implemented,
|
||||
except that it should obey USB specifications.
|
||||
Such issues are in the domain of gadget drivers,
|
||||
including knowing about implementation constraints
|
||||
imposed by some USB controllers
|
||||
or understanding that composite devices might happen to
|
||||
be built by integrating reusable components.
|
||||
</para>
|
||||
|
||||
<para>Note that the lifecycle above can be slightly different
|
||||
for OTG devices.
|
||||
Other than providing an additional OTG descriptor in each
|
||||
configuration, only the HNP-related differences are particularly
|
||||
visible to driver code.
|
||||
They involve reporting requirements during the SET_CONFIGURATION
|
||||
request, and the option to invoke HNP during some suspend callbacks.
|
||||
Also, SRP changes the semantics of
|
||||
<function>usb_gadget_wakeup</function>
|
||||
slightly.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="ch9"><title>USB 2.0 Chapter 9 Types and Constants</title>
|
||||
|
||||
<para>Gadget drivers
|
||||
rely on common USB structures and constants
|
||||
defined in the
|
||||
<filename><linux/usb_ch9.h></filename>
|
||||
header file, which is standard in Linux 2.6 kernels.
|
||||
These are the same types and constants used by host
|
||||
side drivers (and usbcore).
|
||||
</para>
|
||||
|
||||
!Iinclude/linux/usb_ch9.h
|
||||
</sect1>
|
||||
|
||||
<sect1 id="core"><title>Core Objects and Methods</title>
|
||||
|
||||
<para>These are declared in
|
||||
<filename><linux/usb_gadget.h></filename>,
|
||||
and are used by gadget drivers to interact with
|
||||
USB peripheral controller drivers.
|
||||
</para>
|
||||
|
||||
<!-- yeech, this is ugly in nsgmls PDF output.
|
||||
|
||||
the PDF bookmark and refentry output nesting is wrong,
|
||||
and the member/argument documentation indents ugly.
|
||||
|
||||
plus something (docproc?) adds whitespace before the
|
||||
descriptive paragraph text, so it can't line up right
|
||||
unless the explanations are trivial.
|
||||
-->
|
||||
|
||||
!Iinclude/linux/usb_gadget.h
|
||||
</sect1>
|
||||
|
||||
<sect1 id="utils"><title>Optional Utilities</title>
|
||||
|
||||
<para>The core API is sufficient for writing a USB Gadget Driver,
|
||||
but some optional utilities are provided to simplify common tasks.
|
||||
These utilities include endpoint autoconfiguration.
|
||||
</para>
|
||||
|
||||
!Edrivers/usb/gadget/usbstring.c
|
||||
!Edrivers/usb/gadget/config.c
|
||||
<!-- !Edrivers/usb/gadget/epautoconf.c -->
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="controllers"><title>Peripheral Controller Drivers</title>
|
||||
|
||||
<para>The first hardware supporting this API was the NetChip 2280
|
||||
controller, which supports USB 2.0 high speed and is based on PCI.
|
||||
This is the <filename>net2280</filename> driver module.
|
||||
The driver supports Linux kernel versions 2.4 and 2.6;
|
||||
contact NetChip Technologies for development boards and product
|
||||
information.
|
||||
</para>
|
||||
|
||||
<para>Other hardware working in the "gadget" framework includes:
|
||||
Intel's PXA 25x and IXP42x series processors
|
||||
(<filename>pxa2xx_udc</filename>),
|
||||
Toshiba TC86c001 "Goku-S" (<filename>goku_udc</filename>),
|
||||
Renesas SH7705/7727 (<filename>sh_udc</filename>),
|
||||
MediaQ 11xx (<filename>mq11xx_udc</filename>),
|
||||
Hynix HMS30C7202 (<filename>h7202_udc</filename>),
|
||||
National 9303/4 (<filename>n9604_udc</filename>),
|
||||
Texas Instruments OMAP (<filename>omap_udc</filename>),
|
||||
Sharp LH7A40x (<filename>lh7a40x_udc</filename>),
|
||||
and more.
|
||||
Most of those are full speed controllers.
|
||||
</para>
|
||||
|
||||
<para>At this writing, there are people at work on drivers in
|
||||
this framework for several other USB device controllers,
|
||||
with plans to make many of them be widely available.
|
||||
</para>
|
||||
|
||||
<!-- !Edrivers/usb/gadget/net2280.c -->
|
||||
|
||||
<para>A partial USB simulator,
|
||||
the <filename>dummy_hcd</filename> driver, is available.
|
||||
It can act like a net2280, a pxa25x, or an sa11x0 in terms
|
||||
of available endpoints and device speeds; and it simulates
|
||||
control, bulk, and to some extent interrupt transfers.
|
||||
That lets you develop some parts of a gadget driver on a normal PC,
|
||||
without any special hardware, and perhaps with the assistance
|
||||
of tools such as GDB running with User Mode Linux.
|
||||
At least one person has expressed interest in adapting that
|
||||
approach, hooking it up to a simulator for a microcontroller.
|
||||
Such simulators can help debug subsystems where the runtime hardware
|
||||
is unfriendly to software development, or is not yet available.
|
||||
</para>
|
||||
|
||||
<para>Support for other controllers is expected to be developed
|
||||
and contributed
|
||||
over time, as this driver framework evolves.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="gadget"><title>Gadget Drivers</title>
|
||||
|
||||
<para>In addition to <emphasis>Gadget Zero</emphasis>
|
||||
(used primarily for testing and development with drivers
|
||||
for usb controller hardware), other gadget drivers exist.
|
||||
</para>
|
||||
|
||||
<para>There's an <emphasis>ethernet</emphasis> gadget
|
||||
driver, which implements one of the most useful
|
||||
<emphasis>Communications Device Class</emphasis> (CDC) models.
|
||||
One of the standards for cable modem interoperability even
|
||||
specifies the use of this ethernet model as one of two
|
||||
mandatory options.
|
||||
Gadgets using this code look to a USB host as if they're
|
||||
an Ethernet adapter.
|
||||
It provides access to a network where the gadget's CPU is one host,
|
||||
which could easily be bridging, routing, or firewalling
|
||||
access to other networks.
|
||||
Since some hardware can't fully implement the CDC Ethernet
|
||||
requirements, this driver also implements a "good parts only"
|
||||
subset of CDC Ethernet.
|
||||
(That subset doesn't advertise itself as CDC Ethernet,
|
||||
to avoid creating problems.)
|
||||
</para>
|
||||
|
||||
<para>Support for Microsoft's <emphasis>RNDIS</emphasis>
|
||||
protocol has been contributed by Pengutronix and Auerswald GmbH.
|
||||
This is like CDC Ethernet, but it runs on more slightly USB hardware
|
||||
(but less than the CDC subset).
|
||||
However, its main claim to fame is being able to connect directly to
|
||||
recent versions of Windows, using drivers that Microsoft bundles
|
||||
and supports, making it much simpler to network with Windows.
|
||||
</para>
|
||||
|
||||
<para>There is also support for user mode gadget drivers,
|
||||
using <emphasis>gadgetfs</emphasis>.
|
||||
This provides a <emphasis>User Mode API</emphasis> that presents
|
||||
each endpoint as a single file descriptor. I/O is done using
|
||||
normal <emphasis>read()</emphasis> and <emphasis>read()</emphasis> calls.
|
||||
Familiar tools like GDB and pthreads can be used to
|
||||
develop and debug user mode drivers, so that once a robust
|
||||
controller driver is available many applications for it
|
||||
won't require new kernel mode software.
|
||||
Linux 2.6 <emphasis>Async I/O (AIO)</emphasis>
|
||||
support is available, so that user mode software
|
||||
can stream data with only slightly more overhead
|
||||
than a kernel driver.
|
||||
</para>
|
||||
|
||||
<para>There's a USB Mass Storage class driver, which provides
|
||||
a different solution for interoperability with systems such
|
||||
as MS-Windows and MacOS.
|
||||
That <emphasis>File-backed Storage</emphasis> driver uses a
|
||||
file or block device as backing store for a drive,
|
||||
like the <filename>loop</filename> driver.
|
||||
The USB host uses the BBB, CB, or CBI versions of the mass
|
||||
storage class specification, using transparent SCSI commands
|
||||
to access the data from the backing store.
|
||||
</para>
|
||||
|
||||
<para>There's a "serial line" driver, useful for TTY style
|
||||
operation over USB.
|
||||
The latest version of that driver supports CDC ACM style
|
||||
operation, like a USB modem, and so on most hardware it can
|
||||
interoperate easily with MS-Windows.
|
||||
One interesting use of that driver is in boot firmware (like a BIOS),
|
||||
which can sometimes use that model with very small systems without
|
||||
real serial lines.
|
||||
</para>
|
||||
|
||||
<para>Support for other kinds of gadget is expected to
|
||||
be developed and contributed
|
||||
over time, as this driver framework evolves.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="otg"><title>USB On-The-GO (OTG)</title>
|
||||
|
||||
<para>USB OTG support on Linux 2.6 was initially developed
|
||||
by Texas Instruments for
|
||||
<ulink url="http://www.omap.com">OMAP</ulink> 16xx and 17xx
|
||||
series processors.
|
||||
Other OTG systems should work in similar ways, but the
|
||||
hardware level details could be very different.
|
||||
</para>
|
||||
|
||||
<para>Systems need specialized hardware support to implement OTG,
|
||||
notably including a special <emphasis>Mini-AB</emphasis> jack
|
||||
and associated transciever to support <emphasis>Dual-Role</emphasis>
|
||||
operation:
|
||||
they can act either as a host, using the standard
|
||||
Linux-USB host side driver stack,
|
||||
or as a peripheral, using this "gadget" framework.
|
||||
To do that, the system software relies on small additions
|
||||
to those programming interfaces,
|
||||
and on a new internal component (here called an "OTG Controller")
|
||||
affecting which driver stack connects to the OTG port.
|
||||
In each role, the system can re-use the existing pool of
|
||||
hardware-neutral drivers, layered on top of the controller
|
||||
driver interfaces (<emphasis>usb_bus</emphasis> or
|
||||
<emphasis>usb_gadget</emphasis>).
|
||||
Such drivers need at most minor changes, and most of the calls
|
||||
added to support OTG can also benefit non-OTG products.
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem><para>Gadget drivers test the <emphasis>is_otg</emphasis>
|
||||
flag, and use it to determine whether or not to include
|
||||
an OTG descriptor in each of their configurations.
|
||||
</para></listitem>
|
||||
<listitem><para>Gadget drivers may need changes to support the
|
||||
two new OTG protocols, exposed in new gadget attributes
|
||||
such as <emphasis>b_hnp_enable</emphasis> flag.
|
||||
HNP support should be reported through a user interface
|
||||
(two LEDs could suffice), and is triggered in some cases
|
||||
when the host suspends the peripheral.
|
||||
SRP support can be user-initiated just like remote wakeup,
|
||||
probably by pressing the same button.
|
||||
</para></listitem>
|
||||
<listitem><para>On the host side, USB device drivers need
|
||||
to be taught to trigger HNP at appropriate moments, using
|
||||
<function>usb_suspend_device()</function>.
|
||||
That also conserves battery power, which is useful even
|
||||
for non-OTG configurations.
|
||||
</para></listitem>
|
||||
<listitem><para>Also on the host side, a driver must support the
|
||||
OTG "Targeted Peripheral List". That's just a whitelist,
|
||||
used to reject peripherals not supported with a given
|
||||
Linux OTG host.
|
||||
<emphasis>This whitelist is product-specific;
|
||||
each product must modify <filename>otg_whitelist.h</filename>
|
||||
to match its interoperability specification.
|
||||
</emphasis>
|
||||
</para>
|
||||
<para>Non-OTG Linux hosts, like PCs and workstations,
|
||||
normally have some solution for adding drivers, so that
|
||||
peripherals that aren't recognized can eventually be supported.
|
||||
That approach is unreasonable for consumer products that may
|
||||
never have their firmware upgraded, and where it's usually
|
||||
unrealistic to expect traditional PC/workstation/server kinds
|
||||
of support model to work.
|
||||
For example, it's often impractical to change device firmware
|
||||
once the product has been distributed, so driver bugs can't
|
||||
normally be fixed if they're found after shipment.
|
||||
</para></listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
Additional changes are needed below those hardware-neutral
|
||||
<emphasis>usb_bus</emphasis> and <emphasis>usb_gadget</emphasis>
|
||||
driver interfaces; those aren't discussed here in any detail.
|
||||
Those affect the hardware-specific code for each USB Host or Peripheral
|
||||
controller, and how the HCD initializes (since OTG can be active only
|
||||
on a single port).
|
||||
They also involve what may be called an <emphasis>OTG Controller
|
||||
Driver</emphasis>, managing the OTG transceiver and the OTG state
|
||||
machine logic as well as much of the root hub behavior for the
|
||||
OTG port.
|
||||
The OTG controller driver needs to activate and deactivate USB
|
||||
controllers depending on the relevant device role.
|
||||
Some related changes were needed inside usbcore, so that it
|
||||
can identify OTG-capable devices and respond appropriately
|
||||
to HNP or SRP protocols.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
</book>
|
||||
<!--
|
||||
vim:syntax=sgml:sw=4
|
||||
-->
|
333
Documentation/DocBook/journal-api.tmpl
Normal file
333
Documentation/DocBook/journal-api.tmpl
Normal file
|
@ -0,0 +1,333 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="LinuxJBDAPI">
|
||||
<bookinfo>
|
||||
<title>The Linux Journalling API</title>
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Roger</firstname>
|
||||
<surname>Gammans</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>rgammans@computer-surgery.co.uk</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Stephen</firstname>
|
||||
<surname>Tweedie</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>sct@redhat.com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2002</year>
|
||||
<holder>Roger Gammans</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="Overview">
|
||||
<title>Overview</title>
|
||||
<sect1>
|
||||
<title>Details</title>
|
||||
<para>
|
||||
The journalling layer is easy to use. You need to
|
||||
first of all create a journal_t data structure. There are
|
||||
two calls to do this dependent on how you decide to allocate the physical
|
||||
media on which the journal resides. The journal_init_inode() call
|
||||
is for journals stored in filesystem inodes, or the journal_init_dev()
|
||||
call can be use for journal stored on a raw device (in a continuous range
|
||||
of blocks). A journal_t is a typedef for a struct pointer, so when
|
||||
you are finally finished make sure you call journal_destroy() on it
|
||||
to free up any used kernel memory.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Once you have got your journal_t object you need to 'mount' or load the journal
|
||||
file, unless of course you haven't initialised it yet - in which case you
|
||||
need to call journal_create().
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Most of the time however your journal file will already have been created, but
|
||||
before you load it you must call journal_wipe() to empty the journal file.
|
||||
Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the
|
||||
job of the client file system to detect this and skip the call to journal_wipe().
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In either case the next call should be to journal_load() which prepares the
|
||||
journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery()
|
||||
for you if it detects any outstanding transactions in the journal and similarly
|
||||
journal_load() will call journal_recover() if necessary.
|
||||
I would advise reading fs/ext3/super.c for examples on this stage.
|
||||
[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly
|
||||
complicate the API. Or isn't a good idea for the journal layer to hide
|
||||
dirty mounts from the client fs]
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Now you can go ahead and start modifying the underlying
|
||||
filesystem. Almost.
|
||||
</para>
|
||||
|
||||
|
||||
<para>
|
||||
|
||||
You still need to actually journal your filesystem changes, this
|
||||
is done by wrapping them into transactions. Additionally you
|
||||
also need to wrap the modification of each of the the buffers
|
||||
with calls to the journal layer, so it knows what the modifications
|
||||
you are actually making are. To do this use journal_start() which
|
||||
returns a transaction handle.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
journal_start()
|
||||
and its counterpart journal_stop(), which indicates the end of a transaction
|
||||
are nestable calls, so you can reenter a transaction if necessary,
|
||||
but remember you must call journal_stop() the same number of times as
|
||||
journal_start() before the transaction is completed (or more accurately
|
||||
leaves the the update phase). Ext3/VFS makes use of this feature to simplify
|
||||
quota support.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Inside each transaction you need to wrap the modifications to the
|
||||
individual buffers (blocks). Before you start to modify a buffer you
|
||||
need to call journal_get_{create,write,undo}_access() as appropriate,
|
||||
this allows the journalling layer to copy the unmodified data if it
|
||||
needs to. After all the buffer may be part of a previously uncommitted
|
||||
transaction.
|
||||
At this point you are at last ready to modify a buffer, and once
|
||||
you are have done so you need to call journal_dirty_{meta,}data().
|
||||
Or if you've asked for access to a buffer you now know is now longer
|
||||
required to be pushed back on the device you can call journal_forget()
|
||||
in much the same way as you might have used bforget() in the past.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A journal_flush() may be called at any time to commit and checkpoint
|
||||
all your transactions.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Then at umount time , in your put_super() (2.4) or write_super() (2.5)
|
||||
you can then call journal_destroy() to clean up your in-core journal object.
|
||||
</para>
|
||||
|
||||
|
||||
<para>
|
||||
Unfortunately there a couple of ways the journal layer can cause a deadlock.
|
||||
The first thing to note is that each task can only have
|
||||
a single outstanding transaction at any one time, remember nothing
|
||||
commits until the outermost journal_stop(). This means
|
||||
you must complete the transaction at the end of each file/inode/address
|
||||
etc. operation you perform, so that the journalling system isn't re-entered
|
||||
on another journal. Since transactions can't be nested/batched
|
||||
across differing journals, and another filesystem other than
|
||||
yours (say ext3) may be modified in a later syscall.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The second case to bear in mind is that journal_start() can
|
||||
block if there isn't enough space in the journal for your transaction
|
||||
(based on the passed nblocks param) - when it blocks it merely(!) needs to
|
||||
wait for transactions to complete and be committed from other tasks,
|
||||
so essentially we are waiting for journal_stop(). So to avoid
|
||||
deadlocks you must treat journal_start/stop() as if they
|
||||
were semaphores and include them in your semaphore ordering rules to prevent
|
||||
deadlocks. Note that journal_extend() has similar blocking behaviour to
|
||||
journal_start() so you can deadlock here just as easily as on journal_start().
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Try to reserve the right number of blocks the first time. ;-). This will
|
||||
be the maximum number of blocks you are going to touch in this transaction.
|
||||
I advise having a look at at least ext3_jbd.h to see the basis on which
|
||||
ext3 uses to make these decisions.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Another wriggle to watch out for is your on-disk block allocation strategy.
|
||||
why? Because, if you undo a delete, you need to ensure you haven't reused any
|
||||
of the freed blocks in a later transaction. One simple way of doing this
|
||||
is make sure any blocks you allocate only have checkpointed transactions
|
||||
listed against them. Ext3 does this in ext3_test_allocatable().
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Lock is also providing through journal_{un,}lock_updates(),
|
||||
ext3 uses this when it wants a window with a clean and stable fs for a moment.
|
||||
eg.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
|
||||
journal_lock_updates() //stop new stuff happening..
|
||||
journal_flush() // checkpoint everything.
|
||||
..do stuff on stable fs
|
||||
journal_unlock_updates() // carry on with filesystem use.
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The opportunities for abuse and DOS attacks with this should be obvious,
|
||||
if you allow unprivileged userspace to trigger codepaths containing these
|
||||
calls.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A new feature of jbd since 2.5.25 is commit callbacks with the new
|
||||
journal_callback_set() function you can now ask the journalling layer
|
||||
to call you back when the transaction is finally committed to disk, so that
|
||||
you can do some of your own management. The key to this is the journal_callback
|
||||
struct, this maintains the internal callback information but you can
|
||||
extend it like this:-
|
||||
</para>
|
||||
<programlisting>
|
||||
struct myfs_callback_s {
|
||||
//Data structure element required by jbd..
|
||||
struct journal_callback for_jbd;
|
||||
// Stuff for myfs allocated together.
|
||||
myfs_inode* i_commited;
|
||||
|
||||
}
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
this would be useful if you needed to know when data was committed to a
|
||||
particular inode.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Summary</title>
|
||||
<para>
|
||||
Using the journal is a matter of wrapping the different context changes,
|
||||
being each mount, each modification (transaction) and each changed buffer
|
||||
to tell the journalling layer about them.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Here is a some pseudo code to give you an idea of how it works, as
|
||||
an example.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
journal_t* my_jnrl = journal_create();
|
||||
journal_init_{dev,inode}(jnrl,...)
|
||||
if (clean) journal_wipe();
|
||||
journal_load();
|
||||
|
||||
foreach(transaction) { /*transactions must be
|
||||
completed before
|
||||
a syscall returns to
|
||||
userspace*/
|
||||
|
||||
handle_t * xct=journal_start(my_jnrl);
|
||||
foreach(bh) {
|
||||
journal_get_{create,write,undo}_access(xact,bh);
|
||||
if ( myfs_modify(bh) ) { /* returns true
|
||||
if makes changes */
|
||||
journal_dirty_{meta,}data(xact,bh);
|
||||
} else {
|
||||
journal_forget(bh);
|
||||
}
|
||||
}
|
||||
journal_stop(xct);
|
||||
}
|
||||
journal_destroy(my_jrnl);
|
||||
</programlisting>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="adt">
|
||||
<title>Data Types</title>
|
||||
<para>
|
||||
The journalling layer uses typedefs to 'hide' the concrete definitions
|
||||
of the structures used. As a client of the JBD layer you can
|
||||
just rely on the using the pointer as a magic cookie of some sort.
|
||||
|
||||
Obviously the hiding is not enforced as this is 'C'.
|
||||
</para>
|
||||
<sect1><title>Structures</title>
|
||||
!Iinclude/linux/jbd.h
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="calls">
|
||||
<title>Functions</title>
|
||||
<para>
|
||||
The functions here are split into two groups those that
|
||||
affect a journal as a whole, and those which are used to
|
||||
manage transactions
|
||||
</para>
|
||||
<sect1><title>Journal Level</title>
|
||||
!Efs/jbd/journal.c
|
||||
!Efs/jbd/recovery.c
|
||||
</sect1>
|
||||
<sect1><title>Transasction Level</title>
|
||||
!Efs/jbd/transaction.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
<chapter>
|
||||
<title>See also</title>
|
||||
<para>
|
||||
<citation>
|
||||
<ulink url="ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/journal-design.ps.gz">
|
||||
Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie
|
||||
</ulink>
|
||||
</citation>
|
||||
</para>
|
||||
<para>
|
||||
<citation>
|
||||
<ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html">
|
||||
Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie
|
||||
</ulink>
|
||||
</citation>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
</book>
|
342
Documentation/DocBook/kernel-api.tmpl
Normal file
342
Documentation/DocBook/kernel-api.tmpl
Normal file
|
@ -0,0 +1,342 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="LinuxKernelAPI">
|
||||
<bookinfo>
|
||||
<title>The Linux Kernel API</title>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="Basics">
|
||||
<title>Driver Basics</title>
|
||||
<sect1><title>Driver Entry and Exit points</title>
|
||||
!Iinclude/linux/init.h
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Atomic and pointer manipulation</title>
|
||||
!Iinclude/asm-i386/atomic.h
|
||||
!Iinclude/asm-i386/unaligned.h
|
||||
</sect1>
|
||||
|
||||
<!-- FIXME:
|
||||
kernel/sched.c has no docs, which stuffs up the sgml. Comment
|
||||
out until somebody adds docs. KAO
|
||||
<sect1><title>Delaying, scheduling, and timer routines</title>
|
||||
X!Ekernel/sched.c
|
||||
</sect1>
|
||||
KAO -->
|
||||
</chapter>
|
||||
|
||||
<chapter id="adt">
|
||||
<title>Data Types</title>
|
||||
<sect1><title>Doubly Linked Lists</title>
|
||||
!Iinclude/linux/list.h
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="libc">
|
||||
<title>Basic C Library Functions</title>
|
||||
|
||||
<para>
|
||||
When writing drivers, you cannot in general use routines which are
|
||||
from the C Library. Some of the functions have been found generally
|
||||
useful and they are listed below. The behaviour of these functions
|
||||
may vary slightly from those defined by ANSI, and these deviations
|
||||
are noted in the text.
|
||||
</para>
|
||||
|
||||
<sect1><title>String Conversions</title>
|
||||
!Ilib/vsprintf.c
|
||||
!Elib/vsprintf.c
|
||||
</sect1>
|
||||
<sect1><title>String Manipulation</title>
|
||||
!Ilib/string.c
|
||||
!Elib/string.c
|
||||
</sect1>
|
||||
<sect1><title>Bit Operations</title>
|
||||
!Iinclude/asm-i386/bitops.h
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="mm">
|
||||
<title>Memory Management in Linux</title>
|
||||
<sect1><title>The Slab Cache</title>
|
||||
!Emm/slab.c
|
||||
</sect1>
|
||||
<sect1><title>User Space Memory Access</title>
|
||||
!Iinclude/asm-i386/uaccess.h
|
||||
!Iarch/i386/lib/usercopy.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="kfifo">
|
||||
<title>FIFO Buffer</title>
|
||||
<sect1><title>kfifo interface</title>
|
||||
!Iinclude/linux/kfifo.h
|
||||
!Ekernel/kfifo.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="proc">
|
||||
<title>The proc filesystem</title>
|
||||
|
||||
<sect1><title>sysctl interface</title>
|
||||
!Ekernel/sysctl.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="debugfs">
|
||||
<title>The debugfs filesystem</title>
|
||||
|
||||
<sect1><title>debugfs interface</title>
|
||||
!Efs/debugfs/inode.c
|
||||
!Efs/debugfs/file.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="vfs">
|
||||
<title>The Linux VFS</title>
|
||||
<sect1><title>The Directory Cache</title>
|
||||
!Efs/dcache.c
|
||||
!Iinclude/linux/dcache.h
|
||||
</sect1>
|
||||
<sect1><title>Inode Handling</title>
|
||||
!Efs/inode.c
|
||||
!Efs/bad_inode.c
|
||||
</sect1>
|
||||
<sect1><title>Registration and Superblocks</title>
|
||||
!Efs/super.c
|
||||
</sect1>
|
||||
<sect1><title>File Locks</title>
|
||||
!Efs/locks.c
|
||||
!Ifs/locks.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="netcore">
|
||||
<title>Linux Networking</title>
|
||||
<sect1><title>Socket Buffer Functions</title>
|
||||
!Iinclude/linux/skbuff.h
|
||||
!Enet/core/skbuff.c
|
||||
</sect1>
|
||||
<sect1><title>Socket Filter</title>
|
||||
!Enet/core/filter.c
|
||||
</sect1>
|
||||
<sect1><title>Generic Network Statistics</title>
|
||||
!Iinclude/linux/gen_stats.h
|
||||
!Enet/core/gen_stats.c
|
||||
!Enet/core/gen_estimator.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="netdev">
|
||||
<title>Network device support</title>
|
||||
<sect1><title>Driver Support</title>
|
||||
!Enet/core/dev.c
|
||||
</sect1>
|
||||
<sect1><title>8390 Based Network Cards</title>
|
||||
!Edrivers/net/8390.c
|
||||
</sect1>
|
||||
<sect1><title>Synchronous PPP</title>
|
||||
!Edrivers/net/wan/syncppp.c
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="modload">
|
||||
<title>Module Support</title>
|
||||
<sect1><title>Module Loading</title>
|
||||
!Ekernel/kmod.c
|
||||
</sect1>
|
||||
<sect1><title>Inter Module support</title>
|
||||
<para>
|
||||
Refer to the file kernel/module.c for more information.
|
||||
</para>
|
||||
<!-- FIXME: Removed for now since no structured comments in source
|
||||
X!Ekernel/module.c
|
||||
-->
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="hardware">
|
||||
<title>Hardware Interfaces</title>
|
||||
<sect1><title>Interrupt Handling</title>
|
||||
!Iarch/i386/kernel/irq.c
|
||||
</sect1>
|
||||
|
||||
<sect1><title>MTRR Handling</title>
|
||||
!Earch/i386/kernel/cpu/mtrr/main.c
|
||||
</sect1>
|
||||
<sect1><title>PCI Support Library</title>
|
||||
!Edrivers/pci/pci.c
|
||||
</sect1>
|
||||
<sect1><title>PCI Hotplug Support Library</title>
|
||||
!Edrivers/pci/hotplug/pci_hotplug_core.c
|
||||
</sect1>
|
||||
<sect1><title>MCA Architecture</title>
|
||||
<sect2><title>MCA Device Functions</title>
|
||||
<para>
|
||||
Refer to the file arch/i386/kernel/mca.c for more information.
|
||||
</para>
|
||||
<!-- FIXME: Removed for now since no structured comments in source
|
||||
X!Earch/i386/kernel/mca.c
|
||||
-->
|
||||
</sect2>
|
||||
<sect2><title>MCA Bus DMA</title>
|
||||
!Iinclude/asm-i386/mca_dma.h
|
||||
</sect2>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="devfs">
|
||||
<title>The Device File System</title>
|
||||
!Efs/devfs/base.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="security">
|
||||
<title>Security Framework</title>
|
||||
!Esecurity/security.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="pmfuncs">
|
||||
<title>Power Management</title>
|
||||
!Ekernel/power/pm.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="blkdev">
|
||||
<title>Block Devices</title>
|
||||
!Edrivers/block/ll_rw_blk.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="miscdev">
|
||||
<title>Miscellaneous Devices</title>
|
||||
!Edrivers/char/misc.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="viddev">
|
||||
<title>Video4Linux</title>
|
||||
!Edrivers/media/video/videodev.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="snddev">
|
||||
<title>Sound Devices</title>
|
||||
!Esound/sound_core.c
|
||||
<!-- FIXME: Removed for now since no structured comments in source
|
||||
X!Isound/sound_firmware.c
|
||||
-->
|
||||
</chapter>
|
||||
|
||||
<chapter id="uart16x50">
|
||||
<title>16x50 UART Driver</title>
|
||||
!Edrivers/serial/serial_core.c
|
||||
!Edrivers/serial/8250.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="z85230">
|
||||
<title>Z85230 Support Library</title>
|
||||
!Edrivers/net/wan/z85230.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="fbdev">
|
||||
<title>Frame Buffer Library</title>
|
||||
|
||||
<para>
|
||||
The frame buffer drivers depend heavily on four data structures.
|
||||
These structures are declared in include/linux/fb.h. They are
|
||||
fb_info, fb_var_screeninfo, fb_fix_screeninfo and fb_monospecs.
|
||||
The last three can be made available to and from userland.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
fb_info defines the current state of a particular video card.
|
||||
Inside fb_info, there exists a fb_ops structure which is a
|
||||
collection of needed functions to make fbdev and fbcon work.
|
||||
fb_info is only visible to the kernel.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
fb_var_screeninfo is used to describe the features of a video card
|
||||
that are user defined. With fb_var_screeninfo, things such as
|
||||
depth and the resolution may be defined.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The next structure is fb_fix_screeninfo. This defines the
|
||||
properties of a card that are created when a mode is set and can't
|
||||
be changed otherwise. A good example of this is the start of the
|
||||
frame buffer memory. This "locks" the address of the frame buffer
|
||||
memory, so that it cannot be changed or moved.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The last structure is fb_monospecs. In the old API, there was
|
||||
little importance for fb_monospecs. This allowed for forbidden things
|
||||
such as setting a mode of 800x600 on a fix frequency monitor. With
|
||||
the new API, fb_monospecs prevents such things, and if used
|
||||
correctly, can prevent a monitor from being cooked. fb_monospecs
|
||||
will not be useful until kernels 2.5.x.
|
||||
</para>
|
||||
|
||||
<sect1><title>Frame Buffer Memory</title>
|
||||
!Edrivers/video/fbmem.c
|
||||
</sect1>
|
||||
<sect1><title>Frame Buffer Console</title>
|
||||
!Edrivers/video/console/fbcon.c
|
||||
</sect1>
|
||||
<sect1><title>Frame Buffer Colormap</title>
|
||||
!Edrivers/video/fbcmap.c
|
||||
</sect1>
|
||||
<!-- FIXME:
|
||||
drivers/video/fbgen.c has no docs, which stuffs up the sgml. Comment
|
||||
out until somebody adds docs. KAO
|
||||
<sect1><title>Frame Buffer Generic Functions</title>
|
||||
X!Idrivers/video/fbgen.c
|
||||
</sect1>
|
||||
KAO -->
|
||||
<sect1><title>Frame Buffer Video Mode Database</title>
|
||||
!Idrivers/video/modedb.c
|
||||
!Edrivers/video/modedb.c
|
||||
</sect1>
|
||||
<sect1><title>Frame Buffer Macintosh Video Mode Database</title>
|
||||
!Idrivers/video/macmodes.c
|
||||
</sect1>
|
||||
<sect1><title>Frame Buffer Fonts</title>
|
||||
<para>
|
||||
Refer to the file drivers/video/console/fonts.c for more information.
|
||||
</para>
|
||||
<!-- FIXME: Removed for now since no structured comments in source
|
||||
X!Idrivers/video/console/fonts.c
|
||||
-->
|
||||
</sect1>
|
||||
</chapter>
|
||||
</book>
|
1349
Documentation/DocBook/kernel-hacking.tmpl
Normal file
1349
Documentation/DocBook/kernel-hacking.tmpl
Normal file
File diff suppressed because it is too large
Load diff
2088
Documentation/DocBook/kernel-locking.tmpl
Normal file
2088
Documentation/DocBook/kernel-locking.tmpl
Normal file
File diff suppressed because it is too large
Load diff
282
Documentation/DocBook/libata.tmpl
Normal file
282
Documentation/DocBook/libata.tmpl
Normal file
|
@ -0,0 +1,282 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="libataDevGuide">
|
||||
<bookinfo>
|
||||
<title>libATA Developer's Guide</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Jeff</firstname>
|
||||
<surname>Garzik</surname>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2003</year>
|
||||
<holder>Jeff Garzik</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
The contents of this file are subject to the Open
|
||||
Software License version 1.1 that can be found at
|
||||
<ulink url="http://www.opensource.org/licenses/osl-1.1.txt">http://www.opensource.org/licenses/osl-1.1.txt</ulink> and is included herein
|
||||
by reference.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Alternatively, the contents of this file may be used under the terms
|
||||
of the GNU General Public License version 2 (the "GPL") as distributed
|
||||
in the kernel source COPYING file, in which case the provisions of
|
||||
the GPL are applicable instead of the above. If you wish to allow
|
||||
the use of your version of this file only under the terms of the
|
||||
GPL and not to allow others to use your version of this file under
|
||||
the OSL, indicate your decision by deleting the provisions above and
|
||||
replace them with the notice and other provisions required by the GPL.
|
||||
If you do not delete the provisions above, a recipient may use your
|
||||
version of this file under either the OSL or the GPL.
|
||||
</para>
|
||||
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="libataThanks">
|
||||
<title>Thanks</title>
|
||||
<para>
|
||||
The bulk of the ATA knowledge comes thanks to long conversations with
|
||||
Andre Hedrick (www.linux-ide.org).
|
||||
</para>
|
||||
<para>
|
||||
Thanks to Alan Cox for pointing out similarities
|
||||
between SATA and SCSI, and in general for motivation to hack on
|
||||
libata.
|
||||
</para>
|
||||
<para>
|
||||
libata's device detection
|
||||
method, ata_pio_devchk, and in general all the early probing was
|
||||
based on extensive study of Hale Landis's probe/reset code in his
|
||||
ATADRVR driver (www.ata-atapi.com).
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="libataDriverApi">
|
||||
<title>libata Driver API</title>
|
||||
<sect1>
|
||||
<title>struct ata_port_operations</title>
|
||||
|
||||
<programlisting>
|
||||
void (*port_disable) (struct ata_port *);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Called from ata_bus_probe() and ata_bus_reset() error paths,
|
||||
as well as when unregistering from the SCSI module (rmmod, hot
|
||||
unplug).
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*dev_config) (struct ata_port *, struct ata_device *);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Called after IDENTIFY [PACKET] DEVICE is issued to each device
|
||||
found. Typically used to apply device-specific fixups prior to
|
||||
issue of SET FEATURES - XFER MODE, and prior to operation.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*set_piomode) (struct ata_port *, struct ata_device *);
|
||||
void (*set_dmamode) (struct ata_port *, struct ata_device *);
|
||||
void (*post_set_mode) (struct ata_port *ap);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Hooks called prior to the issue of SET FEATURES - XFER MODE
|
||||
command. dev->pio_mode is guaranteed to be valid when
|
||||
->set_piomode() is called, and dev->dma_mode is guaranteed to be
|
||||
valid when ->set_dmamode() is called. ->post_set_mode() is
|
||||
called unconditionally, after the SET FEATURES - XFER MODE
|
||||
command completes successfully.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
->set_piomode() is always called (if present), but
|
||||
->set_dma_mode() is only called if DMA is possible.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*tf_load) (struct ata_port *ap, struct ata_taskfile *tf);
|
||||
void (*tf_read) (struct ata_port *ap, struct ata_taskfile *tf);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
->tf_load() is called to load the given taskfile into hardware
|
||||
registers / DMA buffers. ->tf_read() is called to read the
|
||||
hardware registers / DMA buffers, to obtain the current set of
|
||||
taskfile register values.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*exec_command)(struct ata_port *ap, struct ata_taskfile *tf);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
causes an ATA command, previously loaded with
|
||||
->tf_load(), to be initiated in hardware.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
u8 (*check_status)(struct ata_port *ap);
|
||||
void (*dev_select)(struct ata_port *ap, unsigned int device);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Reads the Status ATA shadow register from hardware. On some
|
||||
hardware, this has the side effect of clearing the interrupt
|
||||
condition.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*dev_select)(struct ata_port *ap, unsigned int device);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Issues the low-level hardware command(s) that causes one of N
|
||||
hardware devices to be considered 'selected' (active and
|
||||
available for use) on the ATA bus.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*phy_reset) (struct ata_port *ap);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The very first step in the probe phase. Actions vary depending
|
||||
on the bus type, typically. After waking up the device and probing
|
||||
for device presence (PATA and SATA), typically a soft reset
|
||||
(SRST) will be performed. Drivers typically use the helper
|
||||
functions ata_bus_reset() or sata_phy_reset() for this hook.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*bmdma_setup) (struct ata_queued_cmd *qc);
|
||||
void (*bmdma_start) (struct ata_queued_cmd *qc);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
When setting up an IDE BMDMA transaction, these hooks arm
|
||||
(->bmdma_setup) and fire (->bmdma_start) the hardware's DMA
|
||||
engine.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*qc_prep) (struct ata_queued_cmd *qc);
|
||||
int (*qc_issue) (struct ata_queued_cmd *qc);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Higher-level hooks, these two hooks can potentially supercede
|
||||
several of the above taskfile/DMA engine hooks. ->qc_prep is
|
||||
called after the buffers have been DMA-mapped, and is typically
|
||||
used to populate the hardware's DMA scatter-gather table.
|
||||
Most drivers use the standard ata_qc_prep() helper function, but
|
||||
more advanced drivers roll their own.
|
||||
</para>
|
||||
<para>
|
||||
->qc_issue is used to make a command active, once the hardware
|
||||
and S/G tables have been prepared. IDE BMDMA drivers use the
|
||||
helper function ata_qc_issue_prot() for taskfile protocol-based
|
||||
dispatch. More advanced drivers roll their own ->qc_issue
|
||||
implementation, using this as the "issue new ATA command to
|
||||
hardware" hook.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
void (*eng_timeout) (struct ata_port *ap);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
This is a high level error handling function, called from the
|
||||
error handling thread, when a command times out.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
irqreturn_t (*irq_handler)(int, void *, struct pt_regs *);
|
||||
void (*irq_clear) (struct ata_port *);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
->irq_handler is the interrupt handling routine registered with
|
||||
the system, by libata. ->irq_clear is called during probe just
|
||||
before the interrupt handler is registered, to be sure hardware
|
||||
is quiet.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
u32 (*scr_read) (struct ata_port *ap, unsigned int sc_reg);
|
||||
void (*scr_write) (struct ata_port *ap, unsigned int sc_reg,
|
||||
u32 val);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Read and write standard SATA phy registers. Currently only used
|
||||
if ->phy_reset hook called the sata_phy_reset() helper function.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
int (*port_start) (struct ata_port *ap);
|
||||
void (*port_stop) (struct ata_port *ap);
|
||||
void (*host_stop) (struct ata_host_set *host_set);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
->port_start() is called just after the data structures for each
|
||||
port are initialized. Typically this is used to alloc per-port
|
||||
DMA buffers / tables / rings, enable DMA engines, and similar
|
||||
tasks.
|
||||
</para>
|
||||
<para>
|
||||
->host_stop() is called when the rmmod or hot unplug process
|
||||
begins. The hook must stop all hardware interrupts, DMA
|
||||
engines, etc.
|
||||
</para>
|
||||
<para>
|
||||
->port_stop() is called after ->host_stop(). It's sole function
|
||||
is to release DMA/memory resources, now that they are no longer
|
||||
actively being used.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="libataExt">
|
||||
<title>libata Library</title>
|
||||
!Edrivers/scsi/libata-core.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="libataInt">
|
||||
<title>libata Core Internals</title>
|
||||
!Idrivers/scsi/libata-core.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="libataScsiInt">
|
||||
<title>libata SCSI translation/emulation</title>
|
||||
!Edrivers/scsi/libata-scsi.c
|
||||
!Idrivers/scsi/libata-scsi.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="PiixInt">
|
||||
<title>ata_piix Internals</title>
|
||||
!Idrivers/scsi/ata_piix.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="SILInt">
|
||||
<title>sata_sil Internals</title>
|
||||
!Idrivers/scsi/sata_sil.c
|
||||
</chapter>
|
||||
|
||||
</book>
|
289
Documentation/DocBook/librs.tmpl
Normal file
289
Documentation/DocBook/librs.tmpl
Normal file
|
@ -0,0 +1,289 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="Reed-Solomon-Library-Guide">
|
||||
<bookinfo>
|
||||
<title>Reed-Solomon Library Programming Interface</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Thomas</firstname>
|
||||
<surname>Gleixner</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>tglx@linutronix.de</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2004</year>
|
||||
<holder>Thomas Gleixner</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License version 2 as published by the Free Software Foundation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
The generic Reed-Solomon Library provides encoding, decoding
|
||||
and error correction functions.
|
||||
</para>
|
||||
<para>
|
||||
Reed-Solomon codes are used in communication and storage
|
||||
applications to ensure data integrity.
|
||||
</para>
|
||||
<para>
|
||||
This documentation is provided for developers who want to utilize
|
||||
the functions provided by the library.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="bugs">
|
||||
<title>Known Bugs And Assumptions</title>
|
||||
<para>
|
||||
None.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="usage">
|
||||
<title>Usage</title>
|
||||
<para>
|
||||
This chapter provides examples how to use the library.
|
||||
</para>
|
||||
<sect1>
|
||||
<title>Initializing</title>
|
||||
<para>
|
||||
The init function init_rs returns a pointer to a
|
||||
rs decoder structure, which holds the necessary
|
||||
information for encoding, decoding and error correction
|
||||
with the given polynomial. It either uses an existing
|
||||
matching decoder or creates a new one. On creation all
|
||||
the lookup tables for fast en/decoding are created.
|
||||
The function may take a while, so make sure not to
|
||||
call it in critical code paths.
|
||||
</para>
|
||||
<programlisting>
|
||||
/* the Reed Solomon control structure */
|
||||
static struct rs_control *rs_decoder;
|
||||
|
||||
/* Symbolsize is 10 (bits)
|
||||
* Primitve polynomial is x^10+x^3+1
|
||||
* first consecutive root is 0
|
||||
* primitve element to generate roots = 1
|
||||
* generator polinomial degree (number of roots) = 6
|
||||
*/
|
||||
rs_decoder = init_rs (10, 0x409, 0, 1, 6);
|
||||
</programlisting>
|
||||
</sect1>
|
||||
<sect1>
|
||||
<title>Encoding</title>
|
||||
<para>
|
||||
The encoder calculates the Reed-Solomon code over
|
||||
the given data length and stores the result in
|
||||
the parity buffer. Note that the parity buffer must
|
||||
be initialized before calling the encoder.
|
||||
</para>
|
||||
<para>
|
||||
The expanded data can be inverted on the fly by
|
||||
providing a non zero inversion mask. The expanded data is
|
||||
XOR'ed with the mask. This is used e.g. for FLASH
|
||||
ECC, where the all 0xFF is inverted to an all 0x00.
|
||||
The Reed-Solomon code for all 0x00 is all 0x00. The
|
||||
code is inverted before storing to FLASH so it is 0xFF
|
||||
too. This prevent's that reading from an erased FLASH
|
||||
results in ECC errors.
|
||||
</para>
|
||||
<para>
|
||||
The databytes are expanded to the given symbol size
|
||||
on the fly. There is no support for encoding continuous
|
||||
bitstreams with a symbol size != 8 at the moment. If
|
||||
it is necessary it should be not a big deal to implement
|
||||
such functionality.
|
||||
</para>
|
||||
<programlisting>
|
||||
/* Parity buffer. Size = number of roots */
|
||||
uint16_t par[6];
|
||||
/* Initialize the parity buffer */
|
||||
memset(par, 0, sizeof(par));
|
||||
/* Encode 512 byte in data8. Store parity in buffer par */
|
||||
encode_rs8 (rs_decoder, data8, 512, par, 0);
|
||||
</programlisting>
|
||||
</sect1>
|
||||
<sect1>
|
||||
<title>Decoding</title>
|
||||
<para>
|
||||
The decoder calculates the syndrome over
|
||||
the given data length and the received parity symbols
|
||||
and corrects errors in the data.
|
||||
</para>
|
||||
<para>
|
||||
If a syndrome is available from a hardware decoder
|
||||
then the syndrome calculation is skipped.
|
||||
</para>
|
||||
<para>
|
||||
The correction of the data buffer can be suppressed
|
||||
by providing a correction pattern buffer and an error
|
||||
location buffer to the decoder. The decoder stores the
|
||||
calculated error location and the correction bitmask
|
||||
in the given buffers. This is useful for hardware
|
||||
decoders which use a weird bit ordering scheme.
|
||||
</para>
|
||||
<para>
|
||||
The databytes are expanded to the given symbol size
|
||||
on the fly. There is no support for decoding continuous
|
||||
bitstreams with a symbolsize != 8 at the moment. If
|
||||
it is necessary it should be not a big deal to implement
|
||||
such functionality.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>
|
||||
Decoding with syndrome calculation, direct data correction
|
||||
</title>
|
||||
<programlisting>
|
||||
/* Parity buffer. Size = number of roots */
|
||||
uint16_t par[6];
|
||||
uint8_t data[512];
|
||||
int numerr;
|
||||
/* Receive data */
|
||||
.....
|
||||
/* Receive parity */
|
||||
.....
|
||||
/* Decode 512 byte in data8.*/
|
||||
numerr = decode_rs8 (rs_decoder, data8, par, 512, NULL, 0, NULL, 0, NULL);
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>
|
||||
Decoding with syndrome given by hardware decoder, direct data correction
|
||||
</title>
|
||||
<programlisting>
|
||||
/* Parity buffer. Size = number of roots */
|
||||
uint16_t par[6], syn[6];
|
||||
uint8_t data[512];
|
||||
int numerr;
|
||||
/* Receive data */
|
||||
.....
|
||||
/* Receive parity */
|
||||
.....
|
||||
/* Get syndrome from hardware decoder */
|
||||
.....
|
||||
/* Decode 512 byte in data8.*/
|
||||
numerr = decode_rs8 (rs_decoder, data8, par, 512, syn, 0, NULL, 0, NULL);
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>
|
||||
Decoding with syndrome given by hardware decoder, no direct data correction.
|
||||
</title>
|
||||
<para>
|
||||
Note: It's not necessary to give data and received parity to the decoder.
|
||||
</para>
|
||||
<programlisting>
|
||||
/* Parity buffer. Size = number of roots */
|
||||
uint16_t par[6], syn[6], corr[8];
|
||||
uint8_t data[512];
|
||||
int numerr, errpos[8];
|
||||
/* Receive data */
|
||||
.....
|
||||
/* Receive parity */
|
||||
.....
|
||||
/* Get syndrome from hardware decoder */
|
||||
.....
|
||||
/* Decode 512 byte in data8.*/
|
||||
numerr = decode_rs8 (rs_decoder, NULL, NULL, 512, syn, 0, errpos, 0, corr);
|
||||
for (i = 0; i < numerr; i++) {
|
||||
do_error_correction_in_your_buffer(errpos[i], corr[i]);
|
||||
}
|
||||
</programlisting>
|
||||
</sect2>
|
||||
</sect1>
|
||||
<sect1>
|
||||
<title>Cleanup</title>
|
||||
<para>
|
||||
The function free_rs frees the allocated resources,
|
||||
if the caller is the last user of the decoder.
|
||||
</para>
|
||||
<programlisting>
|
||||
/* Release resources */
|
||||
free_rs(rs_decoder);
|
||||
</programlisting>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="structs">
|
||||
<title>Structures</title>
|
||||
<para>
|
||||
This chapter contains the autogenerated documentation of the structures which are
|
||||
used in the Reed-Solomon Library and are relevant for a developer.
|
||||
</para>
|
||||
!Iinclude/linux/rslib.h
|
||||
</chapter>
|
||||
|
||||
<chapter id="pubfunctions">
|
||||
<title>Public Functions Provided</title>
|
||||
<para>
|
||||
This chapter contains the autogenerated documentation of the Reed-Solomon functions
|
||||
which are exported.
|
||||
</para>
|
||||
!Elib/reed_solomon/reed_solomon.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="credits">
|
||||
<title>Credits</title>
|
||||
<para>
|
||||
The library code for encoding and decoding was written by Phil Karn.
|
||||
</para>
|
||||
<programlisting>
|
||||
Copyright 2002, Phil Karn, KA9Q
|
||||
May be used under the terms of the GNU General Public License (GPL)
|
||||
</programlisting>
|
||||
<para>
|
||||
The wrapper functions and interfaces are written by Thomas Gleixner
|
||||
</para>
|
||||
<para>
|
||||
Many users have provided bugfixes, improvements and helping hands for testing.
|
||||
Thanks a lot.
|
||||
</para>
|
||||
<para>
|
||||
The following people have contributed to this document:
|
||||
</para>
|
||||
<para>
|
||||
Thomas Gleixner<email>tglx@linutronix.de</email>
|
||||
</para>
|
||||
</chapter>
|
||||
</book>
|
265
Documentation/DocBook/lsm.tmpl
Normal file
265
Documentation/DocBook/lsm.tmpl
Normal file
|
@ -0,0 +1,265 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<article class="whitepaper" id="LinuxSecurityModule" lang="en">
|
||||
<articleinfo>
|
||||
<title>Linux Security Modules: General Security Hooks for Linux</title>
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Stephen</firstname>
|
||||
<surname>Smalley</surname>
|
||||
<affiliation>
|
||||
<orgname>NAI Labs</orgname>
|
||||
<address><email>ssmalley@nai.com</email></address>
|
||||
</affiliation>
|
||||
</author>
|
||||
<author>
|
||||
<firstname>Timothy</firstname>
|
||||
<surname>Fraser</surname>
|
||||
<affiliation>
|
||||
<orgname>NAI Labs</orgname>
|
||||
<address><email>tfraser@nai.com</email></address>
|
||||
</affiliation>
|
||||
</author>
|
||||
<author>
|
||||
<firstname>Chris</firstname>
|
||||
<surname>Vance</surname>
|
||||
<affiliation>
|
||||
<orgname>NAI Labs</orgname>
|
||||
<address><email>cvance@nai.com</email></address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
</articleinfo>
|
||||
|
||||
<sect1><title>Introduction</title>
|
||||
|
||||
<para>
|
||||
In March 2001, the National Security Agency (NSA) gave a presentation
|
||||
about Security-Enhanced Linux (SELinux) at the 2.5 Linux Kernel
|
||||
Summit. SELinux is an implementation of flexible and fine-grained
|
||||
nondiscretionary access controls in the Linux kernel, originally
|
||||
implemented as its own particular kernel patch. Several other
|
||||
security projects (e.g. RSBAC, Medusa) have also developed flexible
|
||||
access control architectures for the Linux kernel, and various
|
||||
projects have developed particular access control models for Linux
|
||||
(e.g. LIDS, DTE, SubDomain). Each project has developed and
|
||||
maintained its own kernel patch to support its security needs.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In response to the NSA presentation, Linus Torvalds made a set of
|
||||
remarks that described a security framework he would be willing to
|
||||
consider for inclusion in the mainstream Linux kernel. He described a
|
||||
general framework that would provide a set of security hooks to
|
||||
control operations on kernel objects and a set of opaque security
|
||||
fields in kernel data structures for maintaining security attributes.
|
||||
This framework could then be used by loadable kernel modules to
|
||||
implement any desired model of security. Linus also suggested the
|
||||
possibility of migrating the Linux capabilities code into such a
|
||||
module.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The Linux Security Modules (LSM) project was started by WireX to
|
||||
develop such a framework. LSM is a joint development effort by
|
||||
several security projects, including Immunix, SELinux, SGI and Janus,
|
||||
and several individuals, including Greg Kroah-Hartman and James
|
||||
Morris, to develop a Linux kernel patch that implements this
|
||||
framework. The patch is currently tracking the 2.4 series and is
|
||||
targeted for integration into the 2.5 development series. This
|
||||
technical report provides an overview of the framework and the example
|
||||
capabilities security module provided by the LSM kernel patch.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="framework"><title>LSM Framework</title>
|
||||
|
||||
<para>
|
||||
The LSM kernel patch provides a general kernel framework to support
|
||||
security modules. In particular, the LSM framework is primarily
|
||||
focused on supporting access control modules, although future
|
||||
development is likely to address other security needs such as
|
||||
auditing. By itself, the framework does not provide any additional
|
||||
security; it merely provides the infrastructure to support security
|
||||
modules. The LSM kernel patch also moves most of the capabilities
|
||||
logic into an optional security module, with the system defaulting
|
||||
to the traditional superuser logic. This capabilities module
|
||||
is discussed further in <xref linkend="cap"/>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The LSM kernel patch adds security fields to kernel data structures
|
||||
and inserts calls to hook functions at critical points in the kernel
|
||||
code to manage the security fields and to perform access control. It
|
||||
also adds functions for registering and unregistering security
|
||||
modules, and adds a general <function>security</function> system call
|
||||
to support new system calls for security-aware applications.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The LSM security fields are simply <type>void*</type> pointers. For
|
||||
process and program execution security information, security fields
|
||||
were added to <structname>struct task_struct</structname> and
|
||||
<structname>struct linux_binprm</structname>. For filesystem security
|
||||
information, a security field was added to
|
||||
<structname>struct super_block</structname>. For pipe, file, and socket
|
||||
security information, security fields were added to
|
||||
<structname>struct inode</structname> and
|
||||
<structname>struct file</structname>. For packet and network device security
|
||||
information, security fields were added to
|
||||
<structname>struct sk_buff</structname> and
|
||||
<structname>struct net_device</structname>. For System V IPC security
|
||||
information, security fields were added to
|
||||
<structname>struct kern_ipc_perm</structname> and
|
||||
<structname>struct msg_msg</structname>; additionally, the definitions
|
||||
for <structname>struct msg_msg</structname>, <structname>struct
|
||||
msg_queue</structname>, and <structname>struct
|
||||
shmid_kernel</structname> were moved to header files
|
||||
(<filename>include/linux/msg.h</filename> and
|
||||
<filename>include/linux/shm.h</filename> as appropriate) to allow
|
||||
the security modules to use these definitions.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Each LSM hook is a function pointer in a global table,
|
||||
security_ops. This table is a
|
||||
<structname>security_operations</structname> structure as defined by
|
||||
<filename>include/linux/security.h</filename>. Detailed documentation
|
||||
for each hook is included in this header file. At present, this
|
||||
structure consists of a collection of substructures that group related
|
||||
hooks based on the kernel object (e.g. task, inode, file, sk_buff,
|
||||
etc) as well as some top-level hook function pointers for system
|
||||
operations. This structure is likely to be flattened in the future
|
||||
for performance. The placement of the hook calls in the kernel code
|
||||
is described by the "called:" lines in the per-hook documentation in
|
||||
the header file. The hook calls can also be easily found in the
|
||||
kernel code by looking for the string "security_ops->".
|
||||
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Linus mentioned per-process security hooks in his original remarks as a
|
||||
possible alternative to global security hooks. However, if LSM were
|
||||
to start from the perspective of per-process hooks, then the base
|
||||
framework would have to deal with how to handle operations that
|
||||
involve multiple processes (e.g. kill), since each process might have
|
||||
its own hook for controlling the operation. This would require a
|
||||
general mechanism for composing hooks in the base framework.
|
||||
Additionally, LSM would still need global hooks for operations that
|
||||
have no process context (e.g. network input operations).
|
||||
Consequently, LSM provides global security hooks, but a security
|
||||
module is free to implement per-process hooks (where that makes sense)
|
||||
by storing a security_ops table in each process' security field and
|
||||
then invoking these per-process hooks from the global hooks.
|
||||
The problem of composition is thus deferred to the module.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The global security_ops table is initialized to a set of hook
|
||||
functions provided by a dummy security module that provides
|
||||
traditional superuser logic. A <function>register_security</function>
|
||||
function (in <filename>security/security.c</filename>) is provided to
|
||||
allow a security module to set security_ops to refer to its own hook
|
||||
functions, and an <function>unregister_security</function> function is
|
||||
provided to revert security_ops to the dummy module hooks. This
|
||||
mechanism is used to set the primary security module, which is
|
||||
responsible for making the final decision for each hook.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
LSM also provides a simple mechanism for stacking additional security
|
||||
modules with the primary security module. It defines
|
||||
<function>register_security</function> and
|
||||
<function>unregister_security</function> hooks in the
|
||||
<structname>security_operations</structname> structure and provides
|
||||
<function>mod_reg_security</function> and
|
||||
<function>mod_unreg_security</function> functions that invoke these
|
||||
hooks after performing some sanity checking. A security module can
|
||||
call these functions in order to stack with other modules. However,
|
||||
the actual details of how this stacking is handled are deferred to the
|
||||
module, which can implement these hooks in any way it wishes
|
||||
(including always returning an error if it does not wish to support
|
||||
stacking). In this manner, LSM again defers the problem of
|
||||
composition to the module.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Although the LSM hooks are organized into substructures based on
|
||||
kernel object, all of the hooks can be viewed as falling into two
|
||||
major categories: hooks that are used to manage the security fields
|
||||
and hooks that are used to perform access control. Examples of the
|
||||
first category of hooks include the
|
||||
<function>alloc_security</function> and
|
||||
<function>free_security</function> hooks defined for each kernel data
|
||||
structure that has a security field. These hooks are used to allocate
|
||||
and free security structures for kernel objects. The first category
|
||||
of hooks also includes hooks that set information in the security
|
||||
field after allocation, such as the <function>post_lookup</function>
|
||||
hook in <structname>struct inode_security_ops</structname>. This hook
|
||||
is used to set security information for inodes after successful lookup
|
||||
operations. An example of the second category of hooks is the
|
||||
<function>permission</function> hook in
|
||||
<structname>struct inode_security_ops</structname>. This hook checks
|
||||
permission when accessing an inode.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="cap"><title>LSM Capabilities Module</title>
|
||||
|
||||
<para>
|
||||
The LSM kernel patch moves most of the existing POSIX.1e capabilities
|
||||
logic into an optional security module stored in the file
|
||||
<filename>security/capability.c</filename>. This change allows
|
||||
users who do not want to use capabilities to omit this code entirely
|
||||
from their kernel, instead using the dummy module for traditional
|
||||
superuser logic or any other module that they desire. This change
|
||||
also allows the developers of the capabilities logic to maintain and
|
||||
enhance their code more freely, without needing to integrate patches
|
||||
back into the base kernel.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In addition to moving the capabilities logic, the LSM kernel patch
|
||||
could move the capability-related fields from the kernel data
|
||||
structures into the new security fields managed by the security
|
||||
modules. However, at present, the LSM kernel patch leaves the
|
||||
capability fields in the kernel data structures. In his original
|
||||
remarks, Linus suggested that this might be preferable so that other
|
||||
security modules can be easily stacked with the capabilities module
|
||||
without needing to chain multiple security structures on the security field.
|
||||
It also avoids imposing extra overhead on the capabilities module
|
||||
to manage the security fields. However, the LSM framework could
|
||||
certainly support such a move if it is determined to be desirable,
|
||||
with only a few additional changes described below.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
At present, the capabilities logic for computing process capabilities
|
||||
on <function>execve</function> and <function>set*uid</function>,
|
||||
checking capabilities for a particular process, saving and checking
|
||||
capabilities for netlink messages, and handling the
|
||||
<function>capget</function> and <function>capset</function> system
|
||||
calls have been moved into the capabilities module. There are still a
|
||||
few locations in the base kernel where capability-related fields are
|
||||
directly examined or modified, but the current version of the LSM
|
||||
patch does allow a security module to completely replace the
|
||||
assignment and testing of capabilities. These few locations would
|
||||
need to be changed if the capability-related fields were moved into
|
||||
the security field. The following is a list of known locations that
|
||||
still perform such direct examination or modification of
|
||||
capability-related fields:
|
||||
<itemizedlist>
|
||||
<listitem><para><filename>fs/open.c</filename>:<function>sys_access</function></para></listitem>
|
||||
<listitem><para><filename>fs/lockd/host.c</filename>:<function>nlm_bind_host</function></para></listitem>
|
||||
<listitem><para><filename>fs/nfsd/auth.c</filename>:<function>nfsd_setuser</function></para></listitem>
|
||||
<listitem><para><filename>fs/proc/array.c</filename>:<function>task_cap</function></para></listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
</article>
|
3
Documentation/DocBook/man/Makefile
Normal file
3
Documentation/DocBook/man/Makefile
Normal file
|
@ -0,0 +1,3 @@
|
|||
# Rules are put in Documentation/DocBook
|
||||
|
||||
clean-files := *.9.gz *.sgml manpage.links manpage.refs
|
107
Documentation/DocBook/mcabook.tmpl
Normal file
107
Documentation/DocBook/mcabook.tmpl
Normal file
|
@ -0,0 +1,107 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="MCAGuide">
|
||||
<bookinfo>
|
||||
<title>MCA Driver Programming Interface</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Alan</firstname>
|
||||
<surname>Cox</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>alan@redhat.com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
<author>
|
||||
<firstname>David</firstname>
|
||||
<surname>Weinehall</surname>
|
||||
</author>
|
||||
<author>
|
||||
<firstname>Chris</firstname>
|
||||
<surname>Beauregard</surname>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2000</year>
|
||||
<holder>Alan Cox</holder>
|
||||
<holder>David Weinehall</holder>
|
||||
<holder>Chris Beauregard</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
The MCA bus functions provide a generalised interface to find MCA
|
||||
bus cards, to claim them for a driver, and to read and manipulate POS
|
||||
registers without being aware of the motherboard internals or
|
||||
certain deep magic specific to onboard devices.
|
||||
</para>
|
||||
<para>
|
||||
The basic interface to the MCA bus devices is the slot. Each slot
|
||||
is numbered and virtual slot numbers are assigned to the internal
|
||||
devices. Using a pci_dev as other busses do does not really make
|
||||
sense in the MCA context as the MCA bus resources require card
|
||||
specific interpretation.
|
||||
</para>
|
||||
<para>
|
||||
Finally the MCA bus functions provide a parallel set of DMA
|
||||
functions mimicing the ISA bus DMA functions as closely as possible,
|
||||
although also supporting the additional DMA functionality on the
|
||||
MCA bus controllers.
|
||||
</para>
|
||||
</chapter>
|
||||
<chapter id="bugs">
|
||||
<title>Known Bugs And Assumptions</title>
|
||||
<para>
|
||||
None.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="pubfunctions">
|
||||
<title>Public Functions Provided</title>
|
||||
!Earch/i386/kernel/mca.c
|
||||
</chapter>
|
||||
|
||||
<chapter id="dmafunctions">
|
||||
<title>DMA Functions Provided</title>
|
||||
!Iinclude/asm-i386/mca_dma.h
|
||||
</chapter>
|
||||
|
||||
</book>
|
1320
Documentation/DocBook/mtdnand.tmpl
Normal file
1320
Documentation/DocBook/mtdnand.tmpl
Normal file
File diff suppressed because it is too large
Load diff
591
Documentation/DocBook/procfs-guide.tmpl
Normal file
591
Documentation/DocBook/procfs-guide.tmpl
Normal file
|
@ -0,0 +1,591 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
|
||||
<!ENTITY procfsexample SYSTEM "procfs_example.xml">
|
||||
]>
|
||||
|
||||
<book id="LKProcfsGuide">
|
||||
<bookinfo>
|
||||
<title>Linux Kernel Procfs Guide</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Erik</firstname>
|
||||
<othername>(J.A.K.)</othername>
|
||||
<surname>Mouw</surname>
|
||||
<affiliation>
|
||||
<orgname>Delft University of Technology</orgname>
|
||||
<orgdiv>Faculty of Information Technology and Systems</orgdiv>
|
||||
<address>
|
||||
<email>J.A.K.Mouw@its.tudelft.nl</email>
|
||||
<pob>PO BOX 5031</pob>
|
||||
<postcode>2600 GA</postcode>
|
||||
<city>Delft</city>
|
||||
<country>The Netherlands</country>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<revhistory>
|
||||
<revision>
|
||||
<revnumber>1.0 </revnumber>
|
||||
<date>May 30, 2001</date>
|
||||
<revremark>Initial revision posted to linux-kernel</revremark>
|
||||
</revision>
|
||||
<revision>
|
||||
<revnumber>1.1 </revnumber>
|
||||
<date>June 3, 2001</date>
|
||||
<revremark>Revised after comments from linux-kernel</revremark>
|
||||
</revision>
|
||||
</revhistory>
|
||||
|
||||
<copyright>
|
||||
<year>2001</year>
|
||||
<holder>Erik Mouw</holder>
|
||||
</copyright>
|
||||
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute it
|
||||
and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This documentation is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
|
||||
PURPOSE. See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
|
||||
|
||||
|
||||
<toc>
|
||||
</toc>
|
||||
|
||||
|
||||
|
||||
|
||||
<preface>
|
||||
<title>Preface</title>
|
||||
|
||||
<para>
|
||||
This guide describes the use of the procfs file system from
|
||||
within the Linux kernel. The idea to write this guide came up on
|
||||
the #kernelnewbies IRC channel (see <ulink
|
||||
url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>),
|
||||
when Jeff Garzik explained the use of procfs and forwarded me a
|
||||
message Alexander Viro wrote to the linux-kernel mailing list. I
|
||||
agreed to write it up nicely, so here it is.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
I'd like to thank Jeff Garzik
|
||||
<email>jgarzik@pobox.com</email> and Alexander Viro
|
||||
<email>viro@parcelfarce.linux.theplanet.co.uk</email> for their input,
|
||||
Tim Waugh <email>twaugh@redhat.com</email> for his <ulink
|
||||
url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>,
|
||||
and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for
|
||||
proofreading.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This documentation was written while working on the LART
|
||||
computing board (<ulink
|
||||
url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>),
|
||||
which is sponsored by the Mobile Multi-media Communications
|
||||
(<ulink
|
||||
url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>)
|
||||
and Ubiquitous Communications (<ulink
|
||||
url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>)
|
||||
projects.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Erik
|
||||
</para>
|
||||
</preface>
|
||||
|
||||
|
||||
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>
|
||||
The <filename class="directory">/proc</filename> file system
|
||||
(procfs) is a special file system in the linux kernel. It's a
|
||||
virtual file system: it is not associated with a block device
|
||||
but exists only in memory. The files in the procfs are there to
|
||||
allow userland programs access to certain information from the
|
||||
kernel (like process information in <filename
|
||||
class="directory">/proc/[0-9]+/</filename>), but also for debug
|
||||
purposes (like <filename>/proc/ksyms</filename>).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This guide describes the use of the procfs file system from
|
||||
within the Linux kernel. It starts by introducing all relevant
|
||||
functions to manage the files within the file system. After that
|
||||
it shows how to communicate with userland, and some tips and
|
||||
tricks will be pointed out. Finally a complete example will be
|
||||
shown.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that the files in <filename
|
||||
class="directory">/proc/sys</filename> are sysctl files: they
|
||||
don't belong to procfs and are governed by a completely
|
||||
different API described in the Kernel API book.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
|
||||
|
||||
|
||||
<chapter id="managing">
|
||||
<title>Managing procfs entries</title>
|
||||
|
||||
<para>
|
||||
This chapter describes the functions that various kernel
|
||||
components use to populate the procfs with files, symlinks,
|
||||
device nodes, and directories.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A minor note before we start: if you want to use any of the
|
||||
procfs functions, be sure to include the correct header file!
|
||||
This should be one of the first lines in your code:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
#include <linux/proc_fs.h>
|
||||
</programlisting>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1 id="regularfile">
|
||||
<title>Creating a regular file</title>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef>
|
||||
<paramdef>const char* <parameter>name</parameter></paramdef>
|
||||
<paramdef>mode_t <parameter>mode</parameter></paramdef>
|
||||
<paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
This function creates a regular file with the name
|
||||
<parameter>name</parameter>, file mode
|
||||
<parameter>mode</parameter> in the directory
|
||||
<parameter>parent</parameter>. To create a file in the root of
|
||||
the procfs, use <constant>NULL</constant> as
|
||||
<parameter>parent</parameter> parameter. When successful, the
|
||||
function will return a pointer to the freshly created
|
||||
<structname>struct proc_dir_entry</structname>; otherwise it
|
||||
will return <constant>NULL</constant>. <xref
|
||||
linkend="userland"/> describes how to do something useful with
|
||||
regular files.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that it is specifically supported that you can pass a
|
||||
path that spans multiple directories. For example
|
||||
<function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>)
|
||||
will create the <filename class="directory">via0</filename>
|
||||
directory if necessary, with standard
|
||||
<constant>0755</constant> permissions.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you only want to be able to read the file, the function
|
||||
<function>create_proc_read_entry</function> described in <xref
|
||||
linkend="convenience"/> may be used to create and initialise
|
||||
the procfs entry in one single call.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Creating a symlink</title>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>struct proc_dir_entry*
|
||||
<function>proc_symlink</function></funcdef> <paramdef>const
|
||||
char* <parameter>name</parameter></paramdef>
|
||||
<paramdef>struct proc_dir_entry*
|
||||
<parameter>parent</parameter></paramdef> <paramdef>const
|
||||
char* <parameter>dest</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
This creates a symlink in the procfs directory
|
||||
<parameter>parent</parameter> that points from
|
||||
<parameter>name</parameter> to
|
||||
<parameter>dest</parameter>. This translates in userland to
|
||||
<literal>ln -s</literal> <parameter>dest</parameter>
|
||||
<parameter>name</parameter>.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Creating a directory</title>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef>
|
||||
<paramdef>const char* <parameter>name</parameter></paramdef>
|
||||
<paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
Create a directory <parameter>name</parameter> in the procfs
|
||||
directory <parameter>parent</parameter>.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Removing an entry</title>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>void <function>remove_proc_entry</function></funcdef>
|
||||
<paramdef>const char* <parameter>name</parameter></paramdef>
|
||||
<paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
Removes the entry <parameter>name</parameter> in the directory
|
||||
<parameter>parent</parameter> from the procfs. Entries are
|
||||
removed by their <emphasis>name</emphasis>, not by the
|
||||
<structname>struct proc_dir_entry</structname> returned by the
|
||||
various create functions. Note that this function doesn't
|
||||
recursively remove entries.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Be sure to free the <structfield>data</structfield> entry from
|
||||
the <structname>struct proc_dir_entry</structname> before
|
||||
<function>remove_proc_entry</function> is called (that is: if
|
||||
there was some <structfield>data</structfield> allocated, of
|
||||
course). See <xref linkend="usingdata"/> for more information
|
||||
on using the <structfield>data</structfield> entry.
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
|
||||
|
||||
|
||||
<chapter id="userland">
|
||||
<title>Communicating with userland</title>
|
||||
|
||||
<para>
|
||||
Instead of reading (or writing) information directly from
|
||||
kernel memory, procfs works with <emphasis>call back
|
||||
functions</emphasis> for files: functions that are called when
|
||||
a specific file is being read or written. Such functions have
|
||||
to be initialised after the procfs file is created by setting
|
||||
the <structfield>read_proc</structfield> and/or
|
||||
<structfield>write_proc</structfield> fields in the
|
||||
<structname>struct proc_dir_entry*</structname> that the
|
||||
function <function>create_proc_entry</function> returned:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct proc_dir_entry* entry;
|
||||
|
||||
entry->read_proc = read_proc_foo;
|
||||
entry->write_proc = write_proc_foo;
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
If you only want to use a the
|
||||
<structfield>read_proc</structfield>, the function
|
||||
<function>create_proc_read_entry</function> described in <xref
|
||||
linkend="convenience"/> may be used to create and initialise the
|
||||
procfs entry in one single call.
|
||||
</para>
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Reading data</title>
|
||||
|
||||
<para>
|
||||
The read function is a call back function that allows userland
|
||||
processes to read data from the kernel. The read function
|
||||
should have the following format:
|
||||
</para>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>int <function>read_func</function></funcdef>
|
||||
<paramdef>char* <parameter>page</parameter></paramdef>
|
||||
<paramdef>char** <parameter>start</parameter></paramdef>
|
||||
<paramdef>off_t <parameter>off</parameter></paramdef>
|
||||
<paramdef>int <parameter>count</parameter></paramdef>
|
||||
<paramdef>int* <parameter>eof</parameter></paramdef>
|
||||
<paramdef>void* <parameter>data</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
The read function should write its information into the
|
||||
<parameter>page</parameter>. For proper use, the function
|
||||
should start writing at an offset of
|
||||
<parameter>off</parameter> in <parameter>page</parameter> and
|
||||
write at most <parameter>count</parameter> bytes, but because
|
||||
most read functions are quite simple and only return a small
|
||||
amount of information, these two parameters are usually
|
||||
ignored (it breaks pagers like <literal>more</literal> and
|
||||
<literal>less</literal>, but <literal>cat</literal> still
|
||||
works).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If the <parameter>off</parameter> and
|
||||
<parameter>count</parameter> parameters are properly used,
|
||||
<parameter>eof</parameter> should be used to signal that the
|
||||
end of the file has been reached by writing
|
||||
<literal>1</literal> to the memory location
|
||||
<parameter>eof</parameter> points to.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The parameter <parameter>start</parameter> doesn't seem to be
|
||||
used anywhere in the kernel. The <parameter>data</parameter>
|
||||
parameter can be used to create a single call back function for
|
||||
several files, see <xref linkend="usingdata"/>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <function>read_func</function> function must return the
|
||||
number of bytes written into the <parameter>page</parameter>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<xref linkend="example"/> shows how to use a read call back
|
||||
function.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Writing data</title>
|
||||
|
||||
<para>
|
||||
The write call back function allows a userland process to write
|
||||
data to the kernel, so it has some kind of control over the
|
||||
kernel. The write function should have the following format:
|
||||
</para>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>int <function>write_func</function></funcdef>
|
||||
<paramdef>struct file* <parameter>file</parameter></paramdef>
|
||||
<paramdef>const char* <parameter>buffer</parameter></paramdef>
|
||||
<paramdef>unsigned long <parameter>count</parameter></paramdef>
|
||||
<paramdef>void* <parameter>data</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
The write function should read <parameter>count</parameter>
|
||||
bytes at maximum from the <parameter>buffer</parameter>. Note
|
||||
that the <parameter>buffer</parameter> doesn't live in the
|
||||
kernel's memory space, so it should first be copied to kernel
|
||||
space with <function>copy_from_user</function>. The
|
||||
<parameter>file</parameter> parameter is usually
|
||||
ignored. <xref linkend="usingdata"/> shows how to use the
|
||||
<parameter>data</parameter> parameter.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Again, <xref linkend="example"/> shows how to use this call back
|
||||
function.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1 id="usingdata">
|
||||
<title>A single call back for many files</title>
|
||||
|
||||
<para>
|
||||
When a large number of almost identical files is used, it's
|
||||
quite inconvenient to use a separate call back function for
|
||||
each file. A better approach is to have a single call back
|
||||
function that distinguishes between the files by using the
|
||||
<structfield>data</structfield> field in <structname>struct
|
||||
proc_dir_entry</structname>. First of all, the
|
||||
<structfield>data</structfield> field has to be initialised:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct proc_dir_entry* entry;
|
||||
struct my_file_data *file_data;
|
||||
|
||||
file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL);
|
||||
entry->data = file_data;
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The <structfield>data</structfield> field is a <type>void
|
||||
*</type>, so it can be initialised with anything.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Now that the <structfield>data</structfield> field is set, the
|
||||
<function>read_proc</function> and
|
||||
<function>write_proc</function> can use it to distinguish
|
||||
between files because they get it passed into their
|
||||
<parameter>data</parameter> parameter:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
int foo_read_func(char *page, char **start, off_t off,
|
||||
int count, int *eof, void *data)
|
||||
{
|
||||
int len;
|
||||
|
||||
if(data == file_data) {
|
||||
/* special case for this file */
|
||||
} else {
|
||||
/* normal processing */
|
||||
}
|
||||
|
||||
return len;
|
||||
}
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Be sure to free the <structfield>data</structfield> data field
|
||||
when removing the procfs entry.
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
|
||||
|
||||
|
||||
<chapter id="tips">
|
||||
<title>Tips and tricks</title>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1 id="convenience">
|
||||
<title>Convenience functions</title>
|
||||
|
||||
<funcsynopsis>
|
||||
<funcprototype>
|
||||
<funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef>
|
||||
<paramdef>const char* <parameter>name</parameter></paramdef>
|
||||
<paramdef>mode_t <parameter>mode</parameter></paramdef>
|
||||
<paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
|
||||
<paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef>
|
||||
<paramdef>void* <parameter>data</parameter></paramdef>
|
||||
</funcprototype>
|
||||
</funcsynopsis>
|
||||
|
||||
<para>
|
||||
This function creates a regular file in exactly the same way
|
||||
as <function>create_proc_entry</function> from <xref
|
||||
linkend="regularfile"/> does, but also allows to set the read
|
||||
function <parameter>read_proc</parameter> in one call. This
|
||||
function can set the <parameter>data</parameter> as well, like
|
||||
explained in <xref linkend="usingdata"/>.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Modules</title>
|
||||
|
||||
<para>
|
||||
If procfs is being used from within a module, be sure to set
|
||||
the <structfield>owner</structfield> field in the
|
||||
<structname>struct proc_dir_entry</structname> to
|
||||
<constant>THIS_MODULE</constant>.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct proc_dir_entry* entry;
|
||||
|
||||
entry->owner = THIS_MODULE;
|
||||
</programlisting>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Mode and ownership</title>
|
||||
|
||||
<para>
|
||||
Sometimes it is useful to change the mode and/or ownership of
|
||||
a procfs entry. Here is an example that shows how to achieve
|
||||
that:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
struct proc_dir_entry* entry;
|
||||
|
||||
entry->mode = S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH;
|
||||
entry->uid = 0;
|
||||
entry->gid = 100;
|
||||
</programlisting>
|
||||
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
|
||||
|
||||
|
||||
<chapter id="example">
|
||||
<title>Example</title>
|
||||
|
||||
<!-- be careful with the example code: it shouldn't be wider than
|
||||
approx. 60 columns, or otherwise it won't fit properly on a page
|
||||
-->
|
||||
|
||||
&procfsexample;
|
||||
|
||||
</chapter>
|
||||
</book>
|
224
Documentation/DocBook/procfs_example.c
Normal file
224
Documentation/DocBook/procfs_example.c
Normal file
|
@ -0,0 +1,224 @@
|
|||
/*
|
||||
* procfs_example.c: an example proc interface
|
||||
*
|
||||
* Copyright (C) 2001, Erik Mouw (J.A.K.Mouw@its.tudelft.nl)
|
||||
*
|
||||
* This file accompanies the procfs-guide in the Linux kernel
|
||||
* source. Its main use is to demonstrate the concepts and
|
||||
* functions described in the guide.
|
||||
*
|
||||
* This software has been developed while working on the LART
|
||||
* computing board (http://www.lart.tudelft.nl/), which is
|
||||
* sponsored by the Mobile Multi-media Communications
|
||||
* (http://www.mmc.tudelft.nl/) and Ubiquitous Communications
|
||||
* (http://www.ubicom.tudelft.nl/) projects.
|
||||
*
|
||||
* The author can be reached at:
|
||||
*
|
||||
* Erik Mouw
|
||||
* Information and Communication Theory Group
|
||||
* Faculty of Information Technology and Systems
|
||||
* Delft University of Technology
|
||||
* P.O. Box 5031
|
||||
* 2600 GA Delft
|
||||
* The Netherlands
|
||||
*
|
||||
*
|
||||
* This program is free software; you can redistribute
|
||||
* it and/or modify it under the terms of the GNU General
|
||||
* Public License as published by the Free Software
|
||||
* Foundation; either version 2 of the License, or (at your
|
||||
* option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it will be
|
||||
* useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
|
||||
* PURPOSE. See the GNU General Public License for more
|
||||
* details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public
|
||||
* License along with this program; if not, write to the
|
||||
* Free Software Foundation, Inc., 59 Temple Place,
|
||||
* Suite 330, Boston, MA 02111-1307 USA
|
||||
*
|
||||
*/
|
||||
|
||||
#include <linux/module.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/proc_fs.h>
|
||||
#include <linux/jiffies.h>
|
||||
#include <asm/uaccess.h>
|
||||
|
||||
|
||||
#define MODULE_VERS "1.0"
|
||||
#define MODULE_NAME "procfs_example"
|
||||
|
||||
#define FOOBAR_LEN 8
|
||||
|
||||
struct fb_data_t {
|
||||
char name[FOOBAR_LEN + 1];
|
||||
char value[FOOBAR_LEN + 1];
|
||||
};
|
||||
|
||||
|
||||
static struct proc_dir_entry *example_dir, *foo_file,
|
||||
*bar_file, *jiffies_file, *symlink;
|
||||
|
||||
|
||||
struct fb_data_t foo_data, bar_data;
|
||||
|
||||
|
||||
static int proc_read_jiffies(char *page, char **start,
|
||||
off_t off, int count,
|
||||
int *eof, void *data)
|
||||
{
|
||||
int len;
|
||||
|
||||
len = sprintf(page, "jiffies = %ld\n",
|
||||
jiffies);
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
|
||||
static int proc_read_foobar(char *page, char **start,
|
||||
off_t off, int count,
|
||||
int *eof, void *data)
|
||||
{
|
||||
int len;
|
||||
struct fb_data_t *fb_data = (struct fb_data_t *)data;
|
||||
|
||||
/* DON'T DO THAT - buffer overruns are bad */
|
||||
len = sprintf(page, "%s = '%s'\n",
|
||||
fb_data->name, fb_data->value);
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
|
||||
static int proc_write_foobar(struct file *file,
|
||||
const char *buffer,
|
||||
unsigned long count,
|
||||
void *data)
|
||||
{
|
||||
int len;
|
||||
struct fb_data_t *fb_data = (struct fb_data_t *)data;
|
||||
|
||||
if(count > FOOBAR_LEN)
|
||||
len = FOOBAR_LEN;
|
||||
else
|
||||
len = count;
|
||||
|
||||
if(copy_from_user(fb_data->value, buffer, len))
|
||||
return -EFAULT;
|
||||
|
||||
fb_data->value[len] = '\0';
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
|
||||
static int __init init_procfs_example(void)
|
||||
{
|
||||
int rv = 0;
|
||||
|
||||
/* create directory */
|
||||
example_dir = proc_mkdir(MODULE_NAME, NULL);
|
||||
if(example_dir == NULL) {
|
||||
rv = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
example_dir->owner = THIS_MODULE;
|
||||
|
||||
/* create jiffies using convenience function */
|
||||
jiffies_file = create_proc_read_entry("jiffies",
|
||||
0444, example_dir,
|
||||
proc_read_jiffies,
|
||||
NULL);
|
||||
if(jiffies_file == NULL) {
|
||||
rv = -ENOMEM;
|
||||
goto no_jiffies;
|
||||
}
|
||||
|
||||
jiffies_file->owner = THIS_MODULE;
|
||||
|
||||
/* create foo and bar files using same callback
|
||||
* functions
|
||||
*/
|
||||
foo_file = create_proc_entry("foo", 0644, example_dir);
|
||||
if(foo_file == NULL) {
|
||||
rv = -ENOMEM;
|
||||
goto no_foo;
|
||||
}
|
||||
|
||||
strcpy(foo_data.name, "foo");
|
||||
strcpy(foo_data.value, "foo");
|
||||
foo_file->data = &foo_data;
|
||||
foo_file->read_proc = proc_read_foobar;
|
||||
foo_file->write_proc = proc_write_foobar;
|
||||
foo_file->owner = THIS_MODULE;
|
||||
|
||||
bar_file = create_proc_entry("bar", 0644, example_dir);
|
||||
if(bar_file == NULL) {
|
||||
rv = -ENOMEM;
|
||||
goto no_bar;
|
||||
}
|
||||
|
||||
strcpy(bar_data.name, "bar");
|
||||
strcpy(bar_data.value, "bar");
|
||||
bar_file->data = &bar_data;
|
||||
bar_file->read_proc = proc_read_foobar;
|
||||
bar_file->write_proc = proc_write_foobar;
|
||||
bar_file->owner = THIS_MODULE;
|
||||
|
||||
/* create symlink */
|
||||
symlink = proc_symlink("jiffies_too", example_dir,
|
||||
"jiffies");
|
||||
if(symlink == NULL) {
|
||||
rv = -ENOMEM;
|
||||
goto no_symlink;
|
||||
}
|
||||
|
||||
symlink->owner = THIS_MODULE;
|
||||
|
||||
/* everything OK */
|
||||
printk(KERN_INFO "%s %s initialised\n",
|
||||
MODULE_NAME, MODULE_VERS);
|
||||
return 0;
|
||||
|
||||
no_symlink:
|
||||
remove_proc_entry("tty", example_dir);
|
||||
no_tty:
|
||||
remove_proc_entry("bar", example_dir);
|
||||
no_bar:
|
||||
remove_proc_entry("foo", example_dir);
|
||||
no_foo:
|
||||
remove_proc_entry("jiffies", example_dir);
|
||||
no_jiffies:
|
||||
remove_proc_entry(MODULE_NAME, NULL);
|
||||
out:
|
||||
return rv;
|
||||
}
|
||||
|
||||
|
||||
static void __exit cleanup_procfs_example(void)
|
||||
{
|
||||
remove_proc_entry("jiffies_too", example_dir);
|
||||
remove_proc_entry("tty", example_dir);
|
||||
remove_proc_entry("bar", example_dir);
|
||||
remove_proc_entry("foo", example_dir);
|
||||
remove_proc_entry("jiffies", example_dir);
|
||||
remove_proc_entry(MODULE_NAME, NULL);
|
||||
|
||||
printk(KERN_INFO "%s %s removed\n",
|
||||
MODULE_NAME, MODULE_VERS);
|
||||
}
|
||||
|
||||
|
||||
module_init(init_procfs_example);
|
||||
module_exit(cleanup_procfs_example);
|
||||
|
||||
MODULE_AUTHOR("Erik Mouw");
|
||||
MODULE_DESCRIPTION("procfs examples");
|
193
Documentation/DocBook/scsidrivers.tmpl
Normal file
193
Documentation/DocBook/scsidrivers.tmpl
Normal file
|
@ -0,0 +1,193 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="scsidrivers">
|
||||
<bookinfo>
|
||||
<title>SCSI Subsystem Interfaces</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Douglas</firstname>
|
||||
<surname>Gilbert</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>dgilbert@interlog.com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
<pubdate>2003-08-11</pubdate>
|
||||
|
||||
<copyright>
|
||||
<year>2002</year>
|
||||
<year>2003</year>
|
||||
<holder>Douglas Gilbert</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
This document outlines the interface between the Linux scsi mid level
|
||||
and lower level drivers. Lower level drivers are variously called HBA
|
||||
(host bus adapter) drivers, host drivers (HD) or pseudo adapter drivers.
|
||||
The latter alludes to the fact that a lower level driver may be a
|
||||
bridge to another IO subsystem (and the "ide-scsi" driver is an example
|
||||
of this). There can be many lower level drivers active in a running
|
||||
system, but only one per hardware type. For example, the aic7xxx driver
|
||||
controls adaptec controllers based on the 7xxx chip series. Most lower
|
||||
level drivers can control one or more scsi hosts (a.k.a. scsi initiators).
|
||||
</para>
|
||||
<para>
|
||||
This document can been found in an ASCII text file in the linux kernel
|
||||
source: <filename>Documentation/scsi/scsi_mid_low_api.txt</filename> .
|
||||
It currently hold a little more information than this document. The
|
||||
<filename>drivers/scsi/hosts.h</filename> and <filename>
|
||||
drivers/scsi/scsi.h</filename> headers contain descriptions of members
|
||||
of important structures for the scsi subsystem.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="driver-struct">
|
||||
<title>Driver structure</title>
|
||||
<para>
|
||||
Traditionally a lower level driver for the scsi subsystem has been
|
||||
at least two files in the drivers/scsi directory. For example, a
|
||||
driver called "xyz" has a header file "xyz.h" and a source file
|
||||
"xyz.c". [Actually there is no good reason why this couldn't all
|
||||
be in one file.] Some drivers that have been ported to several operating
|
||||
systems (e.g. aic7xxx which has separate files for generic and
|
||||
OS-specific code) have more than two files. Such drivers tend to have
|
||||
their own directory under the drivers/scsi directory.
|
||||
</para>
|
||||
<para>
|
||||
scsi_module.c is normally included at the end of a lower
|
||||
level driver. For it to work a declaration like this is needed before
|
||||
it is included:
|
||||
<programlisting>
|
||||
static Scsi_Host_Template driver_template = DRIVER_TEMPLATE;
|
||||
/* DRIVER_TEMPLATE should contain pointers to supported interface
|
||||
functions. Scsi_Host_Template is defined hosts.h */
|
||||
#include "scsi_module.c"
|
||||
</programlisting>
|
||||
</para>
|
||||
<para>
|
||||
The scsi_module.c assumes the name "driver_template" is appropriately
|
||||
defined. It contains 2 functions:
|
||||
<orderedlist>
|
||||
<listitem><para>
|
||||
init_this_scsi_driver() called during builtin and module driver
|
||||
initialization: invokes mid level's scsi_register_host()
|
||||
</para></listitem>
|
||||
<listitem><para>
|
||||
exit_this_scsi_driver() called during closedown: invokes
|
||||
mid level's scsi_unregister_host()
|
||||
</para></listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
<para>
|
||||
When a new, lower level driver is being added to Linux, the following
|
||||
files (all found in the drivers/scsi directory) will need some attention:
|
||||
Makefile, Config.help and Config.in . It is probably best to look at what
|
||||
an existing lower level driver does in this regard.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="intfunctions">
|
||||
<title>Interface Functions</title>
|
||||
!EDocumentation/scsi/scsi_mid_low_api.txt
|
||||
</chapter>
|
||||
|
||||
<chapter id="locks">
|
||||
<title>Locks</title>
|
||||
<para>
|
||||
Each Scsi_Host instance has a spin_lock called Scsi_Host::default_lock
|
||||
which is initialized in scsi_register() [found in hosts.c]. Within the
|
||||
same function the Scsi_Host::host_lock pointer is initialized to point
|
||||
at default_lock with the scsi_assign_lock() function. Thereafter
|
||||
lock and unlock operations performed by the mid level use the
|
||||
Scsi_Host::host_lock pointer.
|
||||
</para>
|
||||
<para>
|
||||
Lower level drivers can override the use of Scsi_Host::default_lock by
|
||||
using scsi_assign_lock(). The earliest opportunity to do this would
|
||||
be in the detect() function after it has invoked scsi_register(). It
|
||||
could be replaced by a coarser grain lock (e.g. per driver) or a
|
||||
lock of equal granularity (i.e. per host). Using finer grain locks
|
||||
(e.g. per scsi device) may be possible by juggling locks in
|
||||
queuecommand().
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="changes">
|
||||
<title>Changes since lk 2.4 series</title>
|
||||
<para>
|
||||
io_request_lock has been replaced by several finer grained locks. The lock
|
||||
relevant to lower level drivers is Scsi_Host::host_lock and there is one
|
||||
per scsi host.
|
||||
</para>
|
||||
<para>
|
||||
The older error handling mechanism has been removed. This means the
|
||||
lower level interface functions abort() and reset() have been removed.
|
||||
</para>
|
||||
<para>
|
||||
In the 2.4 series the scsi subsystem configuration descriptions were
|
||||
aggregated with the configuration descriptions from all other Linux
|
||||
subsystems in the Documentation/Configure.help file. In the 2.5 series,
|
||||
the scsi subsystem now has its own (much smaller) drivers/scsi/Config.help
|
||||
file.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="credits">
|
||||
<title>Credits</title>
|
||||
<para>
|
||||
The following people have contributed to this document:
|
||||
<orderedlist>
|
||||
<listitem><para>
|
||||
Mike Anderson <email>andmike@us.ibm.com</email>
|
||||
</para></listitem>
|
||||
<listitem><para>
|
||||
James Bottomley <email>James.Bottomley@steeleye.com</email>
|
||||
</para></listitem>
|
||||
<listitem><para>
|
||||
Patrick Mansfield <email>patmans@us.ibm.com</email>
|
||||
</para></listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
</book>
|
585
Documentation/DocBook/sis900.tmpl
Normal file
585
Documentation/DocBook/sis900.tmpl
Normal file
|
@ -0,0 +1,585 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="SiS900Guide">
|
||||
|
||||
<bookinfo>
|
||||
|
||||
<title>SiS 900/7016 Fast Ethernet Device Driver</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Ollie</firstname>
|
||||
<surname>Lho</surname>
|
||||
</author>
|
||||
|
||||
<author>
|
||||
<firstname>Lei Chun</firstname>
|
||||
<surname>Chang</surname>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<edition>Document Revision: 0.3 for SiS900 driver v1.06 & v1.07</edition>
|
||||
<pubdate>November 16, 2000</pubdate>
|
||||
|
||||
<copyright>
|
||||
<year>1999</year>
|
||||
<holder>Silicon Integrated System Corp.</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
|
||||
</para>
|
||||
</legalnotice>
|
||||
|
||||
<abstract>
|
||||
<para>
|
||||
This document gives some information on installation and usage of SiS 900/7016
|
||||
device driver under Linux.
|
||||
</para>
|
||||
</abstract>
|
||||
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>
|
||||
This document describes the revision 1.06 and 1.07 of SiS 900/7016 Fast Ethernet
|
||||
device driver under Linux. The driver is developed by Silicon Integrated
|
||||
System Corp. and distributed freely under the GNU General Public License (GPL).
|
||||
The driver can be compiled as a loadable module and used under Linux kernel
|
||||
version 2.2.x. (rev. 1.06)
|
||||
With minimal changes, the driver can also be used under 2.3.x and 2.4.x kernel
|
||||
(rev. 1.07), please see
|
||||
<xref linkend="install"/>. If you are intended to
|
||||
use the driver for earlier kernels, you are on your own.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The driver is tested with usual TCP/IP applications including
|
||||
FTP, Telnet, Netscape etc. and is used constantly by the developers.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Please send all comments/fixes/questions to
|
||||
<ulink url="mailto:lcchang@sis.com.tw">Lei-Chun Chang</ulink>.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="changes">
|
||||
<title>Changes</title>
|
||||
|
||||
<para>
|
||||
Changes made in Revision 1.07
|
||||
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Separation of sis900.c and sis900.h in order to move most
|
||||
constant definition to sis900.h (many of those constants were
|
||||
corrected)
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Clean up PCI detection, the pci-scan from Donald Becker were not used,
|
||||
just simple pci_find_*.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
MII detection is modified to support multiple mii transceiver.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Bugs in read_eeprom, mdio_* were removed.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Lot of sis900 irrelevant comments were removed/changed and
|
||||
more comments were added to reflect the real situation.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Clean up of physical/virtual address space mess in buffer
|
||||
descriptors.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Better transmit/receive error handling.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The driver now uses zero-copy single buffer management
|
||||
scheme to improve performance.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Names of variables were changed to be more consistent.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Clean up of auo-negotiation and timer code.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Automatic detection and change of PHY on the fly.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Bug in mac probing fixed.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Fix 630E equalier problem by modifying the equalizer workaround rule.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Support for ICS1893 10/100 Interated PHYceiver.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Support for media select by ifconfig.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Added kernel-doc extratable documentation.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</orderedlist>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="tested">
|
||||
<title>Tested Environment</title>
|
||||
|
||||
<para>
|
||||
This driver is developed on the following hardware
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
Intel Celeron 500 with SiS 630 (rev 02) chipset
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
SiS 900 (rev 01) and SiS 7016/7014 Fast Ethernet Card
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
|
||||
and tested with these software environments
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
Red Hat Linux version 6.2
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
Linux kernel version 2.4.0
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
Netscape version 4.6
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
NcFTP 3.0.0 beta 18
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
|
||||
<para>
|
||||
Samba version 2.0.3
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="files">
|
||||
<title>Files in This Package</title>
|
||||
|
||||
<para>
|
||||
In the package you can find these files:
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<variablelist>
|
||||
|
||||
<varlistentry>
|
||||
<term>sis900.c</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Driver source file in C
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>sis900.h</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Header file for sis900.c
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>sis900.sgml</term>
|
||||
<listitem>
|
||||
<para>
|
||||
DocBook SGML source of the document
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>sis900.txt</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Driver document in plain text
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
</variablelist>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="install">
|
||||
<title>Installation</title>
|
||||
|
||||
<para>
|
||||
Silicon Integrated System Corp. is cooperating closely with core Linux Kernel
|
||||
developers. The revisions of SiS 900 driver are distributed by the usuall channels
|
||||
for kernel tar files and patches. Those kernel tar files for official kernel and
|
||||
patches for kernel pre-release can be download at
|
||||
<ulink url="http://ftp.kernel.org/pub/linux/kernel/">official kernel ftp site</ulink>
|
||||
and its mirrors.
|
||||
The 1.06 revision can be found in kernel version later than 2.3.15 and pre-2.2.14,
|
||||
and 1.07 revision can be found in kernel version 2.4.0.
|
||||
If you have no prior experience in networking under Linux, please read
|
||||
<ulink url="http://www.tldp.org/">Ethernet HOWTO</ulink> and
|
||||
<ulink url="http://www.tldp.org/">Networking HOWTO</ulink> available from
|
||||
Linux Documentation Project (LDP).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The driver is bundled in release later than 2.2.11 and 2.3.15 so this
|
||||
is the most easy case.
|
||||
Be sure you have the appropriate packages for compiling kernel source.
|
||||
Those packages are listed in Document/Changes in kernel source
|
||||
distribution. If you have to install the driver other than those bundled
|
||||
in kernel release, you should have your driver file
|
||||
<filename>sis900.c</filename> and <filename>sis900.h</filename>
|
||||
copied into <filename class="directory">/usr/src/linux/drivers/net/</filename> first.
|
||||
There are two alternative ways to install the driver
|
||||
</para>
|
||||
|
||||
<sect1>
|
||||
<title>Building the driver as loadable module</title>
|
||||
|
||||
<para>
|
||||
To build the driver as a loadable kernel module you have to reconfigure
|
||||
the kernel to activate network support by
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
make menuconfig
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Choose <quote>Loadable module support ---></quote>,
|
||||
then select <quote>Enable loadable module support</quote>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Choose <quote>Network Device Support ---></quote>, select
|
||||
<quote>Ethernet (10 or 100Mbit)</quote>.
|
||||
Then select <quote>EISA, VLB, PCI and on board controllers</quote>,
|
||||
and choose <quote>SiS 900/7016 PCI Fast Ethernet Adapter support</quote>
|
||||
to <quote>M</quote>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
After reconfiguring the kernel, you can make the driver module by
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
make modules
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
The driver should be compiled with no errors. After compiling the driver,
|
||||
the driver can be installed to proper place by
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
make modules_install
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Load the driver into kernel by
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
insmod sis900
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
When loading the driver into memory, some information message can be view by
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<screen>
|
||||
dmesg
|
||||
</screen>
|
||||
|
||||
or
|
||||
|
||||
<screen>
|
||||
cat /var/log/message
|
||||
</screen>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If the driver is loaded properly you will have messages similar to this:
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
sis900.c: v1.07.06 11/07/2000
|
||||
eth0: SiS 900 PCI Fast Ethernet at 0xd000, IRQ 10, 00:00:e8:83:7f:a4.
|
||||
eth0: SiS 900 Internal MII PHY transceiver found at address 1.
|
||||
eth0: Using SiS 900 Internal MII PHY as default
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
showing the version of the driver and the results of probing routine.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Once the driver is loaded, network can be brought up by
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
/sbin/ifconfig eth0 IPADDR broadcast BROADCAST netmask NETMASK media TYPE
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
where IPADDR, BROADCAST, NETMASK are your IP address, broadcast address and
|
||||
netmask respectively. TYPE is used to set medium type used by the device.
|
||||
Typical values are "10baseT"(twisted-pair 10Mbps Ethernet) or "100baseT"
|
||||
(twisted-pair 100Mbps Ethernet). For more information on how to configure
|
||||
network interface, please refer to
|
||||
<ulink url="http://www.tldp.org/">Networking HOWTO</ulink>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The link status is also shown by kernel messages. For example, after the
|
||||
network interface is activated, you may have the message:
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
eth0: Media Link On 100mbps full-duplex
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
If you try to unplug the twist pair (TP) cable you will get
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
eth0: Media Link Off
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
indicating that the link is failed.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Building the driver into kernel</title>
|
||||
|
||||
<para>
|
||||
If you want to make the driver into kernel, choose <quote>Y</quote>
|
||||
rather than <quote>M</quote> on
|
||||
<quote>SiS 900/7016 PCI Fast Ethernet Adapter support</quote>
|
||||
when configuring the kernel. Build the kernel image in the usual way
|
||||
</para>
|
||||
|
||||
<para><screen>
|
||||
make clean
|
||||
|
||||
make bzlilo
|
||||
</screen></para>
|
||||
|
||||
<para>
|
||||
Next time the system reboot, you have the driver in memory.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<chapter id="problems">
|
||||
<title>Known Problems and Bugs</title>
|
||||
|
||||
<para>
|
||||
There are some known problems and bugs. If you find any other bugs please
|
||||
mail to <ulink url="mailto:lcchang@sis.com.tw">lcchang@sis.com.tw</ulink>
|
||||
|
||||
<orderedlist>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
AM79C901 HomePNA PHY is not thoroughly tested, there may be some
|
||||
bugs in the <quote>on the fly</quote> change of transceiver.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
A bug is hidden somewhere in the receive buffer management code,
|
||||
the bug causes NULL pointer reference in the kernel. This fault is
|
||||
caught before bad things happen and reported with the message:
|
||||
|
||||
<computeroutput>
|
||||
eth0: NULL pointer encountered in Rx ring, skipping
|
||||
</computeroutput>
|
||||
|
||||
which can be viewed with <literal remap="tt">dmesg</literal> or
|
||||
<literal remap="tt">cat /var/log/message</literal>.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The media type change from 10Mbps to 100Mbps twisted-pair ethernet
|
||||
by ifconfig causes the media link down.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</orderedlist>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="RHistory">
|
||||
<title>Revision History</title>
|
||||
|
||||
<para>
|
||||
<itemizedlist>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
November 13, 2000, Revision 1.07, seventh release, 630E problem fixed
|
||||
and further clean up.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
November 4, 1999, Revision 1.06, Second release, lots of clean up
|
||||
and optimization.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
August 8, 1999, Revision 1.05, Initial Public Release
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="acknowledgements">
|
||||
<title>Acknowledgements</title>
|
||||
|
||||
<para>
|
||||
This driver was originally derived form
|
||||
<ulink url="mailto:becker@cesdis1.gsfc.nasa.gov">Donald Becker</ulink>'s
|
||||
<ulink url="ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/kern-2.3/pci-skeleton.c"
|
||||
>pci-skeleton</ulink> and
|
||||
<ulink url="ftp://cesdis.gsfc.nasa.gov/pub/linux/drivers/kern-2.3/rtl8139.c"
|
||||
>rtl8139</ulink> drivers. Donald also provided various suggestion
|
||||
regarded with improvements made in revision 1.06.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The 1.05 revision was created by
|
||||
<ulink url="mailto:cmhuang@sis.com.tw">Jim Huang</ulink>, AMD 79c901
|
||||
support was added by <ulink url="mailto:lcs@sis.com.tw">Chin-Shan Li</ulink>.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="functions">
|
||||
<title>List of Functions</title>
|
||||
!Idrivers/net/sis900.c
|
||||
</chapter>
|
||||
|
||||
</book>
|
327
Documentation/DocBook/tulip-user.tmpl
Normal file
327
Documentation/DocBook/tulip-user.tmpl
Normal file
|
@ -0,0 +1,327 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="TulipUserGuide">
|
||||
<bookinfo>
|
||||
<title>Tulip Driver User's Guide</title>
|
||||
|
||||
<authorgroup>
|
||||
<author>
|
||||
<firstname>Jeff</firstname>
|
||||
<surname>Garzik</surname>
|
||||
<affiliation>
|
||||
<address>
|
||||
<email>jgarzik@pobox.com</email>
|
||||
</address>
|
||||
</affiliation>
|
||||
</author>
|
||||
</authorgroup>
|
||||
|
||||
<copyright>
|
||||
<year>2001</year>
|
||||
<holder>Jeff Garzik</holder>
|
||||
</copyright>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction</title>
|
||||
<para>
|
||||
The Tulip Ethernet Card Driver
|
||||
is maintained by Jeff Garzik (<email>jgarzik@pobox.com</email>).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The Tulip driver was developed by Donald Becker and changed by
|
||||
Jeff Garzik, Takashi Manabe and a cast of thousands.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For 2.4.x and later kernels, the Linux Tulip driver is available at
|
||||
<ulink url="http://sourceforge.net/projects/tulip/">http://sourceforge.net/projects/tulip/</ulink>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This driver is for the Digital "Tulip" Ethernet adapter interface.
|
||||
It should work with most DEC 21*4*-based chips/ethercards, as well as
|
||||
with work-alike chips from Lite-On (PNIC) and Macronix (MXIC) and ASIX.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The original author may be reached as becker@scyld.com, or C/O
|
||||
Scyld Computing Corporation,
|
||||
410 Severn Ave., Suite 210,
|
||||
Annapolis MD 21403
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Additional information on Donald Becker's tulip.c
|
||||
is available at <ulink url="http://www.scyld.com/network/tulip.html">http://www.scyld.com/network/tulip.html</ulink>
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="drvr-compat">
|
||||
<title>Driver Compatibility</title>
|
||||
|
||||
<para>
|
||||
This device driver is designed for the DECchip "Tulip", Digital's
|
||||
single-chip ethernet controllers for PCI (now owned by Intel).
|
||||
Supported members of the family
|
||||
are the 21040, 21041, 21140, 21140A, 21142, and 21143. Similar work-alike
|
||||
chips from Lite-On, Macronics, ASIX, Compex and other listed below are also
|
||||
supported.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
These chips are used on at least 140 unique PCI board designs. The great
|
||||
number of chips and board designs supported is the reason for the
|
||||
driver size and complexity. Almost of the increasing complexity is in the
|
||||
board configuration and media selection code. There is very little
|
||||
increasing in the operational critical path length.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="board-settings">
|
||||
<title>Board-specific Settings</title>
|
||||
|
||||
<para>
|
||||
PCI bus devices are configured by the system at boot time, so no jumpers
|
||||
need to be set on the board. The system BIOS preferably should assign the
|
||||
PCI INTA signal to an otherwise unused system IRQ line.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Some boards have EEPROMs tables with default media entry. The factory default
|
||||
is usually "autoselect". This should only be overridden when using
|
||||
transceiver connections without link beat e.g. 10base2 or AUI, or (rarely!)
|
||||
for forcing full-duplex when used with old link partners that do not do
|
||||
autonegotiation.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="driver-operation">
|
||||
<title>Driver Operation</title>
|
||||
|
||||
<sect1><title>Ring buffers</title>
|
||||
|
||||
<para>
|
||||
The Tulip can use either ring buffers or lists of Tx and Rx descriptors.
|
||||
This driver uses statically allocated rings of Rx and Tx descriptors, set at
|
||||
compile time by RX/TX_RING_SIZE. This version of the driver allocates skbuffs
|
||||
for the Rx ring buffers at open() time and passes the skb->data field to the
|
||||
Tulip as receive data buffers. When an incoming frame is less than
|
||||
RX_COPYBREAK bytes long, a fresh skbuff is allocated and the frame is
|
||||
copied to the new skbuff. When the incoming frame is larger, the skbuff is
|
||||
passed directly up the protocol stack and replaced by a newly allocated
|
||||
skbuff.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The RX_COPYBREAK value is chosen to trade-off the memory wasted by
|
||||
using a full-sized skbuff for small frames vs. the copying costs of larger
|
||||
frames. For small frames the copying cost is negligible (esp. considering
|
||||
that we are pre-loading the cache with immediately useful header
|
||||
information). For large frames the copying cost is non-trivial, and the
|
||||
larger copy might flush the cache of useful data. A subtle aspect of this
|
||||
choice is that the Tulip only receives into longword aligned buffers, thus
|
||||
the IP header at offset 14 isn't longword aligned for further processing.
|
||||
Copied frames are put into the new skbuff at an offset of "+2", thus copying
|
||||
has the beneficial effect of aligning the IP header and preloading the
|
||||
cache.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Synchronization</title>
|
||||
<para>
|
||||
The driver runs as two independent, single-threaded flows of control. One
|
||||
is the send-packet routine, which enforces single-threaded use by the
|
||||
dev->tbusy flag. The other thread is the interrupt handler, which is single
|
||||
threaded by the hardware and other software.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The send packet thread has partial control over the Tx ring and 'dev->tbusy'
|
||||
flag. It sets the tbusy flag whenever it's queuing a Tx packet. If the next
|
||||
queue slot is empty, it clears the tbusy flag when finished otherwise it sets
|
||||
the 'tp->tx_full' flag.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The interrupt handler has exclusive control over the Rx ring and records stats
|
||||
from the Tx ring. (The Tx-done interrupt can't be selectively turned off, so
|
||||
we can't avoid the interrupt overhead by having the Tx routine reap the Tx
|
||||
stats.) After reaping the stats, it marks the queue entry as empty by setting
|
||||
the 'base' to zero. Iff the 'tp->tx_full' flag is set, it clears both the
|
||||
tx_full and tbusy flags.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="errata">
|
||||
<title>Errata</title>
|
||||
|
||||
<para>
|
||||
The old DEC databooks were light on details.
|
||||
The 21040 databook claims that CSR13, CSR14, and CSR15 should each be the last
|
||||
register of the set CSR12-15 written. Hmmm, now how is that possible?
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The DEC SROM format is very badly designed not precisely defined, leading to
|
||||
part of the media selection junkheap below. Some boards do not have EEPROM
|
||||
media tables and need to be patched up. Worse, other boards use the DEC
|
||||
design kit media table when it isn't correct for their board.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
We cannot use MII interrupts because there is no defined GPIO pin to attach
|
||||
them. The MII transceiver status is polled using an kernel timer.
|
||||
</para>
|
||||
</chapter>
|
||||
|
||||
<chapter id="changelog">
|
||||
<title>Driver Change History</title>
|
||||
|
||||
<sect1><title>Version 0.9.14 (February 20, 2001)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Fix PNIC problems (Manfred Spraul)</para></listitem>
|
||||
<listitem><para>Add new PCI id for Accton comet</para></listitem>
|
||||
<listitem><para>Support Davicom tulips</para></listitem>
|
||||
<listitem><para>Fix oops in eeprom parsing</para></listitem>
|
||||
<listitem><para>Enable workarounds for early PCI chipsets</para></listitem>
|
||||
<listitem><para>IA64, hppa csr0 support</para></listitem>
|
||||
<listitem><para>Support media types 5, 6</para></listitem>
|
||||
<listitem><para>Interpret a bit more of the 21142 SROM extended media type 3</para></listitem>
|
||||
<listitem><para>Add missing delay in eeprom reading</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.11 (November 3, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Eliminate extra bus accesses when sharing interrupts (prumpf)</para></listitem>
|
||||
<listitem><para>Barrier following ownership descriptor bit flip (prumpf)</para></listitem>
|
||||
<listitem><para>Endianness fixes for >14 addresses in setup frames (prumpf)</para></listitem>
|
||||
<listitem><para>Report link beat to kernel/userspace via netif_carrier_*. (kuznet)</para></listitem>
|
||||
<listitem><para>Better spinlocking in set_rx_mode.</para></listitem>
|
||||
<listitem><para>Fix I/O resource request failure error messages (DaveM catch)</para></listitem>
|
||||
<listitem><para>Handle DMA allocation failure.</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.10 (September 6, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Simple interrupt mitigation (via jamal)</para></listitem>
|
||||
<listitem><para>More PCI ids</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.9 (August 11, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>More PCI ids</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.8 (July 13, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Correct signed/unsigned comparison for dummy frame index</para></listitem>
|
||||
<listitem><para>Remove outdated references to struct enet_statistics</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.7 (June 17, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Timer cleanups (Andrew Morton)</para></listitem>
|
||||
<listitem><para>Alpha compile fix (somebody?)</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.6 (May 31, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Revert 21143-related support flag patch</para></listitem>
|
||||
<listitem><para>Add HPPA/media-table debugging printk</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.5 (May 30, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>HPPA support (willy@puffingroup)</para></listitem>
|
||||
<listitem><para>CSR6 bits and tulip.h cleanup (Chris Smith)</para></listitem>
|
||||
<listitem><para>Improve debugging messages a bit</para></listitem>
|
||||
<listitem><para>Add delay after CSR13 write in t21142_start_nway</para></listitem>
|
||||
<listitem><para>Remove unused ETHER_STATS code</para></listitem>
|
||||
<listitem><para>Convert 'extern inline' to 'static inline' in tulip.h (Chris Smith)</para></listitem>
|
||||
<listitem><para>Update DS21143 support flags in tulip_chip_info[]</para></listitem>
|
||||
<listitem><para>Use spin_lock_irq, not _irqsave/restore, in tulip_start_xmit()</para></listitem>
|
||||
<listitem><para>Add locking to set_rx_mode()</para></listitem>
|
||||
<listitem><para>Fix race with chip setting DescOwned bit (Hal Murray)</para></listitem>
|
||||
<listitem><para>Request 100% of PIO and MMIO resource space assigned to card</para></listitem>
|
||||
<listitem><para>Remove error message from pci_enable_device failure</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.4.3 (April 14, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>mod_timer fix (Hal Murray)</para></listitem>
|
||||
<listitem><para>PNIC2 resuscitation (Chris Smith)</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.4.2 (March 21, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Fix 21041 CSR7, CSR13/14/15 handling</para></listitem>
|
||||
<listitem><para>Merge some PCI ids from tulip 0.91x</para></listitem>
|
||||
<listitem><para>Merge some HAS_xxx flags and flag settings from tulip 0.91x</para></listitem>
|
||||
<listitem><para>asm/io.h fix (submitted by many) and cleanup</para></listitem>
|
||||
<listitem><para>s/HAS_NWAY143/HAS_NWAY/</para></listitem>
|
||||
<listitem><para>Cleanup 21041 mode reporting</para></listitem>
|
||||
<listitem><para>Small code cleanups</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
|
||||
<sect1><title>Version 0.9.4.1 (March 18, 2000)</title>
|
||||
<itemizedlist>
|
||||
<listitem><para>Finish PCI DMA conversion (davem)</para></listitem>
|
||||
<listitem><para>Do not netif_start_queue() at end of tulip_tx_timeout() (kuznet)</para></listitem>
|
||||
<listitem><para>PCI DMA fix (kuznet)</para></listitem>
|
||||
<listitem><para>eeprom.c code cleanup</para></listitem>
|
||||
<listitem><para>Remove Xircom Tulip crud</para></listitem>
|
||||
</itemizedlist>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
</book>
|
979
Documentation/DocBook/usb.tmpl
Normal file
979
Documentation/DocBook/usb.tmpl
Normal file
|
@ -0,0 +1,979 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
||||
|
||||
<book id="Linux-USB-API">
|
||||
<bookinfo>
|
||||
<title>The Linux-USB Host Side API</title>
|
||||
|
||||
<legalnotice>
|
||||
<para>
|
||||
This documentation is free software; you can redistribute
|
||||
it and/or modify it under the terms of the GNU General Public
|
||||
License as published by the Free Software Foundation; either
|
||||
version 2 of the License, or (at your option) any later
|
||||
version.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This program is distributed in the hope that it will be
|
||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
See the GNU General Public License for more details.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You should have received a copy of the GNU General Public
|
||||
License along with this program; if not, write to the Free
|
||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
MA 02111-1307 USA
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For more details see the file COPYING in the source
|
||||
distribution of Linux.
|
||||
</para>
|
||||
</legalnotice>
|
||||
</bookinfo>
|
||||
|
||||
<toc></toc>
|
||||
|
||||
<chapter id="intro">
|
||||
<title>Introduction to USB on Linux</title>
|
||||
|
||||
<para>A Universal Serial Bus (USB) is used to connect a host,
|
||||
such as a PC or workstation, to a number of peripheral
|
||||
devices. USB uses a tree structure, with the host at the
|
||||
root (the system's master), hubs as interior nodes, and
|
||||
peripheral devices as leaves (and slaves).
|
||||
Modern PCs support several such trees of USB devices, usually
|
||||
one USB 2.0 tree (480 Mbit/sec each) with
|
||||
a few USB 1.1 trees (12 Mbit/sec each) that are used when you
|
||||
connect a USB 1.1 device directly to the machine's "root hub".
|
||||
</para>
|
||||
|
||||
<para>That master/slave asymmetry was designed in part for
|
||||
ease of use. It is not physically possible to assemble
|
||||
(legal) USB cables incorrectly: all upstream "to-the-host"
|
||||
connectors are the rectangular type, matching the sockets on
|
||||
root hubs, and the downstream type are the squarish type
|
||||
(or they are built in to the peripheral).
|
||||
Software doesn't need to deal with distributed autoconfiguration
|
||||
since the pre-designated master node manages all that.
|
||||
At the electrical level, bus protocol overhead is reduced by
|
||||
eliminating arbitration and moving scheduling into host software.
|
||||
</para>
|
||||
|
||||
<para>USB 1.0 was announced in January 1996, and was revised
|
||||
as USB 1.1 (with improvements in hub specification and
|
||||
support for interrupt-out transfers) in September 1998.
|
||||
USB 2.0 was released in April 2000, including high speed
|
||||
transfers and transaction translating hubs (used for USB 1.1
|
||||
and 1.0 backward compatibility).
|
||||
</para>
|
||||
|
||||
<para>USB support was added to Linux early in the 2.2 kernel series
|
||||
shortly before the 2.3 development forked off. Updates
|
||||
from 2.3 were regularly folded back into 2.2 releases, bringing
|
||||
new features such as <filename>/sbin/hotplug</filename> support,
|
||||
more drivers, and more robustness.
|
||||
The 2.5 kernel series continued such improvements, and also
|
||||
worked on USB 2.0 support,
|
||||
higher performance,
|
||||
better consistency between host controller drivers,
|
||||
API simplification (to make bugs less likely),
|
||||
and providing internal "kerneldoc" documentation.
|
||||
</para>
|
||||
|
||||
<para>Linux can run inside USB devices as well as on
|
||||
the hosts that control the devices.
|
||||
Because the Linux 2.x USB support evolved to support mass market
|
||||
platforms such as Apple Macintosh or PC-compatible systems,
|
||||
it didn't address design concerns for those types of USB systems.
|
||||
So it can't be used inside mass-market PDAs, or other peripherals.
|
||||
USB device drivers running inside those Linux peripherals
|
||||
don't do the same things as the ones running inside hosts,
|
||||
and so they've been given a different name:
|
||||
they're called <emphasis>gadget drivers</emphasis>.
|
||||
This document does not present gadget drivers.
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter id="host">
|
||||
<title>USB Host-Side API Model</title>
|
||||
|
||||
<para>Within the kernel,
|
||||
host-side drivers for USB devices talk to the "usbcore" APIs.
|
||||
There are two types of public "usbcore" APIs, targetted at two different
|
||||
layers of USB driver. Those are
|
||||
<emphasis>general purpose</emphasis> drivers, exposed through
|
||||
driver frameworks such as block, character, or network devices;
|
||||
and drivers that are <emphasis>part of the core</emphasis>,
|
||||
which are involved in managing a USB bus.
|
||||
Such core drivers include the <emphasis>hub</emphasis> driver,
|
||||
which manages trees of USB devices, and several different kinds
|
||||
of <emphasis>host controller driver (HCD)</emphasis>,
|
||||
which control individual busses.
|
||||
</para>
|
||||
|
||||
<para>The device model seen by USB drivers is relatively complex.
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
|
||||
<listitem><para>USB supports four kinds of data transfer
|
||||
(control, bulk, interrupt, and isochronous). Two transfer
|
||||
types use bandwidth as it's available (control and bulk),
|
||||
while the other two types of transfer (interrupt and isochronous)
|
||||
are scheduled to provide guaranteed bandwidth.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>The device description model includes one or more
|
||||
"configurations" per device, only one of which is active at a time.
|
||||
Devices that are capable of high speed operation must also support
|
||||
full speed configurations, along with a way to ask about the
|
||||
"other speed" configurations that might be used.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>Configurations have one or more "interface", each
|
||||
of which may have "alternate settings". Interfaces may be
|
||||
standardized by USB "Class" specifications, or may be specific to
|
||||
a vendor or device.</para>
|
||||
|
||||
<para>USB device drivers actually bind to interfaces, not devices.
|
||||
Think of them as "interface drivers", though you
|
||||
may not see many devices where the distinction is important.
|
||||
<emphasis>Most USB devices are simple, with only one configuration,
|
||||
one interface, and one alternate setting.</emphasis>
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>Interfaces have one or more "endpoints", each of
|
||||
which supports one type and direction of data transfer such as
|
||||
"bulk out" or "interrupt in". The entire configuration may have
|
||||
up to sixteen endpoints in each direction, allocated as needed
|
||||
among all the interfaces.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>Data transfer on USB is packetized; each endpoint
|
||||
has a maximum packet size.
|
||||
Drivers must often be aware of conventions such as flagging the end
|
||||
of bulk transfers using "short" (including zero length) packets.
|
||||
</para></listitem>
|
||||
|
||||
<listitem><para>The Linux USB API supports synchronous calls for
|
||||
control and bulk messaging.
|
||||
It also supports asynchnous calls for all kinds of data transfer,
|
||||
using request structures called "URBs" (USB Request Blocks).
|
||||
</para></listitem>
|
||||
|
||||
</itemizedlist>
|
||||
|
||||
<para>Accordingly, the USB Core API exposed to device drivers
|
||||
covers quite a lot of territory. You'll probably need to consult
|
||||
the USB 2.0 specification, available online from www.usb.org at
|
||||
no cost, as well as class or device specifications.
|
||||
</para>
|
||||
|
||||
<para>The only host-side drivers that actually touch hardware
|
||||
(reading/writing registers, handling IRQs, and so on) are the HCDs.
|
||||
In theory, all HCDs provide the same functionality through the same
|
||||
API. In practice, that's becoming more true on the 2.5 kernels,
|
||||
but there are still differences that crop up especially with
|
||||
fault handling. Different controllers don't necessarily report
|
||||
the same aspects of failures, and recovery from faults (including
|
||||
software-induced ones like unlinking an URB) isn't yet fully
|
||||
consistent.
|
||||
Device driver authors should make a point of doing disconnect
|
||||
testing (while the device is active) with each different host
|
||||
controller driver, to make sure drivers don't have bugs of
|
||||
their own as well as to make sure they aren't relying on some
|
||||
HCD-specific behavior.
|
||||
(You will need external USB 1.1 and/or
|
||||
USB 2.0 hubs to perform all those tests.)
|
||||
</para>
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter><title>USB-Standard Types</title>
|
||||
|
||||
<para>In <filename><linux/usb_ch9.h></filename> you will find
|
||||
the USB data types defined in chapter 9 of the USB specification.
|
||||
These data types are used throughout USB, and in APIs including
|
||||
this host side API, gadget APIs, and usbfs.
|
||||
</para>
|
||||
|
||||
!Iinclude/linux/usb_ch9.h
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter><title>Host-Side Data Types and Macros</title>
|
||||
|
||||
<para>The host side API exposes several layers to drivers, some of
|
||||
which are more necessary than others.
|
||||
These support lifecycle models for host side drivers
|
||||
and devices, and support passing buffers through usbcore to
|
||||
some HCD that performs the I/O for the device driver.
|
||||
</para>
|
||||
|
||||
|
||||
!Iinclude/linux/usb.h
|
||||
|
||||
</chapter>
|
||||
|
||||
<chapter><title>USB Core APIs</title>
|
||||
|
||||
<para>There are two basic I/O models in the USB API.
|
||||
The most elemental one is asynchronous: drivers submit requests
|
||||
in the form of an URB, and the URB's completion callback
|
||||
handle the next step.
|
||||
All USB transfer types support that model, although there
|
||||
are special cases for control URBs (which always have setup
|
||||
and status stages, but may not have a data stage) and
|
||||
isochronous URBs (which allow large packets and include
|
||||
per-packet fault reports).
|
||||
Built on top of that is synchronous API support, where a
|
||||
driver calls a routine that allocates one or more URBs,
|
||||
submits them, and waits until they complete.
|
||||
There are synchronous wrappers for single-buffer control
|
||||
and bulk transfers (which are awkward to use in some
|
||||
driver disconnect scenarios), and for scatterlist based
|
||||
streaming i/o (bulk or interrupt).
|
||||
</para>
|
||||
|
||||
<para>USB drivers need to provide buffers that can be
|
||||
used for DMA, although they don't necessarily need to
|
||||
provide the DMA mapping themselves.
|
||||
There are APIs to use used when allocating DMA buffers,
|
||||
which can prevent use of bounce buffers on some systems.
|
||||
In some cases, drivers may be able to rely on 64bit DMA
|
||||
to eliminate another kind of bounce buffer.
|
||||
</para>
|
||||
|
||||
!Edrivers/usb/core/urb.c
|
||||
!Edrivers/usb/core/message.c
|
||||
!Edrivers/usb/core/file.c
|
||||
!Edrivers/usb/core/usb.c
|
||||
!Edrivers/usb/core/hub.c
|
||||
</chapter>
|
||||
|
||||
<chapter><title>Host Controller APIs</title>
|
||||
|
||||
<para>These APIs are only for use by host controller drivers,
|
||||
most of which implement standard register interfaces such as
|
||||
EHCI, OHCI, or UHCI.
|
||||
UHCI was one of the first interfaces, designed by Intel and
|
||||
also used by VIA; it doesn't do much in hardware.
|
||||
OHCI was designed later, to have the hardware do more work
|
||||
(bigger transfers, tracking protocol state, and so on).
|
||||
EHCI was designed with USB 2.0; its design has features that
|
||||
resemble OHCI (hardware does much more work) as well as
|
||||
UHCI (some parts of ISO support, TD list processing).
|
||||
</para>
|
||||
|
||||
<para>There are host controllers other than the "big three",
|
||||
although most PCI based controllers (and a few non-PCI based
|
||||
ones) use one of those interfaces.
|
||||
Not all host controllers use DMA; some use PIO, and there
|
||||
is also a simulator.
|
||||
</para>
|
||||
|
||||
<para>The same basic APIs are available to drivers for all
|
||||
those controllers.
|
||||
For historical reasons they are in two layers:
|
||||
<structname>struct usb_bus</structname> is a rather thin
|
||||
layer that became available in the 2.2 kernels, while
|
||||
<structname>struct usb_hcd</structname> is a more featureful
|
||||
layer (available in later 2.4 kernels and in 2.5) that
|
||||
lets HCDs share common code, to shrink driver size
|
||||
and significantly reduce hcd-specific behaviors.
|
||||
</para>
|
||||
|
||||
!Edrivers/usb/core/hcd.c
|
||||
!Edrivers/usb/core/hcd-pci.c
|
||||
!Edrivers/usb/core/buffer.c
|
||||
</chapter>
|
||||
|
||||
<chapter>
|
||||
<title>The USB Filesystem (usbfs)</title>
|
||||
|
||||
<para>This chapter presents the Linux <emphasis>usbfs</emphasis>.
|
||||
You may prefer to avoid writing new kernel code for your
|
||||
USB driver; that's the problem that usbfs set out to solve.
|
||||
User mode device drivers are usually packaged as applications
|
||||
or libraries, and may use usbfs through some programming library
|
||||
that wraps it. Such libraries include
|
||||
<ulink url="http://libusb.sourceforge.net">libusb</ulink>
|
||||
for C/C++, and
|
||||
<ulink url="http://jUSB.sourceforge.net">jUSB</ulink> for Java.
|
||||
</para>
|
||||
|
||||
<note><title>Unfinished</title>
|
||||
<para>This particular documentation is incomplete,
|
||||
especially with respect to the asynchronous mode.
|
||||
As of kernel 2.5.66 the code and this (new) documentation
|
||||
need to be cross-reviewed.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<para>Configure usbfs into Linux kernels by enabling the
|
||||
<emphasis>USB filesystem</emphasis> option (CONFIG_USB_DEVICEFS),
|
||||
and you get basic support for user mode USB device drivers.
|
||||
Until relatively recently it was often (confusingly) called
|
||||
<emphasis>usbdevfs</emphasis> although it wasn't solving what
|
||||
<emphasis>devfs</emphasis> was.
|
||||
Every USB device will appear in usbfs, regardless of whether or
|
||||
not it has a kernel driver; but only devices with kernel drivers
|
||||
show up in devfs.
|
||||
</para>
|
||||
|
||||
<sect1>
|
||||
<title>What files are in "usbfs"?</title>
|
||||
|
||||
<para>Conventionally mounted at
|
||||
<filename>/proc/bus/usb</filename>, usbfs
|
||||
features include:
|
||||
<itemizedlist>
|
||||
<listitem><para><filename>/proc/bus/usb/devices</filename>
|
||||
... a text file
|
||||
showing each of the USB devices on known to the kernel,
|
||||
and their configuration descriptors.
|
||||
You can also poll() this to learn about new devices.
|
||||
</para></listitem>
|
||||
<listitem><para><filename>/proc/bus/usb/BBB/DDD</filename>
|
||||
... magic files
|
||||
exposing the each device's configuration descriptors, and
|
||||
supporting a series of ioctls for making device requests,
|
||||
including I/O to devices. (Purely for access by programs.)
|
||||
</para></listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para> Each bus is given a number (BBB) based on when it was
|
||||
enumerated; within each bus, each device is given a similar
|
||||
number (DDD).
|
||||
Those BBB/DDD paths are not "stable" identifiers;
|
||||
expect them to change even if you always leave the devices
|
||||
plugged in to the same hub port.
|
||||
<emphasis>Don't even think of saving these in application
|
||||
configuration files.</emphasis>
|
||||
Stable identifiers are available, for user mode applications
|
||||
that want to use them. HID and networking devices expose
|
||||
these stable IDs, so that for example you can be sure that
|
||||
you told the right UPS to power down its second server.
|
||||
"usbfs" doesn't (yet) expose those IDs.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Mounting and Access Control</title>
|
||||
|
||||
<para>There are a number of mount options for usbfs, which will
|
||||
be of most interest to you if you need to override the default
|
||||
access control policy.
|
||||
That policy is that only root may read or write device files
|
||||
(<filename>/proc/bus/BBB/DDD</filename>) although anyone may read
|
||||
the <filename>devices</filename>
|
||||
or <filename>drivers</filename> files.
|
||||
I/O requests to the device also need the CAP_SYS_RAWIO capability,
|
||||
</para>
|
||||
|
||||
<para>The significance of that is that by default, all user mode
|
||||
device drivers need super-user privileges.
|
||||
You can change modes or ownership in a driver setup
|
||||
when the device hotplugs, or maye just start the
|
||||
driver right then, as a privileged server (or some activity
|
||||
within one).
|
||||
That's the most secure approach for multi-user systems,
|
||||
but for single user systems ("trusted" by that user)
|
||||
it's more convenient just to grant everyone all access
|
||||
(using the <emphasis>devmode=0666</emphasis> option)
|
||||
so the driver can start whenever it's needed.
|
||||
</para>
|
||||
|
||||
<para>The mount options for usbfs, usable in /etc/fstab or
|
||||
in command line invocations of <emphasis>mount</emphasis>, are:
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><emphasis>busgid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the GID used for the
|
||||
/proc/bus/usb/BBB
|
||||
directories. (Default: 0)</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>busmode</emphasis>=MMM</term>
|
||||
<listitem><para>Controls the file mode used for the
|
||||
/proc/bus/usb/BBB
|
||||
directories. (Default: 0555)
|
||||
</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>busuid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the UID used for the
|
||||
/proc/bus/usb/BBB
|
||||
directories. (Default: 0)</para></listitem></varlistentry>
|
||||
|
||||
<varlistentry><term><emphasis>devgid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the GID used for the
|
||||
/proc/bus/usb/BBB/DDD
|
||||
files. (Default: 0)</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>devmode</emphasis>=MMM</term>
|
||||
<listitem><para>Controls the file mode used for the
|
||||
/proc/bus/usb/BBB/DDD
|
||||
files. (Default: 0644)</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>devuid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the UID used for the
|
||||
/proc/bus/usb/BBB/DDD
|
||||
files. (Default: 0)</para></listitem></varlistentry>
|
||||
|
||||
<varlistentry><term><emphasis>listgid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the GID used for the
|
||||
/proc/bus/usb/devices and drivers files.
|
||||
(Default: 0)</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>listmode</emphasis>=MMM</term>
|
||||
<listitem><para>Controls the file mode used for the
|
||||
/proc/bus/usb/devices and drivers files.
|
||||
(Default: 0444)</para></listitem></varlistentry>
|
||||
<varlistentry><term><emphasis>listuid</emphasis>=NNNNN</term>
|
||||
<listitem><para>Controls the UID used for the
|
||||
/proc/bus/usb/devices and drivers files.
|
||||
(Default: 0)</para></listitem></varlistentry>
|
||||
</variablelist>
|
||||
|
||||
</para>
|
||||
|
||||
<para>Note that many Linux distributions hard-wire the mount options
|
||||
for usbfs in their init scripts, such as
|
||||
<filename>/etc/rc.d/rc.sysinit</filename>,
|
||||
rather than making it easy to set this per-system
|
||||
policy in <filename>/etc/fstab</filename>.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>/proc/bus/usb/devices</title>
|
||||
|
||||
<para>This file is handy for status viewing tools in user
|
||||
mode, which can scan the text format and ignore most of it.
|
||||
More detailed device status (including class and vendor
|
||||
status) is available from device-specific files.
|
||||
For information about the current format of this file,
|
||||
see the
|
||||
<filename>Documentation/usb/proc_usb_info.txt</filename>
|
||||
file in your Linux kernel sources.
|
||||
</para>
|
||||
|
||||
<para>Otherwise the main use for this file from programs
|
||||
is to poll() it to get notifications of usb devices
|
||||
as they're plugged or unplugged.
|
||||
To see what changed, you'd need to read the file and
|
||||
compare "before" and "after" contents, scan the filesystem,
|
||||
or see its hotplug event.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>/proc/bus/usb/BBB/DDD</title>
|
||||
|
||||
<para>Use these files in one of these basic ways:
|
||||
</para>
|
||||
|
||||
<para><emphasis>They can be read,</emphasis>
|
||||
producing first the device descriptor
|
||||
(18 bytes) and then the descriptors for the current configuration.
|
||||
See the USB 2.0 spec for details about those binary data formats.
|
||||
You'll need to convert most multibyte values from little endian
|
||||
format to your native host byte order, although a few of the
|
||||
fields in the device descriptor (both of the BCD-encoded fields,
|
||||
and the vendor and product IDs) will be byteswapped for you.
|
||||
Note that configuration descriptors include descriptors for
|
||||
interfaces, altsettings, endpoints, and maybe additional
|
||||
class descriptors.
|
||||
</para>
|
||||
|
||||
<para><emphasis>Perform USB operations</emphasis> using
|
||||
<emphasis>ioctl()</emphasis> requests to make endpoint I/O
|
||||
requests (synchronously or asynchronously) or manage
|
||||
the device.
|
||||
These requests need the CAP_SYS_RAWIO capability,
|
||||
as well as filesystem access permissions.
|
||||
Only one ioctl request can be made on one of these
|
||||
device files at a time.
|
||||
This means that if you are synchronously reading an endpoint
|
||||
from one thread, you won't be able to write to a different
|
||||
endpoint from another thread until the read completes.
|
||||
This works for <emphasis>half duplex</emphasis> protocols,
|
||||
but otherwise you'd use asynchronous i/o requests.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1>
|
||||
<title>Life Cycle of User Mode Drivers</title>
|
||||
|
||||
<para>Such a driver first needs to find a device file
|
||||
for a device it knows how to handle.
|
||||
Maybe it was told about it because a
|
||||
<filename>/sbin/hotplug</filename> event handling agent
|
||||
chose that driver to handle the new device.
|
||||
Or maybe it's an application that scans all the
|
||||
/proc/bus/usb device files, and ignores most devices.
|
||||
In either case, it should <function>read()</function> all
|
||||
the descriptors from the device file,
|
||||
and check them against what it knows how to handle.
|
||||
It might just reject everything except a particular
|
||||
vendor and product ID, or need a more complex policy.
|
||||
</para>
|
||||
|
||||
<para>Never assume there will only be one such device
|
||||
on the system at a time!
|
||||
If your code can't handle more than one device at
|
||||
a time, at least detect when there's more than one, and
|
||||
have your users choose which device to use.
|
||||
</para>
|
||||
|
||||
<para>Once your user mode driver knows what device to use,
|
||||
it interacts with it in either of two styles.
|
||||
The simple style is to make only control requests; some
|
||||
devices don't need more complex interactions than those.
|
||||
(An example might be software using vendor-specific control
|
||||
requests for some initialization or configuration tasks,
|
||||
with a kernel driver for the rest.)
|
||||
</para>
|
||||
|
||||
<para>More likely, you need a more complex style driver:
|
||||
one using non-control endpoints, reading or writing data
|
||||
and claiming exclusive use of an interface.
|
||||
<emphasis>Bulk</emphasis> transfers are easiest to use,
|
||||
but only their sibling <emphasis>interrupt</emphasis> transfers
|
||||
work with low speed devices.
|
||||
Both interrupt and <emphasis>isochronous</emphasis> transfers
|
||||
offer service guarantees because their bandwidth is reserved.
|
||||
Such "periodic" transfers are awkward to use through usbfs,
|
||||
unless you're using the asynchronous calls. However, interrupt
|
||||
transfers can also be used in a synchronous "one shot" style.
|
||||
</para>
|
||||
|
||||
<para>Your user-mode driver should never need to worry
|
||||
about cleaning up request state when the device is
|
||||
disconnected, although it should close its open file
|
||||
descriptors as soon as it starts seeing the ENODEV
|
||||
errors.
|
||||
</para>
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1><title>The ioctl() Requests</title>
|
||||
|
||||
<para>To use these ioctls, you need to include the following
|
||||
headers in your userspace program:
|
||||
<programlisting>#include <linux/usb.h>
|
||||
#include <linux/usbdevice_fs.h>
|
||||
#include <asm/byteorder.h></programlisting>
|
||||
The standard USB device model requests, from "Chapter 9 |