This is a discussion on memdup(3) within the mailing.openbsd.tech forums, part of the OpenBSD category; --> memdup(3) has strdup(3) semantics but without strings. Canonical code for duplicating buffer looks like: rpl = malloc(src, len); if ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| memdup(3) has strdup(3) semantics but without strings. Canonical code for duplicating buffer looks like: rpl = malloc(src, len); if (!rpl) ... memcpy(rpl, src, len); Mistakes happen and two lengths in snippet above will be different. To prevent this memdup(3) was created: rpl = memdup(src, len); if (!rpl) ... ... Index: memdup.c ================================================== ================= RCS file: memdup.c diff -N memdup.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ memdup.c 14 Jul 2008 12:27:18 -0000 @@ -0,0 +1,31 @@ +/* + * $OpenBSD$ + * + * Copyright (c) 2008 Alexey Dobriyan <adobriyan@gmail.com> + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ +#include <sys/types.h> +#include <stdlib.h> +#include <string.h> + +void * +memdup(const void *src, size_t len) +{ + void *dst; + + dst = malloc(len); + if (dst) + memcpy(dst, src, len); + return dst; +} Index: memdup.3 ================================================== ================= RCS file: memdup.3 diff -N memdup.3 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ memdup.3 14 Jul 2008 12:27:18 -0000 @@ -0,0 +1,66 @@ +.\" $OpenBSD$ +./" +./"Copyright (c) 2008 Alexey Dobriyan <adobriyan@gmail.com> +./" +./"Permission to use, copy, modify, and distribute this software for any +./"purpose with or without fee is hereby granted, provided that the above +./"copyright notice and this permission notice appear in all copies. +./" +./"THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +./"WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +./"MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +./"ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +./"WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +./"ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +./"OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. +./" +.Dd $Mdocdate: Jul 14 2008 $ +.Dt STRDUP 3 +.Os +.Sh NAME +.Nm memdup +.Nd save a copy of a buffer +.Sh SYNOPSIS +.Fd #include <string.h> +.Ft void * +.Fn memdup "const void *src, size_t len" +.Sh DESCRIPTION +The +.Fn memdup +function allocates sufficient memory for a copy of the buffer +.Fa src , +does the copy, and returns a pointer to it. +The pointer may subsequently be used as an argument to the function +.Xr free 3 . +.Pp +If insufficient memory is available, +.Dv NULL +is returned. +.Sh EXAMPLES +The following will point +.Va p +to an allocated area of memory containing the replica of buffer +.Va src +which has length +.Va len : +.Bd -literal -offset indent +void *p; + +p = memdup(src, len); +if (!p) { + fprintf(stderr, "Out of memory.\en"); + exit(1); +} +.Ed +.Sh ERRORS +The +.Fn memdup +function may fail and set the external variable +.Va errno +for any of the errors specified for the library function +.Xr malloc 3 . +.Sh SEE ALSO +.Xr free 3 , +.Xr malloc 3 , +.Xr memcpy 3 , +.Xr strdup 3 Index: Makefile.inc ================================================== ================= RCS file: /cvs/src/lib/libc/string/Makefile.inc,v retrieving revision 1.20 diff -u -r1.20 Makefile.inc --- Makefile.inc 25 Oct 2007 22:41:02 -0000 1.20 +++ Makefile.inc 14 Jul 2008 12:31:27 -0000 @@ -3,7 +3,8 @@ # string sources .PATH: ${LIBCSRCDIR}/arch/${MACHINE_ARCH}/string ${LIBCSRCDIR}/string -SRCS+= bm.c memccpy.c memrchr.c strcasecmp.c strcasestr.c strcoll.c strdup.c \ +SRCS+= bm.c memccpy.c memdup.c memrchr.c strcasecmp.c strcasestr.c strcoll.c \ + strdup.c \ strerror.c strerror_r.c strlcat.c strmode.c strsignal.c strtok.c \ strxfrm.c \ wcscat.c wcschr.c wcscmp.c wcscpy.c wcscspn.c wcslcat.c wcslcpy.c \ @@ -139,7 +140,7 @@ ${LIBCSRCDIR}/string/rindex.c MAN+= bm.3 bcmp.3 bcopy.3 bstring.3 bzero.3 ffs.3 memccpy.3 memchr.3 \ - memcmp.3 memcpy.3 memmove.3 memset.3 strcasecmp.3 strcat.3 \ + memcmp.3 memcpy.3 memdup.3 memmove.3 memset.3 strcasecmp.3 strcat.3 \ strchr.3 strcmp.3 strcoll.3 strcpy.3 strcspn.3 strerror.3 \ string.3 strlen.3 strmode.3 strdup.3 strpbrk.3 strrchr.3 strsep.3 \ strsignal.3 strspn.3 strstr.3 strtok.3 strxfrm.3 swab.3 strlcpy.3 \ |
| |||
| On Mon, Jul 14, 2008 at 04:49:34PM +0400, Alexey Dobriyan wrote: > memdup(3) has strdup(3) semantics but without strings. > > Canonical code for duplicating buffer looks like: > > rpl = malloc(src, len); > if (!rpl) > ... > memcpy(rpl, src, len); > > Mistakes happen and two lengths in snippet above will be different. > To prevent this memdup(3) was created: > > rpl = memdup(src, len); > if (!rpl) > ... > ... This kind of code is very easy to write. The big question is: do we want it. That's non-standard stuff, it's not ANSI, it's not POSIX, it's not single unix. So what's the point ? It doesn't really help at writing portable code (more the contrary). In case you don't know, strdup isn't even in ANSI C, and that's not an oversight. The design of the standard libc separates strings and memory functions, and you'll notice that strdup combines the two, thus `tying' string handling functions to a specific set of memory allocation. That, and the fact that historically, strdup is not even well specified, so that if you use strdup in a C++ program, you don't even know how you should free the memory (is it free ? is it delete ? *both* options have existed in the past, courtesy of NeXtStep). |
| |||
| On Monday, July 14, Alexey Dobriyan wrote: > + > +void * > +memdup(const void *src, size_t len) > +{ > + void *dst; > + > + dst = malloc(len); > + if (dst) > + memcpy(dst, src, len); So, the memcpy() will access things past the end of 'src' if I "expand"? Stupid interface. -Toby. |
| |||
| On Mon, Jul 14, 2008 at 03:06:14PM +0200, Marc Espie wrote: > On Mon, Jul 14, 2008 at 04:49:34PM +0400, Alexey Dobriyan wrote: > > memdup(3) has strdup(3) semantics but without strings. > > > > Canonical code for duplicating buffer looks like: > > > > rpl = malloc(src, len); > > if (!rpl) > > ... > > memcpy(rpl, src, len); > > > > Mistakes happen and two lengths in snippet above will be different. > > To prevent this memdup(3) was created: > > > > rpl = memdup(src, len); > > if (!rpl) > > ... > > ... > > This kind of code is very easy to write. The big question is: do we > want it. > > That's non-standard stuff, it's not ANSI, it's not POSIX, it's not single > unix. > > So what's the point ? The point is writing "len" once, so it won't diverge from itself. > It doesn't really help at writing portable code (more the contrary). strlcpy() wasn't in ANSI C, POSIX et al when it was created, yet it was added to libc and from there spread. Of course, bugs aren't that common with malloc+memcpy sequence, but they definitely happen. malloc+memcpy sequence exists extracted in quite a few projects. Sometimes, under different name, sometimes with different prototype, but people definitely use this idiom, so let's help them? Plenty of examples in OpenBSD codebase alone. > In case you don't know, strdup isn't even in ANSI C, and that's not an > oversight. The design of the standard libc separates strings and memory > functions, and you'll notice that strdup combines the two, thus `tying' > string handling functions to a specific set of memory allocation. > > That, and the fact that historically, strdup is not even well specified, > so that if you use strdup in a C++ program, you don't even know how you > should free the memory (is it free ? is it delete ? *both* options have > existed in the past, courtesy of NeXtStep). |
| |||
| On Mon, Jul 14, 2008 at 06:38:49PM +0400, Alexey Dobriyan wrote: > On Mon, Jul 14, 2008 at 03:06:14PM +0200, Marc Espie wrote: > > On Mon, Jul 14, 2008 at 04:49:34PM +0400, Alexey Dobriyan wrote: > > > memdup(3) has strdup(3) semantics but without strings. > > > > > > Canonical code for duplicating buffer looks like: > > > > > > rpl = malloc(src, len); > > > if (!rpl) > > > ... > > > memcpy(rpl, src, len); > > > > > > Mistakes happen and two lengths in snippet above will be different. > > > To prevent this memdup(3) was created: > > > > > > rpl = memdup(src, len); > > > if (!rpl) > > > ... > > > ... > > > > This kind of code is very easy to write. The big question is: do we > > want it. > > > > That's non-standard stuff, it's not ANSI, it's not POSIX, it's not single > > unix. > > > > So what's the point ? > > The point is writing "len" once, so it won't diverge from itself. But you trade an explicit free for and implicit one. I find this a confusing interface that adds nothing. It is also not part of a standard. > > > It doesn't really help at writing portable code (more the contrary). > > strlcpy() wasn't in ANSI C, POSIX et al when it was created, yet it was > added to libc and from there spread. Of course, bugs aren't that common > with malloc+memcpy sequence, but they definitely happen. > > malloc+memcpy sequence exists extracted in quite a few projects. > Sometimes, under different name, sometimes with different prototype, > but people definitely use this idiom, so let's help them? > > Plenty of examples in OpenBSD codebase alone. Sure and they aren't broken. > > > In case you don't know, strdup isn't even in ANSI C, and that's not an > > oversight. The design of the standard libc separates strings and memory > > functions, and you'll notice that strdup combines the two, thus `tying' > > string handling functions to a specific set of memory allocation. > > > > That, and the fact that historically, strdup is not even well specified, > > so that if you use strdup in a C++ program, you don't even know how you > > should free the memory (is it free ? is it delete ? *both* options have > > existed in the past, courtesy of NeXtStep). |
| |||
| * Alexey Dobriyan wrote: > memdup(3) has strdup(3) semantics but without strings. > > Canonical code for duplicating buffer looks like: > > rpl = malloc(src, len); > if (!rpl) > ... > memcpy(rpl, src, len); > > Mistakes happen and two lengths in snippet above will be different. > To prevent this memdup(3) was created: > > rpl = memdup(src, len); > if (!rpl) > ... > ... I dislike this a lot. It is not available anywhere else, for a start. |
| |||
| On Mon, Jul 14, 2008 at 09:51:17AM -0500, Marco Peereboom wrote: > On Mon, Jul 14, 2008 at 06:38:49PM +0400, Alexey Dobriyan wrote: > > On Mon, Jul 14, 2008 at 03:06:14PM +0200, Marc Espie wrote: > > > On Mon, Jul 14, 2008 at 04:49:34PM +0400, Alexey Dobriyan wrote: > > > > memdup(3) has strdup(3) semantics but without strings. > > > > > > > > Canonical code for duplicating buffer looks like: > > > > > > > > rpl = malloc(src, len); > > > > if (!rpl) > > > > ... > > > > memcpy(rpl, src, len); > > > > > > > > Mistakes happen and two lengths in snippet above will be different. > > > > To prevent this memdup(3) was created: > > > > > > > > rpl = memdup(src, len); > > > > if (!rpl) > > > > ... > > > > ... > > > > > > This kind of code is very easy to write. The big question is: do we > > > want it. > > > > > > That's non-standard stuff, it's not ANSI, it's not POSIX, it's not single > > > unix. > > > > > > So what's the point ? > > > > The point is writing "len" once, so it won't diverge from itself. > > But you trade an explicit free for and implicit one. malloc, you mean. > I find this a confusing interface that adds nothing. Do you find strdup() confusing? > It is also not part of a standard. Eventually it will be. > > Plenty of examples in OpenBSD codebase alone. > > Sure and they aren't broken. |
| |||
| On 2008/07/14 18:38, Alexey Dobriyan wrote: > malloc+memcpy sequence exists extracted in quite a few projects. > Sometimes, under different name, sometimes with different prototype, And of course, sometimes with the same name and a different prototype, e.g. int memdup(u_char **to, const u_char *from, size_t size) |
| |||
| On Mon, Jul 14, 2008 at 07:12:03PM +0400, Alexey Dobriyan wrote: > On Mon, Jul 14, 2008 at 09:51:17AM -0500, Marco Peereboom wrote: > > On Mon, Jul 14, 2008 at 06:38:49PM +0400, Alexey Dobriyan wrote: > > > On Mon, Jul 14, 2008 at 03:06:14PM +0200, Marc Espie wrote: > > > > On Mon, Jul 14, 2008 at 04:49:34PM +0400, Alexey Dobriyan wrote: > > > > > memdup(3) has strdup(3) semantics but without strings. > > > > > > > > > > Canonical code for duplicating buffer looks like: > > > > > > > > > > rpl = malloc(src, len); > > > > > if (!rpl) > > > > > ... > > > > > memcpy(rpl, src, len); > > > > > > > > > > Mistakes happen and two lengths in snippet above will be different. > > > > > To prevent this memdup(3) was created: > > > > > > > > > > rpl = memdup(src, len); > > > > > if (!rpl) > > > > > ... > > > > > ... > > > > > > > > This kind of code is very easy to write. The big question is: do we > > > > want it. > > > > > > > > That's non-standard stuff, it's not ANSI, it's not POSIX, it's not single > > > > unix. > > > > > > > > So what's the point ? > > > > > > The point is writing "len" once, so it won't diverge from itself. > > > > But you trade an explicit free for and implicit one. > > malloc, you mean. No I mean free. malloc always needs to be paired with free; in this case you need to pair strdup with free. Not very nice when debugging memory leaks. example: grep malloc * | grep mypointer grep free * | grep mypointer memdup FAIL > > > I find this a confusing interface that adds nothing. > > Do you find strdup() confusing? Extremely. This is one of those APIs I always have to go back to the man to use. It is non-obvious. > > > It is also not part of a standard. > > Eventually it will be. Maybe but until then we might be able to keep a sane interface instead. I'll be the first to admit that currently there is way too much crap in these standards. You can't leave committee people alone; unmitigated committee work always ends up in disaster. |
| ||||
| On Mon, 14 Jul 2008, Alexey Dobriyan wrote: > memdup(3) has strdup(3) semantics but without strings. Please demonstrate that we need it. The extensions to POSIX that OpenBSD has made in the past have been because of either terrible semantics in existing functions (strl*) or clear patterns of misuse (strtonum). There is no API with bad semantics in this case, so it is up to you to demonstrate misuse. Personally, I can't recall ever wanting memdup. -d |