This is a discussion on identify duplicate addresses? within the MySQL forums, part of the Database Server Software category; --> Hello all - I'm working on a survey system where a user can register to win a prize. Since ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hello all - I'm working on a survey system where a user can register to win a prize. Since we don't want people gaming the system, we want to remove any duplicate entries made by the same person to increase their odds. My thought is that the best way to do this is to look at the street mailing address, since we are mailing a certificate to the winner. ( Of course this rules out two people living at the same address, but we can only allow one entry per address in the official rules). I think by focusing on the number and street name, it might be pretty good at identifying duplicate addresses. For instance, these two are the same address 100 West Oak Street 100 W. Oak St If I take out the 'West' and 'Street', then I have a numeric part, '100', and also 'Oak'. ( this does pose a problem for two people at two different dwellings, one at 100 Oak St and 100-1/2 Oak St, but we might want to consider that a duplicate address ) I could compose a list of all street address extras, such as 'East', 'West', 'E', 'W', '.', 'Street', 'St', etc. Once those are removed, I should have a numeric address and the street name. Then I have to make equivalences amongst the different ways of writing ordinal numbers -- e.g. 'first' == 'First' == '1st' == '1 st'. Does this make sense? Is there anything I've overlooked? |
| |||
| Rik Wasmus wrote: > On Mon, 19 Nov 2007 16:47:36 +0100, <lawpoop@gmail.com> wrote: > >> Hello all - >> >> I'm working on a survey system where a user can register to win a >> prize. Since we don't want people gaming the system, we want to remove >> any duplicate entries made by the same person to increase their odds. >> >> My thought is that the best way to do this is to look at the street >> mailing address, since we are mailing a certificate to the winner. >> ( Of course this rules out two people living at the same address, but >> we can only allow one entry per address in the official rules). I >> think by focusing on the number and street name, it might be pretty >> good at identifying duplicate addresses. >> >> For instance, these two are the same address >> 100 West Oak Street >> 100 W. Oak St > >> Does this make sense? Is there anything I've overlooked? > > Euhm, in Holland a _postal code_ + house number is unique. It this not > true for the US/UK/wherever you're from? > Rik, Nope. Here in the U.S. a 9 digit postal code will identify a few square block area, which may have the same house number on different streets. What about apartments? > If you can spare the money, validating the address against a database > for illegal '-1','appt. c', etc. additions is often also possible. Such > a service/database can usually be provided by the/a postal company or > other third parties. The whole database usually doesn't come cheap > though, for a low traffic site you might be better off with a 'check per > address' deal. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| lawpoop@gmail.com wrote: > On Nov 19, 12:24 pm, "Rik Wasmus" <luiheidsgoe...@hotmail.com> wrote: > >> Euhm, in Holland a _postal code_ + house number is unique. It this not >> true for the US/UK/wherever you're from? > > I'm from the US. With the old postal code, that's not true. However, > the USPS introduced extra zip code digits that do make a unique > address, but not everyone is aware of them. I think that the extended > zip code by itself identifies a unique address. But I can't rely on my > users knowing it. > > > No, it does not. It identifies an area of (at most) a few blocks. -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| On Mon, 19 Nov 2007 10:37:46 -0800 (PST), lawpoop@gmail.com wrote: > On Nov 19, 12:15 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote: > >> >> And what do you do with something like 123 E St. NW? > > I'm not from a large city, but where I grew up, we didn't have street > names that were cardinal directions. We had street names that were > ordinals, such as "East 4th St.", but no "North Street"s. So, is this > a realistic example? Are there actually East streets in American > cities? > > In the example you gave above, the street name is "East", right? So I > would have to first strip out the street type ( street, avenue, etc) > 123 E St. NW -> 123 E NW > > Then, if the street has two cardinal directions in the name, I'll take > the first as the street name, and then treat the second as extra > information. E.g. Decide that the street name is "East", and not > "NW". > 123 E NW -> 123 East > > This reasoning assumes that street names never are said to be like > "123 NW East St." -- which may be a faulty assumption. Do you know of > such a street? Certainly. There's one about a mile from my house. 43°00'39"N 88°13'36"W is where the intersection of South Street and North East Avenue is. South East Avenue begins about a half-mile further south. There's also a West Avenue about a half-mile from East Avenue, but doesn't run as far north as South Street. There's also a North Street that comes in East and West varieties at 43°00'47"N 88°14'09" W, but the stree itself runs NE and SW, and it's only contrasting convention that prevents it from being labelled North North Street and South North Street. Next, take a look at the fine example at 41°53'13" N 87°37'41" W. That's the division between East Wacker Drive and West Wacker Drive. Now, over at 41°52'55"N 87°38'13"W, there's the division between North Wacker Drive and South Wacker Drive. So, 120 Wacker Drive without the direction could be any of four distinct addresses. To further complicate, Wacker also has a subterrenean aspect: there's also Lower Wacker Drive that runs UNDERNEATH Wacker Drive from part of South Wacker Drive all the way through North and West Wacker Drives and some distance into East Wacker Drive, meaning that there's actually *eight* places that could be "120 Wacker Drive". Quantum probability indicates there's also a Charm Wacker Drive and Strange Wacker Drive that we haven't the math to adequately draw maps of.... -- Time is a great teacher, but unfortunately it kills all its pupils. -- Hector Berlioz |
| |||
| On Mon, 19 Nov 2007 20:51:31 +0100, Jerry Stuckle <jstucklex@attglobal.net> wrote: > Rik Wasmus wrote: >> On Mon, 19 Nov 2007 16:47:36 +0100, <lawpoop@gmail.com> wrote: >> >>> Hello all - >>> >>> I'm working on a survey system where a user can register to win a >>> prize. Since we don't want people gaming the system, we want to remove >>> any duplicate entries made by the same person to increase their odds. >>> >>> My thought is that the best way to do this is to look at the street >>> mailing address, since we are mailing a certificate to the winner. >>> ( Of course this rules out two people living at the same address, but >>> we can only allow one entry per address in the official rules). I >>> think by focusing on the number and street name, it might be pretty >>> good at identifying duplicate addresses. >>> >>> For instance, these two are the same address >>> 100 West Oak Street >>> 100 W. Oak St >> >>> Does this make sense? Is there anything I've overlooked? >> Euhm, in Holland a _postal code_ + house number is unique. It this not >> true for the US/UK/wherever you're from? > > Nope. Here in the U.S. a 9 digit postal code will identify a few square > block area, which may have the same house number on different streets. Which is a shame. Then again, it's a lot bigger, so it'd have to be quite long to accomplish that. Here 4 numbers followed by 2 alpha's (0000AA) is all it takes. I'd assume a street is unique in a postal code though :P > What about apartments? For convenience, I meant that as 'part of a housenumber'. Official addresses like appartment numbers are actually stored with the postal company (/&local government) also, so validating those is still possible (naturally, more expensive). Making something up/splitting a house is possible, but if you don't explicitly ask for another address it's up to the good nature of the postal worker (it's what they're known for...) wether this 'illegal' address is honorated. Not a problem if it's 'just a few', but people tend to get a little pissed if their route is actually a couple of dozens more then their employer has in the books. It's total bureaucracy here, but it does come with it's advantages... -- Rik Wasmus |
| |||
| On Nov 19, 1:49 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote: > No, in this case it is E street. There is also F, G, H, and so on. And > depending on which quadrant they're in, it can be NE, NW, SE or SW. Well, the pattern might be that alphabet streets only have spelled out preceding directions, or that the direction follows the street type E.g. where the street name is "E" East E st. or E St. NW but not NW E St. nor E E St. > > And they're all in Washington, DC. > > And yes, there are streets named East, West, North and South. Raleigh, > NC has them. > > > In the example you gave above, the street name is "East", right? So I > > would have to first strip out the street type ( street, avenue, etc) > > 123 E St. NW -> 123 E NW > > > Then, if the street has two cardinal directions in the name, I'll take > > the first as the street name, and then treat the second as extra > > information. E.g. Decide that the street name is "East", and not > > "NW". > > 123 E NW -> 123 East > > > This reasoning assumes that street names never are said to be like > > "123 NW East St." -- which may be a faulty assumption. Do you know of > > such a street? > > But that could be true, also. Not off hand, but I know of a lot of > cities which have NW before the street name. I think there might be a pattern that I can tease out. I'll bet that if the cardinal direction precedes the street name, then the street name is not a cardinal direction. Actually, more generally, I'll bet that if the street name is a cardinal direction, it will be spelled out; otherwise it's a letter name. In other words: 123 E NW -- street name is "E" 123 North East St. -- street name is "East" 123 N E street -- street name is "E" I'd guess that anyone who lives on a street that has a cardinal direction for the name never abbreviates the name of the street. For example, if they live on East street, and there is an east East street and a west East street, they might write E East St or W East St but never E E St nor W E St Another way to go about it is to assume that their is only one cardinal direction in the street name. For instance, E St. NW is E Street NorthWest while N E St. is North E St. Actually, it looks like I could simply use the "Cardinal direction street names are never abbreviated " rule to get the same output. Are there streets that have two cardinal directions in the address? Are any of them named after letters or cardinal directions? Would I ever get East West St. NW or East E St. SE? |
| |||
| On Mon, 19 Nov 2007 13:20:19 -0800 (PST), lawpoop@gmail.com wrote: > On Nov 19, 1:49 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote: > >> No, in this case it is E street. There is also F, G, H, and so on. And >> depending on which quadrant they're in, it can be NE, NW, SE or SW. > > Well, the pattern might be that alphabet streets only have spelled out > preceding directions, or that the direction follows the street type > > E.g. where the street name is "E" > > East E st. > or > E St. NW > > but not > > NW E St. > nor > E E St. > >> >> And they're all in Washington, DC. 1905 E ST SE WASHINGTON DC 20003-2593 is a standardized address per USPS standards. -- Time is a great teacher, but unfortunately it kills all its pupils. -- Hector Berlioz |
| |||
| Rik Wasmus wrote: > On Mon, 19 Nov 2007 20:51:31 +0100, Jerry Stuckle > <jstucklex@attglobal.net> wrote: >> Rik Wasmus wrote: >>> On Mon, 19 Nov 2007 16:47:36 +0100, <lawpoop@gmail.com> wrote: >>> >>>> Hello all - >>>> >>>> I'm working on a survey system where a user can register to win a >>>> prize. Since we don't want people gaming the system, we want to remove >>>> any duplicate entries made by the same person to increase their odds. >>>> >>>> My thought is that the best way to do this is to look at the street >>>> mailing address, since we are mailing a certificate to the winner. >>>> ( Of course this rules out two people living at the same address, but >>>> we can only allow one entry per address in the official rules). I >>>> think by focusing on the number and street name, it might be pretty >>>> good at identifying duplicate addresses. >>>> >>>> For instance, these two are the same address >>>> 100 West Oak Street >>>> 100 W. Oak St >>> >>>> Does this make sense? Is there anything I've overlooked? >>> Euhm, in Holland a _postal code_ + house number is unique. It this >>> not true for the US/UK/wherever you're from? >> >> Nope. Here in the U.S. a 9 digit postal code will identify a few >> square block area, which may have the same house number on different >> streets. > > Which is a shame. Then again, it's a lot bigger, so it'd have to be > quite long to accomplish that. Here 4 numbers followed by 2 alpha's > (0000AA) is all it takes. I'd assume a street is unique in a postal code > though :P > That would be a nice idea, but we have a lot of streets only 1/2 block long with a circle and about a half-dozen houses. It might be able to be done, but that would take intelligence. And this is a quasi-governmental entity, after all. :-) >> What about apartments? > > For convenience, I meant that as 'part of a housenumber'. Official > addresses like appartment numbers are actually stored with the postal > company (/&local government) also, so validating those is still possible > (naturally, more expensive). Making something up/splitting a house is > possible, but if you don't explicitly ask for another address it's up to > the good nature of the postal worker (it's what they're known for...) > wether this 'illegal' address is honorated. Not a problem if it's 'just > a few', but people tend to get a little pissed if their route is > actually a couple of dozens more then their employer has in the books. > > It's total bureaucracy here, but it does come with it's advantages... -- ================== Remove the "x" from my email address Jerry Stuckle JDS Computer Training Corp. jstucklex@attglobal.net ================== |
| |||
| On Nov 19, 7:35 pm, "Peter H. Coffin" <hell...@ninehells.com> wrote: > On Mon, 19 Nov 2007 10:37:46 -0800 (PST), lawp...@gmail.com wrote: > > On Nov 19, 12:15 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote: > > >> And what do you do with something like 123 E St. NW? > > > I'm not from a large city, but where I grew up, we didn't have street > > names that were cardinal directions. We had street names that were > > ordinals, such as "East 4th St.", but no "North Street"s. So, is this > > a realistic example? Are there actually East streets in American > > cities? > > > In the example you gave above, the street name is "East", right? So I > > would have to first strip out the street type ( street, avenue, etc) > > 123 E St. NW -> 123 E NW > > > Then, if the street has two cardinal directions in the name, I'll take > > the first as the street name, and then treat the second as extra > > information. E.g. Decide that the street name is "East", and not > > "NW". > > 123 E NW -> 123 East > > > This reasoning assumes that street names never are said to be like > > "123 NW East St." -- which may be a faulty assumption. Do you know of > > such a street? > > Certainly. There's one about a mile from my house. 43°00'39"N 88°13'36"W > is where the intersection of South Street and North East Avenue is. > South East Avenue begins about a half-mile further south. There's also a > West Avenue about a half-mile from East Avenue, but doesn't run as far > north as South Street. There's also a North Street that comes in East > and West varieties at 43°00'47"N 88°14'09" W, but the stree itself runs > NE and SW, and it's only contrasting convention that prevents it from > being labelled North North Street and South North Street. > > Next, take a look at the fine example at 41°53'13" N 87°37'41" W. That's > the division between East Wacker Drive and West Wacker Drive. Now, over > at 41°52'55"N 87°38'13"W, there's the division between North Wacker > Drive and South Wacker Drive. So, 120 Wacker Drive without the > direction could be any of four distinct addresses. To further > complicate, Wacker also has a subterrenean aspect: there's also Lower > Wacker Drive that runs UNDERNEATH Wacker Drive from part of South Wacker > Drive all the way through North and West Wacker Drives and some distance > into East Wacker Drive, meaning that there's actually *eight* places > that could be "120 Wacker Drive". Quantum probability indicates there's > also a Charm Wacker Drive and Strange Wacker Drive that we haven't the > math to adequately draw maps of.... > > -- > Time is a great teacher, but unfortunately it kills all its pupils. > -- Hector Berlioz Did no one think that this might get confusing? |
| ||||
| On Mon, 19 Nov 2007 15:13:20 -0800 (PST), strawberry wrote: >> Next, take a look at the fine example at 41°53'13" N 87°37'41" W. That's >> the division between East Wacker Drive and West Wacker Drive. Now, over >> at 41°52'55"N 87°38'13"W, there's the division between North Wacker >> Drive and South Wacker Drive. So, 120 Wacker Drive without the >> direction could be any of four distinct addresses. To further >> complicate, Wacker also has a subterrenean aspect: there's also Lower >> Wacker Drive that runs UNDERNEATH Wacker Drive from part of South Wacker >> Drive all the way through North and West Wacker Drives and some distance >> into East Wacker Drive, meaning that there's actually *eight* places >> that could be "120 Wacker Drive". Quantum probability indicates there's >> also a Charm Wacker Drive and Strange Wacker Drive that we haven't the >> math to adequately draw maps of.... > > Did no one think that this might get confusing? It just growed that way. -- 100. Finally, to keep my subjects permanently locked in a mindless trance, I will provide each of them with free unlimited Internet access. --Peter Anspach's list of things to do as an Evil Overlord |