On May 6, 2:36*pm, Pat <pat.ca...@service-now.com> wrote:
> On May 6, 2:26 pm, joel garry <joel-ga...@home.com> wrote:
>
>
>
>
>
> > On May 6, 1:16 pm, Pat <pat.ca...@service-now.com> wrote:
>
> > > I have a table set with about 340k rows in it. User's execute queries
> > > against this table for arbitrary search strings which are resolved
> > > against the name table in the column. The table contains mixed case,
> > > but the users expect to match their search regardless of case. So, for
> > > example, both of these should match a search term of "badg":
>
> > > ARabidBadger
> > > somebadglowercase
>
> > > A typical search, for all those entries whose name contains "badg"
> > > would look like this:
>
> > > SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM rnum
> > > FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 *WHERE
> > > lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> > > a where ROWNUM <= 250) WHERE rnum > 0
>
> > > To make this query efficient, I added a covering index
>
> > > create index cmdb_ci_lower_name_sys_id on cmdb_ci (lower("name"),
> > > "sys_id");
>
> > > And our query plan looks like this and completes in about 120 ms. This
> > > was worst case since only 1 row was returned. Short search strings
> > > like "b" will hit their stop key quickly and exit after finding 250
> > > matches without having to exhaust the index.
>
> > > Still, worst case, this query does a full index scan in about 120ms;
>
> > > SQL> SELECT "sys_id" FROM (SELECT /*+ FIRST_ROWS(250) */ a.*, ROWNUM
> > > rnum FROM (SELECT cmdb_ci0."sys_id" FROM cmdb_ci cmdb_ci0 *WHERE
> > > lower(cmdb_ci0."name") LIKE '%badge%' ORDER BY lower(cmdb_ci0."name"))
> > > a where ROWNUM <= 250) WHERE rnum > 0;
>
> > > Elapsed: 00:00:00.12
>
> > > Execution Plan
> > > ----------------------------------------------------------
> > > Plan hash value: 424525705
>
> > > ---------------------------------------------------------------------------**---------------------
> > > | Id *| Operation * * * * *| Name ** * * * * * * * * *| Rows*| Bytes | Cost (%CPU)|
> > > Time * * |
> > > ---------------------------------------------------------------------------**---------------------
> > > | * 0 | SELECT STATEMENT * | * * * * * * * * * * * * * | * 250 | 11750 | * *46 * (0)|
> > > 00:00:01 |
> > > |* *1 | *VIEW * * * * * * *| * * * * * * * * * * * * * | * 250 | 11750 | * *46 * (0)| 00:00:01 |
> > > |* *2 | * COUNT STOPKEY * *| * * * * * * * * * * * * * | * * * | * * * | * * * * * *| * * * * *|
> > > | * 3 | * *VIEW * * * * * *| * * * * * * * * * * * * * | * 251 | *8534 | * *46 * (0)|00:00:01
> > > |
> > > |* *4 | * * INDEX FULL SCAN| CMDB_CI_LOWER_NAME_SYS_ID | 16997 |
> > > 813K| * *46 * * *(0)| 00:00:01 |
> > > ---------------------------------------------------------------------------**---------------------
>
> > > Predicate Information (identified by operation id):
> > > ---------------------------------------------------
>
> > > * *1 - filter("RNUM">0)
> > > * *2 - filter(ROWNUM<=250)
> > > * *4 - filter(LOWER("name") LIKE '%badge%')
>
> > > Statistics
> > > ----------------------------------------------------------
> > > * * * * 141 *recursive calls
> > > * * * * * 0 *db block gets
> > > * * * *2964 *consistent gets
> > > * * * * * 0 *physical reads
> > > * * * * * 0 *redo size
> > > * * * * 439 *bytes sent via SQL*Net to client
> > > * * * * 400 *bytes received via SQL*Net from client
> > > * * * * * 2 *SQL*Net roundtrips to/from client
> > > * * * * * 0 *sorts (memory)
> > > * * * * * 0 *sorts (disk)
> > > * * * * * 1 *rows processed
>
> > > Recently though, we started storing NLS characters in this column
> > > (european customers), so we'll have values like:
>
> > > badger
> > > bädger <-- note the accented a
>
> > > The users want both of the above to collate next to each other.
>
> > > So we, not unnaturally, put the database in linguisitc mode:
>
> > > alter session set nls_sort=german;
> > > alter session set nls_comp=linguistic;
>
> > > Then we issue the same query. The query plan here involves a sort, but
> > > the query returns a grand total of 1 record, so the sort isn't
> > > material to the response time.
>
> > ...
>
> > > 1) Am I doing something wrong here? Why does putting the system into
> > > NLS_SORT and NLS_COMP make this big a difference on performance?
> > > 2) Is there an index I should add here that would allow some form of
> > > efficient querying across this set? I know I'm going to have to do a
> > > full scan here (contains queries being what they are), but why is my
> > > full scan 10X slower in linguistic mode?
> > > 3) Is there some other recommended approach to getting linguistic
> > > collation working efficiently? Shadow columns aren't really practical
> > > here (and even then I'm not sure what I'd put in the shadow unless I
> > > wrote my own german to english unaccenter).
>
> > > Any help would be much appreciated.
>
> > See metalink Note:30779.1
>
> > "Setting NLS_SORT to anything other than BINARY causes a sort to use a
> > full table scan, regardless of the path chosen by the optimizer.
> > BINARY is the exception because indexes are built according to a
> > binary order of keys. Thus the optimizer can use an index to satisfy
> > the ORDER BY clause when NLS_SORT is set to BINARY. If NLS_SORT is set
> > to any linguistic sort, the optimizer must include a full table scan
> > and a full sort in the execution plan. *"
>
> > Things may be different in different versions, I wouldn't know.
>
> > jg
> > --
> > @home.com is bogus.http://forums.oracle.com/forums/thre...ID=2507441?
>
> Even though my query plan says INDEX FAST FULL SCAN he's actually
> doing TABLE ACCESS FULL?
I dunno, I think tracing would be called for to see for sure. The
idea of lying optimizer plans that are obtained for things actually
running is a bit of a thought-provoker for me.
> SQL> select "name" from cmdb_ci where "sys_id" like '%abc%' order by
> "name";
>
> 1530 rows selected.
>
> Elapsed: 00:00:00.20
>
> Execution Plan
> ----------------------------------------------------------
> Plan hash value: 502630801
>
> ---------------------------------------------------------------------------*-----------
> | Id *| Operation * * * * *| Name * *| Rows *| Bytes |TempSpc| Cost (%CPU)|
> Time * * |
> ---------------------------------------------------------------------------*-----------
> | * 0 | SELECT STATEMENT * | * * * * | 16997| * 813K| * * * | *3190 * (3)|
> 00:00:39 |
> | * 1 | *SORT ORDER BY * * | * * * * | 16997 | * 813K| *2008K| *3190 * (3)|
> 00:00:39 |
> |* *2 | * TABLE ACCESS FULL| CMDB_CI | 16997 | * 813K| * * * | *2979
> (3)| 00:00:36 |
> ---------------------------------------------------------------------------*-----------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> * *2 - filter("sys_id" LIKE '%abc%')
>
> Statistics
> ----------------------------------------------------------
> * * * * 148 *recursive calls
> * * * * * 0 *db block gets
> * * * 13165 *consistent gets
> * * * * * 0 *physical reads
> * * * * * 0 *redo size
> * * * 40810 *bytes sent via SQL*Net to client
> * * * *1511 *bytes received via SQL*Net from client
> * * * * 103 *SQL*Net roundtrips to/from client
> * * * * * 1 *sorts (memory)
> * * * * * 0 *sorts (disk)
> * * * *1530 *rows processed
>
> So in this case, a TABLE ACCESS FULL, followed by a sort in binary
> mode is 5x faster than an index full scan in linguistic mode?
>
> If the only problem was that linguistic forced him to table scan, I'd
> expect him to return data in 200ms or so (the time a table scan takes)
> as opposed to 1.3 seconds, neh?-
Well, maybe something else is going on, like applying some semantics
to every row... been too long since I worked with this stuff to do
anything but guess.
jg
--
@home.com is bogus.
Now not only do you drop your keys or phone in the toilet, it
automatically flushes if you stand up to get them.