This is a discussion on Improvement of search for a binary operator within the Pgsql Patches forums, part of the PostgreSQL category; --> Here is a gprof result of the pgbench. % cumulative self self total time seconds seconds calls s/call s/call ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Here is a gprof result of the pgbench. % cumulative self self total time seconds seconds calls s/call s/call name 4.77 4.04 4.04 14000 0.00 0.00 OpernameGetCandidates 4.70 8.02 3.98 2007008 0.00 0.00 HeapTupleSatisfiesSnapshot 4.50 11.83 3.81 14001 0.00 0.00 base_yyparse 3.78 15.03 3.20 2838312 0.00 0.00 AllocSetAlloc 3.73 18.18 3.15 40001 0.00 0.00 heapgetpage The OpernameGetCandidates called from oper. The function of oper is search for a binary operator. It does the following processing: (1)Create candidates of operator that matches operator name and operator kind by OpernameGetCandidates. (2)Find an operator that matches exactly ltypeId and rtypeId from the candidates of operator by binary_oper_exact. (3)If not found, find an operator from the candidates of operator by oper_select_candidate. The oper_select_candidate accepts coerce type and resolve the conflict. I think that we can search the system catalog cache instead of retrieval from the candidates of operator in the binary_oper_exact, and reverse the order of executing (1) and (2) for performance improvement. Because for a general operator such as '=', the number of the candidates of operator exceeds 50. And in common cases the typeIds matches exactly. I tested following commands: pgbench -i pgbench -t 4000 (5 times) results (tps): 1 2 3 4 5 average ----------------------------------------------------------- original: 214.33 184.60 192.52 158.62 170.04 184.02 patched : 223.86 198.72 207.48 165.70 179.67 195.09 regards, --- Atsushi Ogawa *** ./src/backend/catalog/namespace.c.orig Tue Apr 25 23:16:57 2006 --- ./src/backend/catalog/namespace.c Wed Apr 26 00:14:33 2006 *************** *** 681,686 **** --- 681,744 ---- return visible; } + /* + * OpernameGet + * Given a possibly-qualified operator name, left and right type IDs, + * find an operator in the current search path. + */ + Oid + OpernameGet(List *names, Oid ltypeId, Oid rtypeId) + { + char *schemaname; + char *opername; + Oid namespaceId; + HeapTuple tup = NULL; + + /* deconstruct the name list */ + DeconstructQualifiedName(names, &schemaname, &opername); + + if (schemaname) + { + /* use exact schema given */ + namespaceId = LookupExplicitNamespace(schemaname); + + tup = SearchSysCache(OPERNAMENSP, + CStringGetDatum(opername), + ObjectIdGetDatum(ltypeId), + ObjectIdGetDatum(rtypeId), + ObjectIdGetDatum(namespaceId)); + } + else + { + /* we need namespace search */ + ListCell *nsp; + + recomputeNamespacePath(); + + foreach(nsp, namespaceSearchPath) + { + namespaceId = lfirst_oid(nsp); + + tup = SearchSysCache(OPERNAMENSP, + CStringGetDatum(opername), + ObjectIdGetDatum(ltypeId), + ObjectIdGetDatum(rtypeId), + ObjectIdGetDatum(namespaceId)); + if (HeapTupleIsValid(tup)) + break; + } + } + + if (HeapTupleIsValid(tup)) + { + Oid operOid = HeapTupleGetOid(tup); + ReleaseSysCache(tup); + return operOid; + } + + return InvalidOid; + } + /* * OpernameGetCandidates *** ./src/backend/parser/parse_oper.c.orig Tue Apr 25 22:12:52 2006 --- ./src/backend/parser/parse_oper.c Wed Apr 26 00:19:04 2006 *************** *** 29,36 **** #include "utils/typcache.h" ! static Oid binary_oper_exact(Oid arg1, Oid arg2, ! FuncCandidateList candidates); static FuncDetailCode oper_select_candidate(int nargs, Oid *input_typeids, FuncCandidateList candidates, --- 29,35 ---- #include "utils/typcache.h" ! static Oid binary_oper_exact(List *opname, Oid arg1, Oid arg2); static FuncDetailCode oper_select_candidate(int nargs, Oid *input_typeids, FuncCandidateList candidates, *************** *** 387,397 **** * be reduced to its base type to find an "exact" match. */ static Oid ! binary_oper_exact(Oid arg1, Oid arg2, ! FuncCandidateList candidates) { FuncCandidateList cand; bool was_unknown = false; /* Unspecified type for one of the arguments? then use the other */ if ((arg1 == UNKNOWNOID) && (arg2 != InvalidOid)) --- 386,396 ---- * be reduced to its base type to find an "exact" match. */ static Oid ! binary_oper_exact(List *opname, Oid arg1, Oid arg2) { FuncCandidateList cand; bool was_unknown = false; + Oid operOid; /* Unspecified type for one of the arguments? then use the other */ if ((arg1 == UNKNOWNOID) && (arg2 != InvalidOid)) *************** *** 405,415 **** was_unknown = true; } ! for (cand = candidates; cand != NULL; cand = cand->next) ! { ! if (arg1 == cand->args[0] && arg2 == cand->args[1]) ! return cand->oid; ! } if (was_unknown) { --- 404,412 ---- was_unknown = true; } ! operOid = OpernameGet(opname, arg1, arg2); ! if (OidIsValid(operOid)) ! return operOid; if (was_unknown) { *************** *** 418,428 **** if (basetype != arg1) { ! for (cand = candidates; cand != NULL; cand = cand->next) ! { ! if (basetype == cand->args[0] && basetype == cand->args[1]) ! return cand->oid; ! } } } --- 415,423 ---- if (basetype != arg1) { ! operOid = OpernameGet(opname, basetype, basetype); ! if (OidIsValid(operOid)) ! return operOid; } } *************** *** 509,530 **** FuncDetailCode fdresult = FUNCDETAIL_NOTFOUND; HeapTuple tup = NULL; ! /* Get binary operators of given name */ ! clist = OpernameGetCandidates(opname, 'b'); ! /* No operators found? Then fail... */ ! if (clist != NULL) { /* ! * Check for an "exact" match. */ - operOid = binary_oper_exact(ltypeId, rtypeId, clist); - if (!OidIsValid(operOid)) - { - /* - * Otherwise, search for the most suitable candidate. - */ /* * Unspecified type for one of the arguments? then use the other * (XXX this is probably dead code?) --- 504,526 ---- FuncDetailCode fdresult = FUNCDETAIL_NOTFOUND; HeapTuple tup = NULL; ! /* ! * Check for an "exact" match. ! */ ! operOid = binary_oper_exact(opname, ltypeId, rtypeId); ! if (!OidIsValid(operOid)) { /* ! * Otherwise, search for the most suitable candidate. */ + /* Get binary operators of given name */ + clist = OpernameGetCandidates(opname, 'b'); + + /* No operators found? Then fail... */ + if (clist != NULL) + { /* * Unspecified type for one of the arguments? then use the other * (XXX this is probably dead code?) *************** *** 537,548 **** inputOids[1] = rtypeId; fdresult = oper_select_candidate(2, inputOids, clist, &operOid); } - if (OidIsValid(operOid)) - tup = SearchSysCache(OPEROID, - ObjectIdGetDatum(operOid), - 0, 0, 0); } if (!HeapTupleIsValid(tup) && !noError) op_error(pstate, opname, 'b', ltypeId, rtypeId, fdresult, location); --- 533,545 ---- inputOids[1] = rtypeId; fdresult = oper_select_candidate(2, inputOids, clist, &operOid); } } + if (OidIsValid(operOid)) + tup = SearchSysCache(OPEROID, + ObjectIdGetDatum(operOid), + 0, 0, 0); + if (!HeapTupleIsValid(tup) && !noError) op_error(pstate, opname, 'b', ltypeId, rtypeId, fdresult, location); ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Atsushi Ogawa <a_ogawa@hi-ho.ne.jp> writes: > The OpernameGetCandidates called from oper. The function of oper is > search for a binary operator. It does the following processing: > (1)Create candidates of operator that matches operator name and > operator kind by OpernameGetCandidates. > (2)Find an operator that matches exactly ltypeId and rtypeId from > the candidates of operator by binary_oper_exact. > (3)If not found, find an operator from the candidates of operator by > oper_select_candidate. The oper_select_candidate accepts coerce type > and resolve the conflict. > I think that we can search the system catalog cache instead of > retrieval from the candidates of operator in the binary_oper_exact, > and reverse the order of executing (1) and (2) for performance > improvement. AFAICT, this will make things a bit faster if there is an exact match, and quite a bit slower if there is not (especially if the search path is long). I've known for awhile that OpernameGetCandidates is a bottleneck, but I don't want a solution that optimizes some cases at the price of making others worse. pgbench is not a good model of the real world for such tradeoffs. Whatever solution we find, it should be applied to the unary operator paths as well. It's not appropriate to introduce gratuitous differences between the binary and unary operator paths. (Again, pgbench is a poor model of the real world ... I don't think it even uses any unary operators.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Atsushi Ogawa <a_ogawa@hi-ho.ne.jp> writes: > I think that we can search the system catalog cache instead of > retrieval from the candidates of operator in the binary_oper_exact, > and reverse the order of executing (1) and (2) for performance > improvement. I've been thinking more about how this might be improved. In a test involving the same trivial query repeated many times ("select * from tab where foo = 'constant'") I see oprofile results like this: samples % symbol name 95234 5.5458 AllocSetAlloc 89622 5.2190 hash_search 86038 5.0103 OpernameGetCandidates 76747 4.4692 SearchCatCache 64042 3.7294 base_yyparse 39473 2.2987 LWLockAcquire 32195 1.8748 base_yylex 25024 1.4572 nocachegetattr 21913 1.2761 MemoryContextAllocZeroAligned 21268 1.2385 _bt_compare 19665 1.1452 fmgr_info_cxt_security 18430 1.0732 LWLockRelease 17312 1.0081 expression_tree_walker .... 7787 0.4535 SearchCatCacheList .... The problem with the patch you proposed is that when there *isn't* an exact match, it costs N extra SearchCatCache probes, where N is the length of the search_path. We need a solution that limits the damage in that case. The solution I'm considering is to add an additional namespace.c routine defined like Oid OpnameGetOprid(List *opname, Oid oprleft, Oid oprright) that returns the first exact match in the search path, or 0 if none. (parse_oper.c would try this before OpernameGetCandidates.) The secret is that it should use a SearchCatCacheList lookup instead of repeated SearchCatCache probles. If the list lookup is successful, it will normally produce a very short result (typically only one member) and so finding the first visible member if any is cheap. With this approach the penalty for failure is just one extra SearchCatCacheList lookup (which has cost comparable to SearchCatCache, I believe, once the cache is set up) instead of N SearchCatCache lookups. In the example above this should add at most about half a percent of total system runtime. The potential of making a large dent in OpernameGetCandidates' five percent makes that look like a good trade. If this works out we might want to think about converting some of the other namespace.c routines to use similar tactics. I'm not sure I've ever seen them high in a profile though. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| ||||
| I wrote: > The solution I'm considering is to add an additional namespace.c routine > defined like > Oid OpnameGetOprid(List *opname, Oid oprleft, Oid oprright) I coded this up, and it seems to be a win just on code cleanliness grounds, because there are several places that want to do a search for an operator with exact input types given; this routine fits their needs exactly while OpernameGetCandidates was a poor fit. It seems to behave as advertised in terms of getting rid of OpernameGetCandidates overhead in the exact-match case --- this only makes for a few percent overall improvement in the test case I'm looking at, but that's about what's expected. I made a non-exact-match test case by changing char(N) to varchar(N) in the table definitions, and what I see is samples % symbol name 189998 11.9853 SearchCatCache 68888 4.3455 AllocSetAlloc 66793 4.2134 hash_search 49284 3.1089 OpernameGetCandidates 46472 2.9315 base_yyparse 28268 1.7832 LWLockAcquire 27487 1.7339 DirectFunctionCall1 27071 1.7077 DLMoveToFront 26375 1.6638 nocachegetattr 24956 1.5743 SearchSysCache 21461 1.3538 base_yylex 20176 1.2727 heap_getsysattr 19466 1.2279 FunctionCall2 18148 1.1448 CatalogCacheComputeHashValue 17608 1.1107 _bt_compare 16606 1.0475 MemoryContextAllocZeroAligned .... 6870 0.4334 SearchCatCacheList This case is significantly slower than the other one anyway, and the reason is evidently a lot more SearchCatCache calls. AFAICT those can all be blamed on the getBaseType() calls that are sprinkled through parse_func.c and parse_oper.c. I thought those would probably come back to haunt us from a performance standpoint someday :-( Anyway, I can't measure any performance difference in the non-exact-match case. Maybe if we got rid of the getBaseType calls it'd be worth checking again, but for now this seems like a win. I'll go ahead and commit it. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |