Unix Technical Forum

Re: Avoid huge perfomance loss on string concatenation

This is a discussion on Re: Avoid huge perfomance loss on string concatenation within the Pgsql General forums, part of the PostgreSQL category; --> In response to "Andrus" <eetasoft@online.ee>: > Thank you very much for quick reply. Keep the mailing list included in ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql General

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-10-2008, 12:13 AM
Bill Moran
 
Posts: n/a
Default Re: Avoid huge perfomance loss on string concatenation

In response to "Andrus" <eetasoft@online.ee>:

> Thank you very much for quick reply.


Keep the mailing list included in this discussion.

> > can you please give us the types of dok.kuupaev and dok.kellaaeg? I
> > think a simple fix is possible here.

>
> dok.kuupaev type is DATE
>
> dok.kellaaeg type is character(5) NOT NULL DEFAULT ''
> and is used to represent dokument time in format hh mm
>
> Database encoding is UTF-8 , cluster locale is estonian, OS is Windows 2003
> server.
>
> >> You provide zero information on the table layout

>
> dok table full definition is below.
> What other information do you need ?


That's pretty much it.

> >>, and the explain output
> >> has been horribly mangled by your MUA.

>
> I used copy and paste from pgAdmin.
> I checked my message and it seems that explain output is OK, havent found
> any truncation.
> So I do'nt understand this.


I had to cut/paste the data into a text editor and reformat it before I
could work with it. The arbitrary line-wrap is painful:

"Nested Loop Left Join (cost=68.75..5064.86 rows=1 width=24) (actual
time=8.081..26995.552 rows=567 loops=1)"
" Join Filter: ((toode.grupp = artliik.grupp) AND (toode.liik =
artliik.liik))"
" -> Nested Loop (cost=68.75..5062.19 rows=1 width=43) (actual
time=8.045..26965.731 rows=567 loops=1)"
" -> Index Scan using toode_pkey on toode (cost=0.00..8.27 rows=1
width=43) (actual time=0.023..0.026 rows=1 loops=1)"
" Index Cond: ('NAH S'::bpchar = toode)"
" -> Nested Loop (cost=68.75..5053.91 rows=1 width=24) (actual
time=8.016..26964.698 rows=567 loops=1)"
" -> Index Scan using dok_kuupaev_idx on dok
(cost=0.00..4326.16 rows=10 width=4) (actual time=0.059..67.985 rows=3543
loops=1)"
" Index Cond: ((kuupaev >= '2007-11-01'::date) AND
(kuupaev <= '2007-12-04'::date))"
" Filter: ((((kuupaev)::text || (kellaaeg)::text) >=
'2007-11-01'::text) AND (((kuupaev)::text || (kellaaeg)::text) <=
'2007-12-0423 59'::text))"
" -> Bitmap Heap Scan on rid (cost=68.75..72.76 rows=1
width=28) (actual time=7.577..7.577 rows=0 loops=3543)"
" Recheck Cond: ((dok.dokumnr = rid.dokumnr) AND
(rid.toode = 'NAH S'::bpchar))"
" -> BitmapAnd (cost=68.75..68.75 rows=1 width=0)
(actual time=7.574..7.574 rows=0 loops=3543)"
" -> Bitmap Index Scan on rid_dokumnr_idx
(cost=0.00..5.13 rows=83 width=0) (actual time=0.037..0.037 rows=14
loops=3543)"
" Index Cond: (dok.dokumnr = rid.dokumnr)"
" -> Bitmap Index Scan on rid_toode_idx
(cost=0.00..63.03 rows=1354 width=0) (actual time=7.528..7.528 rows=21144
loops=3543)"
" Index Cond: (toode = 'NAH S'::bpchar)"
" -> Seq Scan on artliik (cost=0.00..2.27 rows=27 width=19) (actual
time=0.007..0.020 rows=27 loops=567)"
"Total runtime: 26996.399 ms"

> >> I would suspect the problem is that there's no index that can be used
> >> for that final comparison.

>
> Postgres must use index on kuupaev in both queries.


I'm not sure what that comment is supposed to mean.

PG is using the index for the condition
dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

but there is no index that matches the expression
dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

If you look at your explain output, you'll see that step is taking a
lot of time, and it's inside a nested loop, which means it's run
repeatedly.

> This index filters out most rows.


Have you run a VACUUM ANALYZE on the tables involved with this query
recently? It's possible that PG has outdated statistics and is
running a poor plan as a result.

> >> Do you have an index along the lines of
> >> CREATE INDEX dokindex ON dok (kuupaeve||kellaaeg) ?

>
> I do'nt have this index.
> dok.kuupaev||dok.kellaaeg conditon should applied after index search is
> performed.


The query planner doesn't seem to think so.

> It filters out only a small number of rows additionally to the plain kuupaev
> filter.
> So adding index on dok.kuupaev||dok.kellaaeg is not reasonable IMHO.
>
> Please confirm that most reasonable way to fix this to add this index, I
> will add this.


Just add the index and rerun to see if it helps. If it doesn't, then
drop the index. I have absolutely no way to investigate this for you.

> >> Overall, the fact that you're concatenating two text fields to generate a
> >> date field tends to suggest that your database schema has some fairly
> >> major design problems, but I can only speculate at this point.

>
> This schema is migrated from dbms where there was no datetime support.
> char(5) field is used to express time in form hh mm
> This schema is deployed in a large number of servers.
> Its change would be very expensive. change requires huge amout of work time
> to re-write applications, create database conversion scripts, re-write
> pl/sql triggers, test and fix new bugs causes by change.


I understand. However, that doesn't change the fact that such a change
will improve performance and accuracy of the data.

For example, in your query, you have a text string meant to represent
a date: '2007-12-0423 59'

This is not a valid date/time, but PostgreSQL has no way to know that
because it's just a text string. As a result, you're query is liable
to give you outright incorrect results.

>
> Andrus.
>
>
> CREATE TABLE firma1.dok
> (
> doktyyp character(1) NOT NULL,
> dokumnr integer NOT NULL DEFAULT nextval('dok_dokumnr_seq'::regclass),
> kuupaev date NOT NULL,
> oper character(3),
> klient character(12),
> laonr numeric(2),
> raha character(3),
> tasudok character(25),
> knr character(10),
> tasukuup date,
> yksus character(10),
> sihtyksus character(10),
> pais2obj character(10),
> saaja character(12),
> krdokumnr integer,
> eimuuda ebool,
> kasutaja character(10),
> username character(10),
> kellaaeg character(5) NOT NULL DEFAULT ''::bpchar,
> arvekonto character(10),
> maksetin character(5),
> exchrate numeric(11,6),
> ratefound date,
> kurss numeric(10,5),
> tekst1 text,
> viitenr character(20),
> objrealt ebool,
> arvenumber character(25),
> pais3obj character(10),
> pais4obj character(10),
> pais5obj character(10),
> pais6obj character(10),
> pais7obj character(10),
> pais8obj character(10),
> pais9obj character(10),
> masin character(5),
> tegmasin character(5),
> guid character(36) NOT NULL,
> doksumma numeric(12,2),
> kinnitatud ebool,
> tasumata numeric(12,2),
> sularaha numeric(12,2),
> kaardimaks numeric(12,2),
> kalkliik character(1),
> kalktoode character(20),
> inventuur ebool,
> algus date,
> lopp date,
> taidetud ebool,
> kaal numeric(7,3),
> "timestamp" character(14) NOT NULL DEFAULT to_char(now(),
> 'YYYYMMDDHH24MISS'::text),
> vmnr integer,
> tellimus character(25),
> volitaisik character(36),
> liikmesrii character(2),
> tehingulii character(2),
> tarneklaus character(10),
> statprots character(2),
> CONSTRAINT dok_pkey PRIMARY KEY (dokumnr),
> CONSTRAINT dok_arvekonto_fkey FOREIGN KEY (arvekonto)
> REFERENCES firma1.konto (kontonr) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_kalktoode_fkey FOREIGN KEY (kalktoode)
> REFERENCES firma1.toode (toode) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_kasutaja_fkey FOREIGN KEY (kasutaja)
> REFERENCES kasutaja (kasutaja) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_klient_fkey FOREIGN KEY (klient)
> REFERENCES firma1.klient (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_knr_fkey FOREIGN KEY (knr)
> REFERENCES firma1.konto (kontonr) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_krdokumnr_fkey FOREIGN KEY (krdokumnr)
> REFERENCES firma1.dok (dokumnr) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_liikmesrii_fkey FOREIGN KEY (liikmesrii)
> REFERENCES riik (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_maksetin_fkey FOREIGN KEY (maksetin)
> REFERENCES maksetin (maksetin) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_oper_fkey FOREIGN KEY (oper)
> REFERENCES alamdok (oper) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais2obj_fkey FOREIGN KEY (pais2obj)
> REFERENCES firma1.yksus2 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais3obj_fkey FOREIGN KEY (pais3obj)
> REFERENCES firma1.yksus3 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais4obj_fkey FOREIGN KEY (pais4obj)
> REFERENCES firma1.yksus4 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais5obj_fkey FOREIGN KEY (pais5obj)
> REFERENCES firma1.yksus5 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais6obj_fkey FOREIGN KEY (pais6obj)
> REFERENCES firma1.yksus6 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais7obj_fkey FOREIGN KEY (pais7obj)
> REFERENCES firma1.yksus7 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais8obj_fkey FOREIGN KEY (pais8obj)
> REFERENCES firma1.yksus8 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_pais9obj_fkey FOREIGN KEY (pais9obj)
> REFERENCES firma1.yksus9 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_raha_fkey FOREIGN KEY (raha)
> REFERENCES raha (raha) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_saaja_fkey FOREIGN KEY (saaja)
> REFERENCES firma1.klient (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_sihtyksus_fkey FOREIGN KEY (sihtyksus)
> REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_statprots_fkey FOREIGN KEY (statprots)
> REFERENCES transpor (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_tarneklaus_fkey FOREIGN KEY (tarneklaus)
> REFERENCES tarnekla (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_tehingulii_fkey FOREIGN KEY (tehingulii)
> REFERENCES tehingul (kood) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_username_fkey FOREIGN KEY (username)
> REFERENCES kasutaja (kasutaja) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE SET NULL DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_vmnr_fkey FOREIGN KEY (vmnr)
> REFERENCES firma1.vmaks (vmnr) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_volitaisik_fkey FOREIGN KEY (volitaisik)
> REFERENCES firma1.kaardika (guid) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_yksus_fkey FOREIGN KEY (yksus)
> REFERENCES firma1.yksus1 (yksus) MATCH SIMPLE
> ON UPDATE CASCADE ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE,
> CONSTRAINT dok_check CHECK (krdokumnr IS NULL OR (doktyyp = ANY
> (ARRAY['G'::bpchar, 'O'::bpchar]))),
> CONSTRAINT dok_dokumnr_check CHECK (dokumnr > 0),
> CONSTRAINT dok_guid_check CHECK (guid <> ''::bpchar)
> )
> WITHOUT OIDS;
>
> ALTER TABLE firma1.dok OWNER TO eeva_owner;
>
> CREATE INDEX dok_klient_idx
> ON firma1.dok
> USING btree
> (klient);
>
> CREATE INDEX dok_krdokumnr_idx
> ON firma1.dok
> USING btree
> (krdokumnr);
>
> CREATE INDEX dok_kuupaev_idx
> ON firma1.dok
> USING btree
> (kuupaev);
>
> CREATE INDEX dok_tasudok_idx
> ON firma1.dok
> USING btree
> (tasudok);
>
> CREATE UNIQUE INDEX dok_tasudok_unique_idx
> ON firma1.dok
> USING btree
> (doktyyp, tasudok)
> WHERE doktyyp = ANY (ARRAY['T'::bpchar, 'U'::bpchar]);
>
> CREATE INDEX dok_tasumata_idx
> ON firma1.dok
> USING btree
> (tasumata);
>
> CREATE INDEX dok_tellimus_idx
> ON firma1.dok
> USING btree
> (tellimus);
>
> CREATE TRIGGER dok_btrig
> BEFORE INSERT
> ON firma1.dok
> FOR EACH ROW
> EXECUTE PROCEDURE firma1.dok_seq_trig();
>
>
>
>



--
Bill Moran
http://www.potentialtech.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-10-2008, 12:13 AM
Andrus
 
Posts: n/a
Default Re: Avoid huge perfomance loss on string concatenation

> I'm not sure what that comment is supposed to mean.
>
> PG is using the index for the condition
> dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'
>
> but there is no index that matches the expression
> dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'
>
> If you look at your explain output, you'll see that step is taking a
> lot of time, and it's inside a nested loop, which means it's run
> repeatedly.


Postgres must use index to filter out rows matching to the condition

dok.kuupaev BETWEEN '2007-11-01' AND '2007-12-04'

this returns 121 rows.

Additional condition

dok.kuupaev||dok.kellaaeg BETWEEN '2007-11-01' AND '2007-12-0423 59'

should be evaluated only for these 121 rows.

> Have you run a VACUUM ANALYZE on the tables involved with this query
> recently? It's possible that PG has outdated statistics and is
> running a poor plan as a result.


I have the follwing command in end of postgresql.conf file:

stats_start_collector = on
stats_row_level = on
autovacuum = on

So autovacuum should be running.

> Just add the index and rerun to see if it helps. If it doesn't, then
> drop the index. I have absolutely no way to investigate this for you.


I tried

CREATE INDEX dok_kuupaev_kellaaeg_idx ON dok ((kuupaev||kellaaeg));

but got error

ERROR: functions in index expression must be marked IMMUTABLE
SQL state: 42P17

How to create such index ?

> For example, in your query, you have a text string meant to represent
> a date: '2007-12-0423 59'
>
> This is not a valid date/time, but PostgreSQL has no way to know that
> because it's just a text string. As a result, you're query is liable
> to give you outright incorrect results.


My fields are kuupaev date and kellaaeg char(5)
kellaaeg is in format hh mm

I compare this always with kuupaev||kellaaeg

'2007-12-0423 59' is my valid datetime for to be used for this conversion.

I can probably convert kuupaev||kellaaeg to a datetime and use datetime
comparison instead of this. Will this increase perfomance ?

Andrus.


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 10:04 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0
www.UnixAdminTalk.com