Total estimated I/O cost for statement

Discussion:

(too old to reply)

1dg618

2011-10-25 22:52:07 UTC

Below are two total estimated I/O cost for statement 1 for my two kind-
of similar queries. What does the number mean? The doc is confusing.

I assume the lower the number the better?

Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

Total estimated I/O cost for statement 1 (at line 1): 63543.

a***@gmail.com

2011-10-25 23:15:49 UTC

Permalink

IO cost is a unit-less number. Essentially I think it's a weighting applied to the (estimated) logical and physical IO done by the query - 2 units for a logical IO and 18 units for a physical IO.

If I recall correctly, the first access of any page is assumed to be physical, and subsequent accesses assumed to be logical (i.e. perfect caching but with a cold cache initially). It can and will deviate from your real-world results.

Hope that helps,
Antony.

Antony

2011-10-25 23:21:06 UTC

Permalink

To answer your other question, a qualified "yes" lower is better - in the sense that the ASE optimizer will seek to minimise this value in choosing the final plan. It's possible in practice, though, that the plan producing the fastest response to your query might not be the same as the one giving the lowest *estiamted* io cost. :)

1dg618

2011-10-26 00:13:47 UTC

Permalink

Post by Antony
To answer your other question, a qualified "yes" lower is better - in the sense that the ASE optimizer will seek to minimise this value in choosing the final plan. It's possible in practice, though, that the plan producing the fastest response to your query might not be the same as the one giving the lowest *estiamted* io cost. :)

You're correct. (1) has the lower *estimate* and the slower time.
Where as (2) has the higher *estimate* and the faster time. It's
always interesting because "things" make sense... and here comes the
wrench.

(1) Total estimated I/O cost for statement 1 (at line 1): 20019.

vs.

(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

1dg618

2011-10-26 00:20:42 UTC

Permalink

Post by 1dg618

You're correct. (1) has the lower *estimate* and the slower time.
Where as (2) has the higher *estimate* and the faster time. It's
always interesting because "things" make sense... and here comes the
wrench.
(1) Total estimated I/O cost for statement 1 (at line 1): 20019.
vs.
(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

In the query plan for (2), "Index contains all needed columns. Base
table will not be read." The estimate was higher and the time was
faster.

Positioning by key.
Index contains all needed columns. Base table will not be read.
Keys are:

In the query plan for (1), it didn't have the "Index contains all
needed columns. Base table will not be read." The estimate was lower
and the time was slower.

Positioning by key.
Keys are:

I guess, I'll have to put together the tables, queries and plan and
post it.

Derek Asirvadem

2011-10-29 06:21:08 UTC

Permalink

Post by 1dg618

Post by 1dg618
(1) Total estimated I/O cost for statement 1 (at line 1): 20019.
vs.
(2) Total estimated I/O cost for statement 1 (at line 1): 63543.

In the query plan for (2), "Index contains all needed columns. Base
table will not be read." The estimate was higher and the time was
faster.
Positioning by key.
Index contains all needed columns. Base table will not be read.
In the query plan for (1), it didn't have the "Index contains all
needed columns. Base table will not be read." The estimate was lower
and the time was slower.
Positioning by key.

Well, that goes to show, you cannot ever take one single stat or
metric, or clause of a sentence, hold it up separate from the context,
forgetting the rest of the sats, metrics, clauses, and examine it in
isolation. That is fragmented, and cannot be relied upon. No, you
need to take all the stats, metrics, etc, and evaluate them as a
whole.

When I see that (2) is a covered query, I would take that as being the
overriding indicator of performance; the *estimated* IO cost becomes
very secondary.

Post by 1dg618
I guess, I'll have to put together the tables, queries and plan and
post it.

Yes, please. And:

SET STATISTICS IO ON
SET STATISTICS TIME ON

If you have installed it, DBISQL provides a very attractive visual
rather than text plan.

Regards
Derek

1dg618

2011-10-31 16:36:31 UTC

Permalink

Hopefully, this helps out?

CREATE TABLE first
(
one_id INT NOT NULL,
two_id INT NOT NULL,
date DATETIME NOT NULL,
junk_id INT NOT NULL,
CONSTRAINT first_pkc_1
PRIMARY KEY CLUSTERED ( one_id, two_id, date, junk_id ) on
'default'
)
LOCK allpages ON 'default' partition BY roundrobin 1
GO

CREATE TABLE second
(
junk_id INT NOT NULL,
junk_desc varchar(255) NULL,
CONSTRAINT second_pkc_1
PRIMARY KEY CLUSTERED ( junk_id ) on 'default'
)
LOCK allpages ON 'default' partition BY roundrobin 1
GO

select
a.one_id,
a.two_id,
a.date,
b.junk_id
from
first a
left join second b on a.junk_id = b.junk_id
where
a.date >= '20091020'
and
a.date <= '20091120'
and
a.one_id = 120326
and
a.two_id = 1

QUERY PLAN FOR STATEMENT 1 (at line 1).
STEP 1
The type of query is SELECT.
5 operator(s) under root
|ROOT:EMIT Operator (VA = 5)
|
| |RESTRICT Operator (VA = 4)(0)(0)(0)(0)(11)
| |
| | |MERGE JOIN Operator (Join Type: Left Outer Join) (VA =
3)
| | | Using Worktable2 for internal storage.
| | | Key Count: 1
| | | Key Ordering: ASC
| | |
| | | |SORT Operator (VA = 1)
| | | | Average Row width is 22.000000
| | | | Using Worktable1 for internal storage.
| | | |
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | first
| | | | | a
| | | | | Using Clustered Index.
| | | | | Index : first_pkc_1
| | | | | Forward Scan.
| | | | | Positioning by key.
| | | | | Keys are:
| | | | | one_id ASC
| | | | | two_id ASC
| | | | | date ASC
| | | | | Using I/O Size 16 Kbytes for data pages.
| | | | | With MRU Buffer Replacement Strategy for
data pages.
| | |
| | | |SCAN Operator (VA = 2)
| | | | FROM TABLE
| | | | second
| | | | b
| | | | Using Clustered Index.
| | | | Index : XPKsecone
| | | | Forward Scan.
| | | | Positioning at index start.
| | | | Index contains all needed columns. Base table
will not be read.
| | | | Using I/O Size 16 Kbytes for index leaf pages.
| | | | With LRU Buffer Replacement Strategy for index
leaf pages.

Total estimated I/O cost for statement 1 (at line 1): 77052.

The first table is filled with about 300 million records and the
second table is filled with about 2 million records.

According to the show plan, we it is using work tables and what do
work tables mean?

"Using Worktable2 for internal storage." and "Using Worktable1 for
internal storage."

Is there anything I can do to get rid of the work tables?

Derek Asirvadem

2011-11-02 00:54:23 UTC

Permalink

This post might be inappropriate. Click to display it.

Antony

2011-11-02 03:03:42 UTC

Permalink

| | |MERGE JOIN Operator (Join Type: Left Outer Join) (VA = 3)

If merge joins are causing you troubles, I'd ignore them:

1> set merge_join off
2> go

I'm loathe to recommend you disable them by default - there must be some use for them - but I have yet to see them cause anything but trouble :)

At lease - try the query plan again without merge joins enabled ...

1dg618

2011-11-02 16:39:21 UTC

Permalink

Suggestions on a good Sybase book that talks about creating tables,
indexes, constraints, etc., to take advantage of the "mechanics" of
Sybase? I realize, I don't know the mechanics of Sybase. Each vendor's
implementation is different. If you understand it, you can take
advantage of it. Am I wrong?

Derek Asirvadem

2011-11-02 21:45:45 UTC

Permalink

On Nov 3, 3:39 am, 1dg618 <***@gmail.com> wrote:

Unfortunately there are no *good* books. The Jeffery Garbus book is
the best of the bad lot. For the level of your questions, it will
increase your understanding. No question that you should know as much
as you can about your platform, so that you can administer it better.

It is true that platform-specific info will allow you to take
advantage of that, but that is at the higher end, let's say the last
20% of performance. The issues here are not internals or platform-
specific. They fall squarely within the first 80%. Really, they are
Data Modelling errors. Sybase is by far the best at taking whatever
you have modelled and making the most of it: the better/more standard-
compliant your model, the better Sybase will handle it.
- if you add the FK Constraint the QP will correct itself (eg. change
from merge join to nested loop join; otherwise MJ is probably the best
it can do)
- if you correct the code, you can examine one table only
- if you correct the nullable FK (serious error) the problem
disappears

None of that is "taking advantage of platform-specific info"

1dg618

2011-11-04 13:11:29 UTC

Permalink

Unfortunately there are no *good* books. The Jeffery Garbus book is
the best of the bad lot. For the level of your questions, it will
increase your understanding. No question that you should know as much
as you can about your platform, so that you can administer it better.
It is true that platform-specific info will allow you to take
advantage of that, but that is at the higher end, let's say the last
20% of performance. The issues here are not internals or platform-
specific. They fall squarely within the first 80%. Really, they are
Data Modelling errors. Sybase is by far the best at taking whatever
you have modelled and making the most of it: the better/more standard-
compliant your model, the better Sybase will handle it.
- if you add the FK Constraint the QP will correct itself (eg. change
from merge join to nested loop join; otherwise MJ is probably the best
it can do)
- if you correct the code, you can examine one table only
- if you correct the nullable FK (serious error) the problem
disappears
None of that is "taking advantage of platform-specific info"

Thank you.

Derek Asirvadem

2011-11-06 14:33:07 UTC

Permalink

Post by 1dg618
Thank you.

My pleasure.

Regards
Derek

Continue reading on narkive:

Search results for 'Total estimated I/O cost for statement' (Questions and Answers)

replies

Why is Obama attacking Caterpillar and other companies for telling the truth about the cost of Obamacare?

started 2010-03-28 10:47:36 UTC

politics