问一条sql执行效率的问题讨论第2页: - oracle

群组首页 → 数据库 → oracle → 论坛

发表回复

问一条sql执行效率的问题

獨自等待 2010-08-15

left join 应该可以我支持楼上的 ,但还可以考虑用存储过程 100W commit一次

yunmanfan 2010-08-16

我来做的话,就写个程序,在程序里处理,是不是可行?

zhang19841017 2010-08-16

先用union all　把数据插进表Ｃ，把表Ａ删掉，　把Ｃ表重命名为Ａ　就可以了

select * from A union all select id, 0 as status from B
into C

这样应该比left join 效率要高

grantsw 2010-08-17

查看 execute plan .

fsj0101 2010-08-20

zhang19841017 写道

这样，A和B相同的数据不就重复了？还得删除重复的数据。

fsj0101 2010-08-20

改成not exists 应该不会太慢，1000万的数据，保守１个小时之内

今夜有雨 2010-08-20

谢谢大家，最后采用了full join的方式，两张表full join一下，存入第三张表。

dingjun1 2010-08-20

测试过程
原来的表cons，customerid客户ID，这是存放的打卡记录，共有6403364行,不同customerid数：30568，在customerid创建了索引。
create table test_my as select customerid,opcount from cons;/*15 seconds*/
create index test_my_cid on test_my(customerid);/*40 seconds*/
去掉两个customerid
delete from test_my where customerid in(28158,16);
commit;
执行下面的三条语句
从一个表中查询另一个表不存在的customerid。
select/*+ gather_plan_statistics */ customerid,opcount from test_my t where not exists(select customerid from cons a where t.customerid=a.customerid);
select/*+ gather_plan_statistics */ * from ( select customerid from cons minus select customerid from test_my );
select/*+ gather_plan_statistics */ a.customerid from cons a left join test_my b on (a.customerid=b.customerid ) where b.customerid is null;

--这条SQL把表写反了，但是不影响参考
select/*+ gather_plan_statistics */ customerid,opcount from test_my t where
not exists(select customerid from cons a where t.customerid=a.customerid);
call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      6.90      19.96      37714      26275          0           0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        3      6.90      19.97      37714      26275          0           0
Rows     Row Source Operation
------- ---------------------------------------------------
      0 HASH JOIN RIGHT ANTI (cr=26275 pr=37714 pw=19530 time=19969254 us)
6403364   INDEX FAST FULL SCAN I_CONS_CUSTOMERID (cr=14109 pr=6554 pw=0 time=10193 us)(object id 152141)
6403069   TABLE ACCESS FULL TEST_MY (cr=12166 pr=11630 pw=0 time=10970 us)

select/*+ gather_plan_statistics */ * from ( select customerid from cons
minus select customerid from test_my )

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     12.96      21.91      59457      61308          0           2
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        3     12.96      21.91      59457      61308          0           2

Rows     Row Source Operation
------- ---------------------------------------------------
      2 VIEW (cr=61308 pr=59457 pw=0 time=21909468 us)
      2   MINUS (cr=61308 pr=59457 pw=0 time=21909463 us)
30568    SORT UNIQUE (cr=49142 pr=47514 pw=0 time=13923686 us)
6403364     TABLE ACCESS FULL CONS (cr=49142 pr=47514 pw=0 time=6403410 us)
30566    SORT UNIQUE (cr=12166 pr=11943 pw=0 time=7971044 us)
6403069     TABLE ACCESS FULL TEST_MY (cr=12166 pr=11943 pw=0 time=12481 us)

没有做distinct 一个customerid存在多条记录
select/*+ gather_plan_statistics */ a.customerid from cons a left join
test_my b on (a.customerid=b.customerid ) where b.customerid is null

call     count       cpu    elapsed       disk      query    current        rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        3      6.85      21.35      65211      61308          0         295
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        5      6.85      21.35      65211      61308          0         295

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 57

Rows     Row Source Operation
------- ---------------------------------------------------
    295 HASH JOIN RIGHT ANTI (cr=61308 pr=65211 pw=16647 time=16633288 us)
6403069   TABLE ACCESS FULL TEST_MY (cr=12166 pr=0 pw=0 time=49 us)
6403364   TABLE ACCESS FULL CONS (cr=49142 pr=48564 pw=0 time=19226485 us)
-------------------------------------------------------------------------------------------

从执行计划上看，not exists会被优化器优化为半连接。
如果走索引INDEX FAST FULL SCAN会减少一些逻辑读。

select/*+ gather_plan_statistics */ customerid,opcount from test_my where customerid not in(select customerid from cons)
这条SQL实在太慢了，就没有等。
执行过程应该是从test_my中取出一条记录，然后就到cons中进行扫描一次，那么逻辑读大致可以理解为test_my * cons;
其它的三种方式逻辑读大致可以理解为test_my + cons。不同的是创建hash表过滤与唯一值排序后过滤的区别。

dingjun1 2010-08-20

今夜有雨写道

谢谢大家，最后采用了full join的方式，两张表full join一下，存入第三张表。

40秒内能完成么？

TeddyWang 2010-08-24

我爱小白写道

可以采用临时表，通过临时表把数据插入A中去

我觉得这个比较可行，在orale10g以后你可以使用forall来执行整块的插入。

首先把符合要求的数据弄出来放在全局临时表中，然后用forall来1000件1000件的载入，问题应该不大。
效率大概在每40000件5分钟左右。

TYPE data_ttype IS TABLE OF A%ROWTYPE      INDEX BY PLS_INTEGER;
data_tbl            data_ttype;
FORALL i IN INDICES OF data_tbl SAVE EXCEPTIONS
      INSERT INTO A
      VALUES data_tbl(i);

但是如果你的目标表是在dblink从另一个数据库影射过来的话，forall不好用。

发表回复

>>返回群组首页

问一条sql执行效率的问题

相关讨论

相关资源推荐