Are there general ABAP-specific tips related to performance of big SELECT
queries?
In particular, is it possible to close once and for all the question of FOR ALL ENTRIES IN
vs JOIN
?
Are there general ABAP-specific tips related to performance of big SELECT
queries?
In particular, is it possible to close once and for all the question of FOR ALL ENTRIES IN
vs JOIN
?
A few (more or less) ABAP-specific hints:
Avoid SELECT * where it s not needed, try to select only the fields that are required. Reason: Every value might be mapped several times during the process (DB Disk --> DB Memory --> Network --> DB Driver --> ABAP internal). It s easy to save the CPU cycles if you don t need the fields anyway. Be very careful if you SELECT * a table that contains BLOB fields like STRING, this can totally kill your DB performance because the blob contents are usually stored on different pages.
Don t SELECT ... ENDSELECT for small to medium result sets, use SELECT ... INTO TABLE instead.
Reason: SELECT ... INTO TABLE performs a single fetch and doesn t keep the cursor open while SELECT ... ENDSELECT will typically fetch a single row for every loop iteration.
This was a kind of urban myth - there is no performance degradation for using SELECT
as a loop statement. However, this will keep an open cursor during the loop which can lead to unwanted (but not strictly performance-related) effects.
For large result sets, use a cursor and an internal table. Reason: Same as above, and you ll avoid eating up too much heap space.
Don t ORDER BY, use SORT instead. Reason: Better scalability of the application server.
Be careful with nested SELECT statements. While they can be very handy for small inner result sets , they are a huge performance hog if the nested query returns a large result set.
Measure, Measure, Measure Never assume anything if you re worried about performance. Create a representative set of test data and run tests for different implementations. Learn how to use ST05 and SAT.
There won t be a way to close your second question "once and for all". First of all, FOR ALL ENTRIES IN joins a database table and an internal (memory) table while JOIN only operates on database tables. Since the database knows nothing about the internal ABAP memory, the FOR ALL ENTRIES IN statement will be transformed to a set of WHERE statements - just try and use the ST05 to trace this. Second, you can t add values from the second table when using FOR ALL ENTRIES IN. Third, be aware that FOR ALL ENTRIES IN always implies DISTINCT. There are a few other pitfalls - be sure to consult the on-line ABAP reference, they are all listed there.
If the number of records in the second table is small, both statements should be more or less equal in performance - the database optimizer should just preselect all values from the second table and use a smart joining algorithm to filter through the first table. My recommendation: Use whatever feels good, don t try to tweak your code to illegibility.
If the number of records in the second table exceeds a certain value, Bad Things [TM] happen with FOR ALL ENTRIES IN - the contents of the table are split into multiple sets, then the query is transformed (see above) and re-run for each set.
Another note: The "Avoid SELECT *" statement is true in general, but I can tell you where it is false.
When you are going to take most of the fields anyway, and where you have several queries (in the same program, or different programs that are likely to be run around the same time) which take most of the fields, especially if they are different fields that are missing.
This is because the App Server Data buffers are based on the select query signature. If you make sure to use the same query, then you can ensure that the buffer can be used instead of hitting the database again. In this case, SELECT * is better than selecting 90% of the fields, because you make it much more likely that the buffer will be used.
Also note that as of the last version I tested, the ABAP DB layer wasn t smart enough to recognize SELECT A, B as being the same as SELECT B, A, which means you should always put the fields you take in the same order (preferable the table order) in order to make sure again that the data buffer on the application is being well used.
I usually follow the rules stated in this pdf from SAP: "Efficient Database Programming with ABAP" It shows a lot of tips in optimizing queries.
This question will never be completely answered.
ABAP statement for accessing database is interpreted several times by different components of whole system (SAP and DB). Behavior of each component depends from component itself, its version and settings. Main part of interpretation is done in DB adapter on SAP side.
The only viable approach for reaching maximum performance is measurement on particular system (SAP version and DB vendor and version).
There are also quite extensive hints and tips in transaction SE30. It even allows you (depending on authorisations) to write code snippets of your own & measure it.
Unfortunately we can t close the "for all entries" vs join debate as it is very dependent on how your landscape is set up, wich database server you are using, the efficiency of your table indexes etc.
The simplistic answer is let the DB server do as much as possible. For the "for all entries" vs join question this means join. Except every experienced ABAP programmer knows that it s never that simple. You have to try different scenarios and measure like vwegert said. Also remember to measure in your live system as well, as sometimes the hardware configuration or dataset is significantly different to have entirely different results in your live system than test.
I usually follow the following conventions:
select *
, Select only the required fields.Select single
, or else use SELECT UP TO TO 1 ROWS, ENDSELECT
.FOR ALL ENTRIES
.Two more points in addition to the other answers:
usually you use JOIN
for two or more tables in the database and you use FOR ALL ENTRIES IN
to join database tables with a table you have in memory. If you can, JOIN
.
usually the IN
operator is more convinient than FOR ALL ENTRIES IN
. But the kernel translates IN
into a long select statement. The length of such a statement is limited and you get a dump when it gets too long. In this case you are forced to use FOR ALL ENTRIES IN
despite the performance implications.
With in-memory database technologies, it s best if you can finish all data and calculations on the database side with JOINs and database aggregation functions like SUM.
But if you can t, at least try to avoid accessing database in LOOPs. Also avoid reading the database without using indexes, of course.
What to look for in performance analyzer in VS 2008 I am using VS Team system and got the performance wizard and reports going. What benchmarks/process do I use? There is a lot of stuff in the ...
We have a number of items coming in from a web service; each item containing an unknown number of properties. We are storing them in a database with the following Schema. Items - ItemID - ...
Are there some noticeable outcomes in terms of performance or other aspects to follow semantic HTML?
I m using Visual Studio 2008 (with the latest service pack) I also have ReSharper 4.5 installed. ReSharper Code analysis/ scan is turned off. OS: Windows 7 Enterprise Edition It takes me a long time ...
As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...
I ve tried searching for this but it s pretty difficult to put into words. Basically, our site will run fine for most users without any issues. Sometimes though, those of us who use the site pretty ...
Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...
So I m running perl 5.10 on a core 2 duo macbook pro compiled with threading support: usethreads=define, useithreads=define. I ve got a simple script to read 4 gzipped files containing aroud 750000 ...