Mendip Data Systems

Database applications for businesses and schools

LogoTransparent

Speed Comparison Tests                                                  Page 7 (of 8)

Screenshots

Click any image to view a larger version ...

1 2 3 Return To Top Page 7 of 7 Return to Access Articles 4 5 6

Click to download:  

 

                                                                                                                 

7.   Check Record Exists                                            Updated 27/02/2019

These tests compare the time required to check the existence of a specified record in a large data table using 4 different methods. This was partly done as a follow up to an example database uploaded at Utter Access forum by UA member, cheekybuddha, in this thread VBA - Search Value In Table 

 

There are two similar versions of this set of speed tests.

The main difference is the search field is NOT INDEXED in one and INDEXED in the other.

 

In a real world application, fields being searched regularly should ALWAYS be INDEXED.

The index will increase file size but dramatically reduce search times. Access looks up the location of the data in the index so it can be retrieved very quickly in a query or SQL statement.

 

INDEXING makes all searches much faster but there are some other interesting differences in the results for each version.

 

In each of these tests, a 'reference' table tblSource containing 10,000 different UK postcodes (deliberately kept small to reduce file size) is used to populate the test table tblData which is initially empty.

 

In order to get a large data table, those records are appended repeatedly

For example, 100 batches (default) of 10,000 records to give a total 1,000,000records in tblData.

 

One RANDOM record is then replaced by a 'dummy' postcode 'XM4 5HQ' used in the record check.

For info, this postcode is used to sort letters addressed to Santa Claus!!!

 

Once the data table has been populated, click the Run button to start running the tests.

 

The recordset is then looped through repeatedly, exiting each loop as soon as the search postcode is found, and the total time required is measured.

 

 

NOTE:

The applications have been set to compact automatically when closed to reduce file 'bloat'.

This does carry a small risk of corruption so you may wish to create a backup copy or disable automatic compacting.

 

However, I haven’t experienced any issues during repeated testing with these applications.

SpeedTest9b-NotIndexed0 -StartForm

The 4 methods being compared are :

a)   DLookup

b)   DCount

c)   SQL Count(*)

d)   SQL SELECT

 

The code used for each test is as follows:

a)  DLookup

 

 For Q = 1 To LC 'loop count

       'look for matching record

        blnRet = Not IsNull(DLookup("ID", "tblData", "Postcode = 'XM4 5HQ'"))

 

        If blnRet Then GoTo NextStep

        If blnCancel Then GoTo EndRoutine

 

 NextStep:

       blnRet = False

 Next Q

b)   DCount

 

 For Q = 1 To LC 'loop count

       'check for non-zero count

       blnRet = DCount("ID", "tblData", "Postcode = 'XM4 5HQ'") > 0

       

       If blnRet Then GoTo NextStep

       If blnCancel Then GoTo EndRoutine

       

 NextStep:

       blnRet = False

 Next Q

c)   SQL Count(*)

 

 For Q = 1 To LC 'loop count

      'check for non-zero count

      With CurrentDb.OpenRecordset("SELECT COUNT(*) FROM tblData

            WHERE Postcode = 'XM4 5HQ';")

            blnRet = .Fields(0) > 0

           .Close

       End With

       

       If blnRet Then GoTo NextStep

       If blnCancel Then GoTo EndRoutine

 NextStep:

        blnRet = False

 Next Q

d)   SQL SELECT

 

 For Q = 1 To LC 'loop count

     'check for non-zero record count

      With CurrentDb.OpenRecordset("SELECT ID FROM tblData

            WHERE Postcode = 'XM4 5HQ';")

            blnRet = .RecordCount > 0

           .Close

      End With

 

     If blnRet Then GoTo NextStep

     If blnCancel Then GoTo EndRoutine

 

 NextStep:

     blnRet = False

 Next Q

1.   Search field NOT Indexed

The search field (Postcode) is NOT INDEXED which makes the search time much SLOWER.

 

As the position of the search postcode record is randomly selected, its position affects the speed of certain tests

 

Each of the tests measure the time needed to detect the existence of the postcode ‘XM4 5HQ’.

 

Each test is then repeated by looping through the code multiple times

The number of loops used can be varied with default=100.

So, in this case 1,000,000 records are looped through 100 times i.e. 100,000,000 records.

 

The position of the random record can be varied by clicking the 'Randomise Selection' button.

In the first example, the random record ID = 61600 – close to the start of the recordset.

SpeedTest9b-NotIndexed1-LowID

In this case, DLookup and SQL SELECT are very fast with the DCount & SQL Count(*) MUCH slower.

This isn’t surprising as the two Count tests need to check the entire file.

 

 

In the second example, the ID is 551357 – near the middle of the file.

As a result, it takes longer to detect the record using DLookup or SQL SELECT.

However, the two Count test results are almost the same as before

SpeedTest9b-NotIndexed3-HighID SpeedTest9b-NotIndexed4-2M_LowID SpeedTest9b-NotIndexed5-2M_VeryHighID

In the third example, the ID is 983734 – close to the end of the file

As a result, the times for the 4 tests are very similar but the two Count tests are still slower

Next the number of records in the data table were doubled to 2 million

Here are the results where the search record was close to the start of the recordset.

Once again where the ID value is near the end of the recordset

Here are the full results for 1 million records together with the average times

SpeedTest9b-NotIndexed6-1M_Results SpeedTest9b-NotIndexed7-1M_ResultsSummary

Here are the full results for the larger recordset of 2 million records

SpeedTest9b-NotIndexed9-2M_ResultsSummary SpeedTest9b-NotIndexed8-2M_Results

To a certain extent, the most effective method depends on the position of the record being checked (which will of course be unknown!).

 

Overall, using DLookup or SQL SELECT give very similar outcomes to each other, as do DCount or SQLCount(*).  However, the two Count methods are ALWAYS SLOWER for fields that are NOT INDEXED

2.   Indexed Search Field

For comparison, the first test is IDENTICAL to that used in the original non-indexed example.

i.e. 1 million records in tblData and 100 loops in the speed tests.

The results are shown below left (together with the equivalent results from the first test on the right)

SpeedTest9b-NotIndexed3-HighID

INDEXED

NOT INDEXED

All TESTS are VERY FAST – approximately 250x faster due to the INDEXING

 

The tests were then repeated using 1000 loops instead of 100:

SpeedTest9a-Indexed2-1M-1K loops SpeedTest9a-Indexed3-1M-10K loops

And once again using 10,000 loops of 1 million records

i.e. a total of 10 thousand million records checked... in around 10 seconds!

In each case, all results are similar to each other.

However, the two domain functions are, perhaps surprisingly, slightly faster.

 

The times are almost identical because Access is searching the saved INDEXES so it doesn’t need to search the entire file whichever method is used. The time to check 1 million records is about 0.09 seconds!

 

Here is a summary of the results.

The chart scale makes the differences seem more significant than is really the case

SpeedTest9a-Indexed4-1M-Results SpeedTest9a-Indexed5-1M-ResultSummary

The tests were then run once more using an even larger dataset of 10 million records.

SpeedTest9b-NotIndexed2-MidID SpeedTest9a-Indexed1-1M-100 loops SpeedTest9a-Indexed7-10M-ResultSummary SpeedTest9a-Indexed6-10M-Results SpeedTest9a-Indexed6-10M-Results

The outcomes for 10 million records are effectively IDENTICAL to those with 1 million records

- approx 10 seconds in each case.

 

The benefits of indexing become even more apparent the larger the dataset used.

Conversely, if your tables have only a few records, indexing provides minimal benefit

 

Apart from increased file size, there is another disadvantage of indexing.

Running action queries on an indexed field will take significantly longer (up to 50% in tests).

This is because the index needs to be updated as well as the individual records

 

It matters very little which method is used to check if a record exists in an indexed field as there is very little variation between the results. However, the two domain function are marginally faster

 

 

NOTE:

Creating huge datasets takes an EXTREMELY long time and significantly increases file size.

This 10 million recordset took about 40 minutes to complete. The file became almost 760 MB bigger

 

If you try to create a very large dataset, you will be warned about the implications.

SpeedTest9-Warning2 SpeedTest9-Warning1

As the Access file size limit is 2GB, the MAXIMIMUM possible number of records is around 30 million.

I tested this and the file was well over 1.9 GB before compacting.

 

To prevent the risk of hitting the file size limit and consequent datafile corruption, the application will not allow you to add more than 25,000,000 records.

If you really must try the largest dataset allowed, you need be aware that it will take SEVERAL hours to create that many records.

7

Speed Comparison tests 9.2A - INDEXED                   Approx 1.8 MB (zipped)

 

Speed Comparison Tests 9.2B - NOT INDEXED           Approx 1.8 MB (zipped)  

 

Check Record Exists     (PDF version of this article)                                                                        

8

To provide feedback on this article, please enter your name, e-mail address and company (optional) below, add your comment and press the submit button. Thanks

* Required