Pubblicato
2010-04-27 18:56
It is said that optimizing too early is the root of all evil. However it is not so easy to say when is the right time.
Looking at CiviCRM performance there are a number of instances where even on medium sized installations search queries take a long time to execute.
One of the searches that caught my eye is the AJAX search at the top left in the menu bar.
Returning a maximum of ten entries from a medium sized database (~50k records) should take negligible time and on the CiviCRM test data this request was taking around 3 seconds (putting full load on server).
The culprit query from CRM/Contact/Page/AJAX.php is:
$query = "
SELECT DISTINCT(cc.id) as id, CONCAT_WS( ' :: ', {$select} ) as data
FROM civicrm_contact cc {$from}
{$aclFrom}
{$additionalFrom}
{$whereClause}
ORDER BY sort_name
LIMIT 0, {$limit}
";
First of all, MySQL does not have DISTINCT(field) statement that returns rows distinct only in value of a single field.
Still MySQL does not give errors on
SELECT DISTINCT(cc.id) as id, CONCAT_WS( ' :: ', {$select} ) as data
but silently interprets the above as
SELECT DISTINCT cc.id as id, CONCAT_WS( ' :: ', {$select} ) as data
which has a different meaning entirely (and can actually return non-distinct values in cc.id if the data is different).
So to achieve what the initial version aims for (in MySQL), the following syntax works
SELECT cc.id as id, CONCAT_WS( ' :: ', {$select} ) as data
FROM ...
GROUP BY cc.id
This will return distinct values in cc.id and, if there are multiple records per cc.id, random records from the rest of the columns (note: MySQL extends SQL standard in such a way that it is allowed to select columns that are neither aggregated nor listed in ORDER BY clause).
Another idea for improving the performance of the above query comes from the interactive nature of this function and the fact that ORDER BY really puts heavy load on the server.
In this scenario, While we are typing the start of the search term, first two or three letters, do we really care if the results are the first ten entries, strictly sorted, that satisfy the search criteria? Or is it enough for the user to see the full search drop-down to know that it is necessary to refine the search.
The general consensus was that the later was true, especially given the fact that the query written that way reduces the load on the server by several orders of magnitude (main query in the test environment went from 1.43 s to 0.02 s).
The final query now looks like this:
SELECT id, data
FROM (
SELECT cc.id as id, CONCAT_WS( ' :: ', {$select} ) as data, sort_name
FROM civicrm_contact cc {$from}
{$aclFrom}
{$additionalFrom}
{$whereClause}
LIMIT 0, {$limit}
) t
ORDER BY sort_name
and significantly (tangibly) improves the responsiveness of the AJAX search function.
Filed under
Comments
It isn't always obvious that there are more than the shown results that match. So perhaps the limit should be eleven instead of ten, and then show "more..." in place of the eleventh entry.
I know what you mean. Still, this is an AJAX drop-down. The general agreement was that people would rather refine (type more chars or retype search term) than conclude that a contact does not exists by carefully scrolling through the drop-down list. Do you feel show "more..." is a must?
Hi,
We went a bit further and now we are trying a different approach:
- we search on query% instead of %query%
- we do 3 searches, on sort_name=query%, first_name=query% and email=query%
- and concatenate the results sort, first, emails
Works quite better, for instance, searching "ma" will find
Macadam ltd info@macadam.com (org name)
Malarme, John (last name)
Dupond, Maurice mdupond@froggy.com (first name)
Mitch Alan malan@yahoo.com (email)
but will leave out all the xxx@gmail.com, cause the "ma" is in the middle
X+
P.S. Actually, it would have leaved it out anyway, because it wasn't searching on the email, but get my point
$config->includeWildCardInName
if that is true we should search for %query% else we can search for query%
we did something similar to the above in a previous release (1.x) and most folks did not like it.
lobo
Hi,
They didn't like it because you only searched on the sort_name, hence you couldn't find a contact from his first name only for instance.
(I'm talking about the autocomplete, not the search in general).
X+