// oops, there was an error
DisplayErrorPage($cl->GetLastError());
} else {
// everything was good DisplaySearchResult($result);
}
For the sake of completeness, let’s also see how you would group matches by year and compute certain statistics using the native API:
$cl = new SphinxClient();
$cl->SetFilterRange("year_published", 1990, 1999);
$cl->SetSortMode(SPH_SORT_EXTENDED, "price DESC");
$cl->SetGroupBy("year_published", SPH_GROUPBY_ATTR);
$cl->SetSelect("*, MIN(price) AS minprice,
MAX(price) AS maxprice, AVG(price) AS avgprice");
$result = $cl->Query("mysql", "booksindex");
You can see that as we add more processing to the query, the code starts to look more and more like SQL. But we build the query from particular pieces rather than express it as a single statement. In fact, sometimes this might be even handier to program than SQL, which often leaves you building the SQL statement string from pieces. However, sometimes it’s not so handy, and so Sphinx also provides an SQL interface.
Using SphinxQL
Sphinx’s SQL interface actually has two parts. First, searchd supports the MySQL wire protocol, meaning that you can use any existing MySQL client to talk to searchd.
MySQL protocol support can be enabled using the listen directive in the configuration file, as shown earlier.
For a start, ye olde command-line MySQL client works nicely:
$ mysql -h 127.0.0.1 -P 9306
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1 Server version: 0.9.9-dev (r1734)
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> SELECT * FROM test1 WHERE MATCH('test') -> ORDER BY group_id ASC OPTION ranker=bm25;
+---+---+---+---+
| id | weight | group_id | date_added | +---+---+---+---+
| 4 | 1442 | 2 | 1231721236 |
| 2 | 2421 | 123 | 1231721236 |
| 1 | 2421 | 456 | 1231721236 | +---+---+---+---+
3 rows in set (0.00 sec)
Note that in the sample just shown, the mysqld server does not have to be actually running or even installed. The protocol implementation is entirely Sphinx-side and does not depend on MySQL in any way. You don’t even need MySQL client libraries for this—they enable indexer to talk to MySQL, but searchd does not use them at all.
Second, queries sent over this wire protocol are expected to be in so-called SphinxQL, which is our own implementation of SQL syntax. SphinxQL aims to be compatible with MySQL where possible, but adds some extensions of its own to make Sphinx-specific features accessible through the SQL interface, too. SELECT statements are almost identical to MySQL, and the grouping example from the previous section can be expressed as follows:
SELECT *, MIN(price) AS minprice, MAX(price) AS maxprice, AVG(price) AS avgprice FROM booksindex
WHERE MATCH('mysql') AND year_published BETWEEN 1990 AND 1999 GROUP BY year_published
ORDER BY price DESC
Supported statements that manipulate data are SELECT, INSERT, REPLACE, and DELETE. Two more statements used on a daily basis are SHOW META and SHOW WARNINGS, which return extra information and a list of warnings associated with the last executed search query, respectively. A number of other statements are supported; refer to the bundled documentation for the complete and most up-to-date list.
The everyday workhorse is, of course, SELECT. It generally mimics regular SQL syntax (in its MySQL dialect), but handles some things differently (it has to) and adds Sphinx-specific extensions. The most important differences follow.
The “table” list
In regular SQL, a comma-separated list of the tables triggers a join, but the list of full-text indexes in SphinxQL is more like a union: it means that all the listed indexes should be searched and matches should be combined together.
/* SphinxQL dialect syntax */
SELECT *
FROM index1, index2, index3
WHERE MATCH('mysql') ORDER BY price DESC /* Equivalent fully compliant SQL syntax */
SELECT *
FROM ( SELECT * FROM index1 UNION SELECT * FROM index2
Accessing searchd | 35
UNION SELECT * FROM index3 ) WHERE MATCH('mysql') ORDER BY price DESC
Sphinx does not support joins on full-text search results, but does support searches through multiple indexes. So the shorter SphinxQL form improves clarity without conflicting with anything else.
LIMIT clause
The difference in LIMIT clause handling is that Sphinx, because of its internal design decisions, always enforces some limit on the result set. It defaults to LIMIT 0,20. OPTION extension
This clause lets you control a number of query-time settings that are custom to Sphinx—namely, you can choose a full-text match ranking function; set thresholds for query time and found matches; set a retry count and delay interval for distributed querying; and so on.
WITHIN GROUP ORDER BY extension
This clause lets you control which row will be selected to represent the group returned in the result set when using GROUP BY. This is something that regular SQL does not have. A SphinxQL query that groups books by year and returns the most popular book within a given year could look like this:
SELECT * FROM books GROUP BY year_published ORDER BY year_published DESC
WITHIN GROUP ORDER BY sales_count DESC
Finally, native API result sets contain not just matching rows, but also certain meta-information such as elapsed query time on the server side, number of rows found, and per-keyword statistics. These can’t be returned with a normal SQL result set, which carries only matches. Using the SQL API, the meta-information can be accessed using the SHOW META statement following the query you want to check:
mysql> SELECT * FROM test1 WHERE MATCH('test|one|two');
+---+---+---+---+
| hits[0] | 5 |
| keyword[1] | one |
| docs[1] | 1 |
| hits[1] | 2 |
| keyword[2] | two |
| docs[2] | 1 |
| hits[2] | 2 | +---+---+
12 rows in set (0.00 sec)