Grammar
Commands
Other Grammar
SELECT
selectStatement |
|
|
|
|
|
Selects data from one or more tables. UNION ALL
combines rows from multiple select statements. ORDER BY
sorts the result based on the given expressions. LIMIT
(or FETCH FIRST
) limits the number of rows returned by the query with no limit applied if unspecified or specified as null or less than zero. The LIMIT
(or FETCH FIRST
) clause is executed after the ORDER BY
clause to support top-N type queries. OFFSET
clause skips that many rows before beginning to return rows. An optional hint may be used to override decisions made by the query optimizer.
Example:
SELECT * FROM TEST LIMIT 1000;
SELECT * FROM TEST LIMIT 1000 OFFSET 100;
SELECT full_name FROM SALES_PERSON WHERE ranking >= 5.0
UNION ALL SELECT reviewer_name FROM CUSTOMER_REVIEW WHERE score >= 8.0
UPSERT VALUES
UPSERT INTO tableName |
| VALUES ( constantTerm |
| ) |
|
Inserts if not present and updates otherwise the value in the table. The list of columns is optional and if not present, the values will map to the column in the order they are declared in the schema. The values must evaluate to constants.
Use the ON DUPLICATE KEY
clause (available in Phoenix 4.9) if you need the UPSERT
to be atomic. Performance will be slower in this case as the row needs to be read on the server side when the commit is done. Use IGNORE
if you do not want the UPSERT
performed if the row already exists. Otherwise, with UPDATE
, the expression will be evaluated and the result used to set the column, for example to perform an atomic increment. An UPSERT
to the same row in the same commit batch will be processed in the order of execution.
Example:
UPSERT INTO TEST VALUES('foo','bar',3);
UPSERT INTO TEST(NAME,ID) VALUES('foo',123);
UPSERT INTO TEST(ID, COUNTER) VALUES(123, 0) ON DUPLICATE KEY UPDATE COUNTER = COUNTER + 1;
UPSERT INTO TEST(ID, MY_COL) VALUES(123, 0) ON DUPLICATE KEY IGNORE;
UPSERT SELECT
Inserts if not present and updates otherwise rows in the table based on the results of running another query. The values are set based on their matching position between the source and target tables. The list of columns is optional and if not present will map to the column in the order they are declared in the schema. If auto commit is on, and both a) the target table matches the source table, and b) the select performs no aggregation, then the population of the target table will be done completely on the server-side (with constraint violations logged, but otherwise ignored). Otherwise, data is buffered on the client and, if auto commit is on, committed in row batches as specified by the UpsertBatchSize connection property (or the phoenix.mutate.upsertBatchSize HBase
config property which defaults to 10000 rows)
Example:
UPSERT INTO test.targetTable(col1, col2) SELECT col3, col4 FROM test.sourceTable WHERE col5 < 100
UPSERT INTO foo SELECT * FROM bar;
DELETE
Deletes the rows selected by the where clause. If auto commit is on, the deletion is performed completely server-side.
Example:
DELETE FROM TEST;
DELETE FROM TEST WHERE ID=123;
DELETE FROM TEST WHERE NAME LIKE 'foo%';
DECLARE CURSOR
DECLARE CURSOR cursorName FOR selectStatement |
Creates a cursor for the select statement
Example:
DECLARE CURSOR TEST_CURSOR FOR SELECT * FROM TEST_TABLE
OPEN CURSOR
OPEN CURSOR cursorName |
Opens already declared cursor to perform FETCH
operations
Example:
OPEN CURSOR TEST_CURSOR
FETCH NEXT
FETCH NEXT |
| FROM cursorName |
Retrieves next or next n rows from already opened cursor
Example:
FETCH NEXT FROM TEST_CURSOR
FETCH NEXT 10 ROWS FROM TEST_CURSOR
CLOSE
CLOSE cursorName |
Closes an already open cursor
Example:
CLOSE TEST_CURSOR
CREATE TABLE
CREATE TABLE |
| tableRef |
( columnDef |
|
| ) |
|
|
Creates a new table. The HBase
table and any column families referenced are created if they don't already exist. All table, column family and column names are uppercased unless they are double quoted in which case they are case sensitive. Column families that exist in the HBase
table but are not listed are ignored. At create time, to improve query performance, an empty key value is added to the first column family of any existing rows or the default column family if no column families are explicitly defined. Upserts will also add this empty key value. This improves query performance by having a key value column we can guarantee always being there and thus minimizing the amount of data that must be projected and subsequently returned back to the client. HBase
table and column configuration options may be passed through as key/value pairs to configure the HBase
table as desired. Note that when using the IF NOT EXISTS
clause, if a table already exists, then no change will be made to it. Additionally, no validation is done to check whether the existing table metadata matches the proposed table metadata. so it's better to use DROP TABLE
followed by CREATE TABLE
is the table metadata may be changing.
Example:
CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date Date)
CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE not null,
m.db_utilization DECIMAL, i.db_utilization)
m.DATA_BLOCK_ENCODING='DIFF'
CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null,
txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )
CREATE TABLE IF NOT EXISTS "my_case_sensitive_table"
( "id" char(10) not null primary key, "value" integer)
DATA_BLOCK_ENCODING='NONE',VERSIONS=5,MAX_FILESIZE=2000000 split on (?, ?, ?)
CREATE TABLE IF NOT EXISTS my_schema.my_table (
org_id CHAR(15), entity_id CHAR(15), payload binary(1000),
CONSTRAINT pk PRIMARY KEY (org_id, entity_id) )
TTL=86400
DROP TABLE
DROP TABLE |
| tableRef |
|
Drops a table. The optional CASCADE
keyword causes any views on the table to be dropped as well. When dropping a table, by default the underlying HBase
data and index tables are dropped. The phoenix.schema.dropMetaData may be used to override this and keep the HBase
table for point-in-time queries.
Example:
DROP TABLE my_schema.my_table;
DROP TABLE IF EXISTS my_table;
DROP TABLE my_schema.my_table CASCADE;
CREATE FUNCTION
CREATE |
| FUNCTION funcName |
( |
| ) |
RETURNS dataType AS className |
|
Creates a new function. The function name is uppercased unless they are double quoted in which case they are case sensitive. The function accepts zero or more arguments. The class name and jar path should be in single quotes. The jar path is optional and if not specified then the class name will be loaded from the jars present in directory configured for hbase.dynamic.jars.dir.
Example:
CREATE FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction' using jar 'hdfs:/localhost:8080/hbase/lib/myjar.jar'
CREATE FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction'
CREATE FUNCTION my_increment(integer, integer constant defaultvalue='10') returns integer as 'com.mypackage.MyIncrementFunction' using jar '/hbase/lib/myincrement.jar'
CREATE TEMPORARY FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction' using jar 'hdfs:/localhost:8080/hbase/lib/myjar.jar'
DROP FUNCTION
DROP FUNCTION |
| funcName |
Drops a function.
Example:
DROP FUNCTION IF EXISTS my_reverse
DROP FUNCTION my_reverse
CREATE VIEW
CREATE VIEW |
| newTableRef |
|
|
tableOptions |
Creates a new view over an existing HBase
or Phoenix table. As expected, the WHERE
expression is always automatically applied to any query run against the view. As with CREATE TABLE
, the table, column family, and column names are uppercased unless they are double quoted. The newTableRef may refer directly to an HBase
table, in which case, the table, column family, and column names must match the existing metadata exactly or an exception will occur. When a view is mapped directly to an HBase
table, no empty key value will be added to rows and the view will be read-only. A view will be updatable (i.e. referenceable in a DML
statement such as UPSERT
or DELETE
) if its WHERE
clause expression contains only simple equality expressions separated by ANDs
. Updatable views are not required to set the columns which appear in the equality expressions, as the equality expressions define the default values for those columns. If they are set, then they must match the value used in the WHERE
clause, or an error will occur. All columns from the existingTableRef are included as columns in the new view as are columns defined in the columnDef list. An ALTER VIEW
statement may be issued against a view to remove or add columns, however, no changes may be made to the primary key constraint. In addition, columns referenced in the WHERE
clause are not allowed to be removed. Once a view is created for a table, that table may no longer altered or dropped until all of its views have been dropped.
Example:
CREATE VIEW "my_hbase_table"
( k VARCHAR primary key, "v" UNSIGNED_LONG) default_column_family='a';
CREATE VIEW my_view ( new_col SMALLINT )
AS SELECT * FROM my_table WHERE k = 100;
CREATE VIEW my_view_on_view
AS SELECT * FROM my_view WHERE new_col > 70;
DROP VIEW
DROP VIEW |
| tableRef |
|
Drops a view. The optional CASCADE
keyword causes any views derived from the view to be dropped as well. When dropping a view, the actual table data is not affected. However, index data for the view will be deleted.
Example:
DROP VIEW my_view
DROP VIEW IF EXISTS my_schema.my_view
DROP VIEW IF EXISTS my_schema.my_view CASCADE
CREATE SEQUENCE
CREATE SEQUENCE |
| sequenceRef |
|
|
|
|
|
|
Creates a monotonically increasing sequence. START
controls the initial sequence value while INCREMENT
controls by how much the sequence is incremented after each call to NEXT VALUE FOR
. By default, the sequence will start with 1 and be incremented by 1. Specify CYCLE
to indicate that the sequence should continue to generate values after reaching either its MINVALUE
or MAXVALUE
. After an ascending sequence reaches its MAXVALUE
, it generates its MINVALUE
. After a descending sequence reaches its MINVALUE
, it generates its MAXVALUE
. CACHE
controls how many sequence values will be reserved from the server, cached on the client, and doled out as need by subsequent NEXT VALUE FOR
calls for that client connection to the cluster to save on RPC
calls. If not specified, the phoenix.sequence.cacheSize config parameter defaulting to 100 will be used for the CACHE
value.
Example:
CREATE SEQUENCE my_sequence;
CREATE SEQUENCE my_sequence START WITH -1000
CREATE SEQUENCE my_sequence INCREMENT BY 10
CREATE SEQUENCE my_schema.my_sequence START 0 CACHE 10
DROP SEQUENCE
DROP SEQUENCE |
| sequenceRef |
Drops a sequence.
Example:
DROP SEQUENCE my_sequence
DROP SEQUENCE IF EXISTS my_schema.my_sequence
ALTER
ALTER |
| tableRef |
|
Alters an existing table by adding or removing columns or updating table options. When a column is dropped from a table, the data in that column is deleted as well. PK
columns may not be dropped, and only nullable PK
columns may be added. For a view, the data is not affected when a column is dropped. Note that creating or dropping columns only affects subsequent queries and data modifications. Snapshot queries that are connected at an earlier timestamp will still use the prior schema that was in place when the data was written.
Example:
ALTER TABLE my_schema.my_table ADD d.dept_id char(10) VERSIONS=10
ALTER TABLE my_table ADD dept_name char(50), parent_id char(15) null primary key
ALTER TABLE my_table DROP COLUMN d.dept_id, parent_id;
ALTER VIEW my_view DROP COLUMN new_col;
ALTER TABLE my_table SET IMMUTABLE_ROWS=true,DISABLE_WAL=true;
CREATE INDEX
CREATE |
| INDEX |
| indexName |
ON tableRef ( expression |
|
| ) |
|
ASYNC |
|
|
Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the IMMUTABLE_ROWS
property to true (either up-front in the CREATE TABLE
statement or afterwards in an ALTER TABLE
statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes. As of the 4.3 release, functional indexes are supported which allow arbitrary expressions rather than solely column names to be indexed. As of the 4.4.0 release, you can specify the ASYNC
keyword to create the index using a map reduce job.
Example:
CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)
CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10
CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )
DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)
CREATE INDEX my_idx ON sales.opportunity(UPPER(contact_name))
DROP INDEX
Drops an index from a table. When dropping an index, the data in the index is deleted. Note that since metadata is versioned, snapshot queries connecting at an earlier time stamp may still use the index, as the HBase
table backing the index is not deleted.
Example:
DROP INDEX my_idx ON sales.opportunity
DROP INDEX IF EXISTS my_idx ON server_metrics
ALTER INDEX
Alters the state of an existing index. DISABLE
will cause the no further index maintenance to be performed on the index and it will no longer be considered for use in queries. REBUILD
will completely rebuild the index and upon completion will enable the index to be used in queries again. UNUSABLE
will cause the index to no longer be considered for use in queries, however index maintenance will continue to be performed. USABLE
will cause the index to again be considered for use in queries. Note that a disabled index must be rebuild and cannot be set as USABLE
.
Example:
ALTER INDEX my_idx ON sales.opportunity DISABLE
ALTER INDEX IF EXISTS my_idx ON server_metrics REBUILD
EXPLAIN
EXPLAIN |
|
Computes the logical steps necessary to execute the given command. Each step is represented as a string in a single column result set row.
Example:
EXPLAIN SELECT NAME, COUNT(*) FROM TEST GROUP BY NAME HAVING COUNT(*) > 2;
EXPLAIN SELECT entity_id FROM CORE.CUSTOM_ENTITY_DATA WHERE organization_id='00D300000000XHP' AND SUBSTR(entity_id,1,3) = '002' AND created_date < CURRENT_DATE()-1;
UPDATE STATISTICS
UPDATE STATISTICS tableRef |
|
|
Updates the statistics on the table and by default all of its associated index tables. To only update the table, use the COLUMNS
option and to only update the INDEX
, use the INDEX
option. The statistics for a single index may also be updated by using its full index name for the tableRef. The default guidepost properties may be overridden by specifying their values after the SET
keyword. Note that when a major compaction occurs, the default guidepost properties will be used again.
Example:
UPDATE STATISTICS my_table
UPDATE STATISTICS my_schema.my_table INDEX
UPDATE STATISTICS my_index
UPDATE STATISTICS my_table COLUMNS
UPDATE STATISTICS my_table SET phoenix.stats.guidepost.width=50000000
CREATE SCHEMA
CREATE SCHEMA |
| schemaName |
creates a schema and corresponding name-space in hbase. To enable namespace mapping, see https://phoenix.apache.org/tuning.html
User that execute this command should have admin permissions to create namespace in HBase
.
Example:
CREATE SCHEMA IF NOT EXISTS my_schema
CREATE SCHEMA my_schema
USE
USE |
|
Sets a default schema for the connection and is used as a target schema for all statements issued from the connection that do not specify schema name explicitly. USE DEFAULT
unset the schema for the connection so that no schema will be used for the statements issued from the connection.
schemaName should already be existed for the USE SCHEMA
statement to succeed. see CREATE SCHEMA
for creating schema.
Example:
USE my_schema
USE DEFAULT
DROP SCHEMA
DROP SCHEMA |
| schemaName |
Drops a schema and corresponding name-space from hbase. To enable namespace mapping, see https://phoenix.apache.org/tuning.html
This statement succeed only when schema doesn't hold any tables.
Example:
DROP SCHEMA IF EXISTS my_schema
DROP SCHEMA my_schema
GRANT
GRANT permissionString |
| TO |
| userString |
Grant permissions at table, schema or user level. Permissions are managed by HBase
in hbase:acl table, hence access controls need to be enabled. This feature will be available from Phoenix 4.14 version onwards.
Possible permissions are R - Read, W - Write, X - Execute, C - Create and A - Admin. To enable/disable access controls, see https://hbase.apache.org/book.html#hbase.accesscontrol.configuration
Permissions should be granted on base tables. It will be propagated to all its indexes and views. Group permissions are applicable to all users in the group and schema permissions are applicable to all tables with that schema. Grant statements without table/schema specified are assigned at GLOBAL
level.
Phoenix doesn't expose Execute('X') functionality to end users. However, it is required for mutable tables with secondary indexes.
Important Note:
Every user requires 'RX
' permissions on all Phoenix SYSTEM
tables in order to work correctly. Users also require 'RWX
' permissions on SYSTEM.SEQUENCE
table for using SEQUENCES
.
Example:
GRANT 'RXC' TO 'User1'
GRANT 'RWXC' TO GROUP 'Group1'
GRANT 'A' ON Table1 TO 'User2'
GRANT 'RWX' ON my_schema.my_table TO 'User2'
GRANT 'A' ON SCHEMA my_schema TO 'User3'
REVOKE
REVOKE |
| FROM |
| userString |
Revoke permissions at table, schema or user level. Permissions are managed by HBase
in hbase:acl table, hence access controls need to be enabled. This feature will be available from Phoenix 4.14 version onwards.
To enable/disable access controls, see https://hbase.apache.org/book.html#hbase.accesscontrol.configuration
Group permissions are applicable to all users in the group and schema permissions are applicable to all tables with that schema. Permissions should be revoked on base tables. It will be propagated to all its indexes and views. Revoke statements without table/schema specified are assigned at GLOBAL
level.
Revoke removes all the permissions at that level.
Important Note:
Revoke permissions needs to be exactly at the same level as permissions assigned via Grant permissions statement. Level refers to table, schema or user. Revoking any of 'RX
' permissions on any Phoenix SYSTEM
tables will cause exceptions. Revoking any of 'RWX
' permissions on SYSTEM.SEQUENCE
will cause exceptions while accessing sequences.
The examples below are for revoking permissions granted using the examples from GRANT
statement above.
Example:
REVOKE FROM 'User1'
REVOKE FROM GROUP 'Group1'
REVOKE ON Table1 FROM 'User2'
REVOKE ON my_schema.my_table FROM 'User2'
REVOKE ON SCHEMA my_schema FROM 'User3'
Constraint
CONSTRAINT constraintName PRIMARY KEY ( columnName |
|
|
| ) |
Defines a multi-part primary key constraint. Each column may be declared to be sorted in ascending or descending ordering. The default is ascending. One primary key column can also be designated as ROW_TIMESTAMP
provided it is of one of the types: BIGINT, UNSIGNED_LONG, DATE, TIME
and TIMESTAMP
.
Example:
CONSTRAINT my_pk PRIMARY KEY (host,created_date)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC ROW_TIMESTAMP)
Options
| name = value |
|
Sets a built-in Phoenix table property or an HBase
table or column descriptor metadata attribute. The name is case insensitive. If the name is a known HColumnDescriptor
attribute, then the value is applied to the specified column family or, if omitted, to all column families. Otherwise, the HBase
metadata attribute value is applied to the HTableDescriptor
. Note that no validation is performed on the property name or value, so unknown or misspelled options will end up as adhoc metadata attributes values on the HBase
table.
Built-in Phoenix table options include:
SALT_BUCKETS
numeric property causes an extra byte to be transparently prepended to every row key to ensure an evenly distributed read and write load across all region servers. This is especially useful when your row key is always monotonically increasing and causing hot spotting on a single region server. However, even if it's not, it often improves performance by ensuring an even distribution of data across your cluster. The byte is determined by hashing the row key and modding it with the SALT_BUCKETS
value. The value may be from 0 to 256, with 0 being a special means of turning salting off for an index in which the data table is salted (since by default an index has the same number of salt buckets as its data table). If split points are not defined for the table, the table will automatically be pre-split at each possible salt bucket value. For more information, see http://phoenix.incubator.apache.org/salted.html
DISABLE_WAL
boolean option when true causes HBase
not to write data to the write-ahead-log, thus making updates faster at the expense of potentially losing data in the event of a region server failure. This option is useful when updating a table which is not the source-of-truth and thus making the lose of data acceptable.
IMMUTABLE_ROWS
boolean option when true declares that your table has rows which are write-once, append-only (i.e. the same row is never updated). With this option set, indexes added to the table are managed completely on the client-side, with no need to perform incremental index maintenance, thus improving performance. Deletes of rows in immutable tables are allowed with some restrictions if there are indexes on the table. Namely, the WHERE
clause may not filter on columns not contained by every index. Upserts are expected to never update an existing row (failure to follow this will result in invalid indexes). For more information, see http://phoenix.incubator.apache.org/secondary_indexing.html
MULTI_TENANT
boolean option when true enables views to be created over the table across different tenants. This option is useful to share the same physical HBase
table across many different tenants. For more information, see http://phoenix.incubator.apache.org/multi-tenancy.html
DEFAULT_COLUMN_FAMILY
string option determines the column family used used when none is specified. The value is case sensitive. If this option is not present, a column family name of '0' is used.
STORE_NULLS
boolean option (available as of Phoenix 4.3) determines whether or not null values should be explicitly stored in HBase
. This option is generally only useful if a table is configured to store multiple versions in order to facilitate doing flashback queries (i.e. queries to look at the state of a record in the past).
TRANSACTIONAL
option (available as of Phoenix 4.7) determines whether a table (and its secondary indexes) are tranactional. The default value is FALSE
, but may be overriden with the phoenix.table.istransactional.default property. A table may be altered to become transactional, but it cannot be transitioned back to be non transactional. For more information on transactions, see http://phoenix.apache.org/transactions.html
UPDATE_CACHE_FREQUENCY
option (available as of Phoenix 4.7) determines how often the server will be checked for meta data updates (for example, the addition or removal of a table column or the updates of table statistics). Possible values are ALWAYS
(the default), NEVER
, and a millisecond numeric value. An ALWAYS
value will cause the client to check with the server each time a statement is executed that references a table (or once per commit for an UPSERT VALUES
statement). A millisecond value indicates how long the client will hold on to its cached version of the metadata before checking back with the server for updates.
APPEND_ONLY_SCHEMA
boolean option (available as of Phoenix 4.8) when true declares that columns will only be added but never removed from a table. With this option set we can prevent the RPC
from the client to the server to fetch the table metadata when the client already has all columns declared in a CREATE TABLE
/VIEW IF NOT EXISTS
statement.
AUTO_PARTITION_SEQ
string option (available as of Phoenix 4.8) when set on a base table determines the sequence used to automatically generate a WHERE
clause with the first PK
column and the unique identifier from the sequence for child views. With this option set, we prevent allocating a sequence in the event that the view already exists.
The GUIDE_POSTS_WIDTH
option (available as of Phoenix 4.9) enables specifying a different guidepost width per table. The guidepost width determines the byte sized chunk of work over which a query will be parallelized. A value of 0 means that no guideposts should be collected for the table. A value of null removes any table specific guidepost setting, causing the global server-side phoenix.stats.guidepost.width config parameter to be used again. For more information, see the Statistics Collection page.
Example:
IMMUTABLE_ROWS=true
DEFAULT_COLUMN_FAMILY='a'
SALT_BUCKETS=10
DATA_BLOCK_ENCODING='NONE',a.VERSIONS=10
MAX_FILESIZE=2000000000,MEMSTORE_FLUSHSIZE=80000000
UPDATE_CACHE_FREQUENCY=300000
GUIDE_POSTS_WIDTH=30000000
CREATE SEQUENCE id;
CREATE TABLE base_table (partition_id INTEGER, val DOUBLE) AUTO_PARTITION_SEQ=id;
CREATE VIEW my_view AS SELECT * FROM base_table;
The view statement for my_view will be : WHERE partition_id = 1
Hint
|
|
An advanced features that overrides default query processing behavior for decisions such as whether to use a range scan versus skip scan and an index versus no index. Note that strict parsing is not done on hints. If hints are misspelled or invalid, they are silently ignored.
Example:
SKIP_SCAN,NO_INDEX
USE_SORT_MERGE_JOIN
NO_CACHE
INDEX(employee emp_name_idx emp_start_date_idx)
SMALL
Scan Hint
SKIP_SCAN | ||
RANGE_SCAN |
Use the SKIP_SCAN
hint to force a skip scan to be performed on the query when it otherwise would not be. This option may improve performance if a query does not include the leading primary key column, but does include other, very selective primary key columns.
Use the RANGE_SCAN
hint to force a range scan to be performed on the query. This option may improve performance if a query filters on a range for non selective leading primary key column along with other primary key columns
Example:
SKIP_SCAN
RANGE_SCAN
Cache Hint
NO_CACHE
Use the NO_CACHE
hint to prevent the results of the query from populating the HBase
block cache. This is useful in situation where you're doing a full table scan and know that it's unlikely that the rows being returned will be queried again.
Example:
NO_CACHE
Index Hint
INDEX | ||
NO_INDEX | ||
USE_INDEX_OVER_DATA_TABLE | ||
USE_DATA_OVER_INDEX_TABLE |
Use the INDEX
(<table_name> <index_name>...) to suggest which index to use for a given query. Double quotes may be used to surround a table_name and/or index_name to make them case sensitive. As of the 4.3 release, this will force an index to be used, even if it doesn't contain all referenced columns, by joining back to the data table to retrieve any columns not contained by the index.
Use the NO_INDEX
hint to force the data table to be used for a query.
Use the USE_INDEX_OVER_DATA_TABLE
hint to act as a tiebreaker for choosing the index table over the data table when all other criteria are equal. Note that this is the default optimizer decision.
Use the USE_DATA_OVER_INDEX_TABLE
hint to act as a tiebreaker for choosing the data table over the index table when all other criteria are equal.
Example:
INDEX(employee emp_name_idx emp_start_date_idx)
NO_INDEX
USE_INDEX_OVER_DATA_TABLE
USE_DATA_OVER_INDEX_TABLE
Small Hint
SMALL
Use the SMALL
hint to reduce the number of RPCs
done between the client and server when a query is executed. Generally, if the query is a point lookup or returns data that is likely in a single data block (64 KB by default), performance may improve when using this hint.
Example:
SMALL
Seek To Column Hint
SEEK_TO_COLUMN | ||
NO_SEEK_TO_COLUMN |
Use the SEEK_TO_COLUMN
hint to force the server to seek to navigate between columns instead of doing a next. If there are many versions of the same column value or if there are many columns between the columns that are projected, then this may be more efficient.
Use the NO_SEEK_TO_COLUMN
hint to force the server to do a next to navigate between columns instead of a seek. If there are few versions of the same column value or if the columns that are projected are adjacent to each other, then this may be more efficient.
Example:
SEEK_TO_COLUMN
NO_SEEK_TO_COLUMN
Join Hint
USE_SORT_MERGE_JOIN | ||
NO_STAR_JOIN | ||
NO_CHILD_PARENT_JOIN_OPTIMIZATION |
Use the USE_SORT_MERGE_JOIN
hint to force the optimizer to use a sort merge join instead of a broadcast hash join when both sides of the join are bigger than will fit in the server-side memory. Currently the optimizer will not make this determination itself, so this hint is required to override the default behavior of using a hash join.
Use the NO_STAR_JOIN
hint to prevent the optimizer from using the star join query to broadcast the results of the querying one common table to all region servers. This is useful when the results of the querying the one common table is too large and would likely be substantially filtered when joined against one or more of the other joined tables.
Use the NO_CHILD_PARENT_JOIN_OPTIMIZATION
hint to prevent the optimizer from doing point lookups between a child table (such as a secondary index) and a parent table (such as the data table) for a correlated subquery.
Example:
NO_STAR_JOIN
Serial Hint
SERIAL
Use the SERIAL
hint to force a query to be executed serially as opposed to being parallelized along the guideposts and region boundaries.
Example:
SERIAL
Column Def
columnRef dataType |
|
|
|
Define a new primary key column. The column name is case insensitive by default and case sensitive if double quoted. The sort order of a primary key may be ascending (ASC
) or descending (DESC
). The default is ascending. You may also specify a default value (Phoenix 4.9 or above) for the column with a constant expression. If the column is the only column that forms the primary key, then it can be designated as ROW_TIMESTAMP
column provided its data type is one of these: BIGINT, UNSIGNED_LONG, DATE, TIME
and TIMESTAMP
.
Example:
id char(15) not null primary key
key integer null
m.response_time bigint
created_date date not null primary key row_timestamp
key integer null
m.response_time bigint
Table Ref
| tableName |
References a table or view with an optional schema name qualifier
Example:
Sales.Contact
HR.Employee
Department
Sequence Ref
| sequenceName |
References a sequence with an optional schema name qualifier
Example:
my_id_generator
my_seq_schema.id_generator
Column Ref
| columnName |
References a column with an optional family name qualifier
Example:
e.salary
dept_name
Select Expression
* | ||||||||||||||||||
| ||||||||||||||||||
|
An expression in a SELECT
statement. All columns in a table may be selected using *, and all columns in a column family may be selected using <familyName>.*.
Example:
*
cf.*
ID AS VALUE
VALUE + 1 VALUE_PLUS_ONE
Select Statement
SELECT |
|
| selectExpression |
|
FROM tableSpec |
|
|
|
|
Selects data from a table. DISTINCT
filters out duplicate results while ALL
, the default, includes all results. FROM
identifies the table being queried. Columns may be dynamically defined in parenthesis after the table name and then used in the query. Joins are processed in reverse order through a broadcast hash join mechanism. For best performance, order tables from largest to smallest in terms of how many rows you expect to be used from each table. GROUP BY
groups the the result by the given expression(s). HAVING
filters rows after grouping. An optional hint may be used to override decisions made by the query optimizer.
Example:
SELECT * FROM TEST;
SELECT DISTINCT NAME FROM TEST;
SELECT ID, COUNT(1) FROM TEST GROUP BY ID;
SELECT NAME, SUM(VAL) FROM TEST GROUP BY NAME HAVING COUNT(1) > 2;
SELECT d.dept_id,e.dept_id,e.name FROM DEPT d JOIN EMPL e ON e.dept_id = d.dept_id;
Split Point
value | ||
bindParameter |
Defines a split point for a table. Use a bind parameter with preparedStatement.setBinary(int,byte[]) to supply arbitrary bytes.
Example:
'A'
Table Spec
aliasedTableRef | ||||||||||||||||||
|
An optionally aliased table reference, or an optionally aliased select statement in paranthesis.
Example:
PRODUCT_METRICS AS PM
PRODUCT_METRICS(referrer VARCHAR)
( SELECT feature FROM PRODUCT_METRICS ) AS PM
Aliased Table Ref
| tableName |
|
|
|
A reference to an optionally aliased table optionally followed by dynamic column definitions.
Example:
PRODUCT_METRICS AS PM
PRODUCT_METRICS(referrer VARCHAR)
PRODUCT_METRICS TABLESAMPLE (12.08)
Join Type
INNER | ||||||||||||||||
|
The type of join
Example:
INNER
LEFT OUTER
RIGHT
Func Argument
dataType |
|
|
|
|
The function argument is sql data type. It can be constant and also we can provide default,min and max values for the argument in single quotes.
Example:
VARCHAR
INTEGER DEFAULTVALUE='100'
INTEGER CONSTANT DEFAULTVALUE='10' MINVALUE='1' MAXVALUE='15'
Class Name
String
Canonical class name in single quotes.
Example:
'com.mypackage.MyReverseFunction'
Jar Path
String
Hdfs path of jar in single quotes.
Example:
'hdfs://localhost:8080:/hbase/lib/myjar.jar'
'/tmp/lib/myjar.jar'
Order
expression |
|
|
Sorts the result by an expression.
Example:
NAME DESC NULLS LAST
Expression
andCondition |
|
Value or condition.
Example:
ID=1 OR NAME='Hi'
And Condition
booleanCondition |
|
Condition separated by AND
.
Example:
FOO!='bar' AND ID=1
Boolean Condition
| condition |
Boolean condition.
Example:
ID=1 AND NAME='Hi'
Condition
operand |
|
Boolean value or condition. When comparing with LIKE
, the wildcards characters are _
(any one character) and %
(any characters). ILIKE
is the same, but the search is case insensitive. To search for the characters %
and _
, the characters need to be escaped. The escape character is \
(backslash). Patterns that end with an escape character are invalid and the expression returns NULL
. BETWEEN
does an inclusive comparison for both operands.
Example:
FOO = 'bar'
NAME LIKE 'Jo%'
IN (1, 2, 3)
NOT EXISTS (SELECT 1 FROM FOO WHERE BAR < 10)
N BETWEEN 1 and 100
RHS Operand
operand | ||||||||||||||||||
|
Right-hand side operand
Example:
s.my_col
ANY(my_col + 1)
ALL(select foo from bar where bas > 5)
Operand
summand |
|
A string concatenation.
Example:
'foo'|| s
Summand
factor |
|
An addition or subtraction of numeric or date type values
Example:
a + b
a - b
Factor
term |
|
A multiplication, division, or modulus of numeric type values.
Example:
c * d
e / 5
f % 10
Term
|
|
A term which may use subscript notation if it's an array.
Example:
'Hello'
23
my_array[my_index]
array_col[1]
Array Constructor
ARRAY [ expression |
| ] |
Constructs an ARRAY
out of the list of expressions.
Example:
ARRAY[1.0,2.2,3.3]
ARRAY['foo','bas']
ARRAY[col1,col2,col3+1,?]
Sequence
|
| FOR sequenceRef |
Gets the CURRENT
or NEXT
value for a sequence, a monotonically incrementing BIGINT
value. Each call to NEXT VALUE FOR
increments the sequence value and returns the current value. The NEXT
<n> VALUES
syntax may be used to reserve <n> consecutive sequence values. A sequence is only increment once for a given statement, so multiple references to the same sequence by NEXT VALUE FOR
produce the same value. Use CURRENT VALUE FOR
to access the last sequence allocated with NEXT VALUE FOR
for cluster connection of your client. If no NEXT VALUE FOR
had been previously called, an error will occur. These calls are only allowed in the SELECT
expressions or UPSERT VALUES
expressions.
Example:
NEXT VALUE FOR my_table_id
NEXT 5 VALUES FOR my_table_id
CURRENT VALUE FOR my_schema.my_id_generator
Cast
CAST ( expression AS dataType ) |
The CAST
operator coerces the given expression to a different dataType. This is useful, for example, to convert a BIGINT
or INTEGER
to a DECIMAL
or DOUBLE
to prevent truncation to a whole number during arithmetic operations. It is also useful to coerce from a more precise type to a less precise type since this type of coercion will not automatically occur, for example from a TIMESTAMP
to a DATE
. If the coercion is not possible, an error will occur.
Example:
CAST ( my_int AS DECIMAL )
CAST ( my_timestamp AS DATE )
Row Value Constructor
( expression , expression |
| ) |
A row value constructor is a list of other terms which are treated together as a kind of composite structure. They may be compared to each other or to other other terms. The main use case is 1) to enable efficiently stepping through a set of rows in support of query-more type functionality, or 2) to allow IN
clause to perform point gets on composite row keys.
Example:
(col1, col2, 5)
Bind Parameter
? | |||
|
A parameters can be indexed, for example :1
meaning the first parameter.
Example:
:1
?
Value
string | ||
numeric | ||
boolean | ||
null |
A literal value of any data type, or null.
Example:
10
Case
CASE term WHEN expression THEN term |
|
| END |
Returns the first expression where the value is equal to the test expression. If no else part is specified, return NULL
.
Example:
CASE CNT WHEN 0 THEN 'No' WHEN 1 THEN 'One' ELSE 'Some' END
Case When
CASE WHEN expression THEN term |
|
| END |
Returns the first expression where the condition is true. If no else part is specified, return NULL
.
Example:
CASE WHEN CNT<10 THEN 'Low' ELSE 'High' END
Name
| ||||||||||||||||||||||||
quotedName |
Unquoted names are not case sensitive. There is no maximum name length.
Example:
my_column
Quoted Name
" anything " |
Quoted names are case sensitive, and can contain spaces. There is no maximum name length. Two double quotes can be used to create a single double quote inside an identifier.
Example:
"first-name"
Alias
name
An alias is a name that is only valid in the context of the statement.
Example:
A
Null
NULL
NULL
is a value without data type and means 'unknown value'.
Example:
NULL
Data Type
|
|
A type name optionally declared as an array. An array is mapped to java.sql.Array
. Only single dimension arrays are supported and varbinary arrays are not allowed.
Example:
CHAR(15)
VARCHAR
DECIMAL(10,2)
DOUBLE
DATE
VARCHAR ARRAY
CHAR(10) ARRAY [5]
INTEGER []
SQL Data Type
charType | ||
varcharType | ||
decimalType | ||
tinyintType | ||
smallintType | ||
integerType | ||
bigintType | ||
floatType | ||
doubleType | ||
timestampType | ||
dateType | ||
timeType | ||
binaryType | ||
varbinaryType |
A standard SQL data type.
Example:
TINYINT
CHAR(15)
VARCHAR
VARCHAR(1000)
DECIMAL(10,2)
DOUBLE
INTEGER
BINARY(200)
DATE
HBase Data Type
unsignedTimestampType | ||
unsignedDateType | ||
unsignedTimeType | ||
unsignedTinyintType | ||
unsignedSmallintType | ||
unsignedIntType | ||
unsignedLongType | ||
unsignedFloatType | ||
unsignedDoubleType |
A type that maps to a native primitive HBase
value serialized through the Bytes.toBytes() utility methods. Only positive values are allowed.
Example:
UNSIGNED_INT
UNSIGNED_DATE
UNSIGNED_LONG
String
' anything ' |
A string starts and ends with a single quote. Two single quotes can be used to create a single quote inside a string.
Example:
'John''s car'
Boolean
TRUE | ||
FALSE |
A boolean value.
Example:
TRUE
Numeric
int | ||
long | ||
decimal |
The data type of a numeric value is always the lowest possible for the given value. If the number contains a dot this is decimal; otherwise it is int, long, or decimal (depending on the value).
Example:
SELECT -10.05
SELECT 5
SELECT 12345678912345
Int
| number |
The maximum integer number is 2147483647, the minimum is -2147483648.
Example:
10
Long
| number |
Long numbers are between -9223372036854775808 and 9223372036854775807.
Example:
100000
Decimal
| number |
|
A decimal number with fixed precision and scale. Internally, java.lang.BigDecimal
is used.
Example:
SELECT -10.5
Number
0-9 |
|
The maximum length of the number depends on the data type used.
Example:
100
Comments
| |||
| |||
|
Comments can be used anywhere in a command and are ignored by the database. Line comments end with a newline. Block comments cannot be nested, but can be multiple lines long.
Example:
// This is a comment