Package adql.db

Class DBChecker

  • All Implemented Interfaces:
    QueryChecker

    public class DBChecker
    extends java.lang.Object
    implements QueryChecker
    This QueryChecker implementation is able to do the following verifications on an ADQL query:
    1. Check the existence of all table and column references found in a query
    2. Resolve all unknown functions as supported User Defined Functions (UDFs)
    3. Check whether all used geometrical functions are supported
    4. Check whether all used coordinate systems are supported
    5. Check that types of columns and UDFs match with their context

    Check tables and columns

    In addition to check the existence of tables and columns referenced in the query, this checked will also attach database metadata on these references (ADQLTable and ADQLColumn instances when they are resolved.

    These information are:

    Note: Knowing DB metadata of ADQLTable and ADQLColumn is particularly useful for the translation of the ADQL query to SQL, because the ADQL name of columns and tables can be replaced in SQL by their DB name, if different. This mapping is done automatically by JDBCTranslator.

    Version:
    1.4 (11/2017)
    Author:
    Grégory Mantelet (CDS;ARI)
    • Field Detail

      • allowedGeo

        protected java.lang.String[] allowedGeo

        List of all allowed geometrical functions (i.e. CONTAINS, REGION, POINT, COORD2, ...).

        If this list is NULL, all geometrical functions are allowed. However, if not, all items of this list must be the only allowed geometrical functions. So, if the list is empty, no such function is allowed.

        Since:
        1.3
      • allowedCoordSys

        protected java.lang.String[] allowedCoordSys

        List of all allowed coordinate systems.

        Each item of this list must be of the form: "{frame} {refpos} {flavor}". Each of these 3 items can be either of value, a list of values expressed with the syntax "({value1}|{value2}|...)" or a '*' to mean all possible values.

        Note: since a default value (corresponding to the empty string - '') should always be possible for each part of a coordinate system, the checker will always add the default value (UNKNOWNFRAME, UNKNOWNREFPOS or SPHERICAL2) into the given list of possible values for each coord. sys. part.

        If this list is NULL, all coordinates systems are allowed. However, if not, all items of this list must be the only allowed coordinate systems. So, if the list is empty, none is allowed.

        Since:
        1.3
      • coordSysRegExp

        protected java.lang.String coordSysRegExp

        A regular expression built using the list of allowed coordinate systems. With this regex, it is possible to known whether a coordinate system expression is allowed or not.

        If NULL, all coordinate systems are allowed.

        Since:
        1.3
      • allowedUdfs

        protected FunctionDef[] allowedUdfs

        List of all allowed User Defined Functions (UDFs).

        If this list is NULL, any encountered UDF will be allowed. However, if not, all items of this list must be the only allowed UDFs. So, if the list is empty, no UDF is allowed.

        Since:
        1.3
    • Constructor Detail

      • DBChecker

        public DBChecker()

        Builds a DBChecker with an empty list of tables.

        Verifications done by this object after creation:

        • Existence of tables and columns: NO (even unknown or fake tables and columns are allowed)
        • Existence of User Defined Functions (UDFs): NO (any "unknown" function is allowed)
        • Support of geometrical functions: NO (all valid geometrical functions are allowed)
        • Support of coordinate systems: NO (all valid coordinate systems are allowed)
      • DBChecker

        public DBChecker​(java.util.Collection<? extends DBTable> tables)

        Builds a DBChecker with the given list of known tables.

        Verifications done by this object after creation:

        • Existence of tables and columns: OK
        • Existence of User Defined Functions (UDFs): NO (any "unknown" function is allowed)
        • Support of geometrical functions: NO (all valid geometrical functions are allowed)
        • Support of coordinate systems: NO (all valid coordinate systems are allowed)
        Parameters:
        tables - List of all available tables.
      • DBChecker

        public DBChecker​(java.util.Collection<? extends DBTable> tables,
                         java.util.Collection<? extends FunctionDef> allowedUdfs)

        Builds a DBChecker with the given list of known tables and with a restricted list of user defined functions.

        Verifications done by this object after creation:

        • Existence of tables and columns: OK
        • Existence of User Defined Functions (UDFs): OK
        • Support of geometrical functions: NO (all valid geometrical functions are allowed)
        • Support of coordinate systems: NO (all valid coordinate systems are allowed)
        Parameters:
        tables - List of all available tables.
        allowedUdfs - List of all allowed user defined functions. If NULL, no verification will be done (and so, all UDFs are allowed). If empty list, no "unknown" (or UDF) is allowed. Note: match with items of this list are done case insensitively.
        Since:
        1.3
      • DBChecker

        public DBChecker​(java.util.Collection<? extends DBTable> tables,
                         java.util.Collection<java.lang.String> allowedGeoFcts,
                         java.util.Collection<java.lang.String> allowedCoordSys)
                  throws ParseException

        Builds a DBChecker with the given list of known tables and with a restricted list of user defined functions.

        Verifications done by this object after creation:

        • Existence of tables and columns: OK
        • Existence of User Defined Functions (UDFs): NO (any "unknown" function is allowed)
        • Support of geometrical functions: OK
        • Support of coordinate systems: OK
        Parameters:
        tables - List of all available tables.
        allowedGeoFcts - List of all allowed geometrical functions (i.e. CONTAINS, POINT, UNION, CIRCLE, COORD1). If NULL, no verification will be done (and so, all geometries are allowed). If empty list, no geometry function is allowed. Note: match with items of this list are done case insensitively.
        allowedCoordSys - List of all allowed coordinate system patterns. The syntax of a such pattern is the following: "{frame} {refpos} {flavor}" ; on the contrary to a coordinate system expression, here no part is optional. Each part of this pattern can be one the possible values (case insensitive), a list of possible values expressed with the syntax "({value1}|{value2}|...)", or a '*' for any valid value. For instance: "ICRS (GEOCENTER|heliocenter) *". If the given list is NULL, no verification will be done (and so, all coordinate systems are allowed). If it is empty, no coordinate system is allowed (except the default values - generally expressed by an empty string: '').
        Throws:
        ParseException
        Since:
        1.3
      • DBChecker

        public DBChecker​(java.util.Collection<? extends DBTable> tables,
                         java.util.Collection<? extends FunctionDef> allowedUdfs,
                         java.util.Collection<java.lang.String> allowedGeoFcts,
                         java.util.Collection<java.lang.String> allowedCoordSys)
                  throws ParseException

        Builds a DBChecker.

        Verifications done by this object after creation:

        • Existence of tables and columns: OK
        • Existence of User Defined Functions (UDFs): OK
        • Support of geometrical functions: OK
        • Support of coordinate systems: OK
        Parameters:
        tables - List of all available tables.
        allowedUdfs - List of all allowed user defined functions. If NULL, no verification will be done (and so, all UDFs are allowed). If empty list, no "unknown" (or UDF) is allowed. Note: match with items of this list are done case insensitively.
        allowedGeoFcts - List of all allowed geometrical functions (i.e. CONTAINS, POINT, UNION, CIRCLE, COORD1). If NULL, no verification will be done (and so, all geometries are allowed). If empty list, no geometry function is allowed. Note: match with items of this list are done case insensitively.
        allowedCoordSys - List of all allowed coordinate system patterns. The syntax of a such pattern is the following: "{frame} {refpos} {flavor}" ; on the contrary to a coordinate system expression, here no part is optional. Each part of this pattern can be one the possible values (case insensitive), a list of possible values expressed with the syntax "({value1}|{value2}|...)", or a '*' for any valid value. For instance: "ICRS (GEOCENTER|heliocenter) *". If the given list is NULL, no verification will be done (and so, all coordinate systems are allowed). If it is empty, no coordinate system is allowed (except the default values - generally expressed by an empty string: '').
        Throws:
        ParseException
        Since:
        1.3
    • Method Detail

      • specialSort

        protected static final java.lang.String[] specialSort​(java.util.Collection<java.lang.String> items)
        Transform the given collection of string elements in a sorted array. Only non-NULL and non-empty strings are kept.
        Parameters:
        items - Items to copy and sort.
        Returns:
        A sorted array containing all - except NULL and empty strings - items of the given collection.
        Since:
        1.3
      • setTables

        public final void setTables​(java.util.Collection<? extends DBTable> tables)

        Sets the list of all available tables.

        Note: Only if the given collection is NOT an implementation of SearchTableApi, the collection will be copied inside a new SearchTableList, otherwise it is used as provided.

        Parameters:
        tables - List of DBTables.
      • check

        public final void check​(ADQLQuery query)
                         throws ParseException

        Check all the columns, tables and UDFs references inside the given query.

        Note: This query has already been parsed ; thus it is already syntactically correct. Only the consistency with the published tables, columns and all the defined UDFs must be checked.

        Specified by:
        check in interface QueryChecker
        Parameters:
        query - The query to check.
        Throws:
        ParseException - An UnresolvedIdentifiersException if some tables or columns can not be resolved.
        See Also:
        check(ADQLQuery, Stack)
      • resolveTables

        protected java.util.Map<DBTable,​ADQLTable> resolveTables​(ADQLQuery query,
                                                                       java.util.Stack<SearchColumnList> fathersList,
                                                                       UnresolvedIdentifiersException errors)
        Search all table references inside the given query, resolve them against the available tables, and if there is only one match, attach the matching metadata to them. Management of sub-query tables

        If a table is not a DB table reference but a sub-query, this latter is first checked (using check(ADQLQuery, Stack) ; but the father list must not contain tables of the given query, because on the same level) and then corresponding table metadata are generated (using generateDBTable(ADQLQuery, String)) and attached to it.

        Management of "{table}.*" in the SELECT clause

        For each of this SELECT item, this function tries to resolve the table name. If only one match is found, the corresponding ADQL table object is got from the list of resolved tables and attached to this SELECT item (thus, the joker item will also have the good metadata, particularly if the referenced table is a sub-query).

        Table alias

        When a simple table (i.e. not a sub-query) is aliased, the metadata of this table will be wrapped inside a DBTableAlias in order to keep the original metadata but still declare use the table with the alias instead of its original name. The original name will be used only when translating the corresponding FROM item ; the rest of the time (i.e. for references when using a column), the alias name must be used.

        In order to avoid unpredictable behavior at execution of the SQL query, the alias will be put in lower case if not defined between double quotes.

        Parameters:
        query - Query in which the existence of tables must be checked.
        fathersList - List of all columns available in the father queries and that should be accessed in sub-queries. Each item of this stack is a list of columns available in each father-level query. Note: this parameter is NULL if this function is called with the root/father query as parameter.
        errors - List of errors to complete in this function each time an unknown table or column is encountered.
        Returns:
        An associative map of all the resolved tables.
      • resolveColumns

        protected void resolveColumns​(ADQLQuery query,
                                      java.util.Stack<SearchColumnList> fathersList,
                                      java.util.Map<DBTable,​ADQLTable> mapTables,
                                      SearchColumnList list,
                                      UnresolvedIdentifiersException errors)

        Search all column references inside the given query, resolve them thanks to the given tables' metadata, and if there is only one match, attach the matching metadata to them.

        Management of selected columns' references

        A column reference is not only a direct reference to a table column using a column name. It can also be a reference to an item of the SELECT clause (which will then call a "selected column"). That kind of reference can be either an index (an unsigned integer starting from 1 to N, where N is the number selected columns), or the name/alias of the column.

        These references are also checked, in a second step, in this function. Thus, column metadata are also attached to them, as common columns.

        Parameters:
        query - Query in which the existence of tables must be checked.
        fathersList - List of all columns available in the father queries and that should be accessed in sub-queries. Each item of this stack is a list of columns available in each father-level query. Note: this parameter is NULL if this function is called with the root/father query as parameter.
        mapTables - List of all resolved tables.
        list - List of column metadata to complete in this function each time a column reference is resolved.
        errors - List of errors to complete in this function each time an unknown table or column is encountered.
      • resolveColumn

        protected DBColumn resolveColumn​(ADQLColumn column,
                                         SearchColumnList dbColumns,
                                         java.util.Stack<SearchColumnList> fathersList)
                                  throws ParseException

        Resolve the given column, that's to say search for the corresponding DBColumn.

        The third parameter is used only if this function is called inside a sub-query. In this case, the column is tried to be resolved with the first list (dbColumns). If no match is found, the resolution is tried with the father columns list (fathersList).

        Parameters:
        column - The column to resolve.
        dbColumns - List of all available DBColumns.
        fathersList - List of all columns available in the father queries and that should be accessed in sub-queries. Each item of this stack is a list of columns available in each father-level query. Note: this parameter is NULL if this function is called with the root/father query as parameter.
        Returns:
        The corresponding DBColumn if found. Otherwise an exception is thrown.
        Throws:
        ParseException - An UnresolvedColumnException if the given column can't be resolved or an UnresolvedTableException if its table reference can't be resolved.
      • generateDBTable

        public static DBTable generateDBTable​(ADQLQuery subQuery,
                                              java.lang.String tableName)
                                       throws ParseException
        Generate a DBTable corresponding to the given sub-query with the given table name. This DBTable will contain all DBColumn returned by ADQLQuery.getResultingColumns().
        Parameters:
        subQuery - Sub-query in which the specified table must be searched.
        tableName - Name of the table to search.
        Returns:
        The corresponding DBTable if the table has been found in the given sub-query, null otherwise.
        Throws:
        ParseException - Can be used to explain why the table has not been found. Note: not used by default.
      • checkUDFs

        protected void checkUDFs​(ADQLQuery query,
                                 UnresolvedIdentifiersException errors)

        Search all UDFs (User Defined Functions) inside the given query, and then check their signature against the list of allowed UDFs.

        Note: When more than one allowed function match, the function is considered as correct and no error is added. However, in case of multiple matches, the return type of matching functions could be different and in this case, there would be an error while checking later the types. In such case, throwing an error could make sense, but the user would then need to cast some parameters to help the parser identifying the right function. But the type-casting ability is not yet possible in ADQL.

        Parameters:
        query - Query in which UDFs must be checked.
        errors - List of errors to complete in this function each time a UDF does not match to any of the allowed UDFs.
        Since:
        1.3
      • isAllParamTypesResolved

        protected final boolean isAllParamTypesResolved​(ADQLFunction fct)

        Tell whether the type of all parameters of the given ADQL function is resolved.

        A parameter type may not be resolved for 2 main reasons:

        • the parameter is a column, but this column has not been successfully resolved. Thus its type is still unknown.
        • the parameter is a UDF, but this UDF has not been already resolved. Thus, as for the column, its return type is still unknown. But it could be known later if the UDF is resolved later ; a second try should be done afterwards.
        Parameters:
        fct - ADQL function whose the parameters' type should be checked.
        Returns:
        true if the type of all parameters is known, false otherwise.
        Since:
        1.3
      • checkGeometryFunction

        protected void checkGeometryFunction​(java.lang.String fctName,
                                             ADQLFunction fct,
                                             DBChecker.BinarySearch<java.lang.String,​java.lang.String> binSearch,
                                             UnresolvedIdentifiersException errors)

        Check whether the specified geometrical function is allowed by this implementation.

        Note: If the list of allowed geometrical functions is empty, this function will always add an errors to the given list. Indeed, it means that no geometrical function is allowed and so that the specified function is automatically not supported.

        Parameters:
        fctName - Name of the geometrical function to test.
        fct - The function instance being or containing the geometrical function to check. Note: this function can be the function to test or a function embedding the function under test (i.e. RegionFunction).
        binSearch - The object to use in order to search a function name inside the list of allowed functions. It is able to perform a binary search inside a sorted array of String objects. The interest of this object is its compare function which must be overridden and tells how to compare the item to search and the items of the array (basically, a non-case-sensitive comparison between 2 strings).
        errors - List of errors to complete in this function each time a geometrical function is not supported.
        Since:
        1.3
      • resolveCoordinateSystems

        protected void resolveCoordinateSystems​(ADQLQuery query,
                                                UnresolvedIdentifiersException errors)

        Search all explicit coordinate system declarations, check their syntax and whether they are allowed by this implementation.

        Note: "explicit" means here that all StringConstant instances. Only coordinate systems expressed as string can be parsed and so checked. So if a coordinate system is specified by a column, no check can be done at this stage... it will be possible to perform such test only at the execution.

        Parameters:
        query - Query in which coordinate systems must be checked.
        errors - List of errors to complete in this function each time a coordinate system has a wrong syntax or is not supported.
        Since:
        1.3
        See Also:
        checkCoordinateSystem(StringConstant, UnresolvedIdentifiersException)
      • checkCoordinateSystem

        protected void checkCoordinateSystem​(STCS.CoordSys coordSys,
                                             ADQLOperand operand,
                                             UnresolvedIdentifiersException errors)
        Check whether the given coordinate system is allowed by this implementation.
        Parameters:
        coordSys - Coordinate system to test.
        operand - The operand representing or containing the coordinate system under test.
        errors - List of errors to complete in this function each time a coordinate system is not supported.
        Since:
        1.3
      • resolveSTCSExpressions

        protected void resolveSTCSExpressions​(ADQLQuery query,
                                              DBChecker.BinarySearch<java.lang.String,​java.lang.String> binSearch,
                                              UnresolvedIdentifiersException errors)

        Search all STC-S expressions inside the given query, parse them (and so check their syntax) and then determine whether the declared coordinate system and the expressed region are allowed in this implementation.

        Note: In the current ADQL language definition, STC-S expressions can be found only as only parameter of the REGION function.

        Parameters:
        query - Query in which STC-S expressions must be checked.
        binSearch - The object to use in order to search a region name inside the list of allowed functions/regions. It is able to perform a binary search inside a sorted array of String objects. The interest of this object is its compare function which must be overridden and tells how to compare the item to search and the items of the array (basically, a non-case-sensitive comparison between 2 strings).
        errors - List of errors to complete in this function each time the STC-S syntax is wrong or each time the declared coordinate system or region is not supported.
        Since:
        1.3
        See Also:
        STCS.parseRegion(String), checkRegion(adql.db.STCS.Region, RegionFunction, BinarySearch, UnresolvedIdentifiersException)
      • checkTypes

        protected void checkTypes​(ADQLQuery query,
                                  UnresolvedIdentifiersException errors)

        Search all operands whose the type is not yet known and try to resolve it now and to check whether it matches the type expected by the syntactic parser.

        Only two operands may have an unresolved type: columns and user defined functions. Indeed, their type can be resolved only if the list of available columns and UDFs is known, and if columns and UDFs used in the query are resolved successfully.

        When an operand type is still unknown, they will own the three kinds of type and so this function won't raise an error: it is thus automatically on the expected type. This behavior is perfectly correct because if the type is not resolved that means the item/operand has not been resolved in the previous steps and so that an error about this item has already been raised.

        Important note: This function does not check the types exactly, but just roughly by considering only three categories: string, numeric and geometry.

        Parameters:
        query - Query in which unknown types must be resolved and checked.
        errors - List of errors to complete in this function each time a types does not match to the expected one.
        Since:
        1.3
        See Also:
        UnknownType
      • checkSubQueries

        protected void checkSubQueries​(ADQLQuery query,
                                       java.util.Stack<SearchColumnList> fathersList,
                                       SearchColumnList availableColumns,
                                       UnresolvedIdentifiersException errors)

        Search all sub-queries found in the given query but not in the clause FROM. These sub-queries are then checked using check(ADQLQuery, Stack).

        Fathers stack

        Each time a sub-query must be checked with check(ADQLQuery, Stack), the list of all columns available in each of its father queries must be provided. This function is composing itself this stack by adding the given list of available columns (= all columns resolved in the given query) at the end of the given stack. If this stack is given empty, then a new stack is created.

        This modification of the given stack is just the execution time of this function. Before returning, this function removes the last item of the stack.

        Parameters:
        query - Query in which sub-queries must be checked.
        fathersList - List of all columns available in the father queries and that should be accessed in sub-queries. Each item of this stack is a list of columns available in each father-level query. Note: this parameter is NULL if this function is called with the root/father query as parameter.
        availableColumns - List of all columns resolved in the given query.
        errors - List of errors to complete in this function each time a semantic error is encountered.
        Since:
        1.3