Hive表DDL操作(二) 第2关:Create/Drop/ALTER 索引

相关知识

为了完成本关任务,你需要掌握:1. 如何导入本地数据到hive的表中,2. 如何创建索引,3. 如何删除索引。

导入本地数据到 hive 表中

  • /home/shoppings.txt目录下数据格式如下:

    • 在数据库shopping中根据数据分隔方式创建表items_info:

       
      1. CREATE TABLE IF NOT EXISTS shopping.items_info(
      2. id INT COMMENT 'item id',
      3. name STRING COMMENT 'item name',
      4. price FLOAT COMMENT 'item price',
      5. category STRING COMMENT 'item category',
      6. brand STRING COMMENT 'item brand',
      7. stock INT COMMENT 'item stock',
      8. address STRUCT COMMENT 'item sales address')
      9. COMMENT 'goods information table'
      10. row format delimited fields terminated by ',' //字段以‘,’分隔
      11. collection items terminated by '-' //集合以‘-’分隔
      12. TBLPROPERTIES ('creator'='Xiaoming','date'='2019-01-01');
    • 进入到数据库shopping中: use shopping;

    • 导入数据到表items_info中:

       
      1. load data local inpath '/home/shoppings.txt'
      2. overwrite into table items_info;

      • 查看导入的数据: select * from items_info;

        Create 创建索引

        创建索引的语法为:

         
        1. CREATE INDEX index_name ON TABLE base_table_name (col_name,…)
        2. AS index_type
        3. [With DEFERRED REBUILD]
        4. [INDXPROPERTIES (property_name=property_value,…)]
        5. [IN TABLE index_table_name]
        6. [[ROW FORMAT …] STORED AS … | STORED BY]
        7. [LOCATION hdfs_path] [TBLPROPERTIES (…)] [COMMENT "index comment"];

        属性参数说明:

        • With DEFERRED REBUILD:用于构建一个空索引。

          例子:

          • 创建索引items_index:
             
            1. create index items_index on table items_info(id)
            2. as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
            3. with deferred rebuild
            4. IN TABLE items_index_table;

            Alter 修改索引

            修改索引的语法为:

             
            1. ALTER INDEX index_name ON table_name [PARTITION partition_spec] REBUILD;

            ALTER INDEX REBUILD用于重建之前创建索引时使用关键字 WITH DEFERREDREBUILD建立的所有或之前建立的索引,若指定关键字PARTITION,则只针对相应分区建立索引。

            DROP 删除索引

            删除索引的语法为:

             
            1. DROP INDEX [IF EXISTS] index_name ON table_name;
            • 删除索引items_index:
               
              1. drop index if exists items_index on items_info;

              编程要求

              student.txt中的数据格式如下:

              • 创建test2数据库;

              • 根据以上数据创建student表;

              • 将/home/student.txt中的数据导入到表student中;

              • 根据学号Sno创建索引student_index;

              • 删除索引student_index。

                #********* Begin *********#
                echo "
                create database if not exists test2
                LOCATION '/hive/test2'
                WITH DBPROPERTIES('creator'='Floret','date'='2020-11-16');
                CREATE TABLE IF NOT EXISTS test2.student(
                Sno INT COMMENT 'student sno',
                name STRING COMMENT 'student name',
                age INT COMMENT 'student age',
                sex STRING COMMENT 'student sex',
                score STRUCT COMMENT 'student score')
                COMMENT 'students information table'
                row format delimited fields terminated by ',' 
                collection items terminated by '-'
                TBLPROPERTIES ('creator'='Floret','date'='2020-11-16');
                load data local inpath '/home/student.txt'
                overwrite into table student;
                create index student_index on table student(Sno)
                as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
                with deferred rebuild
                IN TABLE student_index_table;
                drop index if exists student_index on student;
                "
                #********* End *********#