python操作hive

Jupyter开发环境

1、右侧打开Jupyter，并创建python3项目。

在这里插入图片描述

python操作hive例子

from pyhive import hive
import pandas as pd

def read_jdbc(host, database: str, table: str, query_sql: str) -> pd.DataFrame:
    # 1、连接hive服务端
    conn = hive.Connection(
        host=host, port=10000, database=database)
    cursor = conn.cursor()
    print('connect hive successfully.')

    # 2、执行hive sql
    cursor.execute(query_sql)
    print('query hive table successfully.')

    # 3、返回pandas.dataframe
    table_len = len(table) + 1
    columns = [col[0] for col in cursor.description]
    col = list(map(lambda x: x[table_len:], columns))
    result = cursor.fetchall()

    return pd.DataFrame(result, columns=col)



read_jdbc('app-11', 'test', 'employee', 'select * from employee')

详细学习内容可观看Spark快速大数据处理扫一扫~~~或者引擎搜索Spark余海峰在这里插入图片描述