python操作hive
Jupyter开发环境
1、右侧打开Jupyter,并创建python3项目。
python操作hive例子
from pyhive import hive
import pandas as pd
def read_jdbc(host, database: str, table: str, query_sql: str) -> pd.DataFrame:
# 1、连接hive服务端
conn = hive.Connection(
host=host, port=10000, database=database)
cursor = conn.cursor()
print('connect hive successfully.')
# 2、执行hive sql
cursor.execute(query_sql)
print('query hive table successfully.')
# 3、返回pandas.dataframe
table_len = len(table) + 1
columns = [col[0] for col in cursor.description]
col = list(map(lambda x: x[table_len:], columns))
result = cursor.fetchall()
return pd.DataFrame(result, columns=col)
read_jdbc('app-11', 'test', 'employee', 'select * from employee')
详细学习内容可观看Spark快速大数据处理扫一扫~~~或者引擎搜索Spark余海峰