HBase初探

string hbaseCluster = "https://charju.azurehdinsight.net";
string hadoopUsername = "帐户名字";
string hadoopPassword = "密码";

ClusterCredentials creds = new ClusterCredentials(new Uri(hbaseCluster), hadoopUsername, hadoopPassword);
var hbaseClient = new HBaseClient(creds);

// No response when GetVersion
var version = hbaseClient.GetVersion();

Console.WriteLine(Convert.ToString(version));
View Code

首先上代码,这个太特么的坑爹了!代码在winform中是没法运行滴!!!在命令行应用中是能够的!!!(浪费了老子好几天的时间……)shell

在winform中,经过windbg调试,发如今GetVersion的时候,主线程起了一个Task,而后等待Task的完成。在Task运行初期(大概1分钟内),会有另一个线程,在WaitHandle,而后等一段时间,该线程消失。主线程中开始Retries调用,而后,就没有而后了……apache

 

Anyway,命令行中,代码是OK的。app

个人例子,是利用新浪上的API来获得股票信息,好比说:http://hq.sinajs.cn/list=sz000977,sh600718,我每秒钟调用一次,而后这些数据刷到hbase里面去。ide

 

股票的实体类定义oop

public class StockEntity
    {
        public string Name { get; set; }
        public double TodayOpeningPrice { get; set; }
        public double YesterdayClosingPrice { get; set; }
        public double CurrentPrice { get; set; }
        public double TodayMaxPrice { get; set; }
        public double TodayMinPrice { get; set; }
        public double BidPriceBuy { get; set; }
        public double BidPriceSell { get; set; }
        public int FixtureNumber { get; set; }
        public double FixtureAmount { get; set; }
        public int Buy1Number { get; set; }
        public double Buy1Price { get; set; }
        public int Buy2Number { get; set; }
        public double Buy2Price { get; set; }
        public int Buy3Number { get; set; }
        public double Buy3Price { get; set; }
        public int Buy4Number { get; set; }
        public double Buy4Price { get; set; }
        public int Buy5Number { get; set; }
        public double Buy5Price { get; set; }
        public int Sell1Number { get; set; }
        public double Sell1Price { get; set; }
        public int Sell2Number { get; set; }
        public double Sell2Price { get; set; }
        public int Sell3Number { get; set; }
        public double Sell3Price { get; set; }
        public int Sell4Number { get; set; }
        public double Sell4Price { get; set; }
        public int Sell5Number { get; set; }
        public double Sell5Price { get; set; }

        public DateTime TransactionTime { get; set; }
    }
View Code

 

数据拉下来以后,新开一个线程,让它去写到hbase中。网站

ThreadPool.QueueUserWorkItem(new WaitCallback(SaveStockDataToHbase), se);

 

具体干活代码以下:ui

 1 private void SaveStockDataToHbase(object state)
 2         {
 3             StockEntity se = state as StockEntity;
 4 
 5             // Insert data into the HBase table.
 6             string rowKey = Guid.NewGuid().ToString();
 7 
 8             CellSet cellSet = new CellSet();
 9             CellSet.Row cellSetRow = new CellSet.Row { key = Encoding.UTF8.GetBytes(rowKey) };
10             cellSet.rows.Add(cellSetRow);
11 
12 
13             Type t = typeof(StockEntity);
14 
15             foreach (string colname in stockEntityColumns)
16             {
17                 var pi = t.GetProperty(colname);
18                 object val = pi.GetValue(se);
19 
20                 Cell value = new Cell { column = Encoding.UTF8.GetBytes("charju:" + colname), data = Encoding.UTF8.GetBytes(Convert.ToString(val)) };
21                 cellSetRow.values.Add(value);
22             }
23 
24             try
25             {
26                 hbaseClient.StoreCells(hbaseStockTableName, cellSet);
27             }
28             catch (Exception ex)
29             {
30                 Console.WriteLine(ex.Message);
31             }
32         }

6~10行,是生成一个新Row。20行,是反射实体类的每个Property 定义,来取对应的值(不然我要写一坨重复的代码)。21行,把对应的该列数据写到这个行上。spa

26行,就是真正的放到hbase中。.net

 

上面20行,你可能会注意到:charju,这是个人column family的名字。回过头来,看看hbase中的表是怎么创建的命令行

string hbaseCluster = "https://charju.azurehdinsight.net";
string hadoopUsername = "<your name>";
string hadoopPassword = "<your password>";
string hbaseStockTableName = "StockInformation";
HBaseClient hbaseClient;

public void CreateHbaseTable()
{

            // Create a new HBase table. - StockInformation
            TableSchema stockTableSchema = new TableSchema();
            stockTableSchema.name = hbaseStockTableName;
            stockTableSchema.columns.Add(new ColumnSchema() { name = "charju" });
            hbaseClient.CreateTable(stockTableSchema);

}

 

而hbaseClient的实例化,是在这里:

ClusterCredentials creds = new ClusterCredentials(new Uri(hbaseCluster), hadoopUsername, hadoopPassword);
hbaseClient = new HBaseClient(creds);

 

数据写入后,咱们能够有几个方式来。一是在hbase中配置一下,容许RDP,而后remote上去跑hbase shell命令,惋惜我虚机里面RDP总失败,不知道为啥。第二种方式,就是用HIVE来查。

链接到hbase的网站后,在hive editor那个界面中,先建立对应的表

CREATE EXTERNAL TABLE StockInformation(rowkey STRING, TodayOpeningPrice STRING, YesterdayClosingPrice STRING, CurrentPrice STRING, TodayMaxPrice STRING, TodayMinPrice STRING, BidPriceBuy STRING, BidPriceSell STRING, FixtureNumber STRING, FixtureAmount STRING, Buy1Number STRING, Buy1Price STRING, Buy2Number STRING, Buy2Price STRING, Buy3Number STRING, Buy3Price STRING, Buy4Number STRING, Buy4Price STRING, Buy5Number STRING, Buy5Price STRING, Sell1Number STRING, Sell1Price STRING, Sell2Number STRING, Sell2Price STRING, Sell3Number STRING, Sell3Price STRING, Sell4Number STRING, Sell4Price STRING, Sell5Number STRING, Sell5Price STRING, TransactionTime STRING)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,charju:TodayOpeningPrice ,charju:YesterdayClosingPrice ,charju:CurrentPrice ,charju:TodayMaxPrice ,charju:TodayMinPrice ,charju:BidPriceBuy ,charju:BidPriceSell ,charju:FixtureNumber ,charju:FixtureAmount ,charju:Buy1Number ,charju:Buy1Price ,charju:Buy2Number ,charju:Buy2Price ,charju:Buy3Number ,charju:Buy3Price ,charju:Buy4Number ,charju:Buy4Price ,charju:Buy5Number ,charju:Buy5Price ,charju:Sell1Number ,charju:Sell1Price ,charju:Sell2Number ,charju:Sell2Price ,charju:Sell3Number ,charju:Sell3Price ,charju:Sell4Number ,charju:Sell4Price ,charju:Sell5Number ,charju:Sell5Price ,charju:TransactionTime')
TBLPROPERTIES ('hbase.table.name' = 'StockInformation');

建立成功后,而后就能够跑SQL了,好比说:

select * from StockInformation where buy1number=9800 order by transactiontime

今天小浪的最大一笔买入。固然,相似于select count(0) 之类的更OK了。

 

 

有用的链接:

https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hbase-tutorial-get-started/