2016年12月27日 星期二

使用 Java 連上 PTT

因為需要嘗試透過 Telnet 登入 PTT,雖然理論上覺得 PTT 的 Android App 這麼多,應該不會找不到可以引用的 Open Source
但實際找了,好像還真的找不太到 @@
最後是參考了 Github 上的一個開源專案 [1],擷取了其中連線和取得畫面的部份作為這次的實驗~。

與 PTT 建立連線

在建立連線的部份,[1] 的描述是說如果是 Telnet 可以使用 Apache Commons Net [2],SSH 則可以使用 JSch [3]
本來是要直接實驗連 ptt.cc:443 的,但結果失敗了 XD
因此這段擷取的範例是用 Telnet 來實作。

使用 Apache Commons Net 與 PTT 建立連線

建立連線其實非常簡單,只要用 org.apache.commons.net.telnet.TelnetClient 指定跟 PTT 連線就好。程式碼如下:

private TelnetClient telnetClient = new TelnetClient();

public void connect() throws IOException {
    this.telnetClient.connect("ptt.cc");
}
取得 PTT 回覆的內容

在上面的 TelnetClient 建立連線後,可以取得 InputStream 和 OutputStream,用來取得回應以及發送指令
因為目前是要先取得 PTT 首頁的畫面,因此需要的是 InputStream 的部份。

public List<String> getScreen() throws IOException {
  LinkedList<String> screenLines = new LinkedList<String>();
  try (
      // Read the response in Big5 since PTT respond Big5 characters when
      // using telnet.
      InputStreamReader isr =
          new InputStreamReader(this.telnetClient.getInputStream(), "Big5");
      BufferedReader br = new BufferedReader(isr);
      PushbackReader pbr = new PushbackReader(br, 128)) {

    int length = 0;
    char[] buffer = new char[4096];
    while ((length = pbr.read(buffer)) != -1) {
      StringBuilder sb = new StringBuilder();
        
      // Collect and parse the response from the terminal.
      for (int pos = 0; pos < length; pos++) {
        char c = buffer[pos];

        switch (c) {
          case 0x0A: // LF
            screenLines.add(sb.toString());
            sb.setLength(0);
            break;
          case 0x0D: // CR
          case 0x1B: // ESC
            break;
          default:
            if (c < 0x20 || c == 0x7F) {
              break;
            }
            sb.append(String.valueOf(c));
        }
      }
        
      return screenLines;
    }
  }
}

這裡我把原本 [1] 的程式碼簡化了非常多的地方,因為目前我只是想實驗把畫面印出來看看而已 XD
所以本來回覆的共 4096 個 character,被我直接轉存成 String List
而轉存過程中做的檢查,也只有檢查是否該換行,然後其他似乎是非必要的控制碼就都略過了。

印出取得的回覆

取得回覆以後,第一件事自然是想看看到底拿到什麼東西了
因為上面已經把資料轉存成 String List 了,所以其實印出來…就只是尋訪 List 的每個字串而已囉 XD。

public void printScreen (List<string> screenLines) {
  log.trace("Screen ({} lines):", screenLines.size());
  for (String line : screenLines) {
    log.trace("Line: {}", line);
  }
}

然後我的實驗程式印出來的結果是這樣:

2016-12-27 15:35:28,088 | Screen (31 lines):
2016-12-27 15:35:28,089 | Line: HTTP/1.1 200 OK
2016-12-27 15:35:28,089 | Line: 
2016-12-27 15:35:28,089 | Line:      ˙      [33;1mPTT[m                ·   ◣        ·  ˙      [1m◢[47m██████[40m◤[m
2016-12-27 15:35:28,089 | Line:              [1m140.112.172.11 [m·     ◢[47m█[1m◥[40m█◤[m            [1m◢[47m█████[m
2016-12-27 15:35:28,089 | Line:   ┌─┐     [33;1m批踢踢實業坊[m        ◢[30;47m▃[37m██[1m◥█[40m◤[m   ·   [1m◢[47m█████[40m◤  [m·[m
2016-12-27 15:35:28,089 | Line:   │[1m–[m└┐   [1mptt.cc       [m     ◢[47m█████[1m◥[40m█◤[m    [1m◢[47m█████[m
2016-12-27 15:35:28,089 | Line:   │[1m–[m  │                   ◢[47m███[40m◤[47m███[1m◥[40m◤[m  [1m◢[47m█████[40m◤  [m˙  ·[m
2016-12-27 15:35:28,089 | Line: ─┘    │┌──┐  ·       ◥◤      [47m████[40m◣[1m◢[47m████[40m    [m·[m
2016-12-27 15:35:28,089 | Line:         └┤  [33;1m–[m│      ·           ◢[47m████[1m◢███[40m◤            [m·[m
2016-12-27 15:35:28,089 | Line:                 │┌───┐         [47m█████[40m▇▇▆▆▅▅▄▄▃▂▁[m
2016-12-27 15:35:28,089 | Line:             ┌─┴┘[1m–[33m–[m  └ ◢[47m█████[40m▇▇▆▆▅▅▄▄▃▃▂▂▁[m
2016-12-27 15:35:28,089 | Line:             │[1m––[m           ◥[47m██[40m      █▇▇▆▆▅▅▄▄▃▂▁[m
2016-12-27 15:35:28,089 | Line: [H[2J                                       [31;41m�[33me[43m�[32me[42m�[34me[44m�[35me▄[30;45m▄[40m   [1;31m批[33m踢[32m踢[34m實[35m業[m坊  [30m�[45me[35;44m▄�[34me[42m�[32me[43m�[33me[41m�[31me[m
2016-12-27 15:35:28,089 | Line:           [1;30;47m�[;47m�[m▇▆▇[1;30;47m�[;36;46m�[;1;30m╲               [;31;41m�[33me[43m�[32me[42m�[34me[44m�[35me[45m�[30me[40m                         [45m�[35me[44m�[34me[42m�[32me[43m�[33me[m
2016-12-27 15:35:28,089 | Line:         ╱[47m [46m [;1;30m╲  [m╱[47m [46m [;1;30m| ╲           [;31;41m�[33me[43m�[32me[42m�[34me[44m�[35me[45m�[30me[40m         [1;31mPtt.cc        [;30;45m�[35me[44m�[34me[42m�[32me[m
2016-12-27 15:35:28,089 | Line:       ╱[1;30m| [47m [46m [40m |╳  [47m [46m [40m| | ╲       [;31;41m�[33me[43m�[32me[42m�[34me[44m�[35me[45m�[30me[40m                                 [45m�[35me[44m�[34me[m
2016-12-27 15:35:28,089 | Line:     ╱[1;30m| | [47m [46m [m╱ [1;30m|╲[47m [46m [40m| | | ╲    [;31;41m�[33me[43m�[32me[42m�[34me[44m�[35me[45m�[30me[40m          [1;47m�[;47m�[m▇▇[1;30;47m�[;36;46m�[40m                  [45m  [m
2016-12-27 15:35:28,089 | Line:   ╱[1;30m| | | [47m�[46m�[40m | | |[47m [;36;46m�[;1;30ms | | | �[;30mt [31;41m�[33me[43m�[32me[42m�[34me[44m�[35me[45m�[30me[40m         [1m╱[47m [46m�[40mB ╱[47m [46m�[40mB                  [45m [m
2016-12-27 15:35:28,090 | Line: ╱[1;30m| | | [m╱[47m [46m [40m [1;30m| | |[47m [46m [40m |�[;30mt[1m、| |�[;41m哈[33;43me[32m�[42me[34m�[44me[35m�[45me[30m�[40me      [1m.[;30m�[1m��[;30ms[;47m�[30;46mt[1;40m\  [47m [46m [40m\                  [m
2016-12-27 15:35:28,090 | Line:   [1;30m| | [m╱[1;30m| [47m [46m [40m | | |[47m [46m [40m | | |[;30m�[1ms [;31;41m�[33me[30;43m|[32m�[47me[34;42m�[1;30;44me[;35;44m�[45me[30m�[40me  [1m_. -' ′||[47m [46m [40m \ [47m [46m [40m \                 [m
2016-12-27 15:35:28,090 | Line: [1m◢[47m◤[30m□[;33m▅▅[47m�[46mf[40m▅▅▅[47m [46m [40m▅▅▅▅[41m�[47mf[43m▅[42m▅[44m▅[45m�[47mf[40m▅▅[47m�[40mf▅▅▅▅[47m�[46mf[40m▅[45m▅[47m [46m [45m�[35mf [;1;30m◣               [m
2016-12-27 15:35:28,090 | Line: [1;47m◤[30m□□[41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [;30;47m�[;46m�[41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [41m [40m [30;47m�[;46m�[41m [40m [35m�[1;30mi[43m�[;30;43me▄[37m�[30me▄▄[35m�[37me[30m▄▄[m
2016-12-27 15:35:28,090 | Line: [47m [1;30m□□ [;30;44m▃[1;35m▃[;44m�[36md[1;30m�[;36;44md[1;30m�[;36;44md[1;30m�[;36;44md[;46m▋[30;44m▃▃▃�[31md�[33md�[32md�[34md�[35md�[30md▃▃▃▃▃▃▃[37m�[36md[30m▃[35m�[1;30md[47m [46m [44m�[45md [40m█[;33m�[41mc�[47mc[41m�[40mc▂[45m▂[47m�[45mc[40m▂[m|
2016-12-27 15:35:28,090 | Line: [1;30;47m□□□[;30;45m�[35m�[30m�[35m�[;46m▋[m▃▂[47m�[m�[46m▋[40m [1;30m▃  [;30;45m�[35m�[30;46m�[31;41m�[;1;43m▅[42m�[30me[44m▄[45m▃[m▁ [1;33;41m�[;30;41m�[40m [1;32m▂ [34m▇�[;34me  [47m [46m [40m  [35;45m�[1;30;47me [46m [47m�[;47me[45m [;1;30m█[;33m�[41mc�[47mc[41m�[40mc�[45mc[40m�[45mc[47m�[45mc[40m�[47mc[30m|[m
2016-12-27 15:35:28,090 | Line: [47m [1;30m□□ [;30;45m�[35m�[30m�[35m�[;46m▋[1;36;43m▃▃▃[;46m▋[31;47m�[36mF[;1m�[;32mi[;42m�[30m�[32;45m�[46md[1;47m�[;33;47md[1m�[31me[;36;47m�[30me[1;36m�[;35;47me[1m�[34me [40m [;32;41m▄[1;42m�[;30;42m�[1;32m�[;32;42m�[44m▃[1m�[;32;44mc[;35m▲[47m [46m [;31m�[30mc[35;45m�[1;30mf[47m [46m [45m�[;45mf [;1;30m█[;32m�[41md�[;47me[32;41m�[1;40md[45m�[;32mc[45m �[;47mc[32;45m�[1;40md[;32;47m▄[m
2016-12-27 15:35:28,090 | Line: [1;30;47m□□□[;34m▄▃[;46m▋[;34m▃▂[30;44m▆[;46m▋[30;44m▅▆[;34m▂▃▄▃▂[30;44m▆▄▂_  [1;37m﹏[36m﹏    [;34;43m▄▄[44m  [43m▄▄[30;44m▃▅▆[;34m▂▄[44m         [m
2016-12-27 15:35:28,090 | Line: [47m [1;30m□□�[;34;44m�[1m�[36m\ [;46m▋[1;30;44m▂    [;46m▋[1;30;44m▂ [36m﹏           [;34m▇█[44m [30m▂▃▄▃▄▅▆[;34m▁▂▃▄▇[44m     [1;37m�[36m�    ﹏  [m
2016-12-27 15:35:28,090 | Line: [47m  [36m▁▂[44m�[34md [31;43m▂▂[34;41m�[44mc  [31;43m▂▂[30;41m�[44mc▂▃▄▃▄▅▆[;34m▁▂▃▄▅▆▅▆█[;1;44m﹏     [36m﹋[37m﹋        ﹏    [m
2016-12-27 15:35:28,090 | Line: 34m▅▇[44m  [1;37m﹏     [;33;44m◥▆[41m▅▅�[44mg [1;37m__﹏�[;34;44m\   [1;36m﹏﹏           [m
2016-12-27 15:35:28,090 | Line: [34m▄▅▆[44m  [1;37m﹏   [36m﹋   ﹏  [;30;44m▂▄▆[;34m▂▄[44m  [1;36m﹏      ﹏[37m﹏ [;34;47m▆▅▆▆▇[44m                 [1mφhtx9[m
2016-12-27 15:35:28,090 | Line:          [1m歡迎來到 [36m批踢踢實業坊 [37m目前有[;32m【106314】[;1m位使用者與您一起欣賞[31m彩虹         [m
參考資料
  1. g21589/PTTCrawler
  2. Apache Commons Net
  3. JSch

沒有留言:

張貼留言