2016년 9월 16일 금요일

C++ XML parsing & iteration

특이사항들:
  • XML file을 저장할 data type은 boost::property_tree::ptree.
    • #include <boost/property_tree/ptree.hpp>
  • read_xml(ifname, ptree)
    • #include <boost/property_tree/xml_parser.hpp>
    • 참고로, property tree는 XML 외에 다른 tree 구조 file들도 parsing할 수 있음 (e.g. JSON, INI, INFO).
  • Iterate하려면 ptreebegin(), end()를 이용해야겠지?
    • Iterator의 type은 ptree::iterator (혹은 ptree::const_iterator).
    • Iterator를 dereference한 value의 type은 ptree::value_type.
      • ptree::value_type은 별 거 아니고, pair<string, basic_ptree> 임. 즉, first가 key이고 second가 sub-tree임.
  • XML의 attribute는 key가 <xmlattr>ptree::value_type 변수의 second에 있는 sub-tree에 주~욱 저장되어 있음. 
  • tree.data() 로 저장된 값에 접근함. 
#include <iostream>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>

using namespace std;
using namespace boost::property_tree;

int main(int argc, char *argv[]) {
  const string ifname("books.xml");
  ptree tree;
  read_xml(ifname, tree);

  const string sep(2, ' ');

  for (ptree::iterator i = tree.begin(); i != tree.end(); ++i) {
    const ptree::value_type &v = *i;
    const string k0 = v.first;
    const ptree t0 = v.second;

    cout << k0 << endl; // catalog

    for (ptree::const_iterator j = t0.begin(); j != t0.end(); ++j) {
      const string k1 = j->first;
      const ptree t1 = j->second;

      const string sep1 = sep;
      cout << sep1 << k1 << endl; // book

      for (ptree::const_iterator k = t1.begin(); k != t1.end(); ++k) {
        const string k2 = k->first;
        const ptree t2 = k->second;

        const string sep2 = sep1 + sep;
        cout << sep2 << k2 << " = "; // <xmlattr>, author, title, genre, 
                                     // price, publish_date, description

        if (k2 == "<xmlattr>") {
          for (ptree::const_iterator l = t2.begin(); l != t2.end(); ++l) {
            const string k3 = l->first;
            const ptree t3 = l->second;

            cout << endl;
            const string sep3 = sep2 + sep;
            cout << sep3 << k3 << " = "; // id
            cout << t3.data() << endl;
          } // for (l: t2)
        } else {
          cout << t2.data() << endl;
        }
      } // for (k: t1)
    } // for (j: t0)
  } // for (i: tree)

  return 0;
}

위의 program을 Microsoft에서 제공하는 books.xml을 이용하여 실행하면 다음과 같은 출력이 나온다:
catalog
  book
    <xmlattr> = 
      id = bk101
    author = Gambardella, Matthew
    title = XML Developer's Guide
    genre = Computer
    price = 44.95
    publish_date = 2000-10-01
    description = An in-depth look at creating applications 
      with XML.
  book
    <xmlattr> = 
      id = bk102
    author = Ralls, Kim
    title = Midnight Rain
    genre = Fantasy
    price = 5.95
    publish_date = 2000-12-16
    description = A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.
  book
    <xmlattr> = 
      id = bk103
    author = Corets, Eva
    title = Maeve Ascendant
    genre = Fantasy
    price = 5.95
    publish_date = 2000-11-17
    description = After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.
  book
    <xmlattr> = 
      id = bk104
    author = Corets, Eva
    title = Oberon's Legacy
    genre = Fantasy
    price = 5.95
    publish_date = 2001-03-10
    description = In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.
  book
    <xmlattr> = 
      id = bk105
    author = Corets, Eva
    title = The Sundered Grail
    genre = Fantasy
    price = 5.95
    publish_date = 2001-09-10
    description = The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.
  book
    <xmlattr> = 
      id = bk106
    author = Randall, Cynthia
    title = Lover Birds
    genre = Romance
    price = 4.95
    publish_date = 2000-09-02
    description = When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.
  book
    <xmlattr> = 
      id = bk107
    author = Thurman, Paula
    title = Splish Splash
    genre = Romance
    price = 4.95
    publish_date = 2000-11-02
    description = A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.
  book
    <xmlattr> = 
      id = bk108
    author = Knorr, Stefan
    title = Creepy Crawlies
    genre = Horror
    price = 4.95
    publish_date = 2000-12-06
    description = An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.
  book
    <xmlattr> = 
      id = bk109
    author = Kress, Peter
    title = Paradox Lost
    genre = Science Fiction
    price = 6.95
    publish_date = 2000-11-02
    description = After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.
  book
    <xmlattr> = 
      id = bk110
    author = O'Brien, Tim
    title = Microsoft .NET: The Programming Bible
    genre = Computer
    price = 36.95
    publish_date = 2000-12-09
    description = Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.
  book
    <xmlattr> = 
      id = bk111
    author = O'Brien, Tim
    title = MSXML3: A Comprehensive Guide
    genre = Computer
    price = 36.95
    publish_date = 2000-12-01
    description = The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.
  book
    <xmlattr> = 
      id = bk112
    author = Galos, Mike
    title = Visual Studio 7: A Comprehensive Guide
    genre = Computer
    price = 49.95
    publish_date = 2001-04-16
    description = Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.