Pages

Parsing files with RapidJson

If you need to parse a JSON stored in a file from a C++ application, you could use the RapidJson header-only library.

In a file, named minimal.json, I have a JSON like this:
{
  "res":"OK",
  "tot":419,
  "nr":20,
  "ds":
  [
    {"a": "one/a","b": "one/b"},
    {"a": "two/a","b": "two/b"}
  ]
}
I want to do a few things this JSON, checking its status (stored in "res", getting the "nr" and "tot" integer fields, and the "ds" array.

As an helper, I have created this class:
#include <rapidjson/document.h>

class RJDoc
{
private:
  rapidjson::Document doc_;

public:
  RJDoc(const std::string& json);

  bool checkStatus();
  int getTotal();
  int getNumber();
  const rapidjson::Value& getDs();
};
As usual, I wrote a number of test cases to drive the code development. Here is one of them:
TEST(TestRJDoc, CaseMinimal)
{
  RJDoc reader("minimal.json"); // 1
  ASSERT_TRUE(reader.checkStatus()); // 2
  ASSERT_EQ(419, reader.getTotal()); // 3
  ASSERT_EQ(20, reader.getNumber());

  const rapidjson::Value& ds = reader.getDs(); // 4
  ASSERT_EQ(2, ds.Size()); // 5

  ASSERT_STREQ("one/a", ds[0U]["a"].GetString()); // 6
  ASSERT_STREQ("two/b", ds[1U]["b"].GetString());
  ASSERT_THROW(ds[3U]["a"].GetString(), std::logic_error); // 7
}
1. I want to be able to instantiate a reader passing the name of the associated file.
2. The status check should pass if the "res" field exists and it is set to "OK".
3. A couple of utility methods to get "tot" and "nr".
4. The access to the "ds" array should be granted by a function that returns it as a RapidJson Value object (a sort of variant).
5. Ensure that the array size is as expected.
6. An element of an object in a JSON array is retrieved using this syntax. Notice that we have to explicitly tell to the compiler that the index is an unsigned value. And remember that the RapidJson Strings are actually plain raw C-strings.
7. I need to manage the case of missing field. For instance here I am trying to access a field of a non-existing element in the array. The default RapidJson behavior is letting an assertion fail. I want instead a standard exception to be thrown, so that I can catch it and perform some sort of fall back operation.

For point (7), we can see in rapidjson.h this piece of code:
///////////////////////////////////////////////////////////////////////////////
// RAPIDJSON_ASSERT

//! Assertion.
/*! By default, rapidjson uses C assert() for assertion.
 User can override it by defining RAPIDJSON_ASSERT(x) macro.
*/
#ifndef RAPIDJSON_ASSERT
#include <cassert>
#define RAPIDJSON_ASSERT(x) assert(x)
#endif // RAPIDJSON_ASSERT
So what I have done is defining the symbol RAPIDJSON_ASSERT before any RapidJson include in my code:
#include <stdexcept>
#define RAPIDJSON_ASSERT(x) if(!(x)) throw std::logic_error("rapidjson exception");
I have defined my class constructor in this way:
RJDoc(const std::string& filename)
{
  std::stringstream ss;
  std::ifstream ifs;
  ifs.open(filename.c_str(), std::ios::binary);
  ss << ifs.rdbuf(); // 1
  ifs.close();

  if(doc_.Parse<0>(ss.str().c_str()).HasParseError()) // 2
    throw std::invalid_argument("json parse error"); // 3
}
1. I read the file stream in a string stream.
2. Then I try to parse the string associated to the stream through the rapidjson::Document defined as private data member.
3. What if the file is missing, the data format is wrong or corrupted? I decided to throw a standard invalid_argument exception (and I wrote a few test cases to document this behavior).

The other methods are even simpler then the constructor:
bool RJDoc::checkStatus()
{
  rapidjson::Value& status = doc["res"];
  if(!status.IsString())
    return false;

  return std::strcmp(doc["res"].GetString(), "OK") == 0; // 1
}

int RJDoc::getTotal()
{
  return doc["tot"].GetInt(); // 2
}

int RJDoc::getNumber()
{
  return doc["nr"].GetInt();
}

const rapidjson::Value& RJDoc::getDs()
{
  rapidjson::Value& value = doc["ds"];
  if(!value.IsArray())
    throw std::logic_error("bad ds"); // 3

  return value;
}
1. When calling a RapidJson GetXXX() method on an element that is not present in the current document, as for my definition of RAPIDJSON_ASSERT, I'd get a std::logic_error exception. In this case I don't want that. This is the reason for carefully checking that "res" is a(n existing) string element before getting its value and comparing it against "OK".
2. No special check for "tot" or "nr", just throw a std::logic_error exception if they are missing.
3. I really want "ds" to be an array, otherwise there is no use in returning its unexpected value. So I'll throw just like no "ds" was available at all.

No comments:

Post a Comment