[tor-bugs] #17939 [Onionoo]: Optimise the construction of details documents with field constraints

Fri Jan 8 15:46:41 UTC 2016

#17939: Optimise the construction of details documents with field constraints
-------------------------+---------------------
 Reporter:  fmap         |          Owner:
     Type:  enhancement  |         Status:  new
 Priority:  Low          |      Milestone:
Component:  Onionoo      |        Version:
 Severity:  Minor        |     Resolution:
 Keywords:               |  Actual Points:
Parent ID:               |         Points:
  Sponsor:               |
-------------------------+---------------------

Comment (by karsten):

 I'm afraid I don't fully understand your last comment.  My earlier
 suggestion was to read the entire JSON string to memory and only keep the
 parts with matching field names; but without using Gson to deserialize the
 string, copying the fields we care about into a new object, and
 serializing that object using Gson again.  My hope was that we could write
 a simple text-based parser that only returns indexes where fields start
 and end in the string which we could then use to feed the parts we care
 about into a `StringBuilder`, basically skipping the serializing step.  We
 might be able to build this using some lower-level Gson stuff or maybe
 even entirely without Gson.  I'm not sure if you're saying above that such
 an approach would be too fragile, but I could see that being the case.
 Worth investigating, maybe.

 But here's another approach that still uses Gson, but possibly in a more
 efficient (and less hacky) way:

 {{{
 import java.util.HashSet;
 import java.util.Set;

 import com.google.gson.ExclusionStrategy;
 import com.google.gson.FieldAttributes;
 import com.google.gson.Gson;
 import com.google.gson.GsonBuilder;

 public class Deserializer {
   static class Inner {
     String innerField;
   }
   static class Outer {
     String outerField;
     Inner innerObject;
   }
   static class FieldsExclusionStrategy implements ExclusionStrategy {
     Class<?> declaringClass;
     Set<String> includedFieldNames;
     public FieldsExclusionStrategy(Class<?> clazz,
         Set<String> includedFieldNames) {
       this.declaringClass = clazz;
       this.includedFieldNames = includedFieldNames;
     }
     public boolean shouldSkipField(FieldAttributes arg0) {
       return this.declaringClass.equals(arg0.getDeclaringClass()) &&
           !includedFieldNames.contains(arg0.getName());
     }
     public boolean shouldSkipClass(Class<?> arg0) {
       return false;
     }
   }
   public static void main(String[] args) {
     Outer t = new Outer();
     t.outerField = "some text";
     t.innerObject = new Inner();
     t.innerObject.innerField = "some inner text";
     Gson gson = new GsonBuilder().create();
     String json = gson.toJson(t);
     System.out.println(json);

     Set<String> keepFieldNames = new HashSet<String>();
     keepFieldNames.add("innerObject");
     gson = new GsonBuilder().setExclusionStrategies(
         new FieldsExclusionStrategy(Outer.class,
             keepFieldNames)).create();
     System.out.println(gson.toJson(gson.fromJson(json, Outer.class)));
   }
 }
 }}}

 I could imagine that this approach is more efficient, because Gson can
 skip lots of fields and because we're not creating a separate object and
 copying over fields.  And it's less ugly code, because we don't have to
 add new code for each field we're adding.  Oh, and it works for all
 document types.  But it's still using Gson for deserializing and
 serializing.  If this looks promising, let's benchmark it.

 Thanks!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/17939#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online