[tor-bugs] #17939 [Onionoo]: Optimise the construction of details documents with field constraints
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Jan 8 15:46:41 UTC 2016
#17939: Optimise the construction of details documents with field constraints
-------------------------+---------------------
Reporter: fmap | Owner:
Type: enhancement | Status: new
Priority: Low | Milestone:
Component: Onionoo | Version:
Severity: Minor | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Sponsor: |
-------------------------+---------------------
Comment (by karsten):
I'm afraid I don't fully understand your last comment. My earlier
suggestion was to read the entire JSON string to memory and only keep the
parts with matching field names; but without using Gson to deserialize the
string, copying the fields we care about into a new object, and
serializing that object using Gson again. My hope was that we could write
a simple text-based parser that only returns indexes where fields start
and end in the string which we could then use to feed the parts we care
about into a `StringBuilder`, basically skipping the serializing step. We
might be able to build this using some lower-level Gson stuff or maybe
even entirely without Gson. I'm not sure if you're saying above that such
an approach would be too fragile, but I could see that being the case.
Worth investigating, maybe.
But here's another approach that still uses Gson, but possibly in a more
efficient (and less hacky) way:
{{{
import java.util.HashSet;
import java.util.Set;
import com.google.gson.ExclusionStrategy;
import com.google.gson.FieldAttributes;
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
public class Deserializer {
static class Inner {
String innerField;
}
static class Outer {
String outerField;
Inner innerObject;
}
static class FieldsExclusionStrategy implements ExclusionStrategy {
Class<?> declaringClass;
Set<String> includedFieldNames;
public FieldsExclusionStrategy(Class<?> clazz,
Set<String> includedFieldNames) {
this.declaringClass = clazz;
this.includedFieldNames = includedFieldNames;
}
public boolean shouldSkipField(FieldAttributes arg0) {
return this.declaringClass.equals(arg0.getDeclaringClass()) &&
!includedFieldNames.contains(arg0.getName());
}
public boolean shouldSkipClass(Class<?> arg0) {
return false;
}
}
public static void main(String[] args) {
Outer t = new Outer();
t.outerField = "some text";
t.innerObject = new Inner();
t.innerObject.innerField = "some inner text";
Gson gson = new GsonBuilder().create();
String json = gson.toJson(t);
System.out.println(json);
Set<String> keepFieldNames = new HashSet<String>();
keepFieldNames.add("innerObject");
gson = new GsonBuilder().setExclusionStrategies(
new FieldsExclusionStrategy(Outer.class,
keepFieldNames)).create();
System.out.println(gson.toJson(gson.fromJson(json, Outer.class)));
}
}
}}}
I could imagine that this approach is more efficient, because Gson can
skip lots of fields and because we're not creating a separate object and
copying over fields. And it's less ugly code, because we don't have to
add new code for each field we're adding. Oh, and it works for all
document types. But it's still using Gson for deserializing and
serializing. If this looks promising, let's benchmark it.
Thanks!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/17939#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list